The objective of this programming assignment is to be able to implement and analyze efficient algorithms for fragments of XPath queries.
-
Preliminaries
This assignment will use the streaming format for the XML. An XML document will be given as input to the implemented algorithms a whitespace delimited file having the following format:
0|1 <whitespace> element1 0|1 <whitespace> element2
where 0|1 is a bit indicating a startElement or endElement event respectively, and element is the name of the element.
-
Assignments
-
Assignment 1 – SimpleQuery
Implement a streaming algorithm for XPath queries of the form
//e1/e2/…/en
where ei are element names.
-
Assignment 2 – LazyDFA
Implement a streaming algorithm using the lazy DFA method explained in
for queries of the form:
//p1//p2//…//pn
where each pi is an path of the form:
ei1/ei2/…/eim
and eij are element names.
-
-
Execution
-
Compile
ant compile
-
Create file jar
ant jar
-
Execute file jar
ant run -Darg0=<path of xml file> -Darg1=<xpath query>
eg:
ant run -Darg0=test/input.txt -Darg1=//a
-
Delete files
ant clean
-
-
Reference:
[1] Todd J Green, Ashish Gupta, Gerome Miklau, Makoto Onizuka, and Dan Suciu. Processing XML streams with deterministic automata and stream indexes. ACM TODS, 29(4):752–788, 2004.
[2] Maynooth University – Constructing a DFA from an NFA Link
Leave a Reply