Iterative SAX Parsing
January 19th, 2007 by James CarrToday I’ve spent my personal development time writing a sax parser that allows iterative parsing, stoppping on a specified “stop tag” for each iteration, and uses an Iterator interface. The power is the custom handler that uses closures to implement callbacks on each node encounter, and in future work on it I hope to get it to the point that one can simply call next() on the iterator and get the custom object they just built. This is just a starting point, but there’s a lot of room for improvement (XPATH FILTERING!!!):
Disclaimer: the data (and format) is made up. It’s not from work.
The XML:
<?xml version="1.0"?> <records> <record type="stolen"> <vin>SDC3ER4320RMT5321</vin> <name>James Carr</name> </record> <record type="maint"> <vin>RGGHWERTGBWERGBQE23R</vin> <name>Som Won</name> </record> <record type="sold"> <vin>24werf34g34gqe4rgq</vin> <name>Das Boot</name> </record> </records>
Setting a simple string:
public void testElementEncounterCallbackSetName() throws SAXException, IOException{
parser = (PullSaxParser) XMLReaderFactory.createXMLReader("org.jamescarr.parsers.PullSaxParser");
handler = new ClosureContentHandler();
parser.setContentHandler(handler);
parser.iterateOn("record");
handler.onElement("name", new Closure<SimpleNode>(){
public void execute(SimpleNode arg) {
name = arg.getValue();
}
});
parser.parse(getInputSourceFromString(XML_STRING_OF_THREE_RECORDS));
Iterator iter = parser.iterator();
assertEquals("", name);
iter.next();
assertEquals("James Carr", name);
iter.next();
assertEquals("Som Won", name);
iter.next();
assertEquals("Das Boot", name);
}
Constructing an object:
int i = 1000;
public void testConstructRecordObjectFromEachElement() throws IOException, SAXException{
parser = (PullSaxParser) XMLReaderFactory.createXMLReader("org.jamescarr.parsers.PullSaxParser");
handler = new ClosureContentHandler();
parser.setContentHandler(handler);
parser.iterateOn("record");
handler.onElementStart("record", new Closure<SimpleNode>(){
public void execute(SimpleNode arg) {
record = new CustomerRecord(++i);
record.setType(arg.getAttribute("type"));
}
}).onElement("name", new Closure<SimpleNode>(){
public void execute(SimpleNode arg) {
record.setName(arg.getValue());
}
}).onElement("vin", new Closure<SimpleNode>(){
public void execute(SimpleNode arg) {
record.setVin(arg.getValue());
}
});
/// start
assertNull(record);
parser.parse(getInputSourceFromString(XML_STRING_OF_THREE_RECORDS));
Iterator iter = parser.iterator();
iter.next();
assertEquals(1001, record.getId());
assertEquals("James Carr", record.getName());
assertEquals("stolen", record.getType());
assertEquals("SDC3ER4320RMT5321", record.getVin());
iter.next();
assertEquals(1002, record.getId());
assertEquals("Som Won", record.getName());
assertEquals("maint", record.getType());
assertEquals("RGGHWERTGBWERGBQE23R", record.getVin());
}







