Java StAX: XMLEventReader - The Iterator API
Jakob Jenkov |
The XMLEventReader
class in Java StAX provides an Iterator
style API for parsing XML.
In other words, it allows you to move from event to event in the XML, letting you control when to move to the
next event. An "event" in this case is for instance the beginning of an element, the end of an element, a group
of text etc. In other words, pretty much the same events you would get from a SAX parser.
You create an XMLEventReader
via the javax.xml.stream.XMLInputFactory
class.
Here is how that looks:
XMLInputFactory factory = XMLInputFactory.newInstance(); //get Reader connected to XML input from somewhere.. Reader reader = getXmlReader(); try { XMLEventReader eventReader = factory.createXMLEventReader(reader); } catch (XMLStreamException e) { e.printStackTrace(); }
Once created you can iterate through the XML input from the underlying Reader
. Here is
how that looks:
while(eventReader.hasNext()){ XMLEvent event = eventReader.nextEvent(); if(event.getEventType() == XMLStreamConstants.START_ELEMENT){ StartElement startElement = event.asStartElement(); System.out.println(startElement.getName().getLocalPart()); } //handle more event types here... }
You obtain an XMLEvent
object from the XMLStreamReader
by calling its nextEvent()
method. From the event object you can check what type of event you've got, by calling its getEventType()
method. Depending on what type of event you have encountered, you will do different actions.
XML Stream Events
Below is a list of the events you can encounter in an XML stream. There are constants for each of these events in
the javax.xml.stream.XMLStreamConstants
interface.
- ATTRIBUTE
- CDATA
- CHARACTERS
- COMMENT
- DTD
- END_DOCUMENT
- END_ELEMENT
- ENTITY_DECLARATION
- ENTITY_REFERENCE
- NAMESPACE
- NOTATION_DECLARATION
- PROCESSING_INSTRUCTION
- SPACE
- START_DOCUMENT
- START_ELEMENT
XMLEvent Processing
From the XMLEvent object you can get access to the corresponding XML data. You can also get information about where (line number + column number) in the XML stream the event was encountered.
You can turn the event object into a more specific event type object, by calling one of these 3 methods:
- asStartElement()
- asEndElement()
- asCharacters()
Exactly how that works with events like START_DOCUMENT, NAMESPACE or PROCESSING_INSTRUCTION, I don't yet know. I'll update this text when I do. Luckily, we will most often only need the START_ELEMENT, END_ELEMENT, and CHARACTERS events, so this lack of knowledge isn't crucial.
XMLEvent.asStartElement()
The asStartElement()
method returns a java.xml.stream.StartElement
object. From this
object you can get the name of the element, get the namespaces of the element, and the attributes of the element.
See the Java 6 JavaDoc for more detail.
XMLEvent.asEndElement()
The asEndElement()
method returns a java.xml.stream.EndElement
object. From this
object you can get the element name and namespace.
XMLEvent.asCharacters()
The asCharacters()
method return a java.xml.stream.Characters
object. From this
object you can obtain the characters themselves, as well as see if the characters are CDATA, white space,
or ignorable white space.
Tweet | |
Jakob Jenkov |