Java SAX Parsing Example

Jakob Jenkov
Last update: 2014-05-21

In this text I will show you an example of how to parse an XML file using a SAX parser, and building an object graph from the parsed XML.

Here is the XML document I want to parse:

<driverVehicleInfo>
    <vehicle>
        <vehicleId>1</vehicleId>
        <name>Limousine</name>
    </vehicle>
    <vehicle>
        <vehicleId>2</vehicleId>
        <name>Aston Martin</name>
    </vehicle>
    <vehicle>
        <vehicleId>3</vehicleId>
        <name>Bus</name>
    </vehicle>

    <driver>
        <driverId>1</driverId>
        <firstName>John</firstName>
        <lastName>Doe</lastName>
        <vehicleId>1</vehicleId>
        <vehicleId>2</vehicleId>
    </driver>
    <driver>
        <driverId>2</driverId>
        <name>Joe Blocks</name>
        <vehicleId>3</vehicleId>
    </driver>
</driverVehicleInfo>

This XML structure I want to parse into an object structure of Driver objects linked to Vehicle objects. Here is the definition of these two classes. I have skipped the getters and setters just to shorten the code. The toString() methods are not necessary. They are only used to easier print out the results in the example.

public class Driver {
    public String driverId = null;
    public String name     = null;

    public List<Vehicle> vehicles = new ArrayList<Vehicle>();

    public String toString() {
        return this.driverId + " : " +
               this.name     + " : " +
               this.vehicles;
    }

}
public class Vehicle {
    public String vehicleId = null;
    public String name      = null;

    public String toString() {
        return  this.vehicleId + " : " +
                this.name;
    }
}

Here is the DefaultHandler subclass that does the parsing. Notice how it uses two stacks to keep of the XML elements parsed, and the Driver and Vehicle objects created.

Notice also, that if the <vehicle> elements had been listed last in the XML file instead of first, this handler would not have worked. The reason is, that when the <driver> elements are processed they expect the vehicles Map to be filled in already. If the <vehicle> elements were listed after the <driver> elements, the vehicles Map would be empty when the <driver> elements were processed.

I will leave it as an excercise for you to decipher what is going on in there. The mere fact that it is not entirely easy to understand is a message in itself. SAX may be fast, but the code you write using it, does not always look too elegant.

package xml.sax;

import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;

import java.util.*;

/**

 */
public class SaxHandler extends DefaultHandler {

    public List<Driver>         drivers  = new ArrayList<Driver>();
    public Map<String, Vehicle> vehicles = new HashMap<String, Vehicle>();

    private Stack<String> elementStack = new Stack<String>();
    private Stack<Object> objectStack  = new Stack<Object>();


    public void startElement(String uri, String localName,
        String qName, Attributes attributes) throws SAXException {

        this.elementStack.push(qName);

        if("driver".equals(qName)){
            Driver driver = new Driver();
            this.objectStack.push(driver);
            this.drivers.add(driver);
        } else if("vehicle".equals(qName)){
            this.objectStack.push(new Vehicle());
        }
    }

    public void endElement(String uri, String localName,
        String qName) throws SAXException {

        this.elementStack.pop();

        if("vehicle".equals(qName) || "driver".equals(qName)){
            Object object = this.objectStack .pop();
            if("vehicle".equals(qName)){
                Vehicle vehicle = (Vehicle) object;
                this.vehicles.put(vehicle.vehicleId, vehicle);
            }
        }
    }

    public void characters(char ch[], int start, int length)
        throws SAXException {

        String value = new String(ch, start, length).trim();
        if(value.length() == 0) return; // ignore white space

        if("driverId".equals(currentElement())){
            Driver driver = (Driver) this.objectStack.peek();
            driver.driverId = (driver.driverId != null ?
                               driver.driverId  : "") + value;
        } else if("name"  .equals(currentElement()) &&
                  "driver".equals(currentElementParent())){

            Driver driver = (Driver) this.objectStack.peek();
            driver.name = (driver.name != null ?
                                driver.name  : "") + value;

        } else if("vehicleId".equals(currentElement()) &&
                  "driver"   .equals(currentElementParent())){

            Driver driver = (Driver) this.objectStack.peek();
            Vehicle vehicle = this.vehicles.get(value);
            if(vehicle != null) driver.vehicles.add(vehicle);

        } else if("vehicleId".equals(currentElement()) &&
                  "vehicle"  .equals(currentElementParent())){

            Vehicle vehicle   = (Vehicle) this.objectStack.peek();
            vehicle.vehicleId = (vehicle.vehicleId != null ?
                                    vehicle.vehicleId  : "")  + value;

        } else if("name"   .equals(currentElement()) &&
                  "vehicle".equals(currentElementParent())){

            Vehicle vehicle = (Vehicle) this.objectStack.peek();
            vehicle.name = (vehicle.name != null ? vehicle.name  : "")
                               + value;
        }
    }

    private String currentElement() {
        return this.elementStack.peek();
    }

    private String currentElementParent() {
        if(this.elementStack.size() < 2) return null;
        return this.elementStack.get(this.elementStack.size()-2);
    }

}    

Here is the code that runs the example:

public class SaxParserExample {

    public static void main (String argv []) {
        SAXParserFactory factory = SAXParserFactory.newInstance();
        try {
            InputStream    xmlInput  =
                new FileInputStream("data\\sax-example.xml");

            SAXParser      saxParser = factory.newSAXParser();
            SaxHandler handler   = new SaxHandler();
            saxParser.parse(xmlInput, handler);

            for(Driver driver : handler.drivers){
                System.out.println(driver);
            }
        } catch (Throwable err) {
            err.printStackTrace ();
        }
    }
}

And here is the output produced using the given XML file, handler and runner example:

1 : John Doe : [1 : Limousine, 2 : Aston Martin]
2 : Joe Blocks : [3 : Bus]

Jakob Jenkov

Featured Videos

Java ConcurrentMap + ConcurrentHashMap

Java Generics

Java ForkJoinPool

P2P Networks Introduction

















Close TOC
All Tutorial Trails
All Trails
Table of contents (TOC) for this tutorial trail
Trail TOC
Table of contents (TOC) for this tutorial
Page TOC
Previous tutorial in this tutorial trail
Previous
Next tutorial in this tutorial trail
Next