StackStalk
  • Home
  • Java
    • Java Collection
    • Spring Boot Collection
  • Python
    • Python Collection
  • C++
    • C++ Collection
    • Progamming Problems
    • Algorithms
    • Data Structures
    • Design Patterns
  • General
    • Tips and Tricks

Friday, January 3, 2014

SAX Parser to read XML file in Java

 January 03, 2014     Java     No comments   

SAX Parser Introduction

Simple API for XML (SAX) is an event driven, serial access mechanism for accessing XML documents. SAX is the fastest and least memory intensive mechanism for dealing with XML documents. An application that uses SAX provides an instance of handler class to the parser. When the parser detects XML constructs, it calls the methods of the handler class, passing them information about the construct that was detected. The most commonly used handler classes are DocumentHandler which is called when XML constructs are recognized, and ErrorHandler which is called when an error occurs.

Simple Java program implementation of SAX parser

This is the input data file. We are interested in parsing the TITLE and YEAR elements from this XML file.
<CATALOG>
 <CD>
  <TITLE>Empire Burlesque</TITLE>
  <ARTIST>Bob Dylan</ARTIST>
  <COUNTRY>USA</COUNTRY>
  <COMPANY>Columbia</COMPANY>
  <PRICE>10.90</PRICE>
  <YEAR>1985</YEAR>
 </CD>
 <CD>
  <TITLE>Hide your heart</TITLE>
  <ARTIST>Bonnie Tyler</ARTIST>
  <COUNTRY>UK</COUNTRY>
  <COMPANY>CBS Records</COMPANY>
  <PRICE>9.90</PRICE>
  <YEAR>1988</YEAR>
 </CD>
 <CD>
  <TITLE>Greatest Hits</TITLE>
  <ARTIST>Dolly Parton</ARTIST>
  <COUNTRY>USA</COUNTRY>
  <COMPANY>RCA</COMPANY>
  <PRICE>9.90</PRICE>
  <YEAR>1982</YEAR>
 </CD>
 <CD>
  <TITLE>Still got the blues</TITLE>
  <ARTIST>Gary Moore</ARTIST>
  <COUNTRY>UK</COUNTRY>
  <COMPANY>Virgin records</COMPANY>
  <PRICE>10.20</PRICE>
  <YEAR>1990</YEAR>
 </CD>
</CATALOG>
This is the main application program. Here we setup the SAX parser object and create the handler object. Then we read the input XML file initiate the parsing.
package com.sourcetricks.MySaxParser;

import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;

import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;

import org.xml.sax.SAXException;

public class MySaxParser {

 public static void main(String[] args) {

  try {
   // Setup the parser
   SAXParserFactory parserFactory = SAXParserFactory.newInstance();
   SAXParser parser = parserFactory.newSAXParser();
   
   // Setup the handler
   MyEchoHandler handler = new MyEchoHandler();
   
   // Read the XML file
   File inputFile = new File("resources/input-data.xml");
   InputStream inputStream = new FileInputStream(inputFile);
   
   // Parse the XML file using the handler
   parser.parse(inputStream, handler);
  } catch (ParserConfigurationException | SAXException | IOException e) {
   e.printStackTrace();
  } 
 }
}
This is an implementation of the DefaultHandler. Here we implement the document events, element events and character events to read the XML element values.
package com.sourcetricks.MySaxParser;

import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;

public class MyEchoHandler extends DefaultHandler {

 private StringBuffer textBuffer = null;
 private boolean elementOfInterest = false;
 
 // Document events
 public void startDocument() throws SAXException { 
  System.out.println("startDocument");
 }

 public void endDocument() throws SAXException {
  System.out.println("endDocument");
 }

 // Element events
 public void startElement(String namespaceURI,
   String sName, String qName,
   Attributes attrs) throws SAXException { 
  if ( qName.equalsIgnoreCase("TITLE") || qName.equalsIgnoreCase("YEAR") ) {
   elementOfInterest = true;
  }
  else {
   elementOfInterest = false;
  }
 }

 public void endElement(String namespaceURI,
   String sName, String qName ) throws SAXException {
  if ( textBuffer == null )
   return;
  
  System.out.println(qName + " = " + textBuffer);
  textBuffer = null;
  elementOfInterest = false;
 }

 // Character events
 public void characters(char buf[], int offset, 
   int len) throws SAXException {
  if ( ! elementOfInterest )
   return;
  
  // Accumulate the characters delivered by the parser in the buffer
  String str = new String(buf, offset, len);
  if ( textBuffer == null ) {
   textBuffer = new StringBuffer(str);
  }
  else {
   textBuffer.append(str);
  }
 }
}
This is output.
startDocument
TITLE = Empire Burlesque
YEAR = 1985
TITLE = Hide your heart
YEAR = 1988
TITLE = Greatest Hits
YEAR = 1982
TITLE = Still got the blues
YEAR = 1990
endDocument
Email ThisBlogThis!Share to XShare to Facebook
Newer Post Older Post Home

0 comments:

Post a Comment

Follow @StackStalk
Get new posts by email:
Powered by follow.it

Popular Posts

  • Python FastAPI file upload and download
    In this article, we will look at an example of how to implement a file upload and download API in a Python FastAPI microservice. Example bel...
  • Monitor Spring Boot App with Micrometer and Prometheus
    Modern distributed applications typically have multiple microservices working together. Ability to monitor and manage aspects like health, m...
  • Avro Producer and Consumer with Python using Confluent Kafka
    In this article, we will understand Avro a popular data serialization format in streaming data applications and develop a simple Avro Produc...
  • Server-Sent Events with Spring WebFlux
    In this article we will review the concepts of server-sent events and work on an example using WebFlux. Before getting into this article it ...
  • Accessing the Kubernetes API
    In this article, we will explore the steps required to access the Kubernetes API and overcome common challenges. All operations and communic...
  • Python FastAPI microservice with Okta and OPA
    Authentication (AuthN) and Authorization (AuthZ) is a common challenge when developing microservices. In this article, we will explore how t...
  • Scheduling jobs in Python
    When developing applications and microservices we run into scenarios where there is a need to run scheduled tasks. Examples include performi...
  • Using Tekton to deploy KNative services
    Tekton is a popular open-source framework for building continuous delivery pipelines. Tekton provides a declarative way to define pipelines ...

Copyright © StackStalk