StackStalk
  • Home
  • Java
    • Java Collection
    • Spring Boot Collection
  • Python
    • Python Collection
  • C++
    • C++ Collection
    • Progamming Problems
    • Algorithms
    • Data Structures
    • Design Patterns
  • General
    • Tips and Tricks

Friday, January 3, 2014

SAX Parser to read XML file in Java

 January 03, 2014     Java     No comments   

SAX Parser Introduction

Simple API for XML (SAX) is an event driven, serial access mechanism for accessing XML documents. SAX is the fastest and least memory intensive mechanism for dealing with XML documents. An application that uses SAX provides an instance of handler class to the parser. When the parser detects XML constructs, it calls the methods of the handler class, passing them information about the construct that was detected. The most commonly used handler classes are DocumentHandler which is called when XML constructs are recognized, and ErrorHandler which is called when an error occurs.

Simple Java program implementation of SAX parser

This is the input data file. We are interested in parsing the TITLE and YEAR elements from this XML file.
<CATALOG>
 <CD>
  <TITLE>Empire Burlesque</TITLE>
  <ARTIST>Bob Dylan</ARTIST>
  <COUNTRY>USA</COUNTRY>
  <COMPANY>Columbia</COMPANY>
  <PRICE>10.90</PRICE>
  <YEAR>1985</YEAR>
 </CD>
 <CD>
  <TITLE>Hide your heart</TITLE>
  <ARTIST>Bonnie Tyler</ARTIST>
  <COUNTRY>UK</COUNTRY>
  <COMPANY>CBS Records</COMPANY>
  <PRICE>9.90</PRICE>
  <YEAR>1988</YEAR>
 </CD>
 <CD>
  <TITLE>Greatest Hits</TITLE>
  <ARTIST>Dolly Parton</ARTIST>
  <COUNTRY>USA</COUNTRY>
  <COMPANY>RCA</COMPANY>
  <PRICE>9.90</PRICE>
  <YEAR>1982</YEAR>
 </CD>
 <CD>
  <TITLE>Still got the blues</TITLE>
  <ARTIST>Gary Moore</ARTIST>
  <COUNTRY>UK</COUNTRY>
  <COMPANY>Virgin records</COMPANY>
  <PRICE>10.20</PRICE>
  <YEAR>1990</YEAR>
 </CD>
</CATALOG>
This is the main application program. Here we setup the SAX parser object and create the handler object. Then we read the input XML file initiate the parsing.
package com.sourcetricks.MySaxParser;

import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;

import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;

import org.xml.sax.SAXException;

public class MySaxParser {

 public static void main(String[] args) {

  try {
   // Setup the parser
   SAXParserFactory parserFactory = SAXParserFactory.newInstance();
   SAXParser parser = parserFactory.newSAXParser();
   
   // Setup the handler
   MyEchoHandler handler = new MyEchoHandler();
   
   // Read the XML file
   File inputFile = new File("resources/input-data.xml");
   InputStream inputStream = new FileInputStream(inputFile);
   
   // Parse the XML file using the handler
   parser.parse(inputStream, handler);
  } catch (ParserConfigurationException | SAXException | IOException e) {
   e.printStackTrace();
  } 
 }
}
This is an implementation of the DefaultHandler. Here we implement the document events, element events and character events to read the XML element values.
package com.sourcetricks.MySaxParser;

import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;

public class MyEchoHandler extends DefaultHandler {

 private StringBuffer textBuffer = null;
 private boolean elementOfInterest = false;
 
 // Document events
 public void startDocument() throws SAXException { 
  System.out.println("startDocument");
 }

 public void endDocument() throws SAXException {
  System.out.println("endDocument");
 }

 // Element events
 public void startElement(String namespaceURI,
   String sName, String qName,
   Attributes attrs) throws SAXException { 
  if ( qName.equalsIgnoreCase("TITLE") || qName.equalsIgnoreCase("YEAR") ) {
   elementOfInterest = true;
  }
  else {
   elementOfInterest = false;
  }
 }

 public void endElement(String namespaceURI,
   String sName, String qName ) throws SAXException {
  if ( textBuffer == null )
   return;
  
  System.out.println(qName + " = " + textBuffer);
  textBuffer = null;
  elementOfInterest = false;
 }

 // Character events
 public void characters(char buf[], int offset, 
   int len) throws SAXException {
  if ( ! elementOfInterest )
   return;
  
  // Accumulate the characters delivered by the parser in the buffer
  String str = new String(buf, offset, len);
  if ( textBuffer == null ) {
   textBuffer = new StringBuffer(str);
  }
  else {
   textBuffer.append(str);
  }
 }
}
This is output.
startDocument
TITLE = Empire Burlesque
YEAR = 1985
TITLE = Hide your heart
YEAR = 1988
TITLE = Greatest Hits
YEAR = 1982
TITLE = Still got the blues
YEAR = 1990
endDocument
  • Share This:  
Newer Post Older Post Home

0 comments:

Post a Comment

Follow @StackStalk
Get new posts by email:
Powered by follow.it

Popular Posts

  • Avro Producer and Consumer with Python using Confluent Kafka
    In this article, we will understand Avro a popular data serialization format in streaming data applications and develop a simple Avro Produc...
  • Monitor Spring Boot App with Micrometer and Prometheus
    Modern distributed applications typically have multiple microservices working together. Ability to monitor and manage aspects like health, m...
  • Server-Sent Events with Spring WebFlux
    In this article we will review the concepts of server-sent events and work on an example using WebFlux. Before getting into this article it ...
  • Implement caching in a Spring Boot microservice using Redis
    In this article we will explore how to use Redis as a data cache for a Spring Boot microservice using PostgreSQL as the database. Idea is to...
  • Python FastAPI microservice with Okta and OPA
    Authentication (AuthN) and Authorization (AuthZ) is a common challenge when developing microservices. In this article, we will explore how t...
  • Spring Boot with Okta and OPA
    Authentication (AuthN) and Authorization (AuthZ) is a common challenge when developing microservices. In this article, we will explore how t...
  • Getting started with Kafka in Python
    This article will provide an overview of Kafka and how to get started with Kafka in Python with a simple example. What is Kafka? ...
  • Getting started in GraphQL with Spring Boot
    In this article we will explore basic concepts on GraphQL and look at how to develop a microservice in Spring Boot with GraphQL support. ...

Copyright © StackStalk