StackStalk
  • Home
  • Java
    • Java Collection
    • Spring Boot Collection
  • Python
    • Python Collection
  • C++
    • C++ Collection
    • Progamming Problems
    • Algorithms
    • Data Structures
    • Design Patterns
  • General
    • Tips and Tricks

Saturday, January 18, 2014

DOM Parser to read XML file in Java

 January 18, 2014     Java     No comments   

In this article, we will explore DOM Parser approach to read XML file contents, parse and iterate to read the required content.

DOM Parser Introduction

Document Object Model (DOM) API for XML approach is memory intensive compared to the SAX parser. Refer SAX Parser for an example implementation of SAX parser.

If XML content size is large it is recommended to use the SAX parser approach. In the DOM parsing approach we load the entire contents of an XML file into a tree structure and then iterate through the tree to read the content. Typically when we need to modify the XML documents DOM parser would be advantageous.

A sample implementation of DOM parser is listed below. Here we read the XML file and create a Document object in memory. Then we iterate through the tree and extract the required elements/ attributes. It is a typical practice to use a POJO to store the contents for application use.

Simple Java program implementation of DOM parser

This is the input XML file we are interested in parsing. We need the attribute ID and the elements TITLE and ARTIST.
<CATALOG>
 <CD id="1">
  <TITLE>Empire Burlesque</TITLE>
  <ARTIST>Bob Dylan</ARTIST>
  <COUNTRY>USA</COUNTRY>
  <COMPANY>Columbia</COMPANY>
  <PRICE>10.90</PRICE>
  <YEAR>1985</YEAR>
 </CD>
 <CD id="2">
  <TITLE>Hide your heart</TITLE>
  <ARTIST>Bonnie Tyler</ARTIST>
  <COUNTRY>UK</COUNTRY>
  <COMPANY>CBS Records</COMPANY>
  <PRICE>9.90</PRICE>
  <YEAR>1988</YEAR>
 </CD>
 <CD id="3">
  <TITLE>Greatest Hits</TITLE>
  <ARTIST>Dolly Parton</ARTIST>
  <COUNTRY>USA</COUNTRY>
  <COMPANY>RCA</COMPANY>
  <PRICE>9.90</PRICE>
  <YEAR>1982</YEAR>
 </CD>
 <CD id="4">
  <TITLE>Still got the blues</TITLE>
  <ARTIST>Gary Moore</ARTIST>
  <COUNTRY>UK</COUNTRY>
  <COMPANY>Virgin records</COMPANY>
  <PRICE>10.20</PRICE>
  <YEAR>1990</YEAR>
 </CD>
</CATALOG>
We create a POJO to store the contents for application use with the required data.
package com.sourcetricks.MyDomParser;

public class CD {
 private String id;
 private String title;
 private String artist;
 
 public String getId() {
  return id;
 }
 public void setId(String id) {
  this.id = id;
 }
 public String getTitle() {
  return title;
 }
 public void setTitle(String title) {
  this.title = title;
 }
 public String getArtist() {
  return artist;
 }
 public void setArtist(String artist) {
  this.artist = artist;
 }
 public void print() {
  System.out.println("ID = " + id);
  System.out.println("Title = " + title);
  System.out.println("Artist = " + artist);
 }
}
This is the main application program to read the XML file contents, parse and iterate to read the required content. Finally we print the POJO's.
package com.sourcetricks.MyDomParser;

import java.io.File;
import java.io.FileInputStream;
import java.io.InputStream;
import java.util.ArrayList;
import java.util.List;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;

import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.NodeList;
import org.w3c.dom.Node;

public class MyDomParser {
 public static void main(String[] args) {

  List<CD> cdList = new ArrayList<CD>();
  
  try {
   // Setup the parser
   DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance();
   DocumentBuilder builder = builderFactory.newDocumentBuilder();
   
   // Read the XML file
   File inputFile = new File("resources/input-data.xml");
   InputStream inputStream = new FileInputStream(inputFile);
   
   // Parse the XML file   
   Document doc = builder.parse(inputStream);
   
   // Get all CD elements
   NodeList cdElements = doc.getElementsByTagName("CD");
   for ( int i = 0; i < cdElements.getLength(); i++ ) {
    Node currentNode = cdElements.item(i);
    
    // Seen the CD tag
    if ( currentNode instanceof Element ) {
     
     // Store in a pojo
     CD cd = new CD();
     
     // Read attribute of CD element
     cd.setId(((Element) currentNode).getAttribute("id"));
     
     // Child elements under CD
     NodeList childNodes = currentNode.getChildNodes();
     for ( int j = 0; j < childNodes.getLength(); j++ ) {
      Node childNode = childNodes.item(j);
      
      if ( childNode instanceof Element ) {
       if ( childNode.getNodeName().equalsIgnoreCase("title") ) {
        cd.setTitle(childNode.getTextContent());
       }
       else if ( childNode.getNodeName().equalsIgnoreCase("artist") ) {
        cd.setArtist(childNode.getTextContent());
       }
       // Include other elements as needed
      }
     }
     
     cdList.add(cd);
    }
   }
  } catch (Exception e) {
   e.printStackTrace();
  } 
  
  // Print contents of CD list
  for ( CD c : cdList ) {
   c.print();
  }
 }
}
This is output.
ID = 1
Title = Empire Burlesque
Artist = Bob Dylan
ID = 2
Title = Hide your heart
Artist = Bonnie Tyler
ID = 3
Title = Greatest Hits
Artist = Dolly Parton
ID = 4
Title = Still got the blues
Artist = Gary Moore
Email ThisBlogThis!Share to XShare to Facebook
Newer Post Older Post Home

0 comments:

Post a Comment

Follow @StackStalk
Get new posts by email:
Powered by follow.it

Popular Posts

  • Python FastAPI file upload and download
    In this article, we will look at an example of how to implement a file upload and download API in a Python FastAPI microservice. Example bel...
  • Avro Producer and Consumer with Python using Confluent Kafka
    In this article, we will understand Avro a popular data serialization format in streaming data applications and develop a simple Avro Produc...
  • Monitor Spring Boot App with Micrometer and Prometheus
    Modern distributed applications typically have multiple microservices working together. Ability to monitor and manage aspects like health, m...
  • Server-Sent Events with Spring WebFlux
    In this article we will review the concepts of server-sent events and work on an example using WebFlux. Before getting into this article it ...
  • Accessing the Kubernetes API
    In this article, we will explore the steps required to access the Kubernetes API and overcome common challenges. All operations and communic...
  • Python FastAPI microservice with Okta and OPA
    Authentication (AuthN) and Authorization (AuthZ) is a common challenge when developing microservices. In this article, we will explore how t...
  • Scheduling jobs in Python
    When developing applications and microservices we run into scenarios where there is a need to run scheduled tasks. Examples include performi...
  • Using Tekton to deploy KNative services
    Tekton is a popular open-source framework for building continuous delivery pipelines. Tekton provides a declarative way to define pipelines ...

Copyright © StackStalk