.NET Issues: Information from XML without Parsing in Java

Introduction

As you know to gather information from an XML file you have to parse the XML file using DOM, SAX or Stax or any other way. There are ways you can get some information from an XML file using plain String processing. Sometimes it performs well also. But this concept cannot be generalized for any kinds or types of XML document. XML parsing with a particular methodology has its own pros and cons. It is always significant to extract more information and provide to the system for smooth and fast processing. Sometimes developers adopt some different approaches to provide the optimum result to the system. In this small post I will provide a very small trick about how to gather information from an XML file without using any conventional parsing techniques in a particular scenario. I will also show you about the performance of both ways (Using Parsing technique and Using Text processing).

Technicalities

I was working in a project where a destination system used provide the result in the form of an XML file and the source system used to display the exact result to the end system. This is a very particular scenario where we have to employ our own technique to achieve the best result. In order to substantiate the above statements, let me provide you a typical scenario. Think about a situation, you are making a call to an external system which connects to a device like POS (Point of Sale) device which only provides the information about your transaction. The transaction may be a success or a failure one. Your system is only concerned about the core output of the XML contents. Also think that in a second, you perform many transactions. Generally the convention way is to parse the XML file and display the data enclosed inside a tag. There can be a debate on this about which parsing technique to use, whether to use DOM or SAX. It all depends upon the situation. In this particular scenario, the XML document is very small and the source system displays only the data without any pre or post processing. I am not saying that the XML parsing in this situation is wrong, but I can recommend if you can process the text smartly from an XML file, it will be helpful to a greater extent. Let us consider the following source XML files and outcome of the XML file.

Successful Transaction

<StatusMsg>Transaction Successful</StatusMsg>

</Status>

</Transaction>

Failed Transaction

<StatusMsg>Transaction Failed</StatusMsg>

<Reason>UnExpected Error in reading Card</Reason>

</Status>

</Transaction>

Now the source system displays the following information.

001
Transaction Successful
09:09:2013
14:54:53

The above situation is a hypothetical one. The whole objective is to show how you can change your business algorithm to boost the performance of your application. It is not always necessary to follow always the conventional approach to achieve the performance.

In the above case, the text you are displaying to the system is very small. Well you can use conventional parsing technique to show the information. But if you have a pile of XML documents of this structure and you want to gather information using XML parsing technique, there may be slight performance repercussion. To avoid, you can apply plain text processing using a simple regular expression. I provide below the two approaches, let us see below.

XML Parsing Technique

The following steps are required to get the information from the XML file.

Get the XML contents as a String
Parse the XML contents using DOM parsing
Visit each element or tag of the XML doc and extract the contents
Display the whole contents to the system

The brief code snippet for XML parsing is given below.

public static void processTxn(String contents) {

DocumentBuilderFactory docBuilderFact = DocumentBuilderFactory

.newInstance();

DocumentBuilder docBuilder = null;

Document doc = null;

try {

docBuilder = docBuilderFact.newDocumentBuilder();

} catch (ParserConfigurationException e) {

e.printStackTrace();

}

StringReader srReader = new StringReader(contents);

InputSource inSrc = new InputSource(srReader);

try {

doc = docBuilder.parse(inSrc);

} catch (SAXException e) {

e.printStackTrace();

} catch (IOException e) {

e.printStackTrace();

}

System.out.println("---------------Message From System-------------");

recursiveParsing(doc.getDocumentElement());

}

Text Processing Technique

The following steps are required to get the information from the XML file

Get the XML contents as a String
Remove all the tags (<Tag>)
Display the whole contents to system

The brief code snippet for removal of XML tags is given below.

public static String removeXmlTags(String contents) {

Pattern tag = Pattern.compile("<.*?>");

Matcher mtag = tag.matcher(contents);

while (mtag.find()) {

contents = mtag.replaceAll("");

}

return contents;

}

Comparison

To compare the time taken by both the code structure, let us make an experiment with 10 observations. Let us see the result below.

********** Time Taken in TEXT Processing *************

NANOSECONDS MILLISECONDS SECONDS

5919837 5.919837 0.005920

********** Time Taken in XML Processing *************

NANOSECONDS MILLISECONDS SECONDS

35151999 35.151999 0.035152

If you run the code in profile mode using Netbeans ide, you can see the difference in time consumptions. Let us see the image below.

.NET Issues

Thursday, September 19, 2013

Information from XML without Parsing in Java

No comments:

Post a Comment