XMLStarlet: a Unix toolkit for XML

An introduction to a quick solution tool that allows manipulating verbose XML files with a minimum of typing.

Download the whole article as PDF

Write a full post in response to this!


XML is everywhere. A quick Google search shows more than a 100 Million articles about the subject. The XML proponents gush about its ability to provide structure and yet remain human readable. The XML critics are quick to mention that XML is so verbose that being human readable does not necessarily make it human comprehensible. Both sides are correct. Yet, despite the ongoing arguments, XML is already integrated into many software products and the rate of adoption is still on the rise. And that means that you need to learn tools and techniques that will allow you to use XML effectively.

There are many ways to work with XML. If you need an overview of tools and techniques, Free Software Magazine published one in its first issue (XML: The answer to everything?). In this article I will be using XMLStarlet, a tool based on XSLT(Extensible Stylesheet Language, Transform). I will show you how to use this tool to quickly extract information out of the XML files.

The catch-22 of XSLT

XSLT, while a very powerful technique for manipulating XML, is itself written in XML. Therefore, to deal with the verbosity of XML, you still need to deal with this verbosity in XSLT.

All of this is changed by the XMLStarlet, a multi-platform free software utility written by Mikhail Grushinskiy on top of the libxml2 and libxslt libraries. While XMLStarlet has the full functionality of the XSLT engine underneath, its interface is compact in the style of UNIX utilities, such as grep and find. I am assuming of course, that you find UNIX utilities simple; if you do not, the trade-off of compactness for clarity may not be worth it.

XML is already integrated into many software products and the rate of adoption is still on the rise. And that means that you need to learn tools and techniques that will allow you to use XML effectively

To me, XMLStarlet is the UNIX toolkit for the XML world as it is capable of queries (think find and grep), editing (sed) and other—more XML specific—operations such as validation, formatting and canonicalization.

Setting up

XMLStarlet is available for Linux, Solaris, MacOS X and Windows. I use it on Windows, but the examples should work the same way on all platforms. Please ensure that XMLStarlet’s binary (xml or xml.exe) is on the path before any other programs of the same name. This is based on the program version 1.0.0.

Once you have XMLStarlet installed following the instructions bundled, try running the basic command xml. It should print out a long list of options.

sel Select data or query XML document(s) (XPATH, etc
ed Edit/Update XML document(s)
tr Transform XML document(s) using XSLT
val Validate XML document(s) (well-formed/DTD/XSD/RelaxNG)
fo Format XML document(s)
el Display element structure of XML document

XML Starlet’s partial list of options

A simple XML file

I will be using a basic phone book XML example to show the abilities of XMLStarlet. This could be something that your mobile phone would store or export its list of contacts to.

Sample phonebook XML file

<?xml version="1.0"?>
<phonebook>
 <contact>
  <name>John Doe</name>
  <phone type="home">555-1234</phone>
  <phone type="work">555-9876</phone>
 </contact>
 <contact>
  <name>Chris Jones</name>
  <phone type="work">555-9876</phone>
  <phone type="home">555-4567</phone>
  <phone type="mobile">555-5555</phone>
 </contact>
 <contact>
  <name>Jane Exciting</name>
  <phone type="home">555-1234</phone>
  <phone type="work">555-6543</phone>
 </contact>
</phonebook>

In the list, there are 3 contacts, each containing a person’s name and a couple of his or her phone numbers. In real life, you would probably have several dozen contacts, each with more field types. But even with this simple example, you can probably see that it could be difficult to locate the information hidden within all the tags.

When you look at the information presented here, from inside the application owning the format, you most probably see one contact entry at a time, possibly even one phone number per screen.

While XMLStarlet has the full functionality of the XSLT engine underneath, its interface is compact in the style of UNIX utilities, such as `grep` and `find`

XMLStarlet allows you to get below the proprietary interface and manipulate multiple entries in one go.

For this article, I will assume that the example XML is in the file phonebook.xml.

Don't miss out on the other pages!
1234next ›last »

Write a full post in response to this!

Similar articles

0

Do you like this post?
Vote for it!

Copyright information

This article is made available under the "Attribution" Creative Commons License 3.0 available from http://creativecommons.org/licenses/by/3.0/.

Biography

Alexandre Rafal...: Alexandre Rafalovitch is a developer with more than 15 years of IT experience. He specializes in Java and XML technology. Alexandre has worked for several years as BEA senior technical support engineer, which gave him a strong impetus to study quick solution tools.

Anonymous visitor's picture

double quotes on xpath expression

Submitted by Anonymous visitor on Thu, 2006-08-17 07:42.

Vote!
0

I tried

xml sel -t -m //contact[phone/@type='mobile'] -v name -o ":" -v phone[@type='mobile'] -n phonebook.xml

But this did not work for me. No answer on stdout. I had to insert double quotes for this to work


xml sel -t -m "//contact[phone/@type='mobile']" -v name -o ":" -v "phone[@type='mobile']" -n phonebook.xml

Daniel Escasa's picture

If you tried this on a *nix

Submitted by Daniel Escasa on Wed, 2007-03-14 01:52.

Vote!
0

If you tried this on a *nix system, the left and right brackets... Well, if you tried it on a *nix system, you know what those do :)

Daniel O. Escasa
IT consultant and writer for hire
contributor, Free Software Magazine (http://www.freesoftwaremagazine.com)
personal blog at http://descasa.i.ph

Anonymous visitor's picture

excellent article

Submitted by Anonymous visitor on Tue, 2007-03-13 17:19.

Vote!
0

excellent article, thanks!

tsoueid's picture

How to update node text value ?

Submitted by tsoueid (not verified) on Fri, 2007-08-17 23:00.

Vote!
0

Hello,

Is is possible in xmlstarlet to update the value of a node by using concat and self:: ?

The below command is not parsing concat and self:: correctly for me:

C:\>xml ed -u //phone[@type='work'] -v "concat(':',self::phone)" phonebook.xml

<?xml version="1.0"?>

John Doe
555-1234
concat(':',self::phone)

Chris Jones
concat(':',self::phone)
555-4567
555-5555

Jane Exciting
555-1234
concat(':',self::phone)


From the FSM staff...

Odiogo