Stylish XML

Stylish XML


Part one of this article looked at how Cascading Style Sheets (CSS) can be used to make XML documents look good in a web browser. In part two, I’ll explore the more complex eXtensible Style sheet Language (XSL) and how it can be used to transform XML into HTML and PDF documents.

This article has downloads!

Limitations of XML/CSS

By the end of part one of this article I had a my tasty pico de gallo recipe marked up with XML tags and nicely styled using CSS. It looked great in my Firefox browser. Unfortunately, one of the major problems with using XML/CSS is that it doesn’t work for everyone. Older, proprietary browsers and text-only browsers can only understand HTML and get terribly confused when trying to interpret XML/CSS. If I want to share my delicious recipes with someone using Netscape 4 or Lynx, I’m going to have to convert my XML into a format that their browser can handle. This means leaving behind CSS and constructing a new style sheet using the eXtensible Style-sheet Language (XSL).

Unfortunately, one of the major problems with using XML/CSS is that it doesn’t work for everyone

As long as I’m in the mood for creating new things, I may as well create a new XML recipe to use for the examples in this article. You already have a recipe for a nice appetizer from part one, and now, for part two, you just need a drink to go with it. I can’t think of anything better to have with pico de gallo and tortilla chips than a cold margarita so that’s what I’ll use in the examples. By the way, the margarita recipe and all of the style sheets in this article are available as a compressed download.

Transforming with XSL

An XSL Transformation, or XSLT, is the process of transforming an XML document into another type of XML document. Now why on earth would someone want to transform one XML document into another XML document? I’ll give you a hint, Grasshopper: HTML is another type of XML document. I can build a custom XSL style sheet, apply it with a tool called an XSLT processor and presto, my XML is magically transformed into HTML.

In the examples, I will be using the XSLT processor called xsltproc to process the XSL style sheets. The xsltproc tool should be available as a package for most GNU/Linux distributions and Apple’s OS X, if you want to follow along. The basic syntax for the command is xsltproc -o [output-file] [style-sheet] [input-file]. Or, in the case of this article’s examples, xsltproc -o output.html recipe-style.xsl margarita.xml. If you don’t have access to xsltproc, or if you’re just feeling a little apathetic about typing all these commands, the output files from each of the examples are included alongside the other files in the compressed download.

An XSL Transformation is the process of transforming an XML document into another type of XML document

Simple beginnings

The most basic XSL style sheet is one that does nothing at all. Of course, there are a few headers that are required, but for the most part the style sheet is devoid of any processing instructions. When using a blank style sheet like this, it appears that xsltproc simply strips the tags from the XML recipe and dumps it as plain text to the output file. Close, but not quite. Take a look at the first line of the output and you will see a document-type declaration for HTML. This was added during processing, because the second line of the recipe-style.xsl specifies HTML as the output document. So I’m on the right track, but output.html displayed in a browser looks really bad. That’s because the only output I have produced so far is a plain-text file masquerading itself as HTML.

Output produced by an “empty” style sheetOutput produced by an “empty” style sheet

Introducing the XSL template

A real HTML document should have tags and currently output.html has none. To make a valid HTML file I’ll need to fix up the XSL style sheet to at least produce <html>, <head> and <body> tags in the appropriate places. This can be done by adding a single XSLT processing rule called a template. You can think of an XSLT template as being like a word processor’s find-and-replace feature. You use the “match” attribute to tell the template what XML element it should find. Everything between the template’s starting and ending tags is the replacement text. To convert my margarita recipe to HTML I need my style sheet to match the recipe element in the XML source file and replace it with appropriate HTML elements in the output document.

HTML tags are here, but the recipe’s gone!HTML tags are here, but the recipe’s gone!

It wasn’t hard to add the template rule to my recipe-style.xsl style sheet and get the HTML tags I wanted to see. Unfortunately, although I am one step closer to a valid HMTL document, it seems that all of the recipe’s content has now disappeared. This is because I have started using processing rules in my style sheet, but I have not followed through by specifying where to place the recipe’s content. Adding a simple <xsl:apply-templates /> element between the HTML <body> tags will fix things up and get my recipe content back.

<apply-templates /> brings the recipe’s content back<apply-templates /> brings the recipe’s content back

Templates galore

Now that I have created some basic HTML output, my recipe just needs a little aesthetic improvement. OK, so it needs a lot of improvement, especially in terms of appropriate line breaks and whitespace, but this is something I can easily do by employing more XSL templates. So far I’ve used a single XSL template in my recipe-style.xsl file. There’s no reason I should stop at just one. In fact, I can create a template for each XML element in my margarita.xml source file to give the HTML output a nice look.

You can think of an XSLT template as being like a word processor’s find-and-replace feature

Since I am to catering to people with older, proprietary browsers and text-only browsers, I’ll use standard HTML tags to achieve the basic style I want. A big improvement to my HTML output can be made by just by adding <p> and <br> tags to get line-breaks in the correct places. I can also change the font size for the title and section heading by using <h1> and <h2> tags, respectively. Four more simple templates added to recipe-style.xsl give the HTML output a more appealing look.

More templates for a better lookMore templates for a better look

The value of a title

My recipe already has the title displayed as a nice big level-one heading, but it really should have a title between the HTML <head> tags as well. It’s not a big deal in terms of the look of the document, but it is required for my output to be considered valid HTML.

I already have a template for the XML title element that displays it between <h1> tags in the HTML output and now I need to use the title again in another place, this time between <title> tags. While it is possible to write two templates for the title element and apply them differently in different situations, it is far easier to use an XSL <value-of /> element.

Remember when I said that templates were like using find-and-replace in a word processor? Well, <value-of /> is like using just the find without the replace, and it uses a select attribute to specify what to find. For example, if I want to know what text is between the <title> tags in my XML document I would simply use <value-of select="title" /> in my XSL style sheet to find it. Strategically placing this value-of element in my XSL style sheet will result in the recipe title appearing in the HTML header of my output as well as in the body. The result doesn’t look that much different in a browser, but by adding this title the HTML will now pass W3C’s markup validation.

HTML that validatesHTML that validates

Transforming to PDF and more

With just a few simple templates, XSL does a great job of transforming my XML-based margarita recipe into HTML, but that’s not all it can do. With a little more work I can also produce documents for Adobe Acrobat or OpenOffice all from the same XML source file. I’ll need to construct a new XSL style sheet and install a new tool called a Formatting Objects processor, but otherwise the procedure is very similar to what we’ve covered so far.

The reason for the new style sheet and the new tool is that transforming to PDFs is a two-step process. First, I will use XSL templates to transform my XML-based margarita recipe into another type of XML document. Sounds familiar right? Only this time I’m not transforming into HTML, but rather into a markup language called Formatting Objects (FO). The second step is to take the FO markup and run it through the FO processor to get a PDF document.

Formatting Objects markup may look scary at first, but it’s not bad after you play around with it a bit. To help you get comfortable I have included a sample XSL style sheet and Formatting Objects output as part of the files in the compressed download.

Further exploration

This article barely scratches the surface of the everything that can be done with XSL and FO. For those of you who want to go further, I have listed some helpful references below.

The W3 Schools website has some good tutorials covering XSL transforms and XSL formatting objects.

There are some XSL style sheets that you can download and examine both from my web site and from the DocBook XSL repository.

You can also learn a lot from using the various tools. Available XSLT processors include xsltproc and Saxon. A couple of my favorite Formatting Objects processors are Apache FOP and XMLMind’s FO Converter. All but xsltproc are Java-based and may be used on a variety of platforms.

Category: 
License: 

Comments

Anonymous visitor's picture
Submitted by Anonymous visitor (not verified) on

So this will make Cascading Style sheets irrevalant?

- eric gawaters blog

[edited by Admin, comment spam removed]

admin's picture
Submitted by admin on

Hi,

Nice try.
To other comment spammers: we read _every_ single comment before posting. Beware.

Admin

Most forwarded

Interview with Dave Mohyla, of DTIDATA

Dave Mohyla is the president and founder of dtidata.com, a hard drive recovery facility based in Tampa, Florida.

TM: Where are you based? What does your company do?
DTI Data recovery is based in South Pasadena, Florida which is a suburb of Tampa. We have been here for over 10 years. We operate a bio-metrically secured class 100 clean room where we perform hard drive recovery on all types of hard disks, from laptop hard drives to multi drive RAID systems.

Anybody up to writing good directory software?

Since the very beginning, directories (of any kind) have had a very central role in the internet. (I have recently grown fond of Free Web Directory. Even Slashdot can be considered a directory: a collection of great news and invaluable user-generated comments. As far as software is concerned, doing a quick search on Google about software directories will return the free (as in freedom) software directories like Savannah, SourceForge, Freshmeat and so on, followed by shareware and freeware sites such as FileBuzz, PCWin Download Center and All Freeware (great if you're looking for shareware and freeware, but definitely less comprehensive than their free-as-in-freedom counterparts).

Interview with Mark Shuttleworth

Mark Shuttleworth is the founder of Thawte, the first Certification Authority to sell public SSL certificates. After selling Thawte to Verisign, Mark moved on to training as an astronaut in Russia and visiting space. Once he got back he founded Ubuntu, the leading GNU/Linux distribution. He agreed on releasing a quick interview to Free Software Magazine.

Is better education the key to finding better software?

I read David Jonathon's article Anybody Up To Writing Good Directory Software? the other day, which got me thinking about software directories in general. As David mentioned, many of the software directories one finds when doing a quick google search are free as in beer, not as in freedom. But what interests me is the software directories that already exist, providing a combination of both free as in beer software, and open source software. Sites such as Freeware Downloads and Shareware Download don't advertise themselves as providing free as in liberty software, but each of them have a good selection of open source software available... if you know where to look.

Most emailed

Free Open Document label templates

If you’ve ever spent hours at work doing mailings, cursed your printer for printing outside the lines on your labels, or moaned “There has got to be a better way to do this,” here’s the solution you’ve been looking for. Working smarter, not harder! Worldlabel.com, a manufacture of labels offers Open Office / Libre Office labels templates for downloading in ODF format which will save you time, effort, and (if you want) make really cool-looking labels

Creating a user-centric site in Drupal

A little while ago, while talking in the #drupal mailing list, I showed my latest creation to one of the core developers there. His reaction was "Wow, I am always surprised what people use Drupal for". His surprise is somehow justified: I did create a site for a bunch of entertainers in Perth, a company set to use Drupal to take over the world with Entertainers.Biz.

Update: since writing this article, I have updated the system so that the whole booking process happens online. I will update the article accordingly!

So, why, why do people and companies develop free software?

More and more people are discovering free software. Many people only do so after weeks, or even months, of using it. I wonder, for example, how many Firefox users actually know how free Firefox really is—many of them realise that you can get it for free, but find it hard to believe that anybody can modify it and even redistribute it legally.

When the discovery is made, the first instinct is to ask: why do they do it? Programming is hard work. Even though most (if not all) programmers are driven by their higher-than-normal IQs and their amazing passion for solving problems, it’s still hard to understand why so many of them would donate so much of their time to creating something that they can’t really show off to anybody but their colleagues or geek friends.

Sure, anybody can buy laptops, and just program. No need to get a full-on lab or spend thousands of dollars in equipment. But... is that the full story?

Fun articles

Santa Claus - the most successful open source project

It dawned on me the other day, as I was shopping for the dozens of gifts it seems I have to buy every December, that Santa Claus is the most successful open source project in history. (Bridget @ Illiterarty would agree with that). Santa Claus is essentially a marketing development that is embodied by everyone who stuffs a sock, gives a gift, hosts a dinner or wishes Merry Christmas over the holiday season.

Most emailed

Editorial

When I first started thinking about Free Software Magazine, I was feeling enthusiastic about the dream. I had Dave, Gianluca, and Alan willing to help me, I had established members of the free software community willing to help me out, I had writers volunteering their time and energy for free, and I had a generous offer from OpenHosting for servers, all before I'd proved myself. There was a sense of excitement in the air, and I thought maybe, just maybe, I could make this work.

Free Software Magazine uses Apollo project management software and CRM for its everyday activities!