The trouble of writing a standards compliant website

The trouble of writing a standards compliant website


One of my tasks at work is to write, enhance and maintain a small website for my boss. Having been given free reign, I—of course—decided to host it on a LAMP server. No trouble here. Not wanting to use outdated technology that would require extensive rewriting after a few years, I decided to stick to standards—and I learnt XHTML 1.1.

Break a leg, or break a page.

Of nice standards

XHTML 1.1 is the latest, W3C-recommended web publishing language there is. It retains the nice parts of HTML, fuses it with the modularity of XML 1.0, requires you to separate content, presentation and interaction, and provides ways for one to rapidly create a page that will look the same regardless of the web browser you’re using. In theory.

Of browsers, standards compliance and bragging rights

Right now, there are roughly four browser families representing most of the browsers used to access the web.

  • Trident-based: Internet Explorer and skinned mshtml browsers. Most used. Most buggy. Most standards abusing.
  • Gecko-based: Mozilla, Firefox, Seamonkey, Camino, and others. Sticks to standards mostly, but knew countless versions and revisions.
  • Opera: used on PCs sometimes, on mobile devices mostly: quite standards compliant in general, does have a few annoying quirks.
  • KHTML: Apple Safari and Konqueror. Tries to comply with standards, but is still mostly an HTML 4.01 engine.

Of those, only two really understand XHTML 1.1: Gecko and Opera. Both include an XML parser, and generate errors if the page is badly written. Firefox’s source code viewer is really nice.

Of assuming one’s mistakes

As a meticulous designer, I wrote the code entirely by hand, discarding entirely the idea of using a WYSIWYG editor—as such, any error in the code has to be corrected by yours truly.

When the time came for me to test and validate my website, I had the horrible surprise of getting slightly different results in Firefox and Opera (easily fixed by reordering my fonts priorities), unworkable content in Konqueror (which required a slightly more extensive rewrite) and...

A download prompt in Internet Explorer.

Of MIME types, HTTP headers and headaches

You will, of course, tell me this: with XHTML, you need to use the application/xhtml+xml mimetype (automatically generated by Apache if your file ends in .xhtml, or that you can set up with a dedicated php command for .php files). If the browser doesn’t mention it supports this one, you can fall back to application/xml or text/xml. If it gets you no cookie, you can send different content with a more classical text/html mime type.

However, Konqueror and above all Internet Explorer lie about their capabilities: Konqueror will open the file and try to process it as regular HTML (which may succeed or fail), Internet Explorer will merely ask you to save the file to disk... so that you can open the page with a standard-compliant browser, perhaps.

If you look at what IE is supposed to support, you’ll get: “/”. Meaning it is supposed to support any kind of file (which, of course, it can’t). Meaning that, if you want to provide a scaled down version of the page, you’ll need to do browser detection—making mimetypes irrelevant, but highly annoying with those browsers that do support it nicely!

Konqueror is a bit more truthful: it says that it supports “/” but that it supports text/html first; for application/xhtml+xml, it can try opening it as HTML 4, or start another browser which registered the application/xhtml+xml mimetype if available. This can be used to set up a script that would, for example, replace all instances of the id= argument with name=—which would make image maps work, for example. Badly applied CSS on too deeply nested DIVs, radically different font scaling and more can make or break a page.

Opera 9 has an integrated XML parser on par with Gecko’s (it wasn’t the case in 8.x); it supports application/xhtml+xml quite well apart from a few bugs (like image maps, which require the older syntax to work) and it displays the page without requiring too many tricks.

Gecko is a big winner here: to support all versions starting from Mozilla 1.0rc3, I merely needed to use one single trick. What a relief.

What you may end up doing

The easy way out would be to reprogram the page in HTML 4.01. Another one would be to force IE to support XHTML 1.1 (yes, it’s possible! It requires a dozen additional lines of code and wastes even more bandwidth when it downloads the W3C DTDs, and then it renders the page in Quirks mode, but it does accept application/xml and parse the XML code) and use only those tags all browsers recognize, or then... Do as I do, program the pages in XHTML 1.O Strict following Appendix C, default mime type to text/html, and detect those few browsers that understand application/xhtml+xml and send them the higher level mimetype.

Darn it, being a responsible web developer is hard sometimes...

Category: 

Comments

Ryan Cartwright's picture

Darn it, being a responsible Web developer is hard sometimes...

I whole-heartedly agree. We're currently progressing moving our website to CMS - naturally we're using a FOSS one (exponent). One of the things I was keen to do was avoid (heck - ban) the use of tables and produce the whole thing using CSS .

We're using the so-called "holy grail", 3-column layout and trying to get this to render even remotely the same in each of the four main browser engines is a thankless task. Aside from the IE hacks you need to put into the CSS - which of course make it non-compliant with W3C standards, I've also come across a lot of issues between Opera/KHTML/Gecko.

In the end, though it's worth it as we'll have a site which will be wholly usable by all web users on their terms (OS, interface, browser, font preferences etc.) and with content manageable by our staff.

At least that's what I keep telling myself when I get bogged down figuring out why ltr renders wrong in Konqueror but right in Firefox, Opera and even Safari (IE? just don't go there!).

A good post though. :o)

Mitch Meyran's picture

Heed my advice: remove as many DIVs as you can - it will solve most of the problems with Konqueror. Instead, don't hesitate to apply CSS to tags, even converting inline elements to blocks if need be. Don't use classes too much, but define id's for some primary layout elements and set styles for their children (like #leftcolumn {} #leftcolumn h2{display:block})
Create a single CSS file for Gecko, Konq and Opera (not too difficult), then only at the end try to fix it for IE6 (IE7 reacts a lot like Opera). Don't hesitate to use the * html hack, and experiment with CSS expressions (the only IE6 CSS abuse that may become part of CSS3 specs...)
It worked not too badly on a 4-parts layout I recently cleaned up.
---
A computer is like air conditioning: it becomes useless when you open windows.

nederhoed's picture
Submitted by nederhoed on

Have you tried validating your pages while developing?

  • Firefox Validation Extension
  • W3C validation

I think using 'text/html' as mime-type with an XHTML DOCTYPE in every page should work fine.

Good luck! Robert R.

--
Working on being
http://nederhoed.com/weblog/

Mitch Meyran's picture

The problem with using text/html with an XHTML 1.1 page is that there ARE differences between HTML4, XHTML 1.0 Strict and XHTML 1.1. One thing for example, 'object' stacking - allowing elegant fallback cascades of content (animated SVG => Apple Flash => JPEG...) wouldn't be supported by an HTML4 browser and would likely crash it.

Even worse, your carefully crafted XHTML page would be rendered using Quirks mode, due to it containing a DOCTYPE incompatible with its mimetype.

I actually tried all validators: all those using an XML validator failed due to incorrect mimetype with text/html. Using text/xml or better solved the problem. Those using an SGML validator (like W3C's) don't take the DOCTYPE into account due to them being SYNTAX (not structure) validator.

Actually, Firefox detects errors in case of incorrect mimetype: try using an anchor written this way: 'a id='thing' /'; it would be perfectly legal in XHTML1.1, but in HTML (or XHTML using text/html) the tag would be considered open until it met another 'a' tag.

And Firefox does (starting with version 2.0; I submitted a bug report for 1.5, and it was solved in Bon Echo) highlight the slash '/'.

Meaning that no, using text/html with XHTML1.1 isn't fine. And it only gets worse when embedding SVG or MathML.

Cheers!

---
A computer is like air conditioning: it becomes useless when you open windows.

Author information

Mitch Meyran's picture

Biography

Have you ever fixed a computer with a hammer, glue and a soldering iron? Why not? It's fun!