XML: WYSIWYG to WYSIWYM

XML: WYSIWYG to WYSIWYM


It all started with cavemen and their cave drawings. All cave drawings were WYSIWYM (What You See is What You Mean). I mean (no pun intended), if you saw a cave drawing, in which a hunter was chasing a mammoth, it meant that a hunter was chasing a mammoth. There were no two ways to interpret the cave drawings. Then came alphabets and words. With words came plain text or documents. Then came XML/SGML for adding information to a document relating to its structure and/or content. An XML document contains both content (words) and an indication of what role the content plays. It’s a markup language that adds meaning to the meaning. With the advent of XML, there became a need for XML editors.

An XML document contains both content (words) and markup, to indicate what role, the content plays. It’s a markup language that adds meaning to the meaning

Basic Terminology

DTD

Document Type Definition. An XML DTD, written in EBNF, defines the structure and syntax of an XML document. Essentially, it defines the elements, entities, and the content model. For more info, please see this link.

Open Standard DTD

Open Standard DTDs are Document Type Definitions that are publicly available for use with various XML aware applications. An example of an Open Standard DTD is DocBook XML DTD. In contrast, WordML developed by Microsoft, is a proprietary XML DTD. WordML needs to be licensed, in order to be utilized by non-Microsoft applications.

XML

XML (eXtensible Markup Language) is a meta markup language that can be used to create other markup languages. XHTML is an example of markup language created using XML.

XML Schema

Similar to DTD, except it’s written in XML, and is also capable of defining data types and putting constraints on the content. For more info, please see this link.

XSLT

XSLT is a template, which defines what transformations need to be performed on an XML document. XSLT allows XML data to be shared among various XML aware applications, which don’t necessarily use the same XML schema. In the context of document authoring and content creation, an XSLT defines what the formatted output will look like.

Valid

An XML document that conforms to rules as defined by an XML DTD or schema is valid.

Well-formed

An XML document that conforms to the syntax rules of XML is well formed.

Proprietary file formats

The popular way of authoring a document is to use WYSIWYG tools (e.g. MS Word or Frame Maker). Documents created using these applications only have formatting information, but lack semantic information (i.e. information related to the structure of the information contained within). Since the formatting information is proprietary to each vendor and the file format is binary, documents created using these applications are locked into the file format of the application’s vendor. If you wish to switch word processors (e.g. Microsoft Word to Corel’s WordPerfect), you have to go through the painful process of converting all of your documents to the new proprietary format, and often lose valuable data during the conversion process.

XML EditorsXML Editors

Enter XML based document authoring

In contrast, XML allows a document author to create content in a “presentation neutral” format that captures the semantics of the content, rather than the presentational format. XML introduced a way of authoring documents that is completely free from the limitations of old fashioned, non-interchangeable, binary file formats (e.g. DOC, XLS etc). Open Standard XML Document Type Definitions (DTDs) and XML Schemas allow the document authors to create consistent, structured documents regardless of the XML editor they are using. A true XML editor supports any XML DTDs or XML Schema. Thus, the document author is able to move from one XML editor to another, without the fear of the content being locked into one proprietary file format. This is good for the document author, but not good for the software vendor who developed the editor. The only way for vendors to compete for customers is to offer editors that provide the more editing features. If they don’t, the customer will just move to a different editor, without losing any content. This has paved the way for the XML editor wars, where each vendor wants to develop an editor that has better features than its competitors’ editors have.

XML allows a document author to create content in a presentation-neutral form that captures the semantics of the content, rather than the presentational format

Features offered by XML editors

Despite the fact that XML is the Holy Grail for publishers and document authors, nobody wants to use Vi or Notepad to create an XML document. They want tools that’ll make XML editing easy and painless. So, the paradigm of WYSIWYG and WYSIWYM was introduced to the XML world. Essentially the document authors want XML editors that can:

  1. Highlight syntax errors
  2. Automatically complete XML Tags
  3. Indent properly
  4. Check for validity against an XML Schema or a DTD
  5. Check for XML well-formedness
  6. Allow viewing and editing of XML documents in a tree view, and
  7. Fix the kitchen sink

Various types of XML editors

The positive outcome of the XML editor wars is that we currently have large number of very good and feature-rich XML editors to choose from. These editors can be divided into three categories: WYSIWYM, WYSIWYG and Text-Based.

WYSIWYM

What You See Is What You Mean is a paradigm that is related to the XML editors, which accurately display the information that is trying to be conveyed, rather than the actual formatting. Since XML doesn’t define the actual formatting of the content, these editors are very useful in visually creating and managing the data. A WYSIWYM editor allows a document author to edit an XML document by interacting with a feedback text, generated by the editor. This presents both the knowledge already defined and the options (tags, elements, attributes, etc.) for extending and modifying it. Thus, a WYSIWYM XML editor alleviates the need for the document author to memorize all of the tags, elements and attributes of XML DTD or Schema. WYSIWYM XML editors are also known as semantical editors.

A WYSIWYM XML editor accurately displays the information that is trying to be conveyed rather that the actual formatting

Butterfly XML is a powerful WYSIWYM editor, freely available under GNU Public License. In addition to the features mentioned above, it is capable of automatically generating an XML Schema or a DTD based on XML file.

Butterfly XML is capable of presenting the elements that are available at the location of the cursor as defined by the DTD or the SchemaButterfly XML is capable of presenting the elements that are available at the location of the cursor as defined by the DTD or the Schema
Right clicking on any tag will display all the child and sibling tags that are available as defined in the XML Schema or DTDRight clicking on any tag will display all the child and sibling tags that are available as defined in the XML Schema or DTD
ButterflyXML offers a Tree View of the XML document. A tree view helps the author to better visualize the semantics of the contentButterflyXML offers a Tree View of the XML document. A tree view helps the author to better visualize the semantics of the content

WYSIWYG

What You See Is What You Get is a paradigm that is related to word processing and publishing. It refers to editing software that makes sure that the image seen on the screen closely corresponds to the final formatted output. That does not mean that WYSIWYG XML editors exclude the semantics of the XML content. All it means is that a WYSIWYG XML editor allows the document author to edit the XML content with the XSLT applied to the content. So the document author doesn’t see the XML elements and attributes, instead he/she sees the formatted output as defined by the XSLT.

A WYSIWYG XML editor allows the document author to edit the XML content with the XSLT applied to the content

Vex (Visual Editor XML) is a very capable WYSIWYG editor that uses CSS to provide authors with an MS Word like user interface to create and edit document. It is well suited for "document-style" documents such as DocBook and XHTML, but is not the best choice for data collection.

Like ButterflyXML, Vex is also capable of listing the elements that are valid and inserting them at the current caret location. Available attributes for each element are displayed as wellLike ButterflyXML, Vex is also capable of listing the elements that are valid and inserting them at the current caret location. Available attributes for each element are displayed as well
Unlike ButterflyXML, Vex does NOT provide a tree view of the whole XML document, however it does provide an outline of the document. Outlines provide a good, high-level view of the semantics of the documentUnlike ButterflyXML, Vex does NOT provide a tree view of the whole XML document, however it does provide an outline of the document. Outlines provide a good, high-level view of the semantics of the document

Text Based

They are most rudimentary type of XML editors, where the document’s author has to manually type in all of the tags and attributes. However, these editors usually check for good structure and sometimes for validity as well. Both Vi and Emacs can be configured to act as a powerful text-based XML editor. The following documentation can be used to configure any of these free editors so they can be used as an XML editor:

Category: 
Tagging: 
License: 

Author information

Saqib Ali's picture

Biography

Saqib Ali is a Snr. Systems Administrator and Technology Evangelist at Seagate Technology. He also manages a free software web based application that allows online conversion of DocBook XML to HTML or PDF. Saqib is also a active contributor to The Linux Documentation Project

Most forwarded

Interview with Dave Mohyla, of DTIDATA

Dave Mohyla is the president and founder of dtidata.com, a hard drive recovery facility based in Tampa, Florida.

TM: Where are you based? What does your company do?
DTI Data recovery is based in South Pasadena, Florida which is a suburb of Tampa. We have been here for over 10 years. We operate a bio-metrically secured class 100 clean room where we perform hard drive recovery on all types of hard disks, from laptop hard drives to multi drive RAID systems.

Anybody up to writing good directory software?

Since the very beginning, directories (of any kind) have had a very central role in the internet. (I have recently grown fond of Free Web Directory. Even Slashdot can be considered a directory: a collection of great news and invaluable user-generated comments. As far as software is concerned, doing a quick search on Google about software directories will return the free (as in freedom) software directories like Savannah, SourceForge, Freshmeat and so on, followed by shareware and freeware sites such as FileBuzz, PCWin Download Center and All Freeware (great if you're looking for shareware and freeware, but definitely less comprehensive than their free-as-in-freedom counterparts).

Interview with Mark Shuttleworth

Mark Shuttleworth is the founder of Thawte, the first Certification Authority to sell public SSL certificates. After selling Thawte to Verisign, Mark moved on to training as an astronaut in Russia and visiting space. Once he got back he founded Ubuntu, the leading GNU/Linux distribution. He agreed on releasing a quick interview to Free Software Magazine.

Is better education the key to finding better software?

I read David Jonathon's article Anybody Up To Writing Good Directory Software? the other day, which got me thinking about software directories in general. As David mentioned, many of the software directories one finds when doing a quick google search are free as in beer, not as in freedom. But what interests me is the software directories that already exist, providing a combination of both free as in beer software, and open source software. Sites such as Freeware Downloads and Shareware Download don't advertise themselves as providing free as in liberty software, but each of them have a good selection of open source software available... if you know where to look.

Most emailed

Free Open Document label templates

If you’ve ever spent hours at work doing mailings, cursed your printer for printing outside the lines on your labels, or moaned “There has got to be a better way to do this,” here’s the solution you’ve been looking for. Working smarter, not harder! Worldlabel.com, a manufacture of labels offers Open Office / Libre Office labels templates for downloading in ODF format which will save you time, effort, and (if you want) make really cool-looking labels

Creating a user-centric site in Drupal

A little while ago, while talking in the #drupal mailing list, I showed my latest creation to one of the core developers there. His reaction was "Wow, I am always surprised what people use Drupal for". His surprise is somehow justified: I did create a site for a bunch of entertainers in Perth, a company set to use Drupal to take over the world with Entertainers.Biz.

Update: since writing this article, I have updated the system so that the whole booking process happens online. I will update the article accordingly!

So, why, why do people and companies develop free software?

More and more people are discovering free software. Many people only do so after weeks, or even months, of using it. I wonder, for example, how many Firefox users actually know how free Firefox really is—many of them realise that you can get it for free, but find it hard to believe that anybody can modify it and even redistribute it legally.

When the discovery is made, the first instinct is to ask: why do they do it? Programming is hard work. Even though most (if not all) programmers are driven by their higher-than-normal IQs and their amazing passion for solving problems, it’s still hard to understand why so many of them would donate so much of their time to creating something that they can’t really show off to anybody but their colleagues or geek friends.

Sure, anybody can buy laptops, and just program. No need to get a full-on lab or spend thousands of dollars in equipment. But... is that the full story?

Fun articles

Santa Claus - the most successful open source project

It dawned on me the other day, as I was shopping for the dozens of gifts it seems I have to buy every December, that Santa Claus is the most successful open source project in history. (Bridget @ Illiterarty would agree with that). Santa Claus is essentially a marketing development that is embodied by everyone who stuffs a sock, gives a gift, hosts a dinner or wishes Merry Christmas over the holiday season.

Most emailed

Editorial

When I first started thinking about Free Software Magazine, I was feeling enthusiastic about the dream. I had Dave, Gianluca, and Alan willing to help me, I had established members of the free software community willing to help me out, I had writers volunteering their time and energy for free, and I had a generous offer from OpenHosting for servers, all before I'd proved myself. There was a sense of excitement in the air, and I thought maybe, just maybe, I could make this work.

Free Software Magazine uses Apollo project management software and CRM for its everyday activities!