FAQ: Writing with XML

andreas.com FAQ: Writing with XML

The following is my user’s perspective on using XML with the Arbortext Epic XML editing environment at a startup in 2001. Some of this may have changed. –andreas

There’s tremendous interest in XML. It’s been around for a number of years, but it was always a “coming soon to a theater near you” kind of thing. We had FrameMaker or Word, and both of those could produce documentation from concept-to-print. One could do everything in one platform (for example, write in Word on PC, output to PDF, and view a PDF on a PC.) One really didn’t need another solution, nor was there much need for cross-platform solutions (companies and their users either live on Windows or on UNIX, and in most cases, they aren’t obligated to use the other platform.) XML was the solution, but there wasn’t a problem.

In the last few years (1999-2001), we’ve seen an extreme proliferation of platforms. Windows and UNIX are no longer the only platforms. We now have Palm PDAs (Palm, Handspring, Sony Cleo) and many other kinds of PDAs (go into any office store and you’ll see a dozen different types of PDAs,) dozens of cell phones, WebTV, Tivo, other TV devices, and so on. All of these devices are linkable. This means documentation needs to be able to output content to multiple devices with different OS (such as a Palm devices or WAP cell phones) and screens (19″ color screens vs a 3″ (diagonal) PDA black-and-white screen.) All of this can be delivered via the web or wireless as well.

So… XML’s long-awaited problem has finally arrived: multiple interlinked platforms. If you want an XML mission statement: Write a single text file and have it automatically outputted in different formats for different devices and distribute it automatically via web or wireless.

For corporate documentation, the end of Frame and Word is in sight. Just as HTML became a standard (for the past few years at work, I write nearly everything in only XML or HTML,) XML will become the standard format for documentation. XML has other advantages: it is license-free (no company owns it,) platform-independent (it runs on all systems,) and it is easy to develop.

I see a few hands, so let’s take some questions.

What about business cases, localization, content re-use, and connection to applications or web sites?

At our startup…

  • Localization was just theory. Few localization bureaus had XML installed. It was too expensive and too complicated.
  • Content re-use: we did this quite a bit. With a few tags, we could output the same file in personalized versions to various OEMs, staff, etc.
  • Connection to applications: This is the really different thing about XML. At other companies, I ran tech pubs basically as a standalone team; we had our own tools, etc. But with XML, the file is just another engineering file. We used CVS to check out files from the engineering server, edited the files with Epic, and checked the file with CVS back into the engineering server. At night, the build was done and the file was processed by the XML converters into the various formats: HTML, PDFs, WinHelp files, etc. These are then sent to various clients, internal users, and so on. XML created versions for various OEMs; each got their version with their logo and corp information. XML automated the production and distribution of documentation. Everyone had the “release-of-the-day”, or as someone called it, a rolling release. With print, you had a two-three month lead time; with XML, you can output as often as you like: daily outputs of docs via wireless onto 300 million cell phones and PDAs.

What about costs, ROI (return on investment,) writer productivity, and so on?

  • You never hear about this in a sales presentation because they don’t want to scare you. The ArborText Epic XML authoring environment is around $50,000. (We did the server installation ourselves and saved about $10,000.)
    XML also requires a number of highly-skilled people in the process. An XML solution includes XML, XSL (the layout language,) FOS (the converters for various formats,) servers, and so on.) It took an XML-experienced fellow more than two months to get the setup to actually work, and that was with generous help from several engineers. Much of the work had to be done inhouse. ArborText was not helpful.
  • As for productivity: well… as a text editor, Epic was primitive, poorly documented, and poorly supported. It was little better than NotePad. Many essential writer’s tools were missing. ArborText got away with this because they had little competition. (This was the state of Epic in 2001. It may have changed.)
    As XML becomes more widespread, better tools may be written and the price may drop.

What is the experience with semantic taxonomies? Part of the power of SGML/XML is supposed to be content semantics.

  • XML stands for eXtensible Markup Language. In theory, you can make up your own tags. But that’s just theory. In reality, every industry council (aerospace, pharmaceuticals, trucking, etc.) has XML committees and each one has published its own collection of XML tags, which are specific to that industry.
  • For computering, we used the DocBook set of tags. These are some 300 tags, but you’ll never use 80% of these. In practice, we used about 20-30 tags, and of these, I used perhaps 15 for 98% of my docs. It’s like HTML, where you use just a few tags for nearly everything.
  • Content semantics was mostly theory at our startup. We used XML mostly for layout tagging. The future of this is in price or item tagging, but that would be an issue for pricing and catalog management departments. It’s not really a matter for computer documentation.

What about control over layout on various platforms?

  • The text “flows into the container.” Which means that section headings can appear at the bottom of a page, an image can be separated from its descriptive paragraph, and so on. In theory, this can be controlled, but we couldn’t figure out how to do this, and since we didn’t print our docs, we didn’t really bother about this. ArborText themselves couldn’t figure out how to do this. Their documentation was a layout mess.

What does XML means to writers?

For the last 15 years, tech pubs for me was a draft-to-print process. I did the text, the layout, the production, worked with print vendors, supervised the book printing, and so on. The value of technical writers was in the collection of skills in creating printed books.

XML sales critters will say “XML allows you to be your Inner Writer!” But folks, to be honest here, writers aren’t highly paid for their Inner Writer. They’re highly paid for their Inner Graphics Dude, their Inner Print Vendor Relations Person, and their Inner Nerd who can use 10 different tools and two dozen undocumented kludges to produce a printed manual.

Some people may say that a writer’s real skill is SME (Subject Matter Expertise,) such as SQL, telecoms, VoIP, or whatever. Yes, there’s a value in that, but the real value is the production skills: they can produce and deliver books.

Look, to put it differently, before XML, writers were the chef at a little French restaurant where they could whip up hors d’ouvres, make turtle soup from scratch, prepare the seared Atlantic salmon, select the wines, follow up with a perfect handmade creme brulee, train the waiters, set the flower arrangements, pick the music for the violins, and so on. We did it all and no one else could do it.

With XML, we chop lettuce, dump it into the Happy Clown, and it spits out a burger. We don’t even get to make fries.

At the moment, very few people know how to use XML, and there’s demand for it, but from a writer’s point of view, it’s easier to use than HTML, and we all know how well THAT is paid. For the writer, XML requires very few skills, and that may lead to a drop in income. Small shops may continue to ignore XML, either because they are reluctant to spend $50,000, or they don’t need to output to ten devices simultaneously. But I’d guess that most mid-sized and larger companies will switch to XML in the next few years because it allows better/cheaper/faster distribution of content.

(Update: Feb. 2005. I reviewed this document. In 2001, we didn’t realize the impact of offshoring. In 2002 and 2003, tens of thousands of jobs were offshored. Now, in early 2005, I would not advise anyone to go into technical writing as a profession. Writers can be hired in India or Malaysia for $2 per hour. It would not make much sense for a company in the USA to install an XML system and hire US writers if they can get the same results at a fraction of the cost in Southeast Asia. If your company wants an XML solution, I suggest you first talk with companies that provide offshoring services. — andreas)

All of this is my experience. Your mileage may vary. If you’re using XML, write to me about XML.

Here’s a writer’s point-of-view on using the Arbortext Epic editor to create XML.