Blog Archives

AODC 2009 day 1 – Structured authoring

This week I’m attending the 2009 Australasian Online Documentation and Content Conference (AODC) in Melbourne. Today, the first day of the conference, the speakers have already given us a lot to think about.

Here are some notes I took from the session on structured authoring by Dave Gash of HyperTrain dot Com. I hope these notes are useful to people who couldn’t be at the conference this year. The AODC organisers will also publish the session slides and supplementary material once the conference is over.

A Painless Introduction to Structured Authoring

In this session Dave Gash discussed the benefits and pitfalls of structured authoring as opposed to the more traditional linear narrative format. He also touched briefly on DITA as the prime technology for a structured authoring environment.

When introducing Dave, Tony Self said, “Dave is known for his shirts, and he hasn’t let us down today.” Dave was wearing a black shirt imprinted with colourful guitars of all sorts. This set the tone for Dave’s live-wire style of presentation. He moved around the stage, chatting to and taunting people in the audience, while at the same time conveying lots of information.

Dave’s session covered the following points: “Look at paradigm shift from linear narratives to structured authoring. Compare and contrast the two methods. Explore structured authoring methodology. Look at some code examples.” He laughed, “This is the one and only time I’m going to use that phrase ‘paradigm shift’. I hate that phrase but it’s appropriate in this case.”

Here’s a good definition of structured authoring:

“Structured writing is a set of publishing workflow practices that lets you define and enforce consistent information structure and facilitates content development, sharing and reuse.”

We need to differentiate between structured authoring (the methodology we use to organise and structure information) and XML (the technology we use to implement the plan).

These are some of the problems Dave mentioned concerning the linear narrative format:

  • There’s too much repetition, rewriting, local formatting. Not enough content re-use.
  • Authors spend too much time doing things that are not writing, but are required for production of the documentation.

He listed the advantages that structured authoring can provide, including:

  • Better control over content versions.
  • More efficient use of a writer’s time.
  • Easier sharing of content across different media and different formats.
  • And more. Take a look at Dave’s presentation slides when the AODC makes them available.

Contrasting linear versus structured authoring, in linear content authoring:

  • Content is authored in a WYSIWYG editing tool.
  • Loose standards allow tweaking.
  • Structure depends largely on output medium.
  • Content and format are intertwined.
  • Re-use of content is done via copy and paste.

On the other hand, in structured authoring:

  • Content is authored in a WYSIOO (What You See is One Option) editing tool.
  • Strict standards do not allow tweaking.
  • Structure is completely independent of output.
  • Authors cannot determine final appearance —Ā  content and format are separate.
  • Re-use of content is accomplished via cross-references.

To sum it up: The goal is valid content (structured) instead of attractive output (linear).

There are a number of benefits on the corporate and management side, such as improved document quality, content re-use, higher author productivity, more flexibility for varied output devices. In a nutshell: savings in personnel, software, support and maintenance costs.

For the writer/author, there are benefits and challenges. On the one hand, there are new tools and new procedures to learn, new job responsibilities that are narrower and less flexible than before, no control over formatting and publishing of the documentation. On the positive side, the change will keep your skill set current. Content is still king and you get more time to write it. No more copy and paste, and no more chasing around to find all the repetitions if you have to change something.

Dave also said you can throw away your responsibilities for formatting and publishing the documentation. As content developer, you are responsible for the content only. So if something breaks, it’s not your problem. I have to admit that I disagree with this point šŸ˜‰ I guess that I enjoy the “holistic” approach to document production, where the writer does have a say in the presentation. So if my team was to move towards structured authoring, I’d definitely train myself up on the structure and publishing side as well as the content development skills. But I do recognise the need for content to be able to squeeze into all sorts of output formats and media. So I see the benefits of content being format- and medium-agnostic. I guess I’d fit into a structured authoring environment quite easily, once I started getting my sense of satisfaction from producing perfect content in an efficient environment.

Next, Dave showed us some code examples of HTML (representational markup) as opposed to XML (semantic markup). He explained how in linear authoring, the typography of a document defines its information structure while in structured authoring the tags define the information structure.

Dave gave us a real-world example: A technical writer writes a feature list, showing the set of features provided by a particular piece of software. This feature list might end up as an introduction to a user guide, as marketing literature, in a press release, in a software review site fact sheet, etc. It looks different in each output format, but it’s written only once.

Now Dave dived into the technical details of XML, the technology most used to implement structured authoring. He explained what a schema is and showed us some schema code. The schema controls what the editor allows you to enter. The editor checks against the scheme to validate the content you enter.

Dave showed us the workflow when using a linear narrative, and contrasted it with a structured authoring workflow. In linear authoring, the writer writes the content into a single tool, maybe interacting with a CMS. Then the writer instructs the tool to publish the content into different formats. In a structured authoring workflow, the writer writes the content into the editing tool, very likely working with a CMS. Then the information architect uses a structure tool and a style tool to structure the content. Then the publisher uses a tool to combine the style, structure and content to produce the different output formats. So there are more clearly defined roles and job responsibilities.

In structured authoring, information chunking becomes vital. Technical writers already chunk information i.e. divide it into logical pieces. For structured authoring:

  • Think in smaller chunks.
  • Consider where the information may be re-used.

Now Dave touched on DITA, the most successful XML standard for structured authoring. The DITA Open Toolkit is free, and there are a number of free plugins to extend the toolkit. For example, there’s the GUI publishing interface, WinANT, developed by Tony Self.

There are a number of authoring/publishing tools. Dave’s favourites are:

  • JustSystems XMetal
  • PixWare XMLMind Editor
  • OxygenXML <oXygen/>

Other well-known tools are starting to support DITA to a certain extent:

  • FrameMaker
  • RoboHelp
  • Flare
  • Author-IT

The question time at the end of the session prompted a number of interesting comments, including these:

  • Emily observed that the publishing and structuring side of document production may not be a full-time job, especially in smaller shops. So it may be quite possible for one person to fill all three roles — content author, information architect and publisher.
  • How long does it take to move to a structured authoring environment? The answer depends on a lot of things, such as the amount of legacy documentation that needs converting and the level of enthusiasm in the existing team. There are a number of places in the States that have taken a year or more to get the first project out the door, and that’s without converting the existing documentation.
  • With content re-use over multiple output formats, is there a danger of having the different output formats looking too much the same? Answer: Yes, there is a danger. It’s up to the formatting people to decide whether they want the content to look the same or different in the different locations.

Thank you Dave for a great introduction to structured authoring.

AODC – separating content, structure, format and behaviour

This week Iā€™m attending the Australasian Online Documentation and Content Conference (AODC) on the Gold Coast in Queensland, Australia. With his inimitable flair and style, Dave Gash presented a session this morning entitled “The Search for the UA Grail: True Separation of Content, Structure, Format and Behaviour”.

Dave is the owner of HyperTrain dot Com, specialising in training and consulting for hypertext developers. Today he told us what’s wrong with the way we traditionally do things, what’s wrong with the conventional wisdom on how we might improve our way of working, and what’s a better way.

What’s wrong with the traditional way we do things

The basic problem is that we write our content, e.g. a web page, and tweak elements of it on the fly. For example, we might make some text bold, or colour other text, or whatever. The result is spaghetti code — difficult to maintain and share.

What’s wrong with the conventional wisdom on improving the above situation

People usually say we should “separate format from content”. But what is “format”? That term is too vague. And the phrase implies that everything that’s not “content” is “format”. Wrong.

The better way

We should separate our document into four components instead of two:

  • Content (which you might realise via XML)
  • Structure (XSLT)
  • Format (CSS)
  • Behaviour (JavaScript)

What we’re aiming for:

  • Maintainability — you can change one of the above four components without breaking the others.
  • Re-usability — you can re-use the same bit of JavaScript, for example, in other documents.
  • Separation of skill sets — different people can work on the component they know best and enjoy most.
  • Simplified updating of content — content is likely to be the component you update most often.

How to do it

Dave demonstrated the procedure we would follow in order to separate a document into the above four components. There are five basic steps. Dave walked us through the details of each step, using code examples of CSS, JavaScript, XML and XSLT. In summary, the steps are:

  1. Identify all JavaScript and move it to an external JS file.
  2. Identify JavaScript that could be better done in CSS. Examples are “onmouseover” and “onmouseout” event handlers that change the style of the text item, and image swaps. Use the CSS “hover” pseudo-class instead.
    • Dave’s tip: You don’t have to specifically code the “I’m through hovering” handler because it’s implicit in the pseudo-class.
  3. Move all CSS styles to an external file. Convert local formatting to classes too.
    • Dave’s tip: If the boss says “Change the list spacing in all lists on all pages”, it’s in one spot — change it and take the rest of the day off.
  4. Add semantic markup to the content, using XML.
  5. Now it’s time for some XSLT. Identify the output HTML you want, then write the XSL transforms to produce it.
    • Write small, individual templates to create the HTML for each specific XML tag. Then use the “magic” <xsl:apply-templates/> element to pull it all together. This nests the processing of the templates, so that the transforms will just keep happening for each XML element, hierarchically down and back up the tree, until they’re all done.

The XSLT generates the HTML and links in the CSS and JavaScript.

Dave has made the code available on the “Downloads” tab of the HyperTrain dot Com web site.

A recommended editing tool: EditPlus.

Thank you for a great session, Dave. And a special thanks for changing “behavior” to “behaviour” throughout your presentation, just so that we Ozzies felt comfortable šŸ˜‰

%d bloggers like this: