AODC 2009 day 1 – Structured authoring

This week I’m attending the 2009 Australasian Online Documentation and Content Conference (AODC) in Melbourne. Today, the first day of the conference, the speakers have already given us a lot to think about.

Here are some notes I took from the session on structured authoring by Dave Gash of HyperTrain dot Com. I hope these notes are useful to people who couldn’t be at the conference this year. The AODC organisers will also publish the session slides and supplementary material once the conference is over.

A Painless Introduction to Structured Authoring

In this session Dave Gash discussed the benefits and pitfalls of structured authoring as opposed to the more traditional linear narrative format. He also touched briefly on DITA as the prime technology for a structured authoring environment.

When introducing Dave, Tony Self said, “Dave is known for his shirts, and he hasn’t let us down today.” Dave was wearing a black shirt imprinted with colourful guitars of all sorts. This set the tone for Dave’s live-wire style of presentation. He moved around the stage, chatting to and taunting people in the audience, while at the same time conveying lots of information.

Dave’s session covered the following points: “Look at paradigm shift from linear narratives to structured authoring. Compare and contrast the two methods. Explore structured authoring methodology. Look at some code examples.” He laughed, “This is the one and only time I’m going to use that phrase ‘paradigm shift’. I hate that phrase but it’s appropriate in this case.”

Here’s a good definition of structured authoring:

“Structured writing is a set of publishing workflow practices that lets you define and enforce consistent information structure and facilitates content development, sharing and reuse.”

We need to differentiate between structured authoring (the methodology we use to organise and structure information) and XML (the technology we use to implement the plan).

These are some of the problems Dave mentioned concerning the linear narrative format:

  • There’s too much repetition, rewriting, local formatting. Not enough content re-use.
  • Authors spend too much time doing things that are not writing, but are required for production of the documentation.

He listed the advantages that structured authoring can provide, including:

  • Better control over content versions.
  • More efficient use of a writer’s time.
  • Easier sharing of content across different media and different formats.
  • And more. Take a look at Dave’s presentation slides when the AODC makes them available.

Contrasting linear versus structured authoring, in linear content authoring:

  • Content is authored in a WYSIWYG editing tool.
  • Loose standards allow tweaking.
  • Structure depends largely on output medium.
  • Content and format are intertwined.
  • Re-use of content is done via copy and paste.

On the other hand, in structured authoring:

  • Content is authored in a WYSIOO (What You See is One Option) editing tool.
  • Strict standards do not allow tweaking.
  • Structure is completely independent of output.
  • Authors cannot determine final appearance —  content and format are separate.
  • Re-use of content is accomplished via cross-references.

To sum it up: The goal is valid content (structured) instead of attractive output (linear).

There are a number of benefits on the corporate and management side, such as improved document quality, content re-use, higher author productivity, more flexibility for varied output devices. In a nutshell: savings in personnel, software, support and maintenance costs.

For the writer/author, there are benefits and challenges. On the one hand, there are new tools and new procedures to learn, new job responsibilities that are narrower and less flexible than before, no control over formatting and publishing of the documentation. On the positive side, the change will keep your skill set current. Content is still king and you get more time to write it. No more copy and paste, and no more chasing around to find all the repetitions if you have to change something.

Dave also said you can throw away your responsibilities for formatting and publishing the documentation. As content developer, you are responsible for the content only. So if something breaks, it’s not your problem. I have to admit that I disagree with this point 😉 I guess that I enjoy the “holistic” approach to document production, where the writer does have a say in the presentation. So if my team was to move towards structured authoring, I’d definitely train myself up on the structure and publishing side as well as the content development skills. But I do recognise the need for content to be able to squeeze into all sorts of output formats and media. So I see the benefits of content being format- and medium-agnostic. I guess I’d fit into a structured authoring environment quite easily, once I started getting my sense of satisfaction from producing perfect content in an efficient environment.

Next, Dave showed us some code examples of HTML (representational markup) as opposed to XML (semantic markup). He explained how in linear authoring, the typography of a document defines its information structure while in structured authoring the tags define the information structure.

Dave gave us a real-world example: A technical writer writes a feature list, showing the set of features provided by a particular piece of software. This feature list might end up as an introduction to a user guide, as marketing literature, in a press release, in a software review site fact sheet, etc. It looks different in each output format, but it’s written only once.

Now Dave dived into the technical details of XML, the technology most used to implement structured authoring. He explained what a schema is and showed us some schema code. The schema controls what the editor allows you to enter. The editor checks against the scheme to validate the content you enter.

Dave showed us the workflow when using a linear narrative, and contrasted it with a structured authoring workflow. In linear authoring, the writer writes the content into a single tool, maybe interacting with a CMS. Then the writer instructs the tool to publish the content into different formats. In a structured authoring workflow, the writer writes the content into the editing tool, very likely working with a CMS. Then the information architect uses a structure tool and a style tool to structure the content. Then the publisher uses a tool to combine the style, structure and content to produce the different output formats. So there are more clearly defined roles and job responsibilities.

In structured authoring, information chunking becomes vital. Technical writers already chunk information i.e. divide it into logical pieces. For structured authoring:

  • Think in smaller chunks.
  • Consider where the information may be re-used.

Now Dave touched on DITA, the most successful XML standard for structured authoring. The DITA Open Toolkit is free, and there are a number of free plugins to extend the toolkit. For example, there’s the GUI publishing interface, WinANT, developed by Tony Self.

There are a number of authoring/publishing tools. Dave’s favourites are:

  • JustSystems XMetal
  • PixWare XMLMind Editor
  • OxygenXML <oXygen/>

Other well-known tools are starting to support DITA to a certain extent:

  • FrameMaker
  • RoboHelp
  • Flare
  • Author-IT

The question time at the end of the session prompted a number of interesting comments, including these:

  • Emily observed that the publishing and structuring side of document production may not be a full-time job, especially in smaller shops. So it may be quite possible for one person to fill all three roles — content author, information architect and publisher.
  • How long does it take to move to a structured authoring environment? The answer depends on a lot of things, such as the amount of legacy documentation that needs converting and the level of enthusiasm in the existing team. There are a number of places in the States that have taken a year or more to get the first project out the door, and that’s without converting the existing documentation.
  • With content re-use over multiple output formats, is there a danger of having the different output formats looking too much the same? Answer: Yes, there is a danger. It’s up to the formatting people to decide whether they want the content to look the same or different in the different locations.

Thank you Dave for a great introduction to structured authoring.

About Sarah Maddox

Technical writer, author and blogger in Sydney

Posted on 20 May 2009, in AODC, open standards, technical writing, xml and tagged , , , , , , . Bookmark the permalink. 3 Comments.

  1. I see a few questionable statements among your notes:

    * Dave doesn’t mention a technology for structured writing that doesn’t use XML.

    * Structured writing can achieve better control over content versions. Counterpoint: Content version control is independent of structure writing methodology.

    * One advantage of structure authoring is easier sharing of content across different media and different formats. Counterpoint: This is possible only after the appropriate content analysis has been performed in order to ensure that acceptable deliverables can be produced for each context of use. That is, significant work must be done up front; structure authoring does not produce this benefit by itself.

    * In structured authoring, re-use of content is accomplished via cross-references. Counterpoint: This is a somewhat ambiguous statement and could refer to reuse inside a content object or reuse for build deliverables from entire content objects. The former is problematic and the requirements for reuse inside content must be analyzed in detail. The latter is not strictly “cross-referencing” but rather assembling deliverables from content objects.

    * Structured authoring producing improved document quality. Counterpoint: If a document deliverable is defined as a collection of potentially reusable content objects, then the quality of the entire deliverable is not only a matter of the quality of the individual content objects but also how the content objects are put together and their combined use (content relationships) is presented coherently to the reader. So merely using structured authoring for individual content objects is not sufficient to produce a quality document deliverable.

    * For structured authoring, writers must think in terms of smaller chunks. Counterpoint: Smaller than what? Smaller than chapters? Smaller than major topics within a chapter? The challenge is that many topics are intended to be used within a certain context within an application (or user task). Separating a content object from its context of use becomes problematic as a writing exercise. Dave doesn’t acknowledge the challenges of designing and implementing more or less context-insensitive writing of content objects. This is still a very emerging area of practice for technical communicators.

  2. Hallo Paul

    It’s great to hear from someone who has so much input into the pros and cons of structured authoring. Dave’s talk was an introductory session, so it wasn’t really the opportunity for such in-depth analysis. I’m not in a structured-authoring shop at the moment, but I know that it’s a hotly debated topic. There was an interesting Q&A time after Dave’s session too. A couple of other conference presenters covered related topics such as single-source publishing and content reuse.

    Thank you for your detailed point-by-point comments. They are really interesting. It would be great if other people who _are_ writing in a strictly structured environment could respond to your points too.

    Looking at your first point, would you be able to tell us of some non-XML structured writing that you’ve been involved with?

    From what I’ve heard, your point that “significant work must be done up front” is spot on. In my environment, we employ content reuse to a limited extent. And that’s not even structured i.e. the content type is not semantically tagged. Even so, I already see the need for some sort of management tool. Otherwise, how do you know what content you already have available for re-use? Matt Armstrong from Author-it gave us a tantalising insight into “authoring memory” in last year’s AODC conference. I blogged about it too.

    The sacrifice of context is a big consideration when authoring for content re-use and/or single-source publishing. I guess you have to weigh up the requirements of your specific environment.

    Thank you again for a thought-provoking comment. Any more counterpoints from anyone?

    Cheers — Sarah

  3. Hi Sarah,

    Using the DocBook DTD is an example of structured writing that is distinct from DITA and that builds upon either an SGML or XML technology base. At the time of its invention, DocBook was intended to support production of more or less traditional documents, but a DocBook document can be assembled from content contained in distinct files, so reuse of those pieces in more than document can take place.

    Structured FrameMaker is another tool for composing and demonstrating compliance with an SGML DTD. One can even use Unstructured FrameMaker’s features to markup document content semantically, if one chooses to conform to a set of target elements that are defined to be FrameMaker compatible.

    At this point in history, I believe that content reuse promoters don’t have enough evidence from successful projects to make the claim that repurposing content, not just reusing content, can be achieved cost-effectively on a large scale. Again, the key factor is the context of the content’s use and meaning. To make an item of content truly reusable, all of the information that describes that content’s context of use must be present in, or proximal to, it. I haven’t yet seen this issue sufficiently analyzed nor solved by any practitioner in our field.

    Paul K. Sholar
    @bkwdgreencomet [Twitter]

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: