Monthly Archives: May 2010
AODC Day 3: Introduction to DITA Conditional Publishing
A couple of weeks ago I attended AODC 2010, the Australasian Online Documentation and Content conference. We were in Darwin, in Australia’s “Top End”. This post is my summary of one of the sessions at the conference and is derived from my notes taken during the presentation. All the credit goes to Dave Gash, the presenter. Any mistakes or omissions are my own.
This year’s AODC included a number of useful sessions on DITA, the Darwin Information Typing Architecture. I’ve already written about Tony Self’s session, an update on DITA features and tools, and about Suchi Govindarajan’s session, an introduction to DITA.
Now Dave Gash presented one of the more advanced DITA sessions, titled “Introduction to DITA Conditional Publishing”.
At the beginning of his talk, Dave made an announcement. He has presented in countries all over the world, many times, and he has never ever ever before done a presentation in shorts!

AODC Day 3: Introduction to DITA Conditional Publishing
Introducing the session
To kick off, Dave answered the question, “Why do we care about conditional processing?” One of the tenets of DITA is re-use. You may have hundreds or even thousands of topics. In any single documentation set, you probably don’t want to publish every piece of the documentation every time.
Conditional processing is a way to determine which content is published at any one time.
Dave’s talk covered these subjects:
- A review of DITA topics, maps and publishing flow
- The use of metadata
- The mechanics of conditional processing
- Some examples
Metadata and the build process
Dave ran us through a quick review of the DITA build process and the concept of metadata. Metadata has many uses. Dave talked specifically about metadata for the control of content publication.
Metadata via attributes
There are a number of attributes available on most DITA elements. These are some of the attributes Dave discussed:
- audience – a group of intended readers
- product – the product name
- platform – the target platform
- rev – product version number
- otherprops – you can use this for other properties
Example:
<step audience="advanced">
Using metadata for conditional processing
Basically, you use the metadata to filter the content. For example, let’s assume you are writing the installation guide for a software application. You may store all the instructions for Linux, Windows and Mac OS in one file. When publishing, you can filter the operating systems and produce separate output for each OS.
In general, you can put metadata in these 3 locations (layers):
- maps – metadata on the <map> element. You might use metadata at this layer to build a manual from similar topics for specific versions of a product.
- topics – metadata to select an entire topic. You might use metadata at this layer to build a documentation set for review by a specific person.
- elements – metadata on individual XML elements inside a topic. You might use this metadata to select steps that are relevant for beginners, as opposed to intermediate or advanced users.
Dave gave us some guidelines on how to decide which of the above layers to use under given circumstances.
Defining the build conditions to control the filtering
Use the ditaval file to define the filter conditions. This file contains the conditions that we want to match on, and actions to take when they’re matched. The build file contains a reference to the ditaval file, making sure it drives the build.
Dave talked us through the <prop> element in the ditaval file, and its attributes:
- att – attribute to be processed
- val – value to be matched
- action – action to take when match is found
A hint: You can use the same attribute in different layers (map, topic and element). Also, you don’t need to specify the location. The build will find the attributes, based on the <prop> element in the ditaval file.
Next we looked at the “include” and “exclude” actions. Remember, the action is one of the attributes in the <prop> element, as described above. Here’s an example of an action:
<prop att="audience" val="novice" action="exclude" />
Dave’s recommendation, very strongly put
is:
Don’t use “include”. Stick to “exclude”.
The basic rule is: Everything not explicitly excluded is included.
Dave’s final recommendation
Go get DITA and play with it!
My conclusion
It was great to have a focus on the conditional publishing side of DITA. It’s something I haven’t had a chance to get into before. Now I know the basics, which rounds off the DITA picture for me. Thank you Dave for an entertaining and information-packed talk.
Update on DITA Features, Tools and Best Practices
Crocs and docs and more AODC posts to come
This is just a quick post to let you know I haven’t forgotten my promise to write up my notes on the last few AODC 2010 sessions, including my own presentation! Things are just a bit busy on the home front at the moment. In the meantime, would you like to see some photos from the Spectacular Jumping Crocodile Cruise that I took on the day after the conference ended? My ever-observant companion, the Travelling Worm, has posted some pictures and words.
AODC 2010: Uncle Dave’s Trivia Night
Over the past few days I’ve posted a number of sober, studious, serious summaries of the sessions at AODC 2010. “A three-day technical writer talk fest, yawn.” Not so! On occasion we do break out and indulge in a trivia night.
Uncle Dave’s Trivia Night is an AODC tradition. It happens in the evening of the conference’s second day, usually a Thursday, and attendance is compulsory. Well, it’s as compulsory as anything at AODC. It ranks up there with the AODC rules, read out dutifully by Tony or Dave at some random time during the conference. “No swearing”, “no mobile phones” and “no hooking”. Almost all the rules are merrily ignored.
Anyway, I digress. (Digression is of course encouraged at AODC. I would win a bonus point for it at Trivia Night.)

AODC 2010: Uncle Dave's Trivia Night
The Trivia Night trophy is of incalculable value and much coveted. It’s a fully automatic, magnificently functional antique shoe shiner. You can get some idea of its value from the way Tony holds it as he shows it to us awe-struck trivia devotees:

AODC 2010: Uncle Dave's Trivia Night
Of course, the winning team gains the privilege of having their name engraved on the trophy for all eternity:

AODC 2010: Uncle Dave's Trivia Night
Dave Gash is the intrepid compiler of the questions and the hero of the night. Here he is, introducing the first round (of questions, that is, though beer was well represented too):

AODC 2010: Uncle Dave's Trivia Night
We divided into teams of four or five. (Numbers are approximate, just like quantum physics.) I was in the best team of the night. As you will see, we excelled throughout and in every way. To begin with, we chose the name “Team Rocket”. Judging from Tony’s expression when we announced our name, it was not a good choice. And so it turned out. Tony deducted a point from our score immediately, for a poor choice of name!
Tony is the judge, final arbiter and awarder of points. Scrupulously fair, benevolently strict, unfailingly impartial and completely incorruptible, the judge responds well to a free beer or any other suitable bribe.
The other teams were the “Mindel Maniacs” (boo, hiss), the “Dribbling Scribblers” and the infamous, annually-sprouting “Farkin Iceholes”.

AODC 2010: Uncle Dave's Trivia Night
The trivia quiz consists of 5 rounds, each containing 8 questions. Then there’s that final, 41st question where fortunes are gambled, lives are won or lost, and strategy is all. You can wager all or part of your total score on that single last question. If you get the answer right, the number of points you wagered is added to your score. If you get it wrong, the number is deducted from your score.
Round 1 came in with a bang. Team Rocket scored a big fat zero. We also lost a point for being slow. We did temporarily gain a point because our team name had become amusingly ironic. Alas however, in the face of our judge’s unmistakable disappointment at the original name, we had sneakily changed it to “Pocket Rockets”, so we lost another point for that deception.
To our utmost surprise and no little pleasure, Tony then awarded us a round of drinks. “Why?” came our started cry. As encouragement, was the reply. Our total score was now -3 points + 4 drinks. I think I lost count somewhere along the way.
During the proceedings, the Dribbling Scribblers lost a point for attempting to bribe the judge. The amount offered, 50c, was considered demeaning.
By the end of round 3, the Pocket Rockets (aka Team Rocket) had managed to bring our score up to the grand total of zero. In round 4, we scored 7 points (wow!) and gained a bonus point by bribing the judge with some jelly beans:

AODC 2010: Uncle Dave's Trivia Night
And now the decider! It was time for that all-important, win or lose last question. Our total score was 13-and-a-half. The Dribbling Scribblers had 16. The Mindel Maniacs (boo, hiss) had 22-and-a-half. The Farkin Iceholes (it got funnier and funnier through the evening as Tony and Dave tried to pronounce this name politely) had 24.
Strategy is all. We wagered 13 points, got the final answer wrong, and ended up with a grand score of half a point. The Mindel Maniacs (who?) were the only team who got the answer right. They won the contest with a total of 42 points.
Here’s Dave with the winning team (what was their name again?) and the magnificent trophy:

AODC 2010: Uncle Dave's Trivia Night
Here’s Team Rocket (aka the Pocket Rockets) proudly displaying our prizes for coming last. We were instructed to look disappointed, but only Matthew could pull that off:

AODC 2010: Uncle Dave's Trivia Night
Wanna know what we won? A coaster, a pen and a sweet. The sweet was an afterthought:

AODC 2010: Uncle Dave's Trivia Night
Dave paid us the final accolade:
You played a hell of a game!
I think he meant it sincerely.
The end (but not really)
If you came here looking for the serious side of technical communication, you’re in the wrong place. Ha ha, just kidding. I’ve already posted a number of summaries from this great AODC conference and there are still a few more sessions that I want to write up, including my own. Coming up just as soon as I find the time to convert my notes into blog posts.
Uncle Dave’s Trivia Night was an experience to be remembered. Thanks to Tony and Dave and all the AODC triviaphiles.
AODC Day 3: Converting to Structured Content
This week I’m at AODC 2010: The Australasian Online Documentation and Content conference. We’re in Darwin, in the “top end” of Australia. This post is my summary of one of the sessions at the conference. The post is derived from my notes taken during the presentation. All the credit goes to Dr Alan Burton, the presenter. Any mistakes or omissions are my own.
The first session on Friday was titled “Converting to Structured Content”, by Dr Alan Burton. Introducing his talk, Alan showed us some “scary” pictures of the world he lives in. Scary indeed — a wall of codes. With a charming smile, he admits to being a “hardcore geek”. From the moment Alan starts speaking, we see that he has a good dose of the AODC sense of humour.

AODC Day 3: Converting to Structured Content
The problem that Alan’s talk addressed is this: You’re told that you need to move your content to XML. You have loads and loads of unstructured content. It’s in FrameMaker, Word, other desktop publish applications, or even more fun: it’s on paper.
Who does the conversion?
Alan discussed various options to consider when deciding who will convert the documents to a structured format:
- You can do it yourself, if you have the technical knowledge required.
- The company’s IT department could do it. Alan jokingly referred to the “IT Crowd” — are you happy for Moss to do your conversion?
- You can outsource it to someone like Alan. (He acknowledges with another disarming smile that there’s no-one quite like him.)
- If you consider outsourcing the work to an overseas company, take into account that this can lead to difficulties when communicating requirements.
Analysis
You need a sample of the documentation that’s to be converted. What’s more, it must be a representative sample. For example, check whether the documentation as a whole contains tables, images, special characters, mathematical symbols. If so, are they represented in the sample? For best results, you need to be given the full documentation set in the analysis phase.
Find out whether the documents comply to a template. If the answer is yes, ask how long the documentation has been around and how many authors have had a hand in it. What are the chances that it really does comply to the template?
Here’s Alan’s recommendation, assuming your source documents are in Word: Convert all the Word documents to RTF and then feed them through an analysis tool. He extracts a complete list of all styles that appear in the documents. If there is a large number of styles, you may have a problem. He is building a set of utilities that let him look at styles, special characters, symbols etc within the documents, without himself having to read each and every document. In this way, he builds up a picture of how easy or difficult it will be to convert the document.
Ask questions such as: “I see that you use italics to represent words that appear in the glossary. Do you use italics for any other purpose, such as titles or Latin words?”
Do the analysis before you choose or create your DTD, and before you buy any software.
Document the results of your analysis in detail.
Requirements
After the analysis phase, you know what you have, but you still need to find out what you actually want. Things to consider:
- Metadata
- Indexing for search
- Online and/or paper output
- Cross-referencing
Markup specification
Alan creates a “markup specification”, which is a detailed requirement specification for the conversion. You need to add examples for each requirement. This documentation provides the input into the design of your DTD.
Mapping document
Alan creates a document that maps the styles of the source document to the XML elements and attributes. The developers will use this document to create or customise the software that does the conversion.
Checking the results of the conversion
When you receive the converted documentation, you need to check it to see if it’s actually what you want.
These are some of the problem areas that Alan mentioned:
- Tables
- Graphics — format, naming conventions, storage
- Cross-references
- Metadata
- Forms — these are very hard to recreate in XML
- Mathematic and scientific symbols and formulae
FrameMaker has a Migration Guide that tells you how to convert from unstructured FrameMaker to structured FrameMaker content. This is a useful tool. You can then export to XML. You do need to do some cleanup afterwards, either manually or with scripting.
You could convert from Word to FrameMaker, then to Structured FrameMaker.
Converting from Word to XML, you usually convert from Word to RTF and then to XML. Alan has done this once, in a very controlled environment and with just one document. It can become fairly complicated.
Converting to DITA is very challenging. You will probably need to split input files into different output files. It is unlikely that you can automate the insertion of metadata. You may need to rename graphics and create multiple graphics formats.
Bottom line
You will never end up with a clean result immediately, no matter what any tool may claim. You will always need to do some manual cleanup, and you will probably need to get a programmer to do some scripting.
My conclusion
This was a good, fast-paced walk through some scenarios and guidelines for converting from unstructured to structured content. I could tell that Alan has a lot more information to share. Thank you for a great session, Alan.


