The DITA debate

I’m coming to the conclusion that there are specific types of content that suit a DITA environment, and that the converse is also true: DITA is not the best solution for every content type. (DITA is the Darwin Information Typing Architecture, an XML architecture for designing, writing, managing, and publishing information.)

“Well, duh,” you may say. 🙂 I’ve never worked in a DITA environment, but I’ve attended two indepth training courses and a number of case studies that walked through successful DITA implementations. The most recent was at the ASTC (NSW) conference last week, where Gareth Oakes presented a case study of an automotive content management system that he designed and implemented in collaboration with Repco. The content is stored and managed in DITA format, and published on a website. (See my report on the session: Repco and DITA.) This was a convincing case study of a situation where DITA has succeeded very well.

In my analysis, the DITA implementations that work well are those where the content consists of a large number of topics, and where those topics have an identical structure. It’s almost as if you’re building a database of content. Good examples are catalogues of automotive spare parts, machine repair instructions,​ safety procedures, aircraft manufacturing manuals, and so on.

Apart from volume and support for a standard layout, this type of content has other requirements that DITA can satisfy well, including the ability to automatically build a variety of manuals by combining the topics into different configurations (via DITA maps) and multi-channel publishing.

On the other hand, some content doesn’t benefit much from such a highly structured storage format. Potentially, the overhead of a DITA environment is overkill and the costs may outweigh the benefits. If we have contributors to the docs who are not tech writers or developers, asking them to learn DITA or specific source control and editor rules can be a deterrent.

Dare I say it: Much of the documentation we write in the software industry falls into the latter category. Our topics tend to be lengthy, less uniform in structure, and more discursive than, say, an auto parts manual. API reference docs are an exception, but they’re auto-generated from software code anyway. We also don’t usually need to recombine the topics into different output configurations, such as different models of a car.

What do you think? Please contradict me. 🙂 Do you have examples that gainsay or support the above conclusions? I’d love to see some examples of well-structured and well-presented documentation produced from DITA source.

About Sarah Maddox

Technical writer, author and blogger in Sydney

Posted on 24 October 2014, in technical writing and tagged , , , . Bookmark the permalink. 14 Comments.

  1. Hi Sarah,

    I’m not going to contradict you, but I am going to add another reason why DITA adoption in the software industry is still very patchy.

    For many small to medium size tech businesses the short term costs of implementing DITA for user documentation appear prohibitive. Even if a forecasting tool (like the Rockley Group worksheet) clearly shows a positive ROI in a relatively short time frame, many small to medium size businesses won’t commit their limited resources to moving to DITA.

    Perhaps when tech comm software products offer a complete DITA solution – the database, the authoring tools, and the publication engine all in one package – we may see adoption grow.

  2. If you:

    * have a lot of conditional text (for different audiences, spelling variations or products)
    * need to share content/receive content with 3rd parties
    * have content with a long shelf life, where the content needs be in a format that is device independent

    then there’s a case.

  3. Hi Sarah,

    Another big plus point for DITA is where you need to produce the same documentation in multiple languages. By structuring a book into topics, translation costs are reduced as you can keep track of which topics have been updated in your master version and need re-translating for each release of the book.



  4. Hi Sarah,

    It’s great you started this discussion!

    This architecture is very flexible as it allows for content reuse and reduces translation costs, true enough, but let’s picture this: a new tech writer starts working for a company that has adopted DITA and created tens and tens of thousands of topics in its database; it might be really overwhelming for the newcomer to search for the content that is suitable for reuse, and more often than not people have to learn it the hard way – by putting lots of blood, sweat, and tears into writing their own text and then discovering a similar text was out there in the database the whole time *great facepalm* Some of my colleagues who have been working with DITA for a considerable amount of time are still extremely disconcerted by this aspect.

    The long and the short of it is, when implemented on a relatively small scale, DITA is all cozy and pliable, but as it expands beyond a reasonable limit, new topics sprout up like mushrooms, and you do have to take pains not to lose track of the reusable content.

  5. @Vadim – you raise a valid point about ensuring volume of topics does not hamper re-use opportunities. Search is not the only mechanism to identify potential topics for re-use though. In assessing whether DITA is appropriate, I think a component content management system (CCMS) needs to be considered as an integral part of the solution. David Farbey mentioned the need for a complete DITA solution, comprising database, authoring tools and publishing engine. I agree, and I think there are products on the market which address this need. The Ixiasoft DITA CMS is one such example, but there are others (see

    @Sarah – Some of the challenges for DITA are not that different to wikis in my view. Organic growth of content, consistent presentation, sense of ownership, managing re-use (ref: Confluence include macro). For sure DITA is a stricter environment for authors to work in, and it might require a jump from more familiar tools/methodologies, and I’m not saying it’s easy, but there are some applications where it can deliver substantial benefits.

    For examples in the software world, SAP are using DITA extensively (see

    • Hallo Charles
      Thanks for the SAP link – it’s great to see a DITA-based content site in action. The site looks very good. There are some typographical inconsistencies that leap out at a tech writer’s eye 🙂 and the controls behave a little randomly in that they flicker in and out of existence (I’m using Chrome) – but those are of course not the result of the DITA implementation. One thing that does bug me, though, and that I’ve seen on a couple of other DITA-based sites, is that the navigation is a little hard to grok. I click on a link, end up somewhere, and its relationship to the content as a whole, or to where I’ve come from, isn’t obvious.

      That said, I ended up on this related site: The navigation structure there is much clearer – I wonder if this section is based on DITA too?


  6. I think it really depends on the products to be documented. A very imported factor that nobody mentioned before is the scalability of DITA. You can start from scratch with a small budget, a small team and a small DITA environment without a CMS. Later, you can easily migrate to a CMS. If you start with, for example, a wiki and your documention requirements (languages, variables, variants, modules) change/grow after a while, you’ll run into problems.

  7. “Potentially, the overhead of a DITA environment is overkill and the costs may outweigh the benefits.”

    I agree, but there is a point where you will surpass the overhead and then increase quality. My experience is that there is A LOT of overhead using a DITA/CCMS combination. You probably want to have 1 or 2 people on your team full time devoted to IA/DITA/CCMS administration – and I’m speaking about a relatively small group of writers maintaining only about ~20K pages of content. For my group, the DITA/CCMS combination lets us maintain quality uniformly spread over this mid-sized docset.

    I think at its heart, using DITA/CCMS system just reflects the complexities in managing a lot of content. If your content management tasks aren’t demanding there’s no reason to use DITA – unless you think your products/documentation offerings are going to grow hugely in the future.

  8. Sarah,

    Welcome to the hornet’s nest!

    The problem with these sorts of discussions is what people are comparing DITA to. I’m no DITA fan, but I can certainly appreciate that it may be a substantial improvement for many people over what they were doing before.

    The problem is, what we were all doing before is often out of date. We have two big new issues to deal with today: hypertext, and scale. Actually, scale is really just an aspect of hypertext, because before hypertext an information set of 10,000 pages was likely to be made up of 20 to 40 books, each of which was produced separately and had little or no connection to the other books. The tools only had to scale to 500 pages, not 10,000. Now we have one information set of 10,000 pages — and (usually) growing.

    Traditional tech comm tools do not scale up well to this level. HATs, word processors, and DTP tools can’t handle this. Nor can the information designs they are designed to support. Wikis can handle the volume, but struggle to manage things. (Wikipedia manages, of course, but it takes an army of volunteers.)

    DITA does not scale well either, as several of the comments above illustrate. You very soon require a big-iron CCMS to maintain any kind of order and discipline, and even then it is a lot of manual work. On the other hand, the DITA+CCMS combination may, with proper modeling and management, scale better than any of the other traditional approaches, or wikis.

    This comes with significant up front costs, though. In particular, success depends in large part in getting the content model right up front, which is hard when you have no experience either of the tools or of this way of working. This means you are likely going to need consultants to do it for you or you are likely to create a mess that you will have a hard time cleaning up later.

    And you can still run into problems if your model turns out not to work so well after a while due either to your consultants getting it wrong or business needs or business conditions changing. Indeed, in many cases, people are implementing models from the manuals-and-help era with tools that are going to take several years to pay back their investment, even as the manuals-and-help era is fading in the rear view mirror.

    But then again, it is not like any of the other tools are necessarily any better at this.

    Wikis are an important exception to this picture because, while they have serious issues with management, they belong to the hypertext era, not the manuals-and-help era.

    What we really need, and what I am working on with the SPFE project, is structured authoring for the hypertext era.

    But as far as the what-is-DITA-good-for question, I always encourage people to think hard about what kind of content they hope/expect to be producing in five years. Don’t implement a systems around todays designs if you are going to be producing something else in five years.

  9. Sarah, I’m glad you raised this point. I implemented DITA for my content inexpensively, using OxygenXML and publishing to its native webhelp output. For source control, we use Mercurial. This approach makes DITA less expensive than almost any HAT on the market.

    However, in the API doc space, DITA isn’t so common. Most API doc tools use Markdown, since developers contribute.

    Our use case for DITA is with conditional processing. With 4 different programming languages, I often have a topic like “Activating the license” with conditional text for the different language samples.

    However, I could have bypassed DITA by simply using jQuery tabs to accomplish very similar things, which is what most people do in API doc.

    Some of our other projects have various roles, with documentation overlapping for different roles. In that case, DITA’s conditional processing works well (as would any HAT, though).

    DITA has a lot of hype and industry clout. In some tech comm circles, you may think if you’re not using DITA, you’re out of date. But I’m not convinced it’s the best approach, particularly for API doc. I also have some issues with the way relationship tables are processed by the Open Toolkit, which is an issue that almost no one seems to experience because they’re all writing in tiny little topics (leading to an information fragmentation experience).

    Can I ask, in your authoring environment, how do you handle content re-use and conditional processing?

  10. “We also don’t usually need to recombine the topics into different output configurations, such as different models of a car.”

    Not true. I’ve worked on many pieces of software that had many different iterations/offerings (basic, enterprise, cloud). With DITA you can single source between all the different offerings AND all versions. That means if some change comes up that applies to all versions, you’re updating 1 DITA topic, not 5 different files for 5 different editions/versions.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: