Blog Archives

The Making of “The Language of Content Strategy” (stc14)

This week I’m attending STC Summit 2014, the annual conference of the Society for Technical Communication. Where feasible, I’ll take notes from the sessions I attend, and share them on this blog. All credit goes to the presenters, and any mistakes are mine.

The Language of Content Strategy is a book by Scott Abel and Rahel Anne BailieIn this session at STC Summit 2014, Scott Abel discussed the content strategy, tools and technologies behind the making of the book.

Problem statement

Time, money, skills and experience are in short supply. Hand-crafting content is expensive, time consuming and not scalable.

The demands of the audience are changing. People use social media, rather than going to a specific website to gather information.

To meet the demands of content delivery today, we need to adopt manufacturing principles. The is made possible by content engineering: The application of engineering discipline to the design and delivery of content.

Case study: The making of  The Language of Content Strategy

In this session, Scott will show us how he and Rahel created a book, an eBook, a website, and a set of learning materials, from a single source, without breaking the bank. They did it by harnessing technology and crowd sourcing.

Scott talked about the differences in approach between technologists and editorialists. Conflict and time wasting arise because of a lack of a common language. Rahel and Scott wanted to craft a solution: A crowd-sourced book about content strategy that is both a case study in content engineering and a practical example of content marketing.

Setting up

The team started with careful analysis of the educational landscape, contributors, and more. Then they defined the content types they needed.

  • The smallest unit of content they would create would be a term and definition pair.
  • Another content type is an essay of 250 words.
  • Then there are contributor bios, statements of importance, and resources.

For the authoring environment, the team selected Atlassian Confluence. It’s a wiki with support for XML content re-use.

They also chose a gimmick: 52. The project included 52 terms, 52 definitions, by 52 experts, published over 52 weeks, and one of the output formats was 52 cards.

Then they selected a team of experts: the best and brightest in tangentially-related fields.

Other roles and responsibilities: markup specialist, editor, indexer, peer reviewers, and a graphic artist.

The source data

The source was authored in Confluence wiki. The content types are clearly labelled: Biography, importance statement, topic name, definition, etc.

The output

In the various output formats, the content is structured differently but still consists of the various topic types. For example, in the printed book every chapter is two pages long, and consistently structured. The eBook format is slightly different, as are the website format and the flash cards learning format.

Each Thursday, one chapter is automatically published. The web output also contains audio files, photos, and additional resources that are not contained in the book.

The advantages of a future-proofed content strategy

The team was able to add content after the fact, such as the audio files for accessibility. The content strategy was designed to future proof the content, so the team was able to adjust to challenges and opportunities. And the strategy is repeatable. Now that it’s been done, it can be done again.

Scott told an amusing story of how he disobeyed his own rules, and tried to create another channel by copying and pasting instead of using the single-sourced content. A marketing person asked him to create a slide deck from the content. He was on a plane, without WiFi, so decided to do it by cutting and pasting. Needless to say, this didn’t work. By the end of the flight he had only 13 slides of the required 52, and had run out of laptop battery!

Cost

The cost of the project came in at under $10,000USD.

  • Approximately $4000USD forgraphic design, indexing, editing, markup assistance, audio tracks and hosting, the URL for the first year, and site hosting for a year.
  • Approximately $5,440 for book donations, postage, Adobe InDesign, Confluence Wiki, and overhead/administrative costs.

Scott’s promise

Scott finished by saying that if you want to undertake a similar project, ask him. He will try to help.

This was a fun and inspiring talk. Thanks Scott!

Confluence full-text search using Python and grep

The standard search in Confluence wiki searches the visible content of the page. It also offers keywords for some specific searches, such as macro names and page titles. But sometimes we need to find things that the search  cannot find, because the content of the relevant XML elements is not indexed. This post offers a solution of sorts: Copy the XML storage format of your pages into text files on your local machine, then use a powerful search like grep to do the work.

Here are some examples of the problem:

  • We may want to find all pages that reference a certain image, or other attachment. It’s easy enough to find the page(s) where the image is attached. But it’s not possible to find all pages that display a given image which is attached to another page.
  • It’s possible to search for all occurrences of a macro name, using the macroName: keyword in the search. But it’s not possible to search for parameter values. This means, for example, you can’t search for all pages that include content from a given page.

I’ve written a script to solve the problem, by downloading the storage format from Confluence onto your local machine, where you can use all sorts of powerful text searches. You’re welcome to use the script, with the proviso that it’s not perfect.

Python script: getConfluencePageContent

The script is in a repository on Bitbucket: https://bitbucket.org/sarahmaddox/confluence-full-text-search.

Note: To run the script successfully, you need access to Confluence, and the Confluence remote API must be enabled.

Installing Python

To run the script, you need to install Python. The scripts are designed for Python 3, not Python 2. There were fairly significant changes in Python 3.

  1. Download Python 3.2.3 or later: http://www.python.org/getit/
    (I downloaded python-3.2.3.amd64.msi, because I’m working on a 64-bit Windows machine.)
  2. Run the installer to install Python on your computer.
    (I left all the options at their default values.)
  3. Add the location of your Python installation to your path variable in Windows:
    1. Go to ‘Start’ > ‘Control Panel’ > ‘System’ > ‘Advanced system settings’
    2. Click ‘Environment Variables’.
    3. In the ‘System variables’ section, select ‘Path’.
    4. Click ‘Edit’.
    5. Add the following to the end of the path, assuming that you installed Python in the default location:
      ;C:\Python32
    6. Click ‘OK’ three times.
    7. Open a command window and type ‘python’ to see if all is OK. You should see something like this:

Confluence full-text search using Python and grep

Getting the script

Go to the Bitbucket repository and choose ‘Downloads’ > ‘Branches’, then download the zip file and unzip it into a directory on your computer.

Running the script to get the content of your pages

To use the getConfluencePageContent script:

  1. Enable the remote API (XML-RPC & SOAP) on your Confluence site.
  2. Open the getConfluencePageContent script in Python’s ‘IDLE’ GUI.  (Right-click on the script and choose ‘Edit with IDLE’.)
  3. Run the script from within IDLE. (Press F5.)
  4. The Python shell will open and prompt you for some information:
    • Confluence URL – The base URL of your Confluence site. If the site uses SSL, enter ‘HTTPS’ instead of ‘HTTP’. For example: https://my.confluence.com
    • Username – Confluence will use this username to access the pages. This username must have ‘view’ access to all the spaces and pages that you want to check.
    • Password – The password for the above username.
    • Space key – A Confluence space key. Case is not important – the match is not case-sensitive.
    • Output directory name – The directory where the script should put its results. The script will create this directory. Make sure it does not yet exist.
  5. Look for the output directory as a sibling of the directory that contains the getConfluencePageContent script. In other words, the output directory will appear in your file system at the same level as the script’s directory.

Python Shell

Python shell (IDLE)

 

Output of the script

The Bitbucket repository contains an example of the output, based on the Demonstration space shipped with Confluence. See the outputexample directory in the repository. For example, this file contains the content of the page titled ‘Welcome to Confluence’.

The script gets the content of all pages in the given Confluence space. It puts the content of each page into a separate text file, in a given directory.

The script creates the output directory as a sibling of the directory that contains the getConfluencePageContent script. In other words, the output directory will appear in your file system at the same level as the script’s directory.

The file name is a combination of the page name and page ID. To prevent problems when creating the files, the script removes all non-alphanumeric characters from the file name. To ensure uniqueness, it appends the page ID to the page name when creating the file name.

The content is in the form of the Confluence storage format, which is a type of XML consisting of HTML with Confluence-specific elements. (Docs.)

The script also writes a line at the top of each file, containing the URL of the page, and marked with asterisks for easy grepping.

Notes:

  • The script will show an error if the output directory already exists.
  • If you see the following error message, you need to enable the remote API (XML-RPC & SOAP) on your Confluence site: xmlrpc.client.ProtocolError: <ProtocolError for localhost:8090/rpc/xmlrpc: 403 Forbidden>

Grep and winGrep

Now that you have the page content in text form, the world’s your oyster. :) You can use the full power of text search tools. If you’re on UNIX, you’ll already know about grep.

If you’re on Windows, let me introduce grepWin. It’s a free, powerful search tool that you can install on Windows. It offers regular expression (regexp) searches as well as standard searches, and it has a very nice UI (user interface).

This screenshot shows a search for an image called ‘step-2-image-1-confluence-demo-space.png’. The image is attached to one page, and referenced in two pages. QED. :D

grepWin

grepWin

 

Comments welcome!

I’d love to know if you think you’ll find the script useful, and if you have any ideas for improving it.

Want an XML schema viewer in Confluence wiki?

You got it. :) Avisi have developed two nifty macros to display an XML schema (XSD) in tabular and graphic format on a Confluence page. The XSD Viewer is a new add-on for Confluence wiki, and the Avisi developers are keen for input from technical writers and others interested in XML schemas.

I’ve been playing around with the add-on, so I’d love to show you a couple of examples and tell you how to get it working for yourself. I’ve also chatted with Yanne from Avisi, who says that he and his team would love to have your feedback.

Example 1:  A purchase order schema

I’ve grabbed the sample schema for a purchase order from MSDN: http://msdn.microsoft.com/en-us/library/ms256129.aspx. I’ve instructed the XSD viewer to start with the purchaseOrder element, and show a depth of 2 levels.

Want an XML schema viewer in Confluence?

Example 2: Graham Hannington’s schema for the Confluence storage format

Hehe, if you put Confluence and XSD in the same blog post, then ‘twould be remiss not to include Graham’s XML schema for the Confluence storage format. :D

The XSD Viewer is using confluence.xsd, starting with the image element.

Want an XML schema viewer for Confluence?

One point of interest here is that the confluence.xsd file references two other schema files: confluence-ri.xsd and confluence-xhtml.xsd. All I had to do to make this work, was to attach all three XSD files to the page. This screenshot shows the attachments on the above page:

Want an XML schema viewer in Confluence?

Hiccups

A couple of times, the XSD Viewer has declined to show any rows in the table. I’m not sure why this occurs. If it happens to you too, it’s worth letting the Avisi team know.

My environment

I’m using Confluence 5.0.1, with version 1.1.1 of the XSD Viewer. I’m running Confluence on my Windows 7 laptop, and I’m using Chrome to view the wiki pages.

How to get your own XSD viewer

To make this happen, you need to do the following:

  1. Download and install Confluence, if you don’t already have it. You can try it for free for 30 days. See the Confluence download page.
  2. Download the XSD Viewer add-on and install it into Confluence. The add-on is also available for free for 30 days. See the XSD Viewer page on the Atlassian Marketplace.
  3. Create a page in Confluence.
  4. Attach your XSD file to the Confluence page, just as you would attach a screenshot or other file. See the documentation on adding attachments.
  5. Edit the page.
  6. Add the “XSD Image” and/or the “XSD Table” macros to the page. See the documentation for the XSD Viewer.
  7. Save the page.

Resources

Useful links:

Feedback so far

I’ve given Yanne at Avisi some feedback already:

  • At first the error messages were a bit too generic to be useful. Avisi have already followed up on this in the latest version of the add-on, which gives more specific error messages. Great!
  • Currently the macro autocomplete in Confluence is triggered by “XSD”. Suggestion: Add “schema” and “XML” to the list of triggers.
  • Add the option to add a border and other styling to the image.

The Avisi team like the latter two suggestions, and are waiting for more feedback before implementing them. Would you be interested in an XSD viewer in Confluence, and what requirements would you have for it?

New style for Confluence documentation – what do you think

To mark the impending release of Confluence 5.0, we’ve applied a new style to the Confluence documentation. It’s done by means of some snazzy CSS, created by Andrew Prentice, Valter Fatia and Paul Watson.

What do you think of the new look? We’d love your feedback on the styles, the way some information is hidden until you hover over it (try zooming your cursor around the page to find the hidden bits) and the contrast with the standard Confluence 5.0 look.

Our customised styles

We’ve applied custom CSS only to the latest documentation space on the wiki – that’s the documentation for Confluence 5.0. This space is using the Documentation theme, but with a lot of CSS on top:

Documentation space with custom stylesheets

Documentation space with custom stylesheets

The standard Confluence 5.0 styles in the Documentation theme

The documentation for Confluence 4.3 is on the same wiki, but we haven’t applied the custom stylesheets. The wiki is already running Confluence 5.0, so you can see the new 5.0 look and feel without any custom styling. The space is using the Documentation theme:

Documentation space with standard styling

Documentation space with standard styling

The standard Confluence 5.0 styles in the default theme

Just for completeness, here’s the Atlassian Training space, on the same wiki, but using the default Confluence 5.0 theme:

A space using the default Confluence 5.0 theme

A space using the default Confluence 5.0 theme

How do you add CSS to a Confluence space?

It’s all in the documentation: Styling Confluence with CSS.

Thoughts? :)

Confluence tip – how to hide child pages in the Documentation theme

This hint is for people who are using the Documentation theme in Confluence wiki, and want to hide the child pages that are shown at the bottom of every page. After all, the left-hand navigation bar in  the Documentation theme shows a page tree, including all parent and child pages. So it’s probably overkill to show the children at the bottom of every page too.

Confluence tip - how to hide child pages

To hide the child pages, add some CSS to the space. This is the CSS you need:

#children-section{   
display:none;   
}

I’ve tested the above CSS in Confluence 4.3 and 5.0.

To add the CSS to your space, you need space administrator permissions. Go to BrowseSpace AdminStylesheet, edit the stylesheet and dump the above code into the text box.

The documentation has more guidelines on using custom stylesheets: Styling Confluence with CSS.

Have fun!

Follow

Get every new post delivered to your Inbox.

Join 1,446 other followers

%d bloggers like this: