The Value of Semantic Tags
By Eric Armstrong on Apr 13, 2008
So what's wrong with using <b>, <i>, and <tt>, anyway? What's so useful about identifying things as menu items, APIs, or filenames? Here's the list of reasons that surfaced at the recent 2008 DITA/CMS Conference. What are your thoughts?
At their session on DITA, Code Reviews, IBM's Carolyn Inkster and Sharon Rouiller showed the results of using their "bad tag finder"--a CSS stylesheet that made text marked with typographic tags like bold and italics stand out with large, brash fonts and brilliant color, so they were easy to spot. The idea was to quickly identify text that would be better off with semantic tags like filename, or menu item.
They were then asked, "What is the value of adding semantic tags?" (Especially after a conversion, when lingering typographic elements like <tt> and <b> will encode such information, what makes it desirable to convert them all to the appropriate semantic tags?)
Here are the reasons they gave:
- Automated link insertion: When messages are marked with a semantic tag, all the messages in a document can be automatically linked to a troubleshooting guide that details the possible causes and ways to deal with them. Alternatively, the troubleshooting guide could be automatically populated with links to each area where a particular message is discussed. Semantic links make that kind of automated document construction possible.
- Information filtering: Users can filter information using metadata tags, so they can leave out information that pertains to products they're not interested in. (That's not quite the same principle as semantic tags, though...)
- Different typographic conventions in different languages: There is no such thing as "bold" in Chinese, so for that locale it makes more sense to use a different color. But for other languages, bold works fine. Some languages might even have color-coding conventions (or acquire them over time), where orange is used for one thing and red is used for something else. Semantic tagging makes it easy to produce documents for any convention, in any locale.
- Translation control: The existence of semantic tags makes it possible to identify terms that shouldn't be translated, both for translators and translation memory systems--for example, product names.
To that list, I would add the following:
- Intelligent search and replace: So you can change all occurrences of a term in a menu item, for example, without worrying about it when it occurs in a filename.
- Automated processing: If you have a part-number specialization, it becomes possible to populate your parts list automatically from a database. Alternatively, the editable document could become the gold standard, and the database could be populated from that.
- Highlights of the 2008 DITA/CMS Conference