Monday Jul 09, 2007

Talking with Microsoft's Gray Knowlton about MSOXML accessibility

Late last week, Gray Knowlton wrote a lengthy blog comment, sharing some of this thoughts about my review of the Microsoft white paper "Accessibility of Ecma Office Open XML File Formats" that Gray posted to the OpenXML Developer website. Given the importance of the office document accessibility, and the attention that ODF and MSOXML are getting with respect to accessibility, I wanted to continue the conversation directly in my blog (vs. buried in a comment to his comment). The rest of this is written directly to Gray...

First, Gray, thank you for talking the time to have this discussion - and thank you for correcting my misspelling of your name (oops!). You mentioned that you undertook this review "in part to educate a few individuals who claimed that “Open XML does not support accessibility”" but that "it was not intended to be a definitive guide to accessibility of Open XML". Can you tell me, has Microsoft - or anyone else - done a thorough review of MSOXML accessibility? Education, and "correction" of "significant misunderstandings" is one thing; a thoughtful and thorough accessibility review is another. Is this white paper the former, or the latter? If as you say it is the former, what about that thoughtful and thorough accessibility review?

Gray, at the start of your blog comment you say "I’m not sure that the “who did this?” question matters as much as your post seems to indicate", and you spend several paragraphs describing your (non-accessibility) background at Microsoft and Adobe. You also noted the name Reed Shaffner, as "a member of my team specifically focused on the accessibility of the Microsoft Office system", though you don't mention whether he was involved in writing this white paper (by the way, is this Reed Shaffner the author of these two blog posts? the new hire from Duke University who last year was a college senior and who presented a paper Linking Pgm allozyme and nucleotide variation in blue mussels at a Colorado Evolution conference?).

The reason I raise the question of authorship of the white paper is to better ascertain the level of accessibility expertise involved in the review. MSOXML has gone through the ECMA standards process, and is up for vote and adoption by the International Standards Organization, and is being evaluated by many countries and U.S. States for use and standardization in their organizations. Knowing whether a thoughtful and thorough accessibility evaluation was done on MSOXML has an impact on folks in many of these places.

Elsewhere in your comments, regarding my question about the use of WCAG 1.0, you note: "We are aware that the WCAG1.0 guidelines might not be the most appropriate of benchmarks today, but there are few finalised alternatives. I do work with our accessibility team on evaluating Open XML against the developing standards, and as I am sure you are aware we are active participants in these processes; however it would not be appropriate at this point to publish anything until those efforts are completed." I understand that issue, but then why do you use the XML Accessibility guidelines, which are only in draft form? It doesn't seem consistent to me...

And speaking of questions, I hope that we can continue this conversation, and that you might address the other questions I raised in my review. To quickly summarize them:

Blog question #2 (unaddressed):When and how will the accessibility failings cited in the paper be fixed?

In your evaluation, you note that MSOXML fails to support WCAG 1.0 checkpoints 4.2, 5.2, 9.4, 10.2, 12.1, 12.2, and 12.4, and that MSOXML only partially supports checkpoints WCAG 1.0 checkpoints 6.4, 8.1, 9.1, and 11.1. Some of these have quite significant impacts on folks with disabilities, including especially those who use assistive technologies to access office documents. For example, WCAG 1.0 Checkpoint 9.4 requires that you have a logical TAB order through all of the links, form controls, and objects within a document. Without this, keyboard only users won't be able to get to and manipulate all document content; screen reader users may miss some sections of content altogether. Another example: WCAG 1.0 Checkpoint 12.4 requires that all labels be explicitly associated with the objects they are labeling. Without this, blind folks using screen readers have a lot of trouble figuring out what a control is when they TAB so it - imaging hearing "edit field; edit field; edit field" as you TAB through a form, instead of "name edit field; address edit field; city edit field".

Blog question #4 (unaddressed): How does MSOXML meet accessibility needs outside of WCAG 1.0 & the W3C XML Accessibility draft? How well does MSOXML support translating into DAISY and Braille?

Perhaps your comment "The scope of the original project was not intended to provide the more significant contribution to the accessibility community you describe in your post" answers this, but it wasn't clear to me from your reply, and I don't want to make poor assumptions (misspelling your name is bad enough!). Does Microsoft know how well MSOXML supports the requirements of DAISY book creation? How well is supports the creation of Braille publications? How well it meets other accessibility needs? Does Microsoft have any experienced accessibility people looking at MSOXML, evaluating against accessibility needs independent of W3C accessibility specs?

Blog question #5 (unaddressed): Is there a difference between the phrases "supported" and "fully supported" in the white paper? When the white paper says that a provision is only "partially supported", what is missing?

Since you use "supported" in some places, and "fully supported" in others, it strongly implies there is a difference. Is there? Is merely "supported" less than full support? And when you note that MSOXML only "partially supports" an accessibility checkpoint, you don't say what is missing. Could you please elaborate on that missing support?

Blog question #6 (unaddressed): Why is the white paper so lacking in clear "supports" statements about the XML Accessibility guidelines?

The style change from the white paper text about WCAG 1.0 vs. XML Accessibility checkpoints almost suggests two different authors. But whatever the reason for it, it leaves a reader like me unable to form a clear picture of how MSOXML stacks up against the XML Accessibility checkpoints.

Separate from these questions - which I hope to read your answers to soon - I want to touch again on this comment of yours: "the scope of the original project was not intended to provide the more significant contribution to the accessibility community you describe in your post, but I posted this project to specifically for this reason. I do hope this project can become the contribution that you and others have expressed interest in evaluating." Many people - me among them - are not in a position to review a 6,000+ page specification. Microsoft has been working on accessibility for at least the past 20 years, and is the author of that that massive specification. Shouldn't Microsoft have its experienced accessibility experts working on this, rather than hoping that others work on it for you?

On a different topic, I'm curious about your comment regarding my P.S. about an accessibility problem I noted in the PDF edition of the white paper. You said: "I am somewhat embarrassed to have not used the tools in Acrobat for making PDF documents more accessible. I did create a Tagged PDF, as you noted, but I haven’t installed Acrobat on my new hardware yet, so I did not have the ability to edit the PDF to correct this easily." Did you use Word 2007 to create the white paper, and the Export to PDF function to create the PDF? In other words, is this the result of a bug in the accessibility functionality of Word 2007's PDF export feature (as you say that you need Adobe Acrobat to "edit the PDF to correct this easily")?

Finally, fair is fair. You asked me a question:

Speaking of PDF, would you mind please pointing out the list of W3C recommendations supported by PDF/X-1a, PDF/X-3 and PDF/A? It’s been a few years, but I don’t recall the use of XForms, SVG or MathML in these specifications. These are all ISO standards today, so I’m curious to compare this with the ODF example you cited in your post.

First, let me forward you to Andrew Kirkpatrick of Adobe, who writes the Adobe Accessibility Blog, and who along with me and 40 other folks is a member of the Telecommunications and Electronic and Information Technology Advisory Committee providing recommendations for updates of accessibility standards issued under section 508 of the Rehabilitation Act and guidelines under Section 255 of the Telecommunications Act. Perhaps you met Andrew during your time at Adobe? In any case, Andrew and Adobe are the experts on PDF accessibility, not me and Sun...

That said, please let me observe that PDF/X-1a (also known as ISO 15930-1:2001) and PDF/X-3 (also known as ISO 15930-3:2002) are "graphic technology standards" for use in "prepress digital data exchange". They are subsets of the full PDF standard. They are not for editable office documents; rather they are for the final step in a print document's life prior to being sent to the printer (whether that document started out life as a spreadsheet or a glossy 4-color magazine advertisement). Mathematical equation editing happens upstream - in places like office documents or other formats dedicated to preserving all of the semantic meaning of the equation. Likewise the gathering of form data.

Anyway, where I believe it makes sense to ask these questions is with PDF/A-1 (also known as ISO 19005-1:2005). This is a "document management" standard, which is described as an "electronic document file format for long-term preservation". It is here (rather than while in transit to a printer) that the preservation of accessibility information is important, and where our accessibility attentions make sense. I haven't yet read that standard, so I can't speak to what W3C specifications it does or does not incorporate. As I find out more, I'll post the answers here.

Tuesday Jul 03, 2007

Reviewing the "Accessibility of Ecma Office Open XML File Formats"

Since the start of the discussions around office document accessibility nearly two years ago - with the publication of Commonwealth of Massachusetts' Enterprise Technical Reference Model v3.5 in September 2005 specifying the use of OpenDocument format for office documents - I have seen NO clear and technical discussion of the accessibility of the Microsoft OXML format. Rather, in meetings I've been part of, Microsoft representatives have simply stated that since Microsoft Office is accessible (only when running on Microsoft Windows, and for the blind only when running with expensive 3rd party assistive technologies), it is automatically the case that the underlying file format fully supports everything needed for accessibility. This has been the first, last, and only word on the accessibility of MSOXML even while many European countries and American States have considered standardizing on an office file format - and grappled with accessibility concerns arising from that consideration. And while several folks I know in the accessibility field have contemplated undertaking such an evaluation of the file format, the 6,000+ page specification for MSOXML proved an effective deterrent.

That finally changed yesterday morning, with the publication of the "Accessibility of Ecma Office Open XML File Formats" white paper at the OpenXML Developer website. This welcome - if late - development allows us to finally start to have a real technical discussion of MSOXML accessibility, and to start to compare MSOXML accessibility support to what is in OpenDocument Format v1.1.

The 38 page "white paper discusses the Accessibility features of Open XML and using the Web Content Accessibility Guidelines 1.0 and XML Accessibility Guidelines ." It begins with this cover text:

Microsoft is offering this document as a contribution to, to further understanding of Open XML within the development community. Microsoft offers this to as a project to which others may contribute, to help improve the support for assistive technology in the development of software using the Ecma Office Open XML Format Specification.

This Microsoft-proffered document goes through each of the checkpoints of the May 1999 Web Content Accessibility Guidelines v1.0, and the 3rd Draft, October 2002 XML Accessibility Guidelines, and describes whether and how MSOXML supports these checkpoints. One particularly nice thing this document does for many of the checkpoints is to illustrate in XML precisely which MSOXML tags one would use to support that checkpoint. For example, for Checkpoint 1.1 of WCAG 1.0, on pages 7 & 8 the white paper shows how to use the tag to annotate shapes, grouped objects, and line/arrows with text equivalents, and the tag to do the same thing for images, charts, and 3D objects. These XML fragment illustrations are present for most of the provisions that Microsoft claims MSOXML supports.

While a very welcome contribution to the discussion of office file format accessibility, this document raises a number of new questions, even as it answers others:

  1. Who did this review? The white paper contains no list of authors, nor even the name of one or more groups at Microsoft. The only person associated with this document is Gray Knowlton, an member with no bio, and only one posting to the website to date. He appears to be a Group Product Manager for Microsoft Office, focused on Technical Product Management. What about people with accessibility expertise at Microsoft? I presume at least one of them was involved, but with what background? What about folks from the disability community. Was anyone with technical background from a blindness organization invited to contribute? Anyone with physical impairment expertise? Anyone making assistive technologies? Anyone making file conversion software to take office documents into Braille or DAISY? It is difficult to trust a document that has no attribution. Given the high stakes involved, it is difficult to accept something without peer review from experts that don't stand to profit from the results of the review.

  2. When and how will the accessibility failings cited in the paper be fixed? While the white paper introduction notes that the "adoption of Accessibility in the Open XML standard will help many technology providers carry forward the information stored in billions of existing documents AND preserve the information in that document intended to enable accessibility", it is silent on the question of whether how MSOXML accessibility support itself will improve. For example, the white paper notes that MSOXML fails to support WCAG 1.0 checkpoints 4.2, 5.2, 9.4, 10.2, 12.1, 12.2, and 12.4. The white paper further notes that MSOXML only partially supports checkpoints WCAG 1.0 checkpoints 6.4, 8.1, 9.1, and 11.1. Some of these are particularly important for blind users needing to understand the context of table cells and for good Braille and DAISY transcription of tables - issues we found in ODF v1.0 and fixed in ODF v1.1. Will these things get fixed in the future? If so, when? By whom? With what outside review (if any)? To appear in what update of the specification?

  3. Why use WCAG 1.0? It is widely recognized that WCAG 1.0 is very old. The Web Content Accessibility Guidelines v2.0 is in ""last call". The white paper uses a draft of the XML accessibility specification, but then oddly doesn't do the same with WCAG 2.0, relying instead on a document that is over 8 years old.

  4. The white paper notes in many places that many of the WCAG 1.0 checkpoints are inappropriate for an office document (this is separate from the checkpoints noted above that the document says are either not supported or not fully supported). This suggests that the document author(s) don't feel that WCAG 1.0 is a fully appropriate standard to use. Yet there is no discussion about this - about why those standards were chosen and not others. Also no discussion about the accessibility purposes to which one might put an office document (e.g. meeting the needs of Braille or DAISY transcription).

  5. The white paper sometimes uses the phrase "this checkpoint is fully supported in Open XML file format", and other times the phrase "this checkpoint is supported in Open XML file format" (as distinct from the phrasing "this checkpoint is partially supported in Open XML file format". Is there a difference between "supported" and "fully supported"? For those checkpoints that are merely "supported", what is missing (if anything)? Further, in many cases for checkpoints that are "partially supported", often the white paper doesn't state what is missing. Without undertaking a thorough reading of the 6,000+ specification, it is difficult to know exactly what is missing. Without knowing who did this review and wrote this white paper, there is nobody we can ask this question of...

  6. In the review of MSOXML against the XML Accessibility guidelines, there is no definitive statement on whether and how MSOXML supports the first 8 checkpoints (1.1 through 3.2); only summaries of what those checkpoints say. For many other XML Accessibility guidelines checkpoints, there is some description of what MSOXML does, but no definitive statement about whether that means that the checkpoint is "fully supported", "supported", or "partially supported" by MSOXML (or yet some other thing). Does this mean that MSOXML fails to support those checkpoints? Does it mean that the checkpoints are inappropriate for an office document? Something else? It just isn't clear.

In addition to these important questions, this white paper nicely sidesteps what I believe is the intent of WGAG 1.0's Guideline 11: "Using W3C technologies and guidelines", and most specifically checkpoint 11.1 "Use W3C technologies when they are available and appropriate for a task and use the latest versions when supported." Unlike ODF, MSOXML makes almost no use of existing W3C standards. Instead of using the W3C XForms specification for embedded forms, it invents its own XML terminology for forms. Instead of using the SVG specification for vector graphics, it uses its own vector graphics encoding. Instead of using the MathML specification for mathematical equations, it uses its own math encoding. Instead of using the W3C date format, it invents its own (and perpetuates a leap year bug from the first releases of Microsoft Excel, rather than having the software that converts from .xls into MSOXML calculate the correct value). Yet oddly, on this topic, the white paper claims that checkpoint 11.1 is "partially supported", with the checky claim that "the Open XML file format has defined namespaces and elements and use [sic] them." This lack of re-use of existing (and accessibility-vetted) W3C XML standards is perhaps the main reason for the MSOXML specification running to 6,000+ pages, and is repeatedly cited by ISO voting members in their objections to the Fast Track ballot of MSOXML.

P.S. a final note: I was disappointed to see that the PDF edition of this white paper failed to make full use of the accessibility features of PDF and be a properly tagged document. The first 8 words of the document appear doubled to screen readers: "AAcccceessssiibbiilliittyy ooff EEccmmaa OOffffiiccee OOppeenn XXMMLL FFiillee FFoorrmmaattss"


Peter Korn


« July 2016