Wednesday May 29, 2013

javadoc TLC

The recent series of patches1 for javadoc completes the work started during JDK 7 to change the internal data model for standard javadoc doclets from strings to a document tree. As a result of this work, there should no longer be any unnecessary internal conversion from tree nodes to strings and back again.

As a side-effect of this work, some bugs were uncovered and fixed, such as not using entities for literal use of '<', '>', and '&', and conversely, treating some HTML fragments as plain text, and then incorrectly replacing those characters with entities. Oops. Also, the indentation of method signatures should now be fixed, so that parameters and exceptions thrown should be vertically aligned, as used to be the case.

A more important side-effect is that the code to generate HTML content has been consolidated within the package, leaving the main internal taglet API to be more format-neutral. This should make it easier to provide doclets that write to alternate formats.

1. hg: jdk8/tl/langtools: 17 new changesets

Thursday Jun 02, 2011

What's Up, JavaDoc?

The Java documentation tool, javadoc, has been somewhat neglected in recent releases, but in JDK 7, it's been getting some amount of long-overdue TLC, albeit mostly under the covers.

Internally, the biggest change has been to rewrite much of the internals of the standard doclet with respect to the way it generates pages. Previously, javadoc worked by processing data structures modelling the API and then generating the HTML files with a combination of using strings and by writing directly to an output stream, which means you need to know sequentially everything that needs to be written. As anyone who has tried to do this knows, this is hard to get right, and in a number of places javadoc got it wrong, and as a result it generated invalid HTML.  Ooops.  Now, the doclet works by creating an HTML "document tree" using a family of internal, new HTMLTree classes. This allows pages to be created non-linearly when necessary, and allows the page to be written by simply walking the document tree. There's a special node that is used to provide user-provided HTML fragments, which may come from documentation comments or from command line options. For now, these fragments are not checked for validity, but given valid input, javadoc now generates valid compliant HTML as output.

As part of the work to make sure that javadoc generates valid HTML output, we have updated the output to meet the Section 508 accessibility guidelines as well. This has caused some minor changes in the visual appearance, such as ensuring tables have captions, headings, and so on.

Also as part of this work, we have updated javadoc to use CSS and a stylesheet. This means that it is reasonably easy to change the appearance of the generated documentation by simply replacing the stylesheet in the generated documentation.

There have been some other more subtle changes as well. javadoc used to be such that it could only be executed once in any VM. This was not a significant restriction as long as javadoc was run using the command line tool, which started a new VM for each invocation of javadoc, but it was a significant impediment for speeding up test execution in order to be able to test javadoc more, and more often. We have also started the work to convert javadoc to use the Compiler API, although more work in this area is required.

What's Next?

Now that javadoc uses a better foundation to build and generate compliant HTML, we are much better placed going forward to consider more radical changes to the contents of the pages, including the possible use of JavaScript for search and menu operations.

Having turned our attention to the way javadoc writes its output, it's now time to also turn our attention to the way it reads its input. In order to be sure we are generating truly compliant (X)HTML, we need to be able to detect issues in any user-provided (X)HTML fragments, in documentation comments or options. One way to facilitate this will be to extend the com.sun.source Compiler Tree API to provide structured access to the content of documentation comments.

Finally, the com.sun.javadoc API has largely been superseded by the new javax.lang.model Language Model API, and so with new language features on the horizon that will require javadoc support, it may be appropriate to migrate the standard doclet onto the newer API.

Thanks to Bhavesh Patel for providing feedback on this entry, and for working on the features described here.


Discussions about technical life in my part of the OpenJDK world.


« July 2016