Saturday, July 19, 2008

Automating Documentation

Lately I have been doing quite a bit of documentation of other people's code, on the premise (with apologies to George Bernard Shaw) that "those who can, code; those who can't, document" :-).

The documentation is primarily for home-grown application level frameworks based on Spring, which allow us to plug in custom behavior using implementations of predefined classes, using hook points in the standard strategy code. The hooks are exposed as bean properties of the strategy bean, and the properties default to classes that define the standard behavior. As you can imagine, this can quickly lead to XML hell, unless you know what the hooks allow, and what custom classes already exist to modify a certain behavior. So documenting the custom classes and where they fit in can help new developers get up to speed on the framework more quickly.

Yet another reason to document is for non-coders to look at and suggest improvements. Due to the nature of our business, a lot of people outside the programming group have a very solid understanding of our technology, and are therefore in a great position to suggest improvements we (coders) haven't thought of. However, they don't write code anymore, so they cannot see in-line comments in code.

So while I firmly maintain (like most programmers), that the best place to write documentation is in-lining it in the code itself, there are enough reasons to spend the effort documenting the code for people who are yet to look at the code, or who will never look at it. However, rather than writing documentation separate from the code, my preferred approach is to generate the documentation from the in-line code comments.

This approach has (at least) three important advantages. One, it encourages programmers to write better in-line documentation in their classes. Two, it allows the documentation to keep up with a rapidly evolving code base without getting stale. Three, it eliminates the drudgery of writing documentation, one reason why, in any project, documentation never keeps up with code, unless you have a dedicated documentation writer for your project.

So, given an applicationContext.xml file, its easy to pull out the bean names that are of a certain class, like so:

1
2
3
4
    ApplicationContext context = 
      new ClassPathXmlApplicationContext("classpath:applicationContext.xml");
    Set<String> beanNames = new HashSet<String>();
    beanNames.addAll(context.getBeanNamesForType(MyStrategyClass.class).asList());

Given these bean names, we can use a standard XML parser toolkit such as JDOM to parse out the beans whose names are in our list of bean names.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
    private static final Namespace BEANS_NS = 
      Namespace.getNamespace("http://www.springframework.org/schema/beans");
    ...
    List<Element> beanElements = root.getChildren("bean", BEANS_NS);
    for (Element beanElement : beanElements) {
      String id = beanElement.getAttributeValue("id");
      if (beanNames.contains(id)) {
        documentBean(beanElement);
      }
    }

The documentBean() method will go through each of the properties in the bean and attempt to document them as well. If the property has a ref attribute, then it grabs the bean definition for that ref and calls documentBean() on the ref element. I do not show the code for the documentBean() method since it is very implementation specific (I write to a wiki format, some others may write to a DocBook XML or HTML format, etc).

The code above will expose the structure, but does not yet describe what each bean does. For that, I rely on the class level Javadocs for each class. To extract the class level Javadocs, I use the QDox library, using the following code. QDox parses the source Java files, so you will need those around in a defined SOURCE_PATH somewhere.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
  private static final String SOURCE_PATH = "src/main/java/";
  ...
  private String getClassLevelJavadoc(String className) throws Exception {
    File javaFile = new File(SOURCE_PATH + StringUtils.replace(className, ".", "/") + ".java");
    JavaDocBuilder builder = new JavaDocBuilder();
    JavaSource source = builder.addSource(javaFile);
    JavaClass[] javaClasses = source.getClasses();
    JavaClass mainClass = null;
    for (JavaClass javaClass : javaClasses) {
      if (javaClass.getFullyQualifiedName().equals(className)) {
        mainClass = javaClass;
        break;
      }
    }
    String comment = mainClass.getComment();
    // post-process the comment (implementation specific)
    // ...
    return comment;
  }

You may need to post-process the comments returned if you are writing to a wiki. For example, my post-processing code replaces a single newline with a space, but two newlines with a single one, and escapes WikiWords.

QDox is not limited to pulling out only the class-level Javadocs. It parses the source file into a bean that allows you to access both class and method level tags by name, as well as a host of other things. However, in my case, my documentation needs are satisfied by a well-written class level Javadoc comment, so that's what I used.

A better known approach in the Java world to do this sort of thing is to use XDoclet. In SQLUnit, one of my open source projects, I used it to parse out custom @sqlunit.xxx class and method level Javadoc tags and convert them to DocBook XML snippets, which I then imported into the source for my User Guide. While QDox solves a similar problem, it is simpler to use in my opinion, since you don't have to write XML converters.

Be the first to comment. Comments are moderated to prevent spam.