ONDotNet.com    
 Published on ONDotNet.com (http://www.ondotnet.com/)
 See this if you're having trouble printing code examples


Using NDoc: Adding World-Class Documentation to Your .NET Components

by Shawn Van Ness
12/09/2002

I've never been a big fan of source-code-based documentation generators -- tools that attempt to produce reference documentation by mining specially- formatted comments out of source code; at least, not a big fan of any of the popular ones that have existed for C++ and Java -- and certainly not for any purpose greater than supplementing an internal spec doc. The concept is clearly of great value: by scanning the source code, the doc-generator can alert the author to any code items that are missing documentation. But historically, in this developer's opinion, they've always suffered from too many problems to be truly useful -- from buggy parsers to a wide range of usability issues, I've just never met a doc-generator that was worth the effort.

In fact, I so despised the lot of them that I once took a crack at writing one myself. But the rumors are true -- I've recently met a new documentation generator, and I've fallen in love. Its name is NDoc, and I do believe it loves me too.

NDoc is not Just Another Source-Code-Based Document Generator

Strictly speaking, NDoc isn't really a source code-based documentation generator, at all. It's simply a very fancy set of XSLT templates that consume the XML doc-comments emitted by the C# compiler, and emit a folder full of HTML files (in a style very reminiscent of the .NET docs on MSDN) and associated manifests for building compiled HtmlHelp modules, or CHM files. All this, wrapped by a nice GUI and command-line interface, of course.

Related Reading

Programming C#
By Jesse Liberty

What's the problem with source-code-based documentation generators? Historically, tools that mean to generate docs based on source code face two fundamental problems. The first problem is that they rely on third-party parsers, which have a tendency to be buggy. Let's face it, writing a parser for languages like C++ and IDL is a serious challenge. Modern languages like Java and C#, perhaps less so -- but still it's a non-trivial exercise, and there are bound to be bugs, oversights, and omissions. Sometimes it's something complex, like a nested class with a static constructor, other times it's something simple like a variable name containing an "ü" character. But there always seems to be some obscure corner case in the language's grammar that doesn't play well with the documentation generator's parser.

The other problem -- suffered by virtually all traditional documentation generators -- is that they tend to swamp one's code with "green". That is, if you actually want to write more than a sentence or two for a class or method (heaven forbid!), your actual code will become lost in a sea of doc-comments.

Furthermore, any serious development project with external documentation deliverables will, at some point, probably want to hand off the reference documentation to a team of technical writers. But nobody wants a horde of tech writers stomping all over their product's source code control system! What to do? There's just no easy way to separate doc-comments from source -- the two are inextricably linked (which, arguably, is the point of having a source-code-based documentation generator in the first place).

And all of this is to say nothing of the fact that source code text editors do not typically offer a very user-friendly interface for writing and maintaining English-language documentation.

I attempted to solve this second problem when I created DocGen (see References) by keeping the English-language documentation in a separate XML file, and "weaving" it into the code structure parsed from an IDL file. The technique worked well -- the XML file itself could be handed off to a technical writer, even checked out separately so as not to interfere with the development effort proper. And yet, the tool could still alert us to missing documentation, as it attempted to match the contents of the IDL file to elements in the XML file. Unfortunately, DocGen failed to rise above the first problem (IDL is a very ugly language, and our parser was a veritable nest of bugs). But the premise was sound.

NDoc (and .NET) to the Rescue!

Addressing the first problem, NDoc doesn't rely on a buggy C# parser. It doesn't have to scan your C# code at all -- it uses .NET reflection to peer inside of your compiled assemblies' metadata tables, which contain the name of every class, structure, method, event, and so forth, in your entire project. Unlike type libraries in the days of COM, the type information in a .NET assembly is complete and true, with nothing left to the imagination -- perfectly suited for generating full-fidelity documentation!

Addressing the second problem, NDoc doesn't force you to drown your source code in a sea of doc comments -- it fetches documentation text from an XML file. NDoc matches documentation from this XML file against type information from a .NET assembly to produce a folder full of HTML files (and optionally, a compiled CHM file). Although many programmers out there do choose to use C#'s doc-comment feature to generate this XML file, it's not strictly mandatory. And, as we'll see later, there are ways to point the C# compiler to an external XML file, as well.

Now how much would you pay? Nothing? That's fine, because NDoc is freely available on SourceForge!

A Quick Overview of C#'s XML Documentation Feature

The C# compiler includes a command-line switch (/doc:<file>) that instructs the compiler to emit an XML file containing docs mined from the C# source code. Such docs are marked up as XML, embedded within line comments that begin with a triple forward-slash sequence, like so:


using System;

namespace Arithex.Samples
{
  /// <summary>
  /// This is a summary of the Foo class.
  /// </summary>
  public class Foo
  {
    /// <summary>
    /// This is a summary of the Bar method.
    /// </summary>
    public void Bar(int x, int y)
    { }
  }
}

The XML produced for the above C# code looks something like this:


<?xml version="1.0"?>
<doc>
  <assembly>
    <name>ClassLibrary1</name>
  </assembly>
  <members>
    <member name="T:Arithex.Samples.Foo">
      <summary>
      This is a summary of the Foo class.
      </summary>
    </member>
    <member 
         name="M:Arithex.Samples.Foo.Bar(System.Int32,
                                         System.Int32)"
    >
      <summary>
      This is a summary of the Bar method.
      </summary>
    </member>
  </members>
</doc>

Now, this XML "schema" is a bit obtuse -- it models pretty much everything as a <member> element, with a single attribute, name, that indicates whether the item is a type, a method, a property, an event, or a field, via a single-letter prefix. The details (such as the types of a method's parameters) are also encoded into the name attribute in a crufty text format. It's ugly, but I suppose it gets the job done -- NDoc can consume it, and that's all that matters! For a thorough tour of C#'s doc comment feature and the XML it generates, I refer you to the MSDN links in the References section of this article. However, as a convenient reference, I've created a mapping of which XML doc-comment elements apply to which aspects of the C# language, and vice versa, in Tables 1 and 2.

Table 1: XML doc comment elements vs. C# language elements

XML doc element C# language element
<c> n/a -- used to mark up code, inline
<code> n/a -- used to mark up blocks of code, as in an example
<example> n/a -- used to declare a block of example code
<excepetion> constructor, method, property -- used to declare the exceptions your code might throw
<include> n/a -- used to import some XML docs from an external XML file
<list> n/a -- used to mark up a list of items
<para> n/a -- used to separate paragraphs
<param> methods, constructors -- used to provide documentation for individual parameters
<paramref> n/a -- used to mark up a reference to a method parameter
<permission> methods, constructors, properties -- used to declare which .NET permissions are needed in order to call a method
<remarks> classes, structs, interfaces, enums, methods, constructors, properties, events, fields -- this is the "long" description of a C# code element
<returns> methods, delegates
<see> n/a -- used to mark up a link to another documentation item, inline
<seealso> anything -- used to provide a manifest of related documentation items
<summary> classes, structs, interfaces, enums, methods, constructors, properties, events, fields -- this is the "short" description of a C# code element
<value> properties

Table 2: C# language elements vs. XML doc-comment elements

C# language element XML doc elements
class <summary>, <remarks>, <seealso>*
struct <summary>, <remarks>, <seealso>*
interface <summary>, <remarks>, <seealso>*
delegate <summary>, <remarks>, <seealso>*, <param>*, <returns>
enum <summary>, <remarks>, <seealso>*
constructor <summary>, <remarks>, <seealso>*, <param>*, <permission>*, <exception>*
property <summary>, <remarks>, <seealso>*, <value>, <permission>*, <exception>*
method <summary>, <remarks>, <seealso>*, <param>*, <returns>, <permission>*, <exception>*
event <summary>, <remarks>, <seealso>*

*may have multiple instances of this element

Using the <include> Tag To Keep Docs Out of Sight, But Not Out of Mind

As I mentioned earlier, one of the traditional problems with source-code-based documentation generators is that in order to write full-fledged documentation for your product, you must swamp your code in comments. (Too many comments? Yes indeed, it is always possible to have too much of a good thing.)

The doc-comment feature of the C# compiler addresses this problem nicely by allowing you to keep the meat of your documentation in a separate XML file. You need only insert one line of doc-comment code, the <include> element, to link that part of your source code to the appropriate bits in your XML doc.

Use of the <include> element requires you to specify two attributes: the relative path to the XML file, and an XPath expression to point to the XML node(s) to include. If you're new to XPath, this is a perfect opportunity to learn this important bit of XML technology -- see the References section for links to some XPath tutorials.


/// <include file='myrefdocs.xml' 
///     path='//Class[@name="Foo"]/summary'
/// />
/// <include file='myrefdocs.xml' 
///     path='//Class[@name="Foo"]/remarks'
/// />
/// <include file='myrefdocs.xml' 
///     path='//Class[@name="Foo"]/seealso'
/// />
public class Foo
{ ... }

The XPath expressions in the above example grab the <summary>, <remarks>, and <seealso> child elements from all <Class> elements with the name="Foo" attribute. I just made this up -- you're free to invent your own XML grammar to organize your documentation.


<?xml version="1.0"?>
<Library>
  <Namespace name="Arithex.Samples">
    <Class name="Foo">
      <summary>This is a sample class.</summary>
      <remarks>
        Here are some interesting remarks about the Foo sample class.
      </remarks>
      <seealso cref="Bar" />
    </Class>
  </Namespace>
</Library>

For serious software engineering projects, it's well worth your time to make use of the <include> element in this way. You can split the documentation up among multiple XML files, in any way you wish -- you'll be able to hand off these XML doc files to a team of tech writers, or even other engineers, who'll be able to check the files in and out of source code control independent of the actual C# source code, thus avoiding many merging-related headaches for you or your team. For example, in a project I'm currently working on, we've decided to split the XML documentation into separate peer files for each C# file -- where Foo.cs would be accompanied by a Foo.xml file, which contains the meat of the documentation. For smaller projects, keeping all of the docs in single XML file may serve you better. You can even choose to mix and match (perhaps keeping the <summary> elements inline in your C# code, and everything else out-of-line in a separate XML file).

NDoc Extensions to C#'s XML Documentation Tags

There are a few important bits missing from Microsoft's specification of C#'s XML Documentation feature. The most notable of these is a way to specify a common summary for the overloads of a function. This is where NDoc really comes into its own: because the format of the docs output by the C# compiler is XML, it's trivial for NDoc to extend the schema to include any missing bits. The C# compiler doesn't complain -- it will pass through any XML tags it doesn't recognize into the resulting documentation, where they can be interpreted by third-party tools (like NDoc).

NDoc extends the set of Microsoft-defined tags by defining a new <overloads> tag, a new <events> tag for documenting the events raised by a member function, and extending the grammar of the <see> and <seealso> tags to allow linking to arbitrary URLs (not just other code items).

You can find some rudimentary documentation for these extended tags in the tags.html file, which is included in the NDoc distribution. The <overloads> tag requires a bit of explanation -- it comes in two flavors: simple and complex.

The simple form of <overloads> is used very much like a <summary> tag, to provide a short description string for the suite of overloaded methods. (If you neglect to provide an <overloads> tag, NDoc will use the <summary> tag on the first overload to describe the overloaded method.)

Figure 1: NDoc-generated text for an overloaded method

NDoc-generated text for an overloaded method

The more complex form of <overloads> allows you to specify other common documentation items, such as a <remarks> section. Unfortunately, it's not a general-purpose dumping ground for all documentation items that are relevant to the individual overloaded methods -- you can't, for example, place documentation for the overloads' common parameters in the <overloads> element and expect them to show up. However, with a little creative application of the <include> tag, you can easily achieve this effect. The sample project accompanying this article demonstrates the combined use the <overloads> and <include> tags, for an overloaded method named Solve:


<Method name="Solve">
  <docs>
    <overloads>Attempts to solve the function.</overloads>
    <param name="guessX1">An initial guess.</param>
    <param name="guessX2">Another initial guess.</param>
    <param name="resultPrecision">
      Desired precision for solution.
    </param>
    <param name="resultStatus">
      The outcome of the solution.
    </param>
    <returns>
      The root of the equation (or Double.NaN, upon error).
    </returns>
  </docs>
</Method>

<Method name="Solve1">
  <docs>
    <summary>Attempts to solve the function, using the default 
      max. number of iterations.</summary>
    <remarks>Attempts to solve the function, using the default 
      max. number of iterations.</remarks>
  </docs>
</Method>

<Method name="Solve2">
  <docs>
    <summary>Attempts to solve the function, using the specified 
      max. number of iterations.</summary>
    <remarks>Attempts to solve the function, using the specified 
      max. number of iterations.</remarks>
    <param name="maxIterations">
      Maximum number of iterations to search.
    </param>
  </docs>
</Method>

Linking to External Documentation

One of the nicer features of NDoc is the way it provides seamless links to documentation in the .NET Framework SDK -- it even goes so far as to extract XML documentation for members inherited from their respective assemblies' XML doc-comment files -- but it can only provide links for the types it knows about.

For other types -- types defined in third-party assemblies -- you're on your own. That's not to say the situation is impossible, just that you have to do a little work.

Let's say you have a method with a parameter of type AcmeSoft.Foo, defined in AcmeSoft.dll. You know that AcmeSoft has wonderful documentation for their product, available online at acmesoft.com, but there's a problem: NDoc will generate a hyperlink from your method's page to a non-existent help topic "AcmeSoft.Foo.html". If you try to follow this link in the HtmlHelp browser, you'll be greeted with an error message. That page doesn't exist in your CHM file, because that type doesn't exist anywhere in your assembly, so NDoc didn't generate it. To rectify the situation, all you need to do is supply the missing ingredient: author an HTML page (manually), name it "AcmeSoft.Foo.html," and pre-place it into the NDoc output directory before running NDoc (or at least before running the HtmlHelp compiler).

It's a bit of a hack -- relying on an accident of NDoc's implementation, perhaps -- but it works, until a future version of NDoc offers a better alternative. And by writing your own page, you're allowed the flexibility you need to warn the user that they're about to start reading third-party documentation, and/or that you're about to connect to them to the Internet, etc.

Conclusion

With each new project I work on, I grow continuously more impressed with the flexibility and versatility of NDoc. It's saved me hundreds of hours of tedious labor engineering my own MSDN-style documentation solution, and it looks good doing it!

For a jumpstart into the wonderful world of NDoc, download the sample project accompanying this article, and see if you don't fall in love yourself.

References

Shawn Van Ness is an independent consultant specializing in the development of secure, efficient, and robust distributed software systems, based on .NET, COM, and XML technologies.


Return to ONDotnet.com

Copyright © 2009 O'Reilly Media, Inc.