Evidence-based management needs comprehensible information; metrics are distilled facts: not a bad fit.

Here is a series of blogs giving a metric that can be useful in many areas of XML project management, from verifying the suitability of adopting a particular schema, to making sure that only work and capabilites arising from business requirements are being carried out, to estimating the price variation that a schema change may entail.

Everyone using XML already uses a metric: well-formedness! Validity is also a metric. (I am simplifying away the difference between a metric and a measure in these blogs: pedants please lower your hackles!) But the metrics for XML on the Web are either concerned with communications and information theory, or are based on programming complexity measures, or are a little polluted by voodoo ideology about good structures and bad structures; I don’t buy into the latter, at least not at the current state of knowledge. But there is a need for a good set of metrics for XML project management, scoping and to inform XML schema governanc, so I thought people might be interested in some of the metrics I have been developing and using.

They all address different, but to me vitally important, aspects of XML projects, and most are, I hope, common sense. Of course, you can make up your own metrics as well: but I think it is good to at least have a basic vocabulary of XML metrics to use or adapt or decry as appropriate.

Element and Attribute Count

This most basic and coarse metric asks the question “How many element and attribute names are there?”

Take a schema or document set, count the unique element names and the unique attribute names, and sum them.

It is a fine metric for schemas where elements or attributes only appear in a single context, with a single meaning. For example, a flat database dump of a single table with 50 fields has a metric of 51; a dump of a single table with 100 fields has a metric of 101; the idea that in some sense the second table is twice as big as the first (as the metric suggests) is obvious.

For other kinds of documents, it becomes less attractive. Mixed content, multiple contexts, attributes used on multiple elements, all these things make a document or schema somehow more complicated, and the Element and Attribute Count metric doesn’t reflect that.