I have recently been getting back into XML processing for some upcoming work with business glossaries. So, I will be examining the relevant business glossary formats in forthcoming blog posts…

As a precursor to that, I would like to delve a bit into XML design by bringing up something that has troubled me on several occasions over the past six months. First, as I was writing a simple iTunes de-dupe utility, I had occasion to parse and process the iTunes XML format. There are many good articles on this format like this xml.com article entitled Hacking iTunes. The iTunes format is an XML data-dump of a data structure called “plist” which is a list of key/value pairs. In fact, “plist” stands for “property list”. While using XML to persist data structures enables some minimal benefits via text encoding, it seems to be harmful to the larger goal of XML being easily understandable and thus processable by many applications. So, while in a small way, dumping data structures to XML is not evil, it also is not recommended. The reason is that the data remains fairly tightly coupled to the program which produced it and thus the semantic value of the data, as a standalone entity, is diminished. In short: Better to design XML documents than dump XML data.

Besides iTunes, this cropped up again about a month ago when I was examining a Microsoft dump of system information on Windows Server 2003 - I looked for that format on the web but was unable to locate it. If you have a link to it, please post a comment with it. That system information format basically fell into the same trap of simply dumping a data structure to XML. The problem I have with this is that it shifts the semantics from a more stable element into the more variable element’s value. An element can have a unique ID. An element can be described via one or more schema types. An element can be reliably referenced externally. Thus, the semantics belong in the elements and not in the element (or attribute) value.

Do you agree?

Do you have other examples of data structures dumped to XML?

As a related reference, this xml.com article calls this a “dynamic document”. However, since I am arguing here that an XML document is an expression of design, I would say that is, at best, a misnomer. To me, dumping data structures to XML seems to be a case of “interoperability lip service”. So, do you design XML documents or dump XML data?
If so, why or why not?

Until next time, see you in the trenches. - Mike