The recent working draft of XML Schemas 1.1 (Structures) seems sensible, small-scoped and solid to me. I hope vendors will hop on board and implement it when it comes out.

The drivers I see for XSD 1.1 include

  • The XQuery/XPath2/XSLT2 effort came up with a tweaked version of the XSD type system, so XSD 1.1 is a response to that. Fair enough. (The draft also adopts the notion of subsumption in reformulating the rules on restriction, which comes from XQuery people.)
  • Catching up (like a hippo racing) a little with the initial technologies proved by the ISO DSDL effort: they have added an assert element like Schematron’s for check clauses on complex types; they have updated the ambiguity rules to do with wildcards and opened up the cardinality rules on xs:all elements (changes mooted in Michael Sperberg-McQueen’s Applications of Brzozowski derivatives to XML Schema processing.) Behind this is an acceptance that DFAs are not the only possible implementation formalism or technology; the bibliography adds papers (by XML Schema Working Group members) on derivatives (as used by James Clark in Jing) and forest/tree-regular languages (as advocated by Murata-san.)
  • More promisingly, for the future, is that they have layered the grammar into one langage that allows ambiguity and another which has the ambiguity constraints: it probably doesn’t alter anything yet, but it does provide an on-ramp for reconcilation with RELAX NG; in the RELAX NG community, checks for ambiguity are (if data binding needs it) a subsequent layer on the grammar rather than being built in.
  • Developers have needed to explain their design choices better: so the draft has quite a bit more material that details the different ways that applications may apply or implement XML Schemas. Different infosets (PSVI) are defined, difference conformance levels and different invocation strategies are given.
  • And, of course, there are somethings due to housekeeping: some missing properties of the PSVI, some better explainations, some wordsmithing due to adopting is instead of must be

Of course, I am most interested in the new assert element. It is based on the assert element from my Schematron schema language; Eddie Robertsson created some XSLT stylesheets for embedding assertions in XML Schemas, and it has proved quite popular and useful. And certainly the ability to constrain types rather than names is useful, for XML Schemas. They have done the right thing by defining a larger version of XPath that can be used, though the draft seems quite fuzzy about whether to use XPath 1 or XPath 2: I cannot image that will not get sorted out though.
As with key/keyref and uniqueness, I think their assertions could be translated in Schematron readily enough.

I should emphasize that their assertions are not Schematron assertions (despite using the same element and attribute names and concept) because in Schematron the natural language statement is the crux of the schema: in XSD assertions, while you can add documentation, there is no similar emphasis on providing a framework to deliver human comprehensible messages in terms of the problem domain to users (programmers, end users). I judge validation technologies on the extent to which they can provide explicit statements of policy and specific diagnostics to humans: I think grammars all fall over in this regard because they encourage you to elide the why’s (why is order required in a certain case?) and consequently fight against efforts to align operation constraints with business requirements.

I would have preferred for this draft to allow attributes in content models, following RELAX NG. However, they have thrown a bone to people who need ISO DSDL features but are stuck with XML Schemas and have gone down the path-based validation route more, which are both signs of progress, I think. So all in all, well done XML Schema WG!