A speech from Jon Bosak is always scintillating. This year he gave the closing keynote at the XML 2006 conference, to commemorate the 10th anniversary of when Jon present XML initially at the same conference; this first half of the talk looks back at the development and history of XML, the second half fills in the gap concerning recent developments of UBL and the current state of play. Most heartening for me is that the emotional nub of the second part of Jon’s talk is actually a challenge to, or chastisement of, software vendors for not supporting Schematron.
Jon seems to feel his experience with Schematron is emblematic of a broader unresponsiveness by software vendors: they are interested in getting ROI on their platform not on enabling users to solve their problems. And not listening to users.
The thing that probably impresses me most about Jon Bosak was the amount of buy-in he was able to generate for XML. He herded more than one hundred fractious and opinionated developers, corralled us and channeled our energy so that, at the end of the day, not only were we happy with XML but most of us thought of it somehow as being “our baby”. Jon moved from Novell to Sun, from ISO to W3C to OASIS, and from SGML to XML to the UBL (Universal Business Language) initiative. It seems just as difficult an arena, if not more so, but if anyone has the people skills to bring it together, I think Jon does; Jon seems to work by allowing people their own space and opinions even though they may differ from his own. His work comes from the “Increase the size of the cake” school rather than the “lets only support what we need” school.
Now, of course, to some extent Jon’s speech follows one strong stream of thought in Sun: that large systems are better built out of standard, simple components using a DIY approach, rather than by a platform. I’d say that Tim Bray’s anti-WS-* comments and James Clark’s anti-XSD comments come out of much the same mindset. Indeed, XML came out of this mindset. (And DIY fits in with Sun’s business, of course, so it is not surprising that someone with this view works there. Remember the early slogan “XML gives Java something to do” which is now being replaced by “XML gives JavaScript something to do” which may be replaced by “JSON gives ECMAScript something to do”)
Another blogger on Jon’s speech is Microsoft’s Mike Champion. Mike is a guy who opinions I respect enough to disagree with. When Jon talks about software vendors, it is hard not to think that Mike is not one of the people Jon would be addressing. Now Mike is hardly a rabid complicator: his blog XML Schema is the root of WS-Evil? agrees with some of Dare Obasanjo’s criticisms of XSD and WS-*: there is just as much a diversity of opinions at Microsoft as at Sun, it seems.
Anyway, Mike responds to Jon’s criticism of vendor’s non-agiility and lack of responsiveness to user requirements as “Furthermore (I was told), business users will never adopt a solution that depends on an additional XSLT pass because it would require them to learn something new (never mind that this “new” thing had been widely employed in other contexts for years).” Mike’s spin is “Those vendors may know their customers, and suspect that they really don’t want to learn yet another XML technology just because it would offer an elegant solution to a somewhat peripheral problem.”
What is interesting about Mike’s comments are that they are not at all a criticism of Schematron (in fact, Mike has said nice things about it several times; Dare Obasanjo even wrote an good article on it for MSDN,) They are a comment on XSD, that it is so bloated and ennervating that users are at their breaking point, or resistance point.
This was a thought that I mentioned to many people in the early days of Schematron. It has always been clear to me that, apart from pioneers, desparados, hackers and custom system integrators (of the style that Jon Bosak mentions), XSD would suck the mental oxygen out of the air for several years. Both in a positive way, as people explore the new possibilities of XSD, and in a negative way, as they pathetically hope against hope to find ways to express constraints that are not ultimately related to database storage issues.
But the air is clearing now, it seems to me. I am regularly hearing of larger projects successfully using Schematron. The Lloyd’s markets use is a good one, but I am also hearing of government pilot projects. I don’t consider Schematron to be a killer app; instead I think it is one of those technologies that will gradually become just part of the furniture. The reasons? powerful, easy to implement using ubiqitous technology, standard, no rivals, and most importantly, it is human-centered. I think it fits in really well with the kind of XML approach. One big thing that Schematron has going for it is that it is terribly platform-neutral; you can use it with XSD systems or RELAX NG or DTDs or even with no schemas; you can use it for message gateways, for forms validation, for business rules checking, as well as for document validation. Just this week, I’ve posted my pre-beta implementation of ISO Schematron to the schematron-love-in maillist, in order to get feedback from developers.
Jon is a can-do pragmatist: I’m still the kind of person that starts looking for an open-source solution if the commercial vendors don’t have the wit or the nerve to provide the capabilities I’m looking for.
But more than that, he understands the human-centred vision behind XML (and Schematron): We’ve wandered off into the weeds of commercialization and forgotten that the web we’ve got is the most primitive form of hypertext that could be imagined — which is why it works, and I don’t want to deny that. But this focus on the money to be made right at the start has led us into an explosion of XML applications that focus purely on the exchange of data between computer systems. We’ve lost track of the human aspect of this to the point where even an organization whose very purpose is the advancement of XML considers it unsuitable for human consumption and requires its specifications to be issued in forms tied to the printed page.


I think schematron has it's place in the world, but the amount of power that it has by using XPath is also one of it's barriers to adoption. At the B2B standards specification level, the vast majority of people are business analysts that are working on these specifications. The typical technical person that would know and understand XPath does not typically participate at the level that the data specifications are being defined. Schematron is great for an add on to either RelaxNG or XSD, but as a standalone does it all tool, I don't think it fits every situation. If you can get Schematron to be transformed and created out of data model, i.e. UML, then it has a better oppurtunity to be implemented more widely.
Schematron fits great as an add on for describing and validating business rules or external code lists. The same business folks that cringe at XSD and RelaxNG, are just as confused by Schematron. So in many ways it still suffers from the same human-centered difficulties XSD and Relax suffer.
Hi David. Yes, you are right that it would be good if vendors implemented UML-to-Schematron and ER-to-Schematron converters; I think it is a matter of time, as Schematron becomes part of the furniture.
I don't have a killer app mentality about Schematron, it is the kind of thing what will be slow and steady and grassroots-driven rather than explosive: for example, I think some organizations won't adopt it until governance requirements and internal quality strictures force IT people to put in business rules validation.
But there are a lot of potential Schematron products and systems that have not been explored yet. I am testing a open source Schematron implementation for Ant this week, for example; it makes it much easier for batch validation for developers using Ant. The more that these kind of basic infrastructure has Schematron bindings, the more that it can be a viable solution to seep up to the non-technical level.
I have already shown in a previous blog how ER diagrams can be converted into Schematron. So UML class diagrams should be possible along the same lines. (I guess it would be possible to make a version of Schematron with OCL as the query language binding for assertion tests, but it doesn't sound appealing to me.)
Interesting, because ER and UML class diagrams frequently don't force particular element order and don't have the same element appearing in multiple contexts with dfferent occurrence constraints within a content model, generation of Schematron-style constraints should be much more straightforward than generating grammars.
As for resistance to XPath being a blocker for Schematron (especially for business rules and codes), the distinctive thing about Schematron is that requirements analysts can operate entirely in terms of natural language statements. I don't believe that business folk don't speak natural language. Techies can add the assertion tests later, just as currently they would write in some non-standard/long-winded format.
I think having that code generation ability from a meta-model or data model to schematron would help greatly. It's the right tool for specific jobs, and particularly in business rules. However, I wouldn't want to necessarily code by hand the necessary assertions to enforce the data schematic constraints on something as robust as the OAGI 9.0 repository or even the UBL 2.0 repositories. Some of the Xpaths that could be involved would be quite ugly. Now using it to enforce business rules is a much smaller subset as compared to the large data schematic these repositories have. These rules are typically coded by the techies that are more than likely to be familiar with XPath, but even this group may or may not be as familiar with it as one would think. It amazes me how many programmers that deal with XML don't know XSLT or XPath. They data bind everything to a programming language like Java or C#.
Many of the people designing the B2B standards are not the techies, and most large B2B organizations don't have techies working on the Schema data model. The people doing the work are using a program of some sort to go from the Platform Independent Model to the Platform Specific Model. Most of them don't know or even want to know what the specific code is generating, they just want to push a button and it generates. That is the level that Schematron or any language needs to get to. It is what XSD is starting to get to for the base languages. Just take a look at hyperModel for a good example of UML to XSD code generation and XSD to UML.
I'm of the feeling that even if a tool generates the code, you should at least understand what it is it generates. :)
Yes, techies who are writing business rules using Java and C# will probably rejoice at how much simpler using XPaths can be in many cases, and how much simpler using Schematron is once you have more than a couple of constraints. The current generation of programmers has grown up on CSS selectors and XSLT1 and XSD's XPath subset, with JUnit assertions and C/Java assert(); the technical infrastructure is seeded with people who grok Schematron fast.
But when you say "I think having that code generation ability from a meta-model or data model to schematron would help greatly" then the issue is not with Schematron but with people who make code generation systems: vendors in particular. Hence Jon's call.
I certainly agree that there is lots of scope for tools to present user interaction in terms of the metamodel, and hide Schematron; Schematron even has a feature called abstract patterns to parameterize Xpaths so that you can separate modeling constraints from implementation details, too.
Jeff,
Where can i find that Schematron implementation for Ant you're testing?
Who is Jeff?
Chrisophe Lauret at Allette Systems is writing it. I am testing it today right now, actually. I think he plans to release it open source next week, and it would be great if someone would take on the task of progressing it through Apache to become a standard part of Ant. If we can get Schematron into a few of the major open source platforms, such as Ant and libxslt, it really would help ease of adoption.
Rick,
Sorry for the name-calling. By the way that Schematron thing is a no nonsense, kick ass way of doing business rules validation. Thank you for inventing it.
Thanks.
I get called Jeff all the time, for example this week on comment to another post. What surprises me is when people call me Jeff who haven't seen or heard my surname...I must look like a real Jeff. My uncle, Uncle Dick (Aka Prof. Derrick Jelliffe of UCLA School of Tropical Medicine), always said it could be worse.