I’ve always been fascinated by dictionaries. We create dictionaries one word at a time, attempting to nail down within a printable tome all of the words the form a language, and for many people such dictionaries represent something immutable, a stamp of authenticity that states that this is the proper way to represent legitimate forms of spelling, legitimate statements of meaning, and in many cases some form of an audit trail that attempts to provide the history of the words.
For all that, dictionaries are fundamentally arbitrary. Someone has to make an editorial decision as to whether a given word is in fact a legitimate word with the language. Is sombrero an English word, or a Spanish one. Is degauss, to remove a magnetic field, a term of technical jargon or a formal member of the language. Is it theater or theatre? My teenage daughter talks about her and her friends being “random”, by which she means that she’s not easily categorizable to modern marketing efforts. Random, to me, is a stochastic definition that means that there is no readily discernable pattern dictating the results of a given function. While there is some commonality to both terms, they are not the same. Is my daughter’s definition of random valid? Is mine?
As I become more heavily involved with ontology and computational semantics, I find this distinction between the notion of a standard as being authoritative vs. the arbitrary nature of developing such standards both troubling and insightful. A standard, if it is well designed, is not something etched in stone; rather, it represents (in the best case) the ideal balance between internal integrity and the necessity of capturing complex (and often intangible) ideas in a world where the basis of authority is often remarkably flimsy.
Several years ago, when I was staying with my uncle in Washington, DC after graduating from college, I went by the Library of Congress (something that I did quite often that year) to check out some of the secondary libraries that are part of the library complex. Sitting at the computer catalog in the music and film library, close enough to the front desk to hear the conversations, I saw a very genteel lady come up to the librarian at the front desk.
“Excuse me,” she said, in a clipped British accent.
“Yes?” the librarian responded. “Can I help you?”
“Please. I’m trying to find the origin of the term ‘chorale’.”
“I can help you with that. Let me check the Oxford English Dictionary.”
“Oh, dear. You see that’s the problem, isn’t it?”
“Huh?”
“Well, I’m from the Oxford English Dictionary …”
Who do the authorities go to when seeking meaning?
Ontologies are tricky things. The first part, defining the terminology inherent within an ontology, places a significant burden upon the professional ontologist (or even the well meaning amateur), because such ontologies must perforce describe conventions, rather than just terms. You are creating the terms of discussion, the set of primitives that make up the relevant language, whether that language is describing a workflow or, well, a chorale.
Yet beyond this initial stage comes the more complex process of defining the relationships between the various tokens or terms of the ontology. Alfred North Whitehead and Bertrand Russell attempted to do this with the intrinsic predicates of logic, in the years before Kurt Goedel dashed their hopes of ever completely defining mathematics irrevocably. No ontology can ever be complete, for there will always be predicates that exist which cannot in fact be stated within that ontology conclusively. This is the curse of the ontologist, the realization that there are in fact intrinsic limits to the ability of defining any language.
We, programmers and computational linguists, are the intellectual descendants of these three prodigies, tasked with the process of using language to drive computers and build models. Yet all too often I wonder whether we have, individually or collectively, lost track of the ultimate futility of what we are attempting to do, losing track of the Heisenbergian uncertainty associated with the process of “standardizing” the languages, human or computational, that all of our efforts are subject to.
Lately, one of the more interesting, and heretical, notions that I have been playing with is the idea that schemas not only can’t be considered to be canonical, but that thinking of schemas as being canonical can prove to be extraordinarily crippling to building complex applications. One of the things that XML does, more than any other language, is to provide a clearcut vehicle for separating the schema from the schema instance. In most computer languages, schema (also known as type, in this context) is inextricably woven together with the instance - the type acts as a formal template, and changing that type (through inheritance or other similar mechanisms) will also change the instance in a one to one relationship.
Yet with XML, the schema is not in fact intrinsically bound to the instance. For most people, this likely doesn’t make a big difference, but I think in fact that this decoupling of schema and instance is perhaps one of the most profound and important aspects of XML, because it recognizes that even definitions can be malleable.
Consider, for instance, one of the more troubling aspects of form design - the concept of dynamic enumerations. For instance, suppose that you have a particular field in your XML based invoice system that is meant to contain the access code for one branch of a company (say, for instance, Starbucks). Now, in any given city there are likely dozens of Starbucks, and the number seems to be increasingly daily. This is information that you cannot in fact encode within a normal XSD schema. You can describe the fact that a given store access code may have a particular pattern or you can create a pointer to a web service, but that pointer still requires that there be some explicit “defining” done beyond that which can be encoded in schema.
This is one of the difficulties inherent in model driven architecture (MDAs). It is rather fiendishly difficult to describe that model completely, because part of the nature of that model is that it may be both dynamic (dependent upon information that may be transitory in nature) and malleable (even static parts of the model may have relevancies that are difficult to encode within any one given schematic language). Up until now, this really hasn’t been that big of an issue - relational database models were considerably more static and self-constrained in terms of their schematic descriptions, and as such the type of MDAs possible from a relational database system were fairly limited.
However, with XSLT you introduce the notion of transformability - you can dynamically create schemas that in turn describe not only models but also possess the capability of displaying those models within a given context (such as XForms, XHTML, or SVG) that allow for the modification of the appropriate instances. This raises some tantalizing prospects - if you can generate your interfaces automatically from schemas, you can eliminate a lot of very expensive programming … and you can also handle the depiction of remarkably complex forms that can handle a great deal of interdependency.
I recently had a comment on a different blog post that the forms generated in this manner are visually uninteresting. Actually, though I find that while there is some validity in the criticism, this is perhaps not a reflection so much of the process (or concept) but rather derives from the fact that our understanding of what is meant by schema tends to be limited to that of static schemas. In point of fact, you can create extraordinarily interesting (and powerful) forms by recognizing that three additional points must apply:
Similarly, I find it useful in creating MDA solutions where potential ambiguous situations (such as the identification of a containing element holding a collection of disparate containers and elements as a “tabset”) can radically simplify the process of letting the content itself drive the presentation, and the views so created are considerably more interesting (and can be done in far less time) than the equivalent architectures for hand-derived forms.
Of course this also raises the rather worrisome prospect that so much more work will be needed to specify the additional cloud of quasi-schemas (and their processors) that the resulting effort will far exceed designing the same piece manually. There is a certain degree of legitimacy in such concerns - the amount of effort involved in building such “generators” is not insignificant, and requires some fairly sophisticated XSLT skills. However, in most cases, what becomes critical in building such systems is to recognize that you have both semantic-bound and semantic-less processors at work, and the more you can move toward the latter, the greater the degree of reuse of your code.
A semantic-bound processor is one that needs to know something about the semantics inherent in the instance in order to perform some actions. A schematron “processor” is semantically bound - you are generally looking for specific named nodes in order to perform either an assertion or a report on the given object. However, the generator for a schematron is semantic-less (relative to the initial instance); it does not in fact need to know anything about the semantics of the instance, only the semantics of schematron itself. Similarly, a language that describes such things as the fact that certain elements should be treated as bags while others are either records or collections of records is semantically-bound, but the generator for processing these entities should be semantic-less - the generator should not know anything about the semantics of the instance.
The beauty of such semantic-less processing is that the tools for building the resultant generators will perforce be applicable to a large number of potential instances, regardless of their underlying semantics. This is the real power of such model driven processing … by working with information at a level where the semantics is irrelevant (by placing as much of the semantics into manipulatable schemas as you can) you are able to create very sophisticated applications regardless of the domain involved without having to become enmeshed in the minutae of scripting - not that the scripting isn’t there, but the scripting acts in an encapsulated fashion as a binding to provide behavior to elements in the declarative module, so doesn’t need to be exposed to the integrating developer.
Thus, by seeing a schema as a dynamic document, one constrained by the instances it describes but that is nonetheless quite malleable even given that, model driven architecture becomes viable. There is one last aspect of this that’s worth exploring - visual presentation. A CSS document is a form of schema, though this point is typically lost because people tend to view CSS as ultimately static and typically applied at the very end of the presentation pipeline.
However, CSS can also be generated from XSLT - and in many ways should be generated from XSLT in model driven applications. You can create reasonably sophisticated applications just knowing the general shape of the various elements in the schema (bag, collection, record, rich text, external list, and so forth). Yet if you assume the two stage transformation where you process the schema at the semantic-less level in order to generate an intermediate transformation that in turn is able to handle customization at the semantic-bound level, possibly with XForms or other bound-node technology, you can handle the relevant CSS generation where it differs from stock by treating it as a CSS schema at the second level of transformation, likely from some XML “style-page” source.
While this may seem like a lot of work to build this customization (such as placing form elements in a very tightly constrained grid that looks like a specific paper form) the important thing to remember here is that you would have done this work anyway. A well designed generative system should provide the relevant hooks for you to provide your own customizations, but the customizations, since they require human judgment and human aesthetics, still need to be done by a human being. By creating XML-based CSS “schemas”, you also gain the advantage of being able to automatically create visual interfaces for automating the task of creating the customization, as well as creating a mechanism for documenting the semantics so that a CSS programmer won’t have to try to guess what exactly “.thisDiv” meant in a five thousand line CSS document.
I’m hoping in the next couple of months to publish as an open source project that encapsulates these ideas. There are others who are playing in this arena as well (most notably Dan McCreary, a friend of mine and XML expert who has been doing some very nice things with schema driven development), and I find on my XForms.org site that others who are working with XForms are beginning to see the benefits of this approach as well. My hope (and intent) is to move us all past the point of being independent developers striking out in the darkness and pull us together into a cohesive movement, built around the principle that model driven (or more properly, schema driven) design is both viable and necessary, especially for large scale systems. If you find that the words I’ve written here resonate, please contact me.
Kurt Cagle is a writer and software architect specializing in XML based technologies. He’s also the webmaster of <a href=”http://www.xforms.org”>XForms.org</a>, a community foum focused at XForms and Model Driven Architecture.
For all that, dictionaries are fundamentally arbitrary. Someone has to make an editorial decision as to whether a given word is in fact a legitimate word with the language. Is sombrero an English word, or a Spanish one. Is degauss, to remove a magnetic field, a term of technical jargon or a formal member of the language. Is it theater or theatre? My teenage daughter talks about her and her friends being “random”, by which she means that she’s not easily categorizable to modern marketing efforts. Random, to me, is a stochastic definition that means that there is no readily discernable pattern dictating the results of a given function. While there is some commonality to both terms, they are not the same. Is my daughter’s definition of random valid? Is mine?
As I become more heavily involved with ontology and computational semantics, I find this distinction between the notion of a standard as being authoritative vs. the arbitrary nature of developing such standards both troubling and insightful. A standard, if it is well designed, is not something etched in stone; rather, it represents (in the best case) the ideal balance between internal integrity and the necessity of capturing complex (and often intangible) ideas in a world where the basis of authority is often remarkably flimsy.
Several years ago, when I was staying with my uncle in Washington, DC after graduating from college, I went by the Library of Congress (something that I did quite often that year) to check out some of the secondary libraries that are part of the library complex. Sitting at the computer catalog in the music and film library, close enough to the front desk to hear the conversations, I saw a very genteel lady come up to the librarian at the front desk.
“Excuse me,” she said, in a clipped British accent.
“Yes?” the librarian responded. “Can I help you?”
“Please. I’m trying to find the origin of the term ‘chorale’.”
“I can help you with that. Let me check the Oxford English Dictionary.”
“Oh, dear. You see that’s the problem, isn’t it?”
“Huh?”
“Well, I’m from the Oxford English Dictionary …”
Who do the authorities go to when seeking meaning?
Ontologies are tricky things. The first part, defining the terminology inherent within an ontology, places a significant burden upon the professional ontologist (or even the well meaning amateur), because such ontologies must perforce describe conventions, rather than just terms. You are creating the terms of discussion, the set of primitives that make up the relevant language, whether that language is describing a workflow or, well, a chorale.
Yet beyond this initial stage comes the more complex process of defining the relationships between the various tokens or terms of the ontology. Alfred North Whitehead and Bertrand Russell attempted to do this with the intrinsic predicates of logic, in the years before Kurt Goedel dashed their hopes of ever completely defining mathematics irrevocably. No ontology can ever be complete, for there will always be predicates that exist which cannot in fact be stated within that ontology conclusively. This is the curse of the ontologist, the realization that there are in fact intrinsic limits to the ability of defining any language.
We, programmers and computational linguists, are the intellectual descendants of these three prodigies, tasked with the process of using language to drive computers and build models. Yet all too often I wonder whether we have, individually or collectively, lost track of the ultimate futility of what we are attempting to do, losing track of the Heisenbergian uncertainty associated with the process of “standardizing” the languages, human or computational, that all of our efforts are subject to.
Lately, one of the more interesting, and heretical, notions that I have been playing with is the idea that schemas not only can’t be considered to be canonical, but that thinking of schemas as being canonical can prove to be extraordinarily crippling to building complex applications. One of the things that XML does, more than any other language, is to provide a clearcut vehicle for separating the schema from the schema instance. In most computer languages, schema (also known as type, in this context) is inextricably woven together with the instance - the type acts as a formal template, and changing that type (through inheritance or other similar mechanisms) will also change the instance in a one to one relationship.
Yet with XML, the schema is not in fact intrinsically bound to the instance. For most people, this likely doesn’t make a big difference, but I think in fact that this decoupling of schema and instance is perhaps one of the most profound and important aspects of XML, because it recognizes that even definitions can be malleable.
Consider, for instance, one of the more troubling aspects of form design - the concept of dynamic enumerations. For instance, suppose that you have a particular field in your XML based invoice system that is meant to contain the access code for one branch of a company (say, for instance, Starbucks). Now, in any given city there are likely dozens of Starbucks, and the number seems to be increasingly daily. This is information that you cannot in fact encode within a normal XSD schema. You can describe the fact that a given store access code may have a particular pattern or you can create a pointer to a web service, but that pointer still requires that there be some explicit “defining” done beyond that which can be encoded in schema.
This is one of the difficulties inherent in model driven architecture (MDAs). It is rather fiendishly difficult to describe that model completely, because part of the nature of that model is that it may be both dynamic (dependent upon information that may be transitory in nature) and malleable (even static parts of the model may have relevancies that are difficult to encode within any one given schematic language). Up until now, this really hasn’t been that big of an issue - relational database models were considerably more static and self-constrained in terms of their schematic descriptions, and as such the type of MDAs possible from a relational database system were fairly limited.
However, with XSLT you introduce the notion of transformability - you can dynamically create schemas that in turn describe not only models but also possess the capability of displaying those models within a given context (such as XForms, XHTML, or SVG) that allow for the modification of the appropriate instances. This raises some tantalizing prospects - if you can generate your interfaces automatically from schemas, you can eliminate a lot of very expensive programming … and you can also handle the depiction of remarkably complex forms that can handle a great deal of interdependency.
I recently had a comment on a different blog post that the forms generated in this manner are visually uninteresting. Actually, though I find that while there is some validity in the criticism, this is perhaps not a reflection so much of the process (or concept) but rather derives from the fact that our understanding of what is meant by schema tends to be limited to that of static schemas. In point of fact, you can create extraordinarily interesting (and powerful) forms by recognizing that three additional points must apply:
- An XML schema (in the most generic sense) is an attempt to model or describe an XML structure. So long as the schema does not in fact violate the validity of the schematic instance, it is a legitimate mechanism for providing one facet of a definition. I call this the Principle of Instance Dominance, because in this case what is most important in the XML world is not the schema, but the schema instance, in direct contravention to the way programming is usually done.
- By dint of #1, any schema that provides a consistent definition of a given instance is as valid as any other schema, to the extent that a given instance may in fact have (potentially an infinite number of) multiple different schema types. This rather extraordinary statement means that schemas are a lot like bosons - you can stack multiple bosons in the same space without violating the laws of physics. I call this the Principle of Schematic Transience.
- Finally, any given schema system implies that there is a mechanism for processing that particular schematic language that may similarly exist in parallel with other such mechanisms, and so long as the schematic languages used to validate the instances do not violate the validity of the instance within the context of other schemas on that instance, such processing is valid and legitimate. I call this the Principle of Validation Independence.
Similarly, I find it useful in creating MDA solutions where potential ambiguous situations (such as the identification of a containing element holding a collection of disparate containers and elements as a “tabset”) can radically simplify the process of letting the content itself drive the presentation, and the views so created are considerably more interesting (and can be done in far less time) than the equivalent architectures for hand-derived forms.
Of course this also raises the rather worrisome prospect that so much more work will be needed to specify the additional cloud of quasi-schemas (and their processors) that the resulting effort will far exceed designing the same piece manually. There is a certain degree of legitimacy in such concerns - the amount of effort involved in building such “generators” is not insignificant, and requires some fairly sophisticated XSLT skills. However, in most cases, what becomes critical in building such systems is to recognize that you have both semantic-bound and semantic-less processors at work, and the more you can move toward the latter, the greater the degree of reuse of your code.
A semantic-bound processor is one that needs to know something about the semantics inherent in the instance in order to perform some actions. A schematron “processor” is semantically bound - you are generally looking for specific named nodes in order to perform either an assertion or a report on the given object. However, the generator for a schematron is semantic-less (relative to the initial instance); it does not in fact need to know anything about the semantics of the instance, only the semantics of schematron itself. Similarly, a language that describes such things as the fact that certain elements should be treated as bags while others are either records or collections of records is semantically-bound, but the generator for processing these entities should be semantic-less - the generator should not know anything about the semantics of the instance.
The beauty of such semantic-less processing is that the tools for building the resultant generators will perforce be applicable to a large number of potential instances, regardless of their underlying semantics. This is the real power of such model driven processing … by working with information at a level where the semantics is irrelevant (by placing as much of the semantics into manipulatable schemas as you can) you are able to create very sophisticated applications regardless of the domain involved without having to become enmeshed in the minutae of scripting - not that the scripting isn’t there, but the scripting acts in an encapsulated fashion as a binding to provide behavior to elements in the declarative module, so doesn’t need to be exposed to the integrating developer.
Thus, by seeing a schema as a dynamic document, one constrained by the instances it describes but that is nonetheless quite malleable even given that, model driven architecture becomes viable. There is one last aspect of this that’s worth exploring - visual presentation. A CSS document is a form of schema, though this point is typically lost because people tend to view CSS as ultimately static and typically applied at the very end of the presentation pipeline.
However, CSS can also be generated from XSLT - and in many ways should be generated from XSLT in model driven applications. You can create reasonably sophisticated applications just knowing the general shape of the various elements in the schema (bag, collection, record, rich text, external list, and so forth). Yet if you assume the two stage transformation where you process the schema at the semantic-less level in order to generate an intermediate transformation that in turn is able to handle customization at the semantic-bound level, possibly with XForms or other bound-node technology, you can handle the relevant CSS generation where it differs from stock by treating it as a CSS schema at the second level of transformation, likely from some XML “style-page” source.
While this may seem like a lot of work to build this customization (such as placing form elements in a very tightly constrained grid that looks like a specific paper form) the important thing to remember here is that you would have done this work anyway. A well designed generative system should provide the relevant hooks for you to provide your own customizations, but the customizations, since they require human judgment and human aesthetics, still need to be done by a human being. By creating XML-based CSS “schemas”, you also gain the advantage of being able to automatically create visual interfaces for automating the task of creating the customization, as well as creating a mechanism for documenting the semantics so that a CSS programmer won’t have to try to guess what exactly “.thisDiv” meant in a five thousand line CSS document.
I’m hoping in the next couple of months to publish as an open source project that encapsulates these ideas. There are others who are playing in this arena as well (most notably Dan McCreary, a friend of mine and XML expert who has been doing some very nice things with schema driven development), and I find on my XForms.org site that others who are working with XForms are beginning to see the benefits of this approach as well. My hope (and intent) is to move us all past the point of being independent developers striking out in the darkness and pull us together into a cohesive movement, built around the principle that model driven (or more properly, schema driven) design is both viable and necessary, especially for large scale systems. If you find that the words I’ve written here resonate, please contact me.
Kurt Cagle is a writer and software architect specializing in XML based technologies. He’s also the webmaster of <a href=”http://www.xforms.org”>XForms.org</a>, a community foum focused at XForms and Model Driven Architecture.


I think the downside of ontological brittleness and linguistic absolutism may be greater than you suggest here.
I notice, in watching technical fields become popularized, that a certain ontological perversity can actually close off the expression of an important concept. A technical example came to my attention just recently. For some reason, it has become popular to treat the term "algorithm" as applying to any computational procedure and to have computer codes (heretofore, realizations of algorithms) count as algorithms directly. The problem is that there are some profound results in the foundation of computing that apply to an earlier notion. Now there is complete misunderstanding of those results because they do not apply to the conflated popular sense, and this leads people to argue with the results (e.g., the Church-Turing thesis) when what has occured is definitional sleight-of-hand.
There are socially more-significant cases under the guise of "framing the discourse" where, for example, "supporting our troops" means keeping them in harms way.
Coming back to MDA and your practical exercise, one I am keen to see more of, it will be interesting to see how one deals with incommensurate models that are not reconcilable by some mechanical procedure, much as there are expressions and related concepts in different natural languages that do not lend themselves to satisfactory inter-language translations.
Yawn.
This sounds much like the mental meanderings we go through whilst
sat on the toilet.
Orcmid,
I don't, in general, see MDAs as being whole solutions. just as I find that XForms is a surprisingly good answer to about 95% of a given problem and very limited in that remaining 5% ... unless you have an extension mechanism that makes it possible to set up additional functionality. There needs to be room for imperative code, the question overall is at what point in the stack the imperative code is introduced. If you can model bindings in a declarative manner, then there are advantages to keeping such bindings as behavior and manipulating the XML as an abstraction before the bindings need to be imposed. The bindings don't disappear - they are a very necessary part of the process, as they impose the notion of componentization - but the idea within MDA is to push such bindings to the periphery of the problem domain - the point where a model gets instantiated as an application.
Your point about algorithms is well taken, however, and one that I have seen as well. Computer code is an implementation of an algorithm, and while it is possible to show from the implementation what the concept behind the algorithm is, a computer program is not (in general) a mathematical proof. However, it can also be argued that there are algorithms which are so complex that they are beyond unaugmented human ability to prove them, with the result that a computer program, particularly one that is generated through computational processes itself (think MathML or Mathematica) may in fact be the only way that the algorithm itself can be, if not defined, then at least described.
As to ontological brittleness - yes, absolutely, I agree with you, and I thought that was what I was trying to say in my original post. Modeling is hard. In many programming applications, the model itself is actually very seldom explicitly stated as such, but rather is much like your algorithm example - it emerges as a fairly amorphous concept that's defined implicitly by the interrelationship of pieces, rather than being explicitly declared. An MDA approach puts the hardest part - development of the model - first, and as such goes against the grain for many programmers. The advantage of MDAs is that once you do have the model (which is generally considerably more than just a formal XML schema, another point I was trying to make) then building the application is trivial and can be nearly completely automated.
MDAs are not a magical panacea, and they generally work best only in those situations that tend to deal with strongly relational data - insurance processing systems, medical systems, process management, workflow situations and the like - systems where the underlying data model can be articulated but may be fluid - such as schemas that may in fact change definitions over time. It's an approach or methodology, and like all such, it is very effective in some domains and less so in others.
Orcmid,
I don't, in general, see MDAs as being whole solutions. just as I find that XForms is a surprisingly good answer to about 95% of a given problem and very limited in that remaining 5% ... unless you have an extension mechanism that makes it possible to set up additional functionality. There needs to be room for imperative code, the question overall is at what point in the stack the imperative code is introduced. If you can model bindings in a declarative manner, then there are advantages to keeping such bindings as behavior and manipulating the XML as an abstraction before the bindings need to be imposed. The bindings don't disappear - they are a very necessary part of the process, as they impose the notion of componentization - but the idea within MDA is to push such bindings to the periphery of the problem domain - the point where a model gets instantiated as an application.
Your point about algorithms is well taken, however, and one that I have seen as well. Computer code is an implementation of an algorithm, and while it is possible to show from the implementation what the concept behind the algorithm is, a computer program is not (in general) a mathematical proof. However, it can also be argued that there are algorithms which are so complex that they are beyond unaugmented human ability to prove them, with the result that a computer program, particularly one that is generated through computational processes itself (think MathML or Mathematica) may in fact be the only way that the algorithm itself can be, if not defined, then at least described.
As to ontological brittleness - yes, absolutely, I agree with you, and I thought that was what I was trying to say in my original post. Modeling is hard. In many programming applications, the model itself is actually very seldom explicitly stated as such, but rather is much like your algorithm example - it emerges as a fairly amorphous concept that's defined implicitly by the interrelationship of pieces, rather than being explicitly declared. An MDA approach puts the hardest part - development of the model - first, and as such goes against the grain for many programmers. The advantage of MDAs is that once you do have the model (which is generally considerably more than just a formal XML schema, another point I was trying to make) then building the application is trivial and can be nearly completely automated.
MDAs are not a magical panacea, and they generally work best only in those situations that tend to deal with strongly relational data - insurance processing systems, medical systems, process management, workflow situations and the like - systems where the underlying data model can be articulated but may be fluid - such as schemas that may in fact change definitions over time. It's an approach or methodology, and like all such, it is very effective in some domains and less so in others.
The question here is: "are dictionaries prescriptive or descriptive"?
Users of dictionaries think of them as prescriptive (because they want to be understood); authors of dictionaries think of them as descriptive (because they want to reflect usage). I don't think that leads to logical issues.
The web has changed the model completely. Wikipedia is written by the people, not by an old British expert, however wise. That makes it much more descriptive (i.e. representative of common usage); and hence it's also much more prescriptive (i.e. an authoritative reference).
Kurt,
Love your column and tune in regularly. I recently developed a user-interface framework that is (sort of) MDA. The view is derived from what is essentially an annotated custom schema language. The mantra is "declarative" view. Your point that the schema itself is fluid based on a particular instance is well-taken. The common approach when you have a metadata language that doesn't exactly fit a spec such as CSS or XMLSchema is to map your custom elements or attributes to "named" tokens that in turn map to tokens in a CSS file (e.g. a required field maps to CSS class .required). As options multiply, so do tokens in various languages... I like the approach of having a recognized schema language drive the output of other presentation tags. I wonder how you would approach the problem of embedding custom hooks within standard languages. You mentioned the example of enumerations. In my case, enumerations are more like expanded searched. Results can be presented by category, facet, limited by security role, etc. How would you represent that? In a schema, how do you say something like "given this user and this form data that he's already entered, limit the possible selection for this field to the results of this dynamic query"? For this case, do you go through the pain of extending a known standard or do you go the route of a private DSL that you then map to standards when it is a good match?
Again, thanks for the post. Always interested in what you're up to.
Regards,
Brent
Everywhere you find a link, put a function call.
Brent,
Kind of working backwards here:
I differentiate in my own efforts between enumerants, which are exclusive tokens or terms that are schematically defined and fixed, and feeds, which are external lists that usually include both an internal token and a label (and which may have additional metadata associated with it). The use of feeds here is meant to (strongly) suggest the association between a feed or resources and an RSS/Atom feed, particularly since I use Atom as my transport protocol for the feeds.
Since I'm using an XForms idiom, the referents themselves are usually defined as an XML data instance -
<xf:model>
<!-- other instances -->
<xf:instance id="myFeedData1"/>
<xf:instance id="myFeedParams1">
<params>
<a>1</a>
<b>2</b>
</params>
</xf:instance>
<bind id="myFeedBind1" nodeset="instance('myFeedParams1')"/>
<xf:submission
id="myFeedGet1"
method="get"
action="http://www.myresources.com/atom/feed.xq"
bind="myFeedBind1"
replace="instance"
instance="myFeedData1"
/>
<xf:action ev:event="xforms-ready">
<submit submission="myFeedGet"/>
<xf:/action>
...
<xf:model>
<xf:select1 ref="someDataValue">
<xf:itemset nodeset="instance('myFeedData1')//atom:entry">
<xf:label ref="atom:title"/>
<xf:value ref="atom:id"/>
</xf:itemset>
<xf:select1
Where atom:id is of the form "EnumerantTypeURI:EnumerantValue".
Thus while feeds usually bear a superficial resemblance to static enumerants, in point of fact they require considerably more resources to support, and in general it is not in fact advantageous for them to be bound explicitly into the static infrastructure. This is useful both for computed lists (such as filtered content, especially content which may change parameterically) and selections that have potentially dozens of potential enumerants and as such would add considerably to the initial download time of the page.
The mappings to generate this are contained in an intermediate tokenized form that's mapped through a couple of XSLTs, as you surmised. I'll try to make that clearer moving forward.