Efforts have been underway recently to develop a schema language for JSON, analogous to the XML Schema Definition Language (XSD) or RelaxNG languages in the XML arena. Similarly, a JSON transformation language is being proposed and bandied about in various AJAX circles as web2 developers attempt to take the best of what XML has to offer and recast it from the angle-bracket modality to the braced modality.
These efforts are intriguing, and for the most part people within the XML community are now affecting the same rather confused expression on their face that I remember seeing on the SGML generation as they watched the young turks of the XML movement push their view of the world out to the world - “Didn’t we already DO that?”
Efforts have been underway recently to develop a schema language for JSON, analogous to the XML Schema Definition Language (XSD) or RelaxNG languages in the XML arena. Similarly, a JSON transformation language is being proposed and bandied about in various AJAX circles as web2 developers attempt to take the best of what XML has to offer and recast it from the angle-bracket modality to the braced modality.
These efforts are intriguing, and for the most part people within the XML community are now affecting the same rather confused expression on their face that I remember seeing on the SGML generation as they watched the young turks of the XML movement push their view of the world out to the world - “Didn’t we already DO that?”
I find, watching this phenomenon continue, that I’ve begun to question some of the dogma that I’ve subscribed to for the last several years - not so much in questioning whether somewhere along the line XML veered away from its true path (with the possible exception of SOAP, and the rather dubious benefits of the long-winded API that DOM became, I think that XML’s done rather well for itself as a technology) as wondering whether it is time for those who represent the biggest trust of XML knowledge needs to be asking if perhaps the JSON folk are raising some questions about the fundamental nature of what it is we WANT out of XML.
Over the years, based upon an interest in the long distant past in role playing games (including a stint as a contractor at both TSR and Wizards of the Coast, spaced more than two decades apart), I’ve developed a kind of stock game character class, one that illustrates fairly nicely the idea of complex structures. In XML, a standard character may be rendered as:
<gameCharacter>
<identity>
<name>Aleria Delamare</name>
<gender>Female</gender>
<species>Half-Elf</species>
<vocation>Mage</vocation>
<lcAlignment>Neutral</lcAlignment>
<geAlignment>Good</geAlignment>
</identity>
<scores>
<game_level>5</game_level>
<health>16</health>
<gold>125</gold>
</scores>
<attributes>
<strength>12</strength>
<intelligence>17</intelligence>
<wisdom>16</wisdom>
<constitution>12</constitution>
<dexterity>15</dexterity>
<charisma>17</charisma>
</attributes>
<spells>
<spell level="1">Create Light</spell>
<spell level="1">Charm Person</spell>
<spell level="2">Minor Illusion</spell>
<spell level="2">Predict Weather</spell>
<spell level="3">Mask</spell>
</spells>
</gameCharacter>
While this particular breakdown makes the specific characteristics of my half-elf mage reasonably obvious, the challenge comes in doing something with this information.
For instance, suppose that you wished to get a listing of all first level spells that Mga. (magistra, for those not in the know) Delamare has in her possessions. Doing this through DOM is singularly painful (all code samples are JavaScript):
var function getSpellsByLevel(doc,level){
var spellNodes = doc.documentElement.getElementsByTagNameNS("","spell);
var spell_arr = [];
for (var index=0;index != spellNodes.children.length;index++){
var spell = spellNodes.children.item(index);
if (spell.getAttribute("level")==level){
spell_arr.push(spell.firstChild.nodeValue);
}
}
return spell_arr;
}
// This varies based upon the Javascript implementation
var character = document.getInstanceDocument("aleria.xml");
var spells = getSpellsByLevel(1);
Is it any wonder that the AJAX community rebelled at something like that? The issue of addressability is very much at play here, given especially that one of the early reasons for the success of XML in the first place was the abstraction of its internal address space. The tree-walking navigation of DOM, while necessary at a very low level, does not necessarily lend itself to providing an intuitive, clean interface for working with the hierarchical XML infoset, something which more traditional class-oriented languages such as Java or (for the matter) JavaScript do quite well.
Of course, if being able to address that XML structure is so critical, then obviously XPath should be able to provide a far better solution - and it does … sort of. The Xpath for retrieving the spell list is pretty simple:
/gameCharacter/spells[@level=1]/text()
Nice, concise, making use of the folder metaphor for retrieving the relevant content. The problem is not in the XPath, but in the invocation of the XPath, as can be seen in the XPath implementation used by Firefox (and more or less endorsed by the W3C).
var xpEvaluator = new XPathEvaluator();
var xpResolver = evaluator.createNSResolver(doc);
var xpResult = evaluator.evaluate("/gameCharacter/spells[@level=1]/text()",doc,xpResolver,0,null);
var spell_arr = [];
while ((var spellNode = xpResult.iterateNem())!=null){
spell_arr.push(spellNode);
}
Um, yeah. Nuff said.
The lack of standardization on XPath, combined with the fairly byzantine interfaces designed so intelligently by the W3C (which, while understandable given the context), has generally meant that most developers working with XML will not reach for XPath even when it is the best tool out there for accessing context, because the amount of work necessary to invoke that XPath just makes it too painful unless its buried deep in fairly non-standardized wrapper classes.
The notation for encoding the game player as a JavaScript object is very straightforward, and on a legibility scale is somewhat cleaner:
var gameCharacter = {
identify:{
name:'Aleria Delamare',
gender:'Female',
species:'Half-Elf',
vocation:'Mage',
lcAlignment:'Neutral',
geAlignment:'Good'
},
scores:{
game_level:5,
health:16,
gold:125
},
attributes:{
strength:12,
intelligence:17,
wisdom:16,
constitution:12,
dexterity:15,
charisma:17
},
spell:[
{name:"Create Light",level:1},
{name:"Charm Person",level:1},
{name:"Minor Illusion",level:2},
{name:"Predict Weather",level:2},
{name:"Mask",level:3}
]
};
While Javascript notation doesn’t have quite the same query capabilities as XPath, the exercise with Javascript and a JSON object like gameCharacter illustrates much of the simplicity of the approach:
var spells = gameCharacter.spell;
var spell_arr=[];
for ([index,spell] in spells){
if (spell.level==1){spell_arr.push(spell.name);}
}
The notation is short, sweet, easy to code and is consistent with normal JavaScript practices. This simplicity in both declaring and working with JavaScript (and by extension with JSON) raises significant questions about whether in fact JSON may in fact be better in this role of web communication than XML.
Admittedly, JSON does have some limitations. One of the most compelling is the fact that JSON has no intrinsic concept of namespace. Namespaces are not universally loved in the XML world, but as people work with an ever larger gamut of XML, the advantages of namespaces usually become apparent quickly. What this means in practice is that while it is possible to use lexical namespace constructs, in JSON, they are only coding conventions - there is no intrinsic capability within JavaScript for differentiating on namespaces.
Another problem faced by JSON is what I call the ABA dilemma. A JavaScript object, like any hash or associate array system, can effectively store only one “name” at any given level of object. Thus, expressions like
{a:"foo",b:"bar",a:"bat"}
are illegal in JavaScript. On the other hand, they aren’t in XML -
<root>
<a>foo</a>
<b>bar</b>
<a>bat</a>
</root>
is perfectly legitimate. As that structure is frequently found in documents (such as alternating <p tags with <ul tags) this can be a problem in encoding XML documents. You CAN use arrays:
{a:["foo","bat"],b:["bar"]}
but then you lose any sense of order. Ultimately, the only real way around this is to assign positions explicitly to each entity:
{a:[{name:"foo",position:0},{name:"bat",position:2}],b:[{name:"bar",position:1}]}
though this can be difficult to reparse and adds additional layers of metadata that were implicit in the XML.
However, it can (and indeed should) be argued that the principle role of JSON is to act as a lightweight object transport envelope, not a document one, and the ABA problem usually occurs very seldom in object representations. Moreover, in all likelihood, if you did have such a document structure, it’s very likely that this information is being transported as HTML strings within a JSON container (or e4x objects, as discussed momentarily), and as such will likely occur comparatively seldom. Admittedly, I can think up a number of counterexamples where this assertion isn’t true, enough that making the assumption could prove to cause problems down the road.
I’ve deliberately held off discussing EcmaScript for XML (e4x) here, because, at least for Mozilla Firefox and Adobe Flash, e4x readily addresses all of the same issues that JSON does. Indeed, if doc holds a string representation of the XML in question, retrieving a list of all first level spells takes three lines of code:
var edoc = new XML(doc);
var spells=edoc.spells.spell.(@level=1)
for ([key,spell] in spells){print(spell.toString());}
I suspect that because of a number of factors, from an incomplete Mozilla implementation to the reluctance on the part of developers to embrace XML from the memory of battling DOM implementations and a comparative lack of information about the technology that e4x hasn’t quite managed to penetrate as fully in the web space as it could have. JSON also has the advantage of being portable to a number of different server implementations, whereas even if you could work with e4x on the client, server side XML has similarly remained largely hidden behind SOAP servers or syndication library tools.
From the standpoint of the XML community, I think all of this should be examined closely. I do not in general see JSON replacing XML - there are simply too many things that XML is already doing where JSON objects offer little to no real advantage. However, there are also places, especially in the areas of messaging and syndication, where JSON actually offers quite a bit of advantage, and there are more than a few situations where a given service (especially a REST based one) should realistically be able to handle both formats.
JSON is not XML, and I think it would be foolish for the XML community to try to treat it as if it was. However, at the same time, being able to render JSON as XML and vice versa makes a great deal of sense in a number of circumstances, especially in areas such as transformations and validation (indeed, it’s not hard to see something like RelaxNG, which already has a JSON-like notation, being modified in very limited ways in order to be used as a JSON validator).
To that end, I’d like to push a proposal to both the XML and AJAX communities, something that I think needs to be taken up by the W3C, the OpenAJAX alliance and JSON.org especially. Establish a set of conventions within JSON that most readily facilitate JSON being used in an XML context. These conventions should be syntactical, things that can be done with hash key naming conventions that can be picked up by a JSON/XML bridge to transform between the two formats. This would serve a number of purposes; first of all, it keeps the web world from a schism that could be devastating for interoperability for both sides; it makes it possible for JSON developers to take advantage of mature and sophisticated XML-based systems and it makes it possible for XML developers to create lightweight messaging systems to the client without having to spend a great deal of effort in writing ad-hoc translation software.
Finally, and perhaps most importantly, it also makes it possible for the XML and JSON communities to work in tandem with one another rather than at cross-purposes. Certainly a little competitions is good, but this kind of competition does nothing for either side but duplicate effort needlessly and fragment the web.
I also think that this also offers to the W3C an opportunity to coordinate with ECMA and the full endorsement of e4x as a viable XML version. The W3C DOM has its place (and won’t be replaced any time soon) but its also looking increasingly antiquated and cumbersome given such things as e4x, JSON and Microsoft’s LINQ. As the W3C is already undertaking this effort on one front with the Efficient XML Interchange (EXI) format for the encoding of “binary” XML, the opportunity is there to establish the fact that the XML DOM is not the only way to manipulate XML.
I’ll be talking about this more in the future, including giving a talk about “Lightweight XML” at the XML 2007 conference in Boston this Winter. It’s time that we take a long, hard look at what XML is all about - I suspect that the answers may surprise us all.
Kurt Cagle is an author, web developer and information architect, and is the webmaster for XForms.org. He writes regularly for O’Reilly Books on XML.com and maintains his own blog, Metaphorical Web. He lives in Victoria, British Columbia, where he enjoys jumping into piles of leaves.


"DOM, while necessary at a very low level, does not necessarily lend itself to ... AARGGGGHHHH" ... What? To what?
Fisrt, JSON's aba "problem" is resolved as:
[{a:"foo"},{b:"bar"},{a:"bat"}]
This is nice because it explicitly indicates order. In XML, this has to be mandated using a schema. I personally prefer to see context within the document itself when it can remain concise without duplication though it does mean you can't sloppily create the document but that's the same if your document must adhere to a schema.
XPath is great ... for talking about XML. It shouldn't be primary means for accessing data within a programming language, which, by the way, is 99.9999% of the way structured documents are processed. This has been one of my biggest issues for the past few years. Languages should be finding ways to represent and interrogate documents using native syntax.
For example, take Java, an XML document should be converted to a language-specific semantic so that accessing "/gameCharacter/spells[@level=1]/text()" becomes like
foreach (gameCharacter.spells, {level==1}) println text();
Of course, XPaths can be far more complicated than a typical language syntax can reasonably represent but the point is that the document will ultimately be processed with a particular language so document processing should be tightly coupled with the language syntax and semantics.
Hm, so they want a way to write schemas for a language that is a stripped-down version of a language which has schemas (more or less). It sounds to me like they want.... JavaScript?
@q,
This is nice because it explicitly indicates order. In XML, this has to be mandated using a schema.
Are you on crack? WTF? What does,
[{a:"foo"},{b:"bar"},{a:"bat"}]
... suggest implicitly about order that the XML fragment,
<root>
<a>foo</a>
<b>bar</b>
<a>bat</a>
</root>
... does not? I do not need an XML schema to tell me that elements a, b, a are ordered.
The assumed "problem" with XML is not XML itself. The problem is w/ people who make silly assumptions about XML w/o first verifying those assumptions are correct.
s/implicitly/explicitly
For example, take Java, an XML document should be converted to a language-specific semantic so that accessing "/gameCharacter/spells[@level=1]/text()" becomes like
foreach (gameCharacter.spells, {level==1}) println text();
No it doesn't. /gameCharacter/spells[@level=1]/text() returns the text nodes that are descendants of /gameCharater/spells who's @level is == 1. foreach (gameCharacter.spells, {level==1}) println text(); will iterate through those text nodes and print them out to the screen or other specified device. In other words your dealing with the difference between a data set and a function that processes that data set.
M. David,
Me thinks that "q" is solving the _JSON_ order problem, not suggesting XML has one. And, yes, the results in XML look very similar to JSON.
XML has HISTORY. Lots of tools; well understood
JSON is great for client side AND I get to stand on the sholders of Giants- I can make JSON schema beteer, faster. We have the technology. (Now if I only had $6 million, man!)
For example, I can do cross selectors in XPath2, but it's a pain using old processors. Also, XSLT is still NOT a functional language: in jsPath, I can define my OWN cross selectors and do WHATEVER I want with MY data.
q,
Good point on the ABA resolution, and one that basically confirms my initial thesis. As I was writing this, I had to ask what the drawbacks were to JSON in terms of encoding XML; the internal answer was not much, and the ABA issue was really the only one that came to mind.
XPath emerged for two reasons - given the hierarchical nature of a given XML document, a folder paradigm was a fairly obvious metaphor for encapsulating that information, and the predicate model also assumed navigation in directions beyond descendants. That you can argue that the "." notation is superior in some (many) respects should indicate that there are in fact other valid syntactical mechanisms for accessing data. That e4x can encapsulate much the same information using dot notation is proof that such a paradigm is quite effective for working with XML.
There is one point that is worth noting here, however. XPath is in many respects more akin to SQL than it is to class manipulation - it works on sets, not individual contexts. This isn't necessarily a superior approach, but it is a different than JavaScript takes.
Note that neither XML nor JS explicitly REQUIRE schemas (inlike languages such as C++ or Java).
Ooops, forgot to indicate it was me writing the previous comment.
@M. David,
In other words your dealing with the difference between a data set and a function that processes that data set.
I'll just ask a question, within any XPath implementation to process "/gameCharacter/spells[@level=1]/text()" what will they have to do prior to returning the set?
Ok, I can't resist, I'll answer that. They'll traverse and iterate. I was only making a point that processing XPaths abstracts away processing details at the expense of code uniformity. All I'm saying is that XPath is great, I just wish languages implemented such capabilities natively.
And no, I'm not on crack. I was saying that using "[]" in JSON explicitly mandates an ordered list. There is nothing in your XML that does the same. It's partially implicit in XML but if I just have<root>
<a>foo</a>
<b>bar</b>
<root>How would I know that is supposed to become[{a:"foo"},{b:"bar"}] as opposed to {a:"foo",b:"bar"}?
I'd actually prefer the later but if the schema indicates it must support "aba", my JSON creation would fail to reproduce the XML in certain circumstances. But when seeing [{a:"foo"},{b:"bar"},{a:"bat"}]you immediately understand the ordering requirement without referring to a schema.
As I see it, the notation
{root:[{a:"foo"},{b:"bar"},{a:"bat"}]}
produces a JS tracking of:
root[0].a, root[1].b, root[2].a
This is where e4x has the advantage:
<root>
<a>foo</a>
<b>bar</b>
<a>bat</a>
</root>
encodes as:
root.a[1],root.b,root.a[2]
and you can work with all a nodes with the expression:
for (a in root.a){process(a);}
Using the full encoding layer described above, the same usage with object notation is a little more hairy:
for (var index=0;index != root.length;index++){
var node = root[index];
for ([objname,objval] in node){
if (objname=="a"){process(objval);}
}
}
That highlights the fundamental difference between the two notations - e4x is implicitly designed to work with node-sets, while js isn't. This is more noticeable when you start talking about descendants:
/root/*/*/bin[starts-with('fin')]
requires a lot of JavaScript object code to duplicate, and the generic:
//bin[starts-with('fin')]
is worse.
var spells = gameCharacter.spell;
var spell_arr=[];
for ([index,spell] in spells){
if (spell.level==1){spell_arr.push(spell.name);}
}
The notation is short, sweet, easy to code
May be, but it's 5 lines versus 8 lines for the equivalent DOM code. The XPath code is 7 lines. Yes, the JSON version is simpler and more elegant, but is it really a quantum leap?
E4X is very attractive to me; seeing it in the title of this article was why I read it. It looks to me like the most readable way of handling XML in JS. If MSIE supported it, I'd probably have used it in my last project.
Куплю Windows Куплю Office -Vista/XP/2003 extsoft@mail.ru
и другое ЛИЦЕНЗИОННОЕ ПО Microsoft
пишите на е-мейл extsoft@mail.ru