December 2007 Archives

Kurt Cagle

AddThis Social Bookmark Button

Once again into the breach, dear friends. I’ve made it a habit over the last several years to put together a list of both my own forecasts for the upcoming year in the realm of XML technologies and technology in general and, with some misgivings, to see how far off the mark I was from the LAST time that I made such a list. The exercise is useful, and should be a part of any analyst’s toolkit, because it forces you to look both at the trends and at potential disruptors to trends (which I’ve come to realize are also trends, albeit considerably more difficult to spot).

So, without further ado, I bring up the list from last year …

Rick Jelliffe

AddThis Social Bookmark Button

The vogue quip that “a camel is a horse designed by committee” probably makes more sense to people who don’t live in a desert country. From here in Australia, camels seem to a very plausible design. It is the speaker, actually, who is wrong: what you need is a camel when you are in the desrt, a horse on the planes, a yak in the mountains, perhaps a porpoise in the sea, and an elephant in the jungle.

The ongoing XML Schemas trainwreck shows little sign of improvement; that users have so repetitively stated their problem and received no satisfaction from the W3C shows how disenfranchised they are. I am thinking about these things again this week for three reasons.

First, I saw (only 2 years too late) the AT&T-originated guidelines on XML Schemas Best Practices which underly a best checker tool at Java.net. It goes through the capabilities of a particular class of application (while assuming that everyone is interested in the same class of applications grrr “XML” is not just what one set of software uses) and gives a list of what will cause problems or be unportable. Some (like deprecating <appinfo>) are dubious, but most seem well-founded. It is a good document for anyone reading.

The tables in A.2 and A.3 is especially interesting, or horrific in practical terms. None of the software supported derivation of complex types by restriction fully, most not at all. None fully supported ID datatypes. Only one implementation fully supported enumerations. Basically, type derivation of complex types was a complete non-starter.

The other reason I am thinking about it was for work. A customer wants to use MS InfoPath with a schema I have been working on. But, predictably, InfoPath has a range of things it doesn’t support. Many of them (replacing “unbounded” for the cardinality of choice groups with some reasonable number) are trivial, but it is the same issue.

A little over a year ago, Paul Klee had a great summary article on XML.COM XML Schemas Profile. It mentions the 2005 W3C organized W3C Workshop on XML Schema 1.0 User Experiences, and the do-nothing Chair’s report (”No-one wants anything, and if they do they don’t agree, and if they agree it cannot be done, and if it could be done other people don’t want it, and if other people do want it they actually want something else, and if they don’t want something else it would be confusing.”) It looks like very strong leadership for inertia, and it cheeses me off that their laziness affects me and my clients at the end of 2007.

One positive thing that has come out has been the W3C Basic XML Schemas Databinding Patterns which lists various XPaths that databinding tools can have. (It mentions how to use these in Schematron, which is good too!) But it doesn’t come up to the level of a profile. (And, to be fair, the W3C Schema WG has also upgraded XSD to reduce some gotchas that have been reported, such as allowing unbounded on all groups.)

Why not? Because, as far as I can make out, the idea that we will all be better off if we pretend that XML Schemas is a unified and whole specification, one size that can fit all, then somehow it will magically happen. But fantasy is a really poor substitute for reality. Time and time again I have seen clients happy about XML Schemas and its promises, only to have their hopes dashes as they realize that as soon as they need to start deploying they have to use subsets and there is no support from “standards” to help interoperability.

The third thing? DIS29500 gave XML Schemas that worked in MSXML, but failed in Xerces. This was raised as an issue (by Japan among others) and the schema is being reworked to support Xerces. (The issue is to do with circular imports IIRC: I think the new schemas will be in a single file per namespace and that will help the RELAX NG conversion too.) Again, this is an issue we are dealing with in late 2007.

And that is what you get when you have a large standard that is not sufficiently modular and focussed to support its main applications: guaranteed non-interoperability. This lack of modularity has been an issue that has been relentless pointed out to the W3C XML Schema Working Group and just as relentless ignored: and the result is that it is surprising if we find a schema that works out-of-the-box with the particular tools desired for a job.

Why is that we are going into 2008 and we still have exactly the same kinds of problems that were clearly expressed as real problems in the 2005 experience workshop, and which were predicted vociferously before then?

M. David Peterson

AddThis Social Bookmark Button

The countdown begins**…

M. David Peterson

AddThis Social Bookmark Button

So in about 6 days all direct addressed EC2 instances will be shutdown. This day comes with *PLENTY* of warning, so decommissioning the 3 direct addressed EC2 instances that we still have running has been planned for a while. Of course, why do something now if you can just as easily put it off until later? ;-)

Okay, so maybe that’s not the best philosophy in life, but when you’ve designed your server infrastructure around worst case scenario disaster recovery, the thought of “losing” an instance or three doesn’t present the type of anxiety you would normally expect, so in the case of EC2, it actually works pretty well.

That said, as per the following screen scrape *even if* we didn’t design our system with a worst case scenario mentality, we’d probably still be okay,

Kurt Cagle

AddThis Social Bookmark Button

The unification of XML and SQL relational data has taken another significant step forward recently with the introduction of significant new XML functionality in mySQL, the world’s most popular open source database. In versions 5.1 and 6.0, mySQL adds the ability to retrieve tables (and JOINS) as XML results, to retrieve SQL schemas as XML files, to both select content via a subset of XPath and to update content using similar functions, and the like, as related recently in an article on the mySQL site: http://dev.mysql.com/tech-resources/articles/xml-in-mysql5.1-6.0.html .

M. David Peterson

AddThis Social Bookmark Button

It seems Safari is the only browser that will leave you left wondering why on earth it — seemingly randomly — refuses to make even the most token attempt at accessing any particular URI via the document function. That’s because each of the other browsers will automatically URL encode GET requests where as Safari will not and as such will throw an internal (I assume internal to the underlying OS?) error. Of course it won’t tell you it through an error which will be the source of significant hair pulling, but none-the-less, an error has been thrown — somewhere. ;-)

I’m not immediately finding anything in the XSLT 1.0 spec that even remotely touches on whether or not it is the job of the transformation engine, the underlying system, or the developer to properly URL encode a request, so I can only assume that regardless of whether or not it’s a pain in the a$$, not URL encoding requests made via the document function is completely within the realm of a standards compliant XSLT processor. Anyone in the know care to clarify?

In the mean time, one of the better resources I’ve found for both quick and easy reference as well as on-the-fly encoding of any given URI is located @ http://www.blooberry.com/indexdot/html/topics/urlencoding.htm. If you find yourself about ready to rip your hair out because Safari refuses to make any attempt at retrieving the document located at any given URI, check the above resource. Chances are pretty good that something as simple as a | character not being properly URL encoded is the culprit.

M. David Peterson

AddThis Social Bookmark Button

RFC 2068: Section 3.2.1: General Syntax

Note: Servers should be cautious about depending on URI lengths above 255 bytes, because some older client or proxy implementations may not properly support these lengths.

Okay, so if I’m making a web service call to a particular URI it’s more than likely going to be inside of my own code base as opposed to inside of someone’s client. And in the cases that it’s not chances are pretty good that this same client doesn’t support the extended functionality of my shiny new asynchronous Web 2.x+ app. So whether or not a client supports URI lengths over 255 bytes is probably less of a concern given that these same clients couldn’t support my application in the first place.

But let’s set aside the most likely client-side scenarios and assume nothing: RFC 2068 is about a week shy of being 11 years old. Is the 255 byte URI length recommendation still applicable? From the client perspective, possibly not. But what about from the proxy perspective? And are there clients (possibly mobile browsers?) that I’m not taking into consideration that still impose a limitation on the URI length?

NOTE: As of October 27th, 2007 the limit inside of Internet Explorer is 2083 bytes. Is 2083 bytes today’s equivalent of the 255 byte recommendation of 11 years ago? (You would have to assume that MSFT didn’t arbitrarily arrive at this figure, basing the limitation on known limitations of the existing infrastructure of the Internet, correct?)

Rick Jelliffe

AddThis Social Bookmark Button

Sean McGrath’s
Master Foo On Structured Documents
makes a similar point to my Standardize the jellybeans not the jars, and is worth a read.

However, there is one big problem with open content models, and using generic containers: many automated XML tools only use schema information and not instance information to do their stuff. This is a problem I am facing right at the moment, actually: a customer wants to use Brand X tool which lets you map from controls on a form to elements in a schema, but also wants to use an industry-standard schema which uses data values.

For example, the tool would like a document like this:

  <Customer>
        ..
       <homephone>1234</homephone>
      <businessphone>1324</businessphone>
      <ax>123</fax>
      ...
    </Customer>

but the industry standard has

    <Party>
        <Person>
            <PersonTypeCode tc="1">Customer</PersonTypeCode>
            ...
            <Phone>
                <PhoneTypeCode tc="1">Home</PhoneTypeCode>
                <DialNumber>1234</DialNumber>
           </Phone>
            <Phone>
                <PhoneTypeCode tc="2">Business</PhoneTypeCode>
                <DialNumber>1234</DialNumber>
           </Phone>
            <Phone>
                <PhoneTypeCode tc="12">Fax</PhoneTypeCode>
                <DialNumber>1234<DialNumber>
           </Phone>
      </Person>
   </Party>

In the first case the Xpath to the fax number is
//Customer/fax

In the second case the XPath is
//Party/Person[PersonTypeCode='Customer']/Phone[PhonetypeCode/@tc="12"]/DialNumber

This kind of issue is a common problem, and the answer is almost always either to forgo the graphical tools (sometimes the application’s backend can handle more complicated Xpaths than the IDE GUI can) or to transform the data in and out so that the application works with data in an optimal form (which requires having a customized schema for the particular application or class of application.) In many cases, it seems that the large standard schemas are either “jack of all trades but master of none” or that they really are designed for neutral data interchange and adoptees should expect to have to do some information-preserving transforms in and out.

Either way, Sean’s blog is in the ballpark.

David A. Chappell

AddThis Social Bookmark Button

In recent articles and presentations I have been postulating that a concept called “next generation Grid Enabled SOA”, a.k.a. “SOA Grid” and “Not your MOM’s Bus”, combines conventional SOA infrastructure technologies such as BPEL and ESB with middle tier data grid technology to provide a new level of predictable scalability and high availability for SOA based applications.

I often get asked - “How much better is it? What’s the ROI?”

Rick Jelliffe

AddThis Social Bookmark Button

As the anti-OOXML crowd’s technical and editorial objections evaporate, and consequently as the reasonable people increasingly see that ISO is delivering a good result for them and jump ship, the rabid anti-OOXML misinformation campaign is ramping up. The basic strategy is to say that things are so bad that no improvement is possible, and indeed that any improvement is complicity.

But it is quite possible for the different sides to engage civilly and constructively.

OOXML Forum: AM

On Friday last week UNSW CyberSpace Law and Technology Center organized a really good day-long seminar in Sydney on the technical and legal feasibility of implementing Office Open XML, to try to get people talking.

The morning was a technical meeting: I was honoured to be invited to speak first, with 30 minutes on ISO and SC34 standards and I will be putting up my slides later. Other invited speakers include Mathew Cruikshank (who was very active in New Zealand’s vote), and Colin Jackson (NZ government angle.) Also speaking in little 10 minutes slots were a representative of IBM (same old material), and Lars Rassmussen from (notorious VML-users) Google Maps (he has a meeting report here but I don’t know why he has an “issue” with me; in any case, like Matthew, I think he is a goody.) MS had a 3 man contingent, and people were obviously trying to be on their best behaviour: Oliver Bell from Singapore was there and had blogged. Not many fireworks, I was tired and grumpy from travel so probably just as well. I was pleased to also see Standards Australia’s Alistair Teggart there too; he made some good clarifications. Prof David Vaile was very interested in what people had to say, and frequently asked for clarifications or expansions. Gnome Foundation’s Jeff Waugh was lively (in fact, he called the technical issues boring…clearly a big picture man) matched only by UNSW’s Pia Waugh. (Clarification: Jeff’s point was, I think, that other issues were more important than the technical/editorial issues of the Australian ballot comments.) (I am sure I have missed some who spoke!)

I’d stereotype the various opinions as people who didn’t see why there should be two standards, people who didn’t see the value of even one standard , people who saw the value in their own standards, people who didn’t see the value of their competition’s standards, and people who thought there should be many more standards (err, probably only me.) (Updated: I originally had some names against these stereotyped positions, but as they are probably not even fair stereotypes I’ve removed them, they don’t help the gag. If you think I have misrepresented you in any of my blogs, please write and I will certainly try to fix things.)

OOXML Forum: PM

The afternoon was the legal session, and very interesting. Unfortunately it only looked at the OSP and didn’t cover any standards law (relation to law of fraud, anti-trust, etc.) probably because in many countries there has been no relevant case law, and in each national jurisdiction the situation will be difference. The US law is quite advanced (or, at least, explicit) here: Australia really needs some legislation to clarify the duties and rights of stakeholders in voluntary national and international standards processes. I have previpously posted some material on this blog Standards and IP for people who are interested.

First up in the afternoon was legal background material by Ron Yu, a very likeable guy. He has made a report that is available from the CybserSpace Law Centre’s website. It was mainly a discussion paper raising various issues that people had made, rather than a definitive position on anything. Then MS’ Steve Mutkoski gave a talk on OSP, mainly focusing on the similarities and differences from the Adobe, Sun and IBM equivalent. He was one of the legal team who drew up OSP and pushed for less legalize in it (using “promise” rather than “covenant” for example.) His main thrust was that the differences between the Sun, IBM and MS licenses were only cosmetic. Steve made his points well.

As it turned out, from discussions it emerged that there was really only one bone of contention, which was the meaning of “required” in the OSP. Now this is something that I have blogged about before: see the lengthy comment (search for “Matthew:”) and also here (search “Kurt #2″).

MS’ legal department have been absolutely hopeless in helping people figure this out, and if OOXML fails at the final vote, they and Steve Ballmer have the lion’s share of the blame. Many in the open source community react strongly to the memory of MS’s FUDing on Linux patents; rather than (as I tend to do) saying “oh good, at least in the standards process they are opening up their IPR” (i.e. due to things like standardization and the OSP) the MS FUD has raised suspicion to the level where people say “since they clearly want to enforce their IPR, they cannot be genuine about OSP” (i.e. there is some trick there). That Steve Mutkoski was so unprepared to answer questions about what “required” meant shows, I think, that this issue (which has been repeatedly and constantly raised over the last year) just has completely flown over the heads of the MS legal department. This part of the QA session was a pretty disappointing performance.

The issue here is that the OSP only covers “required portions” of the spec (MS, IBM, etc) promise not to enforce their patents unless you sue them. When I looked at the OSP, I went ahead and looked at all the other licenses to see what “required portion” meant, since it was clearly some kind of legal term. IBM’s license is better, because it spells it out; MS thinks it is unnecessary to spell it out since lawyers would know; they wanted to keep the OSP to one page. But they got it wrong: people think that normal language is being used.

W3C (and OASIS) dealt with this very problem. I think the OSP should be redrafted to follow their wording (and of IETF), and use “normative” rather than “required”. That aligns the promise with the language of the standards and clears up some potential for confusion.

Laypeople look at “required portions” and decide that this must be in opposition to “optional portions”. Here is the MS wording from OSP:

To clarify, “Microsoft Necessary Claims” are those claims of Microsoft-owned or Microsoft-controlled patents that are necessary to implement only the required portions of the Covered Specification that are described in detail and not merely referenced in such Specification.

Predictably, IBM ’s rep was trying promote this confusion. Quite a lot of chutzpah considering that IBM in fact uses the same legal terminology in its covenant

“Necessary Claims” are those patent claims that can not be avoided by any commercially reasonable, compliant implementation of the Required Portions of a Covered Specification. “Required Portions” are those portions of a specification that must be implemented to comply with such specification. If the specification prescribes discretionary extensions, Required Portions include those portions of the discretionary extensions that must be implemented to comply with such discretionary extensions..

Intel v. Via Technologies

IBM’s wording is much clearer and better, and “required portion” is indeed a common term in these licenses. However, what if Microsoft turned around and said “We didn’t define it as a required term, and now we want to charge licenses for patents”? Lets put aside the common legal usage of required portion in licenses. Lets also put aside the small likelihood that there could be non-junk patents in the area of document processing formats (considering the maturity of Unix Publisher Workbench, TeX and so on from the 1960s to the 1980s, not to mention ISO SGML (IS 8879:1986) and its applications since 1986 and before. And lets put aside fraud issues, given the consistent public representations by dozens of top-level management from MicroSoft.

What happens with an ambiguous licenses? During the session I asked if anyone knew of any case law where “Required” was discussed, having a nag in my memory. I have looked it up again and it comes up in the case Intel v. Via Technologies, 319 F.3d 1357 (Fed. Cir. 2003 which has a discussion here. In that case, the judgement was that “required” must be given the widest interpretation (to include “optional”)

Although we agree with Intel that its reading of the plain meaning of “required by” is a reasonable one, we disagree that its reading is the only reasonable one. First, the words “required by” without any clarification could mean either non-optional protocols of AGP 2.0 or electrical interfaces or protocols that are required to perform any specification “described” in AGP 2.0, including non-optional protocols for an optional specification. For example, books “required by” a school could mean books needed for (1) “required” (non-optional) classes; or (2) any class taken, including optional classes.

The word “optional” does not occur anywhere in the license agreement.

Thus, we conclude that VIA’s and Intel’s interpretations are both reasonable readings of the license agreement. The district court erred in holding that VIA’s reading of the agreement is the only reasonable one. Nevertheless, it was harmless error because, as there is ambiguity in the agreement, the district court properly granted summary judgment of noninfringement relying on contra proferentum.

When a contract is ambiguous, the principle of contra proferentum, under Delaware law, requires that the agreement be construed against the drafter who is solely responsible for its terms.

Contra proferentum has been held determinative in resolving ambiguity in a contract that, like the agreement here, is drafted by one party and offered on a “take it or leave it” basis without meaningful negotiations.

It would be interesting to know in which other jurisdictions would also allow contra proferentum: according to Wikipedia it also includes Europe, California and international arbitration. Here in Australia, there are multiple cases that endorse the principle in various circumstances.

If I may throw a spanner in the works, the thing that I see missing from all these licenses is that they only seem to cover the of patents where use of the patents is unavoidable to implementing the spec. In other words, if there are multiple ways of implementing something, you have to use the way that is not covered by the patent. I think this is unacceptable, and something that MS, IBM and the others should fix. Don’t compete at this level, boys and girls, it is counter-productive: open up.

So it was a really enjoyable day, and I enjoyed catching up with Greg Stone, Matthew Cruikshank and the others during the breaks. I think Pia, David and the CyberLaw Policy Centre organizers did a really good job.

Consortium Disinfo

But outside in the world, confusion is still rampant. OASIS lawyer Andy Updegrove has never been known to say a positive thing about MicroSoft nor a negative thing about IBM, and most of the time he is happy to be one link down the S-bend from Rob Weir’s mischief, however he is really valuable when commenting on law. But it is interesting to see the level of misinformation of some of Updegrove’s readers.

A case in point. While I don’t see this issue as primarily IBM versus MS (that is one aspect but there are also open source people and industry people and governments occupying all positions: pro, neutral, con, don’t care, be fair, make it right, etc) nevertheless I think IBM’s strategy has long been that since they cannot prevent DIS 29500 being fixed and adopted, they need to shoot the messenger and blacken ISO’s name. In fact, IBM’s Bob Sutor is quite open about who IBM are really interested in, no shame in that: when asked about Open XML and ODF and ISO he replies

I think we have collectively educated and permanently changed the policies of procurement people in many organizations around the world.

Recently, IBM marketing guy Rob Weir has not had much to blog on, since according to JTC1 rules (which they are trying to get strictly enforced following their meeting in Australia last month) ballot resolution discussions are private. This rule is intended to stop hysteria and allow the participants the full range to describe options without outsiders citing proposals and options as done deals. (I.e. exactly what Rob has been doing, such as his complaint that the early issues dealt with were the trivial ones, to try to prop up the crumbling argument that there would be no changes to DIS 29500 in response to the National Body comments) People who are interested in participating have had a year to join their national standards body’s committees and come to grips with all the issues and procedures. So Rob came up with a great spin: Microsoft is bad because, PERFIDY! they are following the rules…

Anyway, Andy’s blog on this was a real classic according to the formula. He picks up on Weir’s message of the day, and links to both Weir and MS’ Brian Jones, which is some balance. Then, in further imitation of fairness, he quotes “Pamela Jones” (Groklaw) but her article is also just a riff from Weir’s tune. Pamela manages to find some minor wording issue based on some material on the SC34 website (the FAQ is really clear on the issue, I thought): Pamela because most people in ISO committees do not have English as their first language, it is not a good idea to try to find the worst meaning in phrasing: material on a general webpage is just general material and you are wasting people’s time by trying to read too much into them. Then, based on the spurious idea of the vote being taken at the meeting, off she goes with imagined opportunities for conspiracies and so on. Again, it is the basic strategy: FUD. Take the most lurid interpretation possible, try to discredit the process.

In politics, this kind of spin is called “innoculation”. What you do is try to get ahead of your opponent by coming in early with responses to them. For example, “Six lies the Democrats will tell you” or whatever. The intent is that then the public will hear any statements through the framework established by you. (Indeed, IBM’s Bob Sutor even talked of having a competition for this.)

Andy’s blog has a similar comical moment: after given some of the smaller details of an ECMA press release, where ECMA rolls over on some of the most contentious issues which previously the Echo Chamber had told us would never be changed, Updegrove says

Despite the meagerness of the sampling of recommendations described in the press release, it is possible to get an idea of the degree to which Ecma and Microsoft are willing to go in order to secure a final, favorable vote.

What Andy: no “How great, we got what we asked for!” instead a complaint about the press release being meagre (oh no, not enough material for a convincing spin) and then the corker that this shows “the lengths” they will go to! PERFIDY! They are refusing to act unreasonably! PERFIDY! They are giving in to user demands! So despite this utterly clear evidence that the process looks like it is working (ECMA proposes a standard, national bodies consider it and make comments, ECMA and the national bodies work out constructive solutions to them, the way that every other fast-tracked ISO standard procedes), instead it is supposed to be bad news. It is hard for me to think why it isn’t kind of pathetic.

Anyway, the thing that grabbed my attention about this blog item was that there had been four small discussion threads. And all of them were based on wrong information.

  • The first makes a statement about maintenance, but the maintenance regime has not been decided yet. (See the SC34 Kyoto minutes 8.1
  • The second is some FUD on copyright. Yet the ECMA copyright is very clear.
  • The third claims that ISO requires two implementations. (Actually, now that Apple has about 10 different independent implementations of parts of OOXML released or in the works, claims about the impossibility of implementation are looking increasingly implausible.)
  • The fourth is that OOXML could not have gone through the PAS process, as ODF did. (I made a reply giving the actual JTC1 directives on the issue: Appendix M) This last claim is often based on the idea that OOXML is a really rotten spec from the technical writing POV: ODF is contrasted. However, the fact is that all specs when looked at in detail have a large number of things to improve. Look at the Japanese defect report for ODF which has finally surfaced: 98 problems ranging from trivial editorial (’An’ not ‘A’) to the incorrect (#68) to the incomplete (#75) to the inconsistent (#88). The thing to do with errors is fix them, not augur calamity or incompetence cheap points from them.

A world of confusion. The emperor’s new clothes are tremendously well-ventilated.

Kurt Cagle

AddThis Social Bookmark Button

The XML 2007 Conference has come and gone, with as usual a number of thought provoking talks and controversies. During the evening of the first day, there was a special XForms Evening, with a number of the industry gurus in that space providing very good examples of why XForms is a compelling technology and here to stay.

In the final keynote session, though, Elliott Rusty Harold sounded a somewhat more alarming note, indicating that while XForms does have a huge potential, there are no killer apps out there for it, and without significant support from the various players in this space it will be dead within the year.

M. David Peterson

AddThis Social Bookmark Button

Changeset 4436

Timestamp:
12/14/07 12:13:21 (2 hours ago)

Author:
xmlhacker

Message:
the cat is out of the bag ;-)

… and that cat’s got some *TEETH* …

M. David Peterson

AddThis Social Bookmark Button

I’ve known Chime and Uche for a while now. We all have. In fact, if it wasn’t for the Ogbuji family I doubt much XML would or even could have been much more than a passing fad. Fortunately, that’s not how things played out.

Every generation has their revolutions, and each revolution has their revolutionaries. Our generation has Chime and Uche, and to be quite honest, if that’s all our generation were to ever have, it would be more than enough. Our generation is one of the lucky ones. Sometimes that’s just the way things work. And that’s certainly how it worked out for us this go round.

M. David Peterson

AddThis Social Bookmark Button

Two recent entries, one in the form of a blog entry from Dare Obasanjo, the other in the form of a post to the FeedSync list from Steven Lees, both in the last 24 hours,

ADO.NET Data Services (Astoria) Transforms SQL Server into an Atom Store

This is sick. With Astoria I can expose my relational database or even a local just an XML file using a RESTful interface that utilizes the Atom Publishing Protocol or JSON. I am somewhat amused that one of the options is placing a RESTful interface over a SOAP Web Service. My, how times have changed…

It is pretty cool that Microsoft is the first major database vendor to bring the dream of the Atom store to fruition. I also like that one of the side effects of this is that there is now an AtomPub client library for .NET Framework.

Of course, I’m sure there will be many who will contend that GData, and therefore Google were the first to bring the “dream” of an Atom store to fruition my bad. Dare stated “first major database vendor“, which as far as I know is a true and fair statement. That said, I’m leaving in my props to Joe Gregorio cuz’ he deserves both the credit and attention, regardless of the fact that he isn’t a major DB vendor either. and to be completely honest, Joe Gregorio not only brought forward the original dream of the Atom store, but was the originating dreamer that brought AtomPub into existence, quietly building both the client and server pieces of this dream while at the same time acting as the (lead?) editor on a two man *ROCKSTAR* team, and backed by some of the brightest minds in the industry to ensure that the final result was what it needed to be. But let’s try and set aside differences in perspective for now and take a look at what Steven Lees has to say,

Rick Jelliffe

AddThis Social Bookmark Button

Here is a quick summary of my impressions of the Kyoto meeting.

Japan is so cheap to eat and stay in hotels! As long as you avoid touristic places.

Kyoto is so beautiful.

A real changing of the guard at SC34, with new Secretarat Manager, Convenor, and changes to the heads of WG1 and WG3. These are some of the people I really enjoyed seeing at meetings, and often quite eccentric or wonderful, so I hope they will still participate.

We now have a fulltime professional Secretariat Manager. It seems she is crackimg the whip to get things tightened up. For example, under the new rules I will have to be a delegate from Australia again, not independent.

DSDL is ticking along OK. We worked through some of the very last issues for some of the specs. After the horror year of 2007, we all hope things will settle down.

We are going to have a new version of Schematron. This will include the various features requested over the last few years, notably a better import mechanism, XSLT2 support, and so on. I am pretty sure I want to fold in code for ISO DSRL, ISO CRDL and ISO DTLL to the skeleton implementation, which will give a lot more capabilities. We are looking at standardizing a streaming version of Schematron as Part 6 of DSDL.

I had been tasked with trying to contact PKWARE about a possible ISO standard for ZIP. They did not reply to me, but the OOXML editor said he was in contact with them, so I expect there will be some progress there soon. That is one advantage of having the big boys at the table.

One feature of this set of meetings is the increasingly strong desire by the chairmen to prevent any wandering off into off-topic matters. This is of course because of the impending BRM which loomed over many people’s minds (but not me!) which looks like being a very disciplined affair, indeed.

It was great to see many new nations participate: we had two delegations from Africa, a delegation from India, more Europeans. Very often the delegations included a professional from the standards body, rather than a technical person, so I think they were familiarizing themselves with the lay of the land preparatory to the BRM. There are already more delegates registered for the BRM than can fit in the theatre provided (120) so it looks like the larger NBs will have to trim out excessive delegates. But lots of smart people will be looking at lots of issues. ECMA TC45 had their meeting the week before SC34, and it seems they have been ploughing through the issues.

M. David Peterson

AddThis Social Bookmark Button

or Girl. Pick whichever most closely resembles your gender and then apply this selection to the following…

Mike Linksvayer? The major political issue of today?

Music distribution companies are only one of the forces for control and censorship. The long term issue is bigger than whether private ownership of 21st-century printing presses should be permitted. The issue is whether individuals of the later 21st-century will have self-ownership.

M. David Peterson

AddThis Social Bookmark Button

A bit of a Bungee Labs theme as of late, and for good reason: I have about this >< much time at the moment to do not a whole lot more than eat, sleep, code, repeat, and while that doesn’t answer why I’ve bin on a Bungee Binger, as per the title of this post, any way I can find to save both time, money, and the stress of worrying about whether or not I’m going to make any given deadline is something I’m going to be paying attention to. As such, my attention has been directed towards any aspect of my developer toolbag which holds potential of providing a faster, more efficient, and more productive way to get from Join Point A to Point Cut B, and in this regard, I have some advice,

When your concerns are founded upon finding every possible way to weave into any given paragraph the key phrases and terminology used in Aspect Oriented Programming, chances are quite good you should consider taking a *NICE LONG* vacation as far away from the keyboard and computer screen as you can possibly get. And it’s for this very reason I am finding the latest offering from Bungee Labs oh so very appealing to these liquid crystallized eye balls of mine,

Rick Jelliffe

AddThis Social Bookmark Button

I am glad to see that Adobe’s PDF 1.7 has been accepted as an ISO standard, IS 32000:2008. It still needs to have a few hundred comments resolved and folded back into the final text, but the initial ballot was a success and I suppose early next year the spec will go online at ISO’s free site. It has gone through very fast, and I congratulate all concerned.

For my opinion on why an ISO standard for PDF is a good thing, see yesterday’s blog All interfaces by market dominators should be QA-ed, ZRAND standards!

There have already been smaller subsets of PDF available: PDF/A for archiving and PDF/X for exchange, both subsets of PDF 1.4. (The links are to pages that are really good examples for what governments and guidance organizations need to provide, to help people select between multiple standards.)

I am sure ISO PDF will help reduce that apoplexy that some people are being encouraged to have concerning OOXML, because it shows that there can be multiple standards (even for the same thing: three ISO standards for PDF alone, and counting!) as long as they don’t contradict (which has a very strict meaning in ISO usage: standard A cannot say X is a Z while standard B says X is a Z). And it shows that proprietary technologies can be standardized. And it shows that there is a difference in the (good) openness for getting good documentation and (coutner-productive) openness in arbitrarily changing a standard on ideological/aesthetic lines so that it no longer reflects the existing, deployed technology. And it shows that standardizations is a positive step forward for the community to manage market-dominating technologies (I mean standardization in the sense of being published as a ISO standard, which does not imply being adopted by any nation as a required format by regulation.)

They have 205 comments. It would be interested to see how this compares to the size of the spec, and compare it to OOXML. (I was pleased to see that some ISO PDF people measure the size of their document in total surface area of printed page frames rather than just raw page count: this is a little bit more sophisticated than dumb page count, but still only an unsound indicator for serious comparisons of standard size or complexity.) I couldn’t find a draft fast, but I read that in ISO format it takes fewer pages than the Adobe format: but taking th eAdobe 1.7 of 1310 pages as a roug guide, that gives an issue rate of 1 issue per 6.4 pages, compared to the OOXML rate of about 1 issue per 8 pages (assuming about 750 unique issues for OOXML). The numbers are not precise, but they are about the same! The only difference is that the OOXML changes tend to be broader (conformance, organization) and more disruptive (since people expect XML to be readable in the most general sense, while they don’t expect this of PDF.)

One of the most interesting documents about how Adobe/AIIM created the draft ahead of standarization is here. It is strikingly similar to how the OOXML draft was created, but note that among the national body complaints about OOXML include several concerning the use of “shall” and “should” (I raised this issue with my national body, and it was included in the Australian comments.) Conformance language is important: a standard is not really a document that is a specification suitable for a programmer to implement directly, but it is something that may be used in contracts (or called up by regulations) so it needs to be clear about what it requires and what it doesn’t require (clarity is more essential than completeness, if you know what I mean.)

ISO 32000 is based on the PD 1.7 spec, available here. The document ISO 32000 - Summary of Changes describes how the format was made.

The 205 ballot comments and their resolutions will not be publicly available, I expect, according to the usual ISO requirements. The mechanism for participation in standards development is to seriously join in, not criticize from armchairs: openness does not mean a free-for-all. People who suggest that somehow we can have Slashdotters directing standards are not realistic.

It will be interesting to see which other market dominators sniff the wind. Standardization through ISO of market-dominating technologies is good for everyone. The technology is already entrenched, so it does not entrench things further, but it provides a better basis for substitution (good for user choice and competitors) and interoperability (good for user choice and the dominator company and peripheral developers): everyone wins. They need to do this voluntarily before regulators use closed standards as evidence in anti-trust procedings.

I don’t see the people complaining on OOXML about proprietary technologies being standardized, the ISO fast-tracking procedure, the use of vendor consortia to largely rubber-stamp a pre-existing text, the kinds of error-rates, and the presence of actual users, vendors and stakeholders’ representatives on committees, complaining about ISO PDF. But all the things are present there. What is the difference? (Flamers: don’t sidestep by mentioning other supposed flaws in DIS 29500, that is not what I was asking, thanks.)

M. David Peterson

AddThis Social Bookmark Button

Just noticed that the gang over @ Bungee Labs updated their site design, and couldn’t help but be inspired by the following graphic that greeted me upon my arrival,



Now *THAT’S* how to effectively tell your story in less words than exist in one of my average sentences. Nicely done, Bungee!

Keith Fahlgren

AddThis Social Bookmark Button

Here’s my notes from the last day of XML Conference 2007. David has collected some of the blogging about the conference.

Keith Fahlgren

AddThis Social Bookmark Button

This is the continuation of blogging from XML Conference 2007. See yesterday’s post for more. There are, of course, a lot of folks blogging about the conference. Here’s my colleague Andy’s take. Elliotte Rusty Harold is providing some wonderful reading as well (and apparently did a smashing job at the XForms talk last night). For a visual sense of the conference, check out David Megginson’s photos on Flickr.

Rick Jelliffe

AddThis Social Bookmark Button

The trouble with standards is that there are not enough of them. There is a strong public interest in having the interface technologies of market dominators (which would include near monopolists and long-term super-profit-takers) out in the open, unencumbered, zero royalty, non-discriminatory licensed, and with the documentation QAed by an independent group which may include experts and rivals and stakeholders.

And there is indeed a great system set up for this: ISO, the International Organization for Standardization.

First, I should clear up a misunderstanding that many people fall into: a technical standard (from ISO) is not a regulation. According to ISO, It is always the adopter’s responsibility to look at the available specifications and see which ones are useful, and in which contexts.

Second, for ISO standards, multiple standards for the same area the norm not the exception. In fact: look at the dozens of standards for graphics formats, the multiple standards for programming languages, the multiple standards for operating systems, the multiple standards for schema languages and so on. Putting a proprietary standard through the standards mill does not prevent other rival technologies becoming standards, nor does it necessarily obsolete an existing standards. Standards are not a race.

Next, I would like to mention that ISO has a very wide range of publication types, from International Standard to Technical Report to Publicly Available Specification, and it certainly may be the case that some technologies (such as obsolescent or fast-changing technologies) would be better made available using a lesser type.

My key point is that once a company’s success in an area brings it to the point of market domination (or long-term super-profit-taker) then anti-trust regulators need to ensure that their interface technologies are open enough for the usual level-playing field concerns to be addressed. It needs to be just a cost of doing business, once you reach a certain point.

Obviously this applies very directly to Microsoft. But IBM is also a market dominator (indeed, monopolist) in the mainframe game. And Google looks similar for search; Apple with the iPod; Adobe with PDF; and so on. There may be non-American companies that it applies to as well, I suppose. (And there are technologies that achieve market dominance outside a dominant vendor: PKWARE’s ZIP for example. These need to be standardized as well.)

The current standardization effort for DIS 29500 (Office Open XML) provides a good backdrop for this. For almost two decades, independent software developers have been calling on Microsoft to open up their interface and document formats: the SAMBA developers for example. In 2004, the European Union recommended to Microsoft that it should put its document formats forward (in an XML version) for international standardization. As Microsoft has done this, first through ECMA then at ISO, it is prompted a vicious campaign against the effort, lead by business rival IBM but also by stakeholders allied to some open source software substitutes for Microsoft Office. (In particular, stakeholders allied to the Sun lead Open Office application and the ODF format; note however that other Open Source stakeholders, notably those allied to Novel have welcomed the standardization effort.)

Some of the objections to DIS 29500 are in fact objections to the idea of a Microsoft-derived technology becoming a standard. In fact, several standards have come that way. The recent ISO Open Font standard, for example, it based on MicroSoft and Apple’s Open Type fonts (True Type, etc).

Other objections to DIS 29500 come from the other direction: DIS 29500 is flawed not because of what is in it, but because of what is outside its scope: media formats, macro languages, printer driver configurations, and so on. The most extreme version of this argument is to find a fault with DIS 29500 that it does not describe the (50 or so different) earlier binary formats for Office and its component products. (When this complaint is made in the same breath as saying that the 6000 page draft is too large, it does smack a little of insincerity.)

I usually respond by pointing out that a standard needs to be scoped, that standard is a work-in-progress, that it is impossible for ISO committees and National Bodies to have enough volunteers to do this work (both because experts are scarce and valuable, and because idealism is not stimulated by contact with market dominators.)

But I should have been articulating this: yes, there should be documentation for the binary formats, and indeed any format, API or protocol used for interfacing computer systems which gain market dominance. There does need to be more rather than less, and the cost (which are ultimately costs of decent QA and education, once the spurious controversy that has unsuccessfully attempted to derail DIS29500’s progress is over. (Of course, the lack of a tangentially related standard is no reason to reject a standard, we need to start from somewhere.)

Now many businesses are naturally coming to opening up their technologies in a similar way: look at Sun, for example, with its Open Solaris, Open SPARC, and its steps towards opening up Java. Public policy makers need to foster a procurement and regulatory environment where the winds of change for openness also blow refereshingly on market dominators.

The European Union was completely correct in asking for MS to adopt standard notations (XML) for the Office documents, and to standardize the schemas (through DIS29500), just as they were right to ask OASIS to standardize ODF (IS 26300). However, I hope this is the start of a larger movement for more: all document formats, all APIs, all protocols which have significant market domination need to be made available through one of the ISO standardization processes. All these interfaces are objects of legitimate interest for public policy for reasons of information ownership, level-playing field access, anti-trust, and even just from a procurement angle to ensure that systems are adequately documented and have had adequate attention paid to internationalization, harmonization, accessibility, and conformance testing: basic QA. And where the market dominating technology belongs to a market dominated by a single player (or cartel), then that player needs to bear the cost of the standardization effort, as a normal cost of business.

Now, my proposal here is not a total package: issues of conformance with external standards still need to be in place. After the statement “Here’s what we do” comes the natural question “Is it good enough?”, and that belongs in a separate blog.

Keith Fahlgren

AddThis Social Bookmark Button

Just like last year, I’ll be blogging from XML Conference 2007. Rather than imposing some editorial structure, this’ll simply be a serialization of the things I hear from various speakers in various sessions.

Kurt Cagle

AddThis Social Bookmark Button

A few years ago, I was briefly involved with a publishing company that was interested in packaging and producing eBooks. The challenges that we faced in trying to go from client submissions in Word, the occasional PDF and even straight text files proved to be daunting, largely because these works would in general place such a requirement on editors that it was not cost-effective enough to be a viable model. Most people working with Word have only a limited understanding and therefore use for word styles, and the notion of even more stringent structured documents was completely foreign to them.

Rick Jelliffe

AddThis Social Bookmark Button

I’ve took a day off to install Sabayon Linux 3.4. It has taken me a week to get it right, with many false steps. My initial verdict: simple things are simple and work out-of-the-box, hard things are hard and require a separate internet connection to do research. The unfortunate thing is that you have no idea what is simple and hard until after the event…this has all the hallmarks of being a distribution made by gamers with superfast internet connections and superfast machines, and this caused me a lot of grief and wasted time.

Pros:

  • very nice desktop (KDE with modern 3D effects if your card supports it);
  • very good out-of-the-box capabilities especially for support for different media types in Firefox and lots of drivers; this was my first experience of changing the video card on a working Linux system and having the thing work correctly after.
  • very good for people who want quite a large full featured distribution and have no internet access (it takes about 10 meg installed from a DVD!);
  • very good for old UNIXy types like me who want su and want to recompile the kernel;
  • works well with modest hardware: my PC is 8 years old for example: during the week, I upgraded my RAM to 512Meg, and typically don’t go even get into swapping when running Eclipse, Firefox and Thunderbird.

Cons:

  • It didn’t recognize my ATI card correctly, so I had to install in text mode and fix things up by hand. So much for out-of-the-box.
  • It didn’t recognize my LG screen with 1440×900 resolution.
  • When I replaced the ATI with a new NVIDIA card, it recognized this, but the default nv driver did not provide 3D. (I had played with them on another machine: Beryl/Compiz are pretty attractive.) So I am using just the 2D desktop.
  • I downloaded the driver that NVIDIA provides, and found that there were three different web pages with different methods for installing it. I wish people would bother to write which distribution of Sabayon (or Gentoo even) they were writing about. Anyway, I couldn’t get any of these methods to work. One involved recompiling the kernel, which then wouldn’t run. I ran a repair install from the DVD, and (next day) I had a running Linux again, but I’ve ditched the nvidia driver for the generic nv driver. 2D will have to do.
  • Poor desktop admin tools compared to other distributions to help you connect into local LANs not using dynamic addresses; for example, I could not find anywhere in a graphical tool to set the DNS server location.
  • Attempting to connect up to a printer was a disaster. Its nice automatic search tool locate our Ricoh 2035, and let
    me select the drivers for it, but then told me that it did not have these drivers in fact.
  • Sabayon uses a package manager called emerge however it is not a RPM-alike, it works very differently. It downloads patches, then recompiles the application. I made the mistake of doing this for Thunderbird, and it took over 5 hours (8 hours? 24 hours? who knows, I was long asleep).

Monday

I’ve previously used Mandrake Linux, then switched to Mint Linux for months ago: Mint has a lot going for it, but I never got around to configuring it happily to what I wanted, and the Upgrade Button Debacle was a bad start.

So a DVD of Sabayon was available in a newsagent, and looked interesting. I don’t think I’ve used a Gentoo-based Linux before. Sabayon is big: it complained that my 12 gig disk might not be big enough: very different to Mint’s dainty footprint. Saboyon comes with a lot of drivers built-in: one of the attractions being that will be, I hope less downloading and configuration.

I’d tried the DVD-boot on my laptop so I knew the DVD worked. On the laptop, the Beryl/Compiz windowing worked: lots of effects and translucency and vibrating windows. Fun but useless. Sabayon is very much aimed at gamers, I think, but that was a plus for me: I was tired that in Mint there were several media types that would not run in Firefox: I am too busy to track down download things.

Booting from the DVD on my desktop machine, the first thing that became clear was that it incorrectly detected my network card. I have a decade old ATI Rage Turbo Pro, which has worked fine on other Linuxes until now. The web gave the answer immediately: edit /etc/X11/xorg.conf to use the ati driver.

Next, my screen was not correct. Fair enough: it is a LG wide screen that prefers 1440×900. Again, the web got the answer very fast. Type gtf 1440 900 60 to give the correct modeline entry for /etc/X11/xorg.conf (and make sure there are no other screen resolutions at the same depth that are larger in either horizontal or vertical axes.) Great.

Now that I knew the screen would be OK, I installed the OS and, after installing, edited the /etc/X11/xorg.conf to be correct. I configured the networking and started to play with Firefox, which comes as part of the distro. All up, this had taken 5 hours, but only 2 hours that required my attention: the install from the DVD is a long period.

But oh dear, Firefox was clunky and the graphics stuck. Hmmm, probably time for a new card. Go home and eat up the leftover 60/65 eggs and caviar and rocket from Monday’s dinner. I decided to install a new graphics card, which daunted me a bit, because I have never liked altering the hardware of linux boxes: on old UNIX systems it was always a breeze: just recompile the kernel. But I hoped the Sabayon Out-Of-The-Box approach would make th