Reviews Archives

Rick Jelliffe

AddThis Social Bookmark Button

I’ve just caught up with this document from W3C which fills in a big gap in English-language technical material. Japanese typesetting technology has been very influential in the other Ideographic countries, and they share many commonalities (e.g. Japanese ruby text and Taiwanese bopomofo.) There is a Japanese standard JIS X 4051, but it has no translation available: though parts of it, usually called the kinsoku rules, are floating around in material from vendors, particularly Adobe’s Ken Lunde and some MS material.

By and large, Chinese and Korean have different details (e.g. different characters) but the same analysis applies.

One term that the W3C draft uses but does not define is kihonhanmen; readers getting held up by this could substitute underlying grid (or text block or even constant width frame) for this.

AddThis Social Bookmark Button

How earth shattering could an upgrade from 0.45 to 0.46 be?

In this case there are some really neat new features. For those who are unfamiliar with the software, Inkscape is a visual svg editor. When I first tried svg I was writing the files by hand with Vim. Nothing against VIm, I still use it for coding, but its a lot more fun to create svg’s in Inkscape with a nice image editing interface.

I’ve been able to use it to convert bitmaps into svg’s for a while now, but with the 0.46 release we have a whole new dimension with two new tools. There’s a deformation tool you can use to mush your drawing like a squishy toy.

Its still beta software, I managed to crash it out when I tried to run a raster effect on a bitmap. There’s a whole new set of raster effects, along with a new 3D cube object.

Inkscape doesn’t do animation, but for developing svg graphics or any kind of 2D design Inkscape makes svg fun. You can view the results of thirty minutes of playing with the new 3D functionality on my picasa page.

Rick Jelliffe

AddThis Social Bookmark Button

The Java Community Process is the mechanism Sun set up to develop and evolve Java “in Internet time”. It brings together “a cross-section of both major stakeholders and other members of the Java community”. A group of experts make the initial draft, then “Consensus around the form and content of the draft is then built using an iterative review process that allows an ever-widening audience to review and comment on the document.”

The result is a specification, a reference (proof of concept) implementation, and a technology compatibility kit (tests).

One specification I have been interested in for a while is JCP 296 the Swing Application Framework. The JSR (Java Specification Request) was approved in May 2006. There is an implementation at Java.net.

However, I cannot figure how to find the spec. Looking at the JCP site, there is everything about the spec, but no actual link to it. Looking at the implementation site, again no actual link to the spec. This strikes me as an entirely odd way to do business. What are they trying to hide? :-) Whatever it is, they are doing an excellent job of making sure that no-one finds it.

Looking at the site, it seems JSR 298 is marked “in-progress”. That means, I suppose that it is still a committee draft, that has not been released. After 18 months? So much for Internet time!

It seems like in order to see the draft, I will have to sign up to be a JSP member. For an individual member, it is $0 which is nice, but I have to send a fax to the other side of the world and await them to fax back a password. Or I can fax and send hardcopy by courier.

But even then, I don’t actually know that the JSP 296 draft is available for community review. The status is given as “In progress” but there is no mention of this status on the JCP description page. Presumably for the last 22 months the draft is being written. I presume a draft exists, because it has software that claims to be an implementation.

What is interesting is that this is the opposite of the ISO process. At ISO using the normal rules, it is the early drafts (working drafts, committee drafts) that are given the most exposure and can be floated around openness, and only the very final draft standards that are supposed to be controlled (to reduce interoperability problems where people write systems according to different drafts rather than the final standard, and for the standards that are published commercially by ISO and standards organizations for cost recovery.)

While I am generally in favour of committee room secrecy, to prevent intimidation and silly marketing point-scoring and to disenfranchise armchair experts, and while I can understand that drafts can change substantially so you don’t want to have old drafts floating around, openness is better. But after 22 months, and after there is an implementation, to have no actual draft casually available is not “Internet time”, is it?

M. David Peterson

AddThis Social Bookmark Button

Placeholder for ongoing notes from the Microsoft Technology Summit…

Rick Jelliffe

AddThis Social Bookmark Button

I have been pretty disappointed in the new operating system distros I have been trying out recently. In the last three to six months there has been:

  • A horrible install of a new Mac where the Expose feature caused windows to run away when I tried to click on controls near the edges of windows. It was like some kind of demented joke or game. (The user, who was previously a dedicated PC user, now loves the Mac and thinks it is much simpler.)
  • Today I tried twice to install the new service pack for MS Vista, only to have the install fail with no useful message.
  • An attempt to install a mainstream Linux on my new PC failed when it could not detect the keyboard. I had to get the new box because another install of a newer Linux from another distro was disastrous for performance on my quite old box.
  • I got too bored to continue with another mainstream Linux install, where the DVD instructed me to first burn the image to a bootable CD.

So instead I have installed a recent Solaris Developer, from the DVD of some Linux magazine in the newsagent.

This is one of the easiest installs I have had. (I could only install onto a partition on the main disk, install onto a partition on the secondary disk failed with a bogus message about user accounts. No biggie.)

The system boots up, SAMBA works fine and detects most things I want to detect. (It is interesting that it only detected one printer on our network, however Vista only detects the other one, so that is not so bad.) It has Firefox and Thunderbird, which are what I’d use anywhere, and StarOffice, which is good enough for now; I cannot really use it or OpenOffice for making presentations until Impress gets tables in v3.0. It comes with Java installed, and Netbeans, though I’ll be downloading Eclipse for compatibility with the workgroup here.

The desktop is a nice GNOME and really uncluttered and to the point.

Best of all, it feels like UNIX. Not a half-assed wannabee, or a messy child’s toyroom, the way some Linux distros seem to be. But lots of GNU goodness. I still have to see how it copes with some issues like updates (which was the only real flaw I found in Mandrake Linux, that I was happy with for a few years.)

So I really like Solaris. It seems to suit what I want and expect better than any other OS distro I have come across yet.

But it has one big problem: the screen graphics are super ugly. In fact, so repellent as to make it unusable. I have a 1440×900 LCD monitor and this is not one of the built-in types supported. No problem, I thought, I’ll just change the appropriate xorg.conf (or whatever is the equivalent) file. But I cannot see how to do it: it looks like it is hardcoded or something. So I have some other resolution, with a half inch dangling above the screen and unreachable. And the fonts are ugly and thick: even when I turn on anti-aliasing and play with the LCD settings it makes little real difference. Unless some kind reader can make a good suggestion, it just doesn’t compare to what I have been used to under Windows, Mac or even Linuxes.

I really hope I am doing something wrong, because apart from that Solaris really seems to fit the bill for me. Maybe I have to resurrect the old CRT monitor.

[Further Adventures] I tried to install a different card, only to have a hardware problem, so I switched back to the original new card. Oops, now the thing doesn’t boot. Checking though, for some reason the BIOS had switched around which of the two hard disks to boot from. I don’t understand how this could have happened. In fact, I don’t understand why it can happen either, because I thought both hard disks would be checked for booting in any case. But swapping the order of booting from disks fixes the boot problem, and I am online again.

I looked through the X windows logs, and sure enough the VESA driver only has a limited number of screen resolutions available, and 1440×900 is not one of them. Sigh… So to use Solaris I have to either go down in resolution to fit the monitors we have, or buy in a new monitor. I was finding the wider screen really useful for Eclipse, so I guess I will have to search for something else. All this is taking a frigging long time: I expect to live for three years on a single installation, so having to go through four or five large an problematic installs is wearing me down.

Rick Jelliffe

AddThis Social Bookmark Button

Patrick Durusau has a few more items on his website. Always worth a read for anyone interested in getting more than the party lines. Here is some of his latest TOC:

Rick Jelliffe

AddThis Social Bookmark Button

I’m writing this sitting in the sun looking at the pool, somewhere tropical, en route from the exhausting ISO/IEC JTC1 SC34 DIS29500 BRM meeting (hoping for my lost bags to appear and with every flight delayed by up to 12 hours). And not an acronym in sight here!

Apologies to readers; I took down the rest of the article, because it was proper for me to report back to Standards Australia first. This is quite reasonable, I think. But several sites copied the following from caches:

I’ll blog some more, but the BRM clearly has succeeded in its formal aim, which is to produce a better text. Every response by the editor was formally voted on. The big picture issues were given extra time for detailed discussion, and the NBs had opportunity to raise their highest priority issue, in turn. It would have been great to have had more time to deal with more of the middling issues: where we would have preferred some variant or augmentation of the Editor’s response to our issue or where we didn’t like his answer.

The context of this was that the meeting was productive and calm:

The BRM went pretty much the way I expected: grinding through the issues, politeness, assertiveness, corridor sessions, strange bedfellows, a lot of newbies who made up for it with articulateness, candour and brains. In substance, it was a typical ISO meeting: issues, votes, different personalities and cultures interacting, some people happy, some people pissed off about individual results, limited time, stimulation, mind-numbing alterations to resolutions, convivial dinners with fascinating techoes, late-night study sessions and early morning drafting gallops. But in accidents it was very odd indeed: not just the size of the meeting and the size of the draft and the sewerage farm of disinformation surrounding it…what is atypical is the large number of non-technical delegates and that a few delegates seemed surprised that their delegations would have to figure out a position on each issue by the end of the week (which could be “abstain - we have no position”.) It is not as if they hadn’t been told!

And after that quote was material emphasizing that there is a maintenance process to fix outstanding issues and new ones that get discovered:

There are a lot of those, and they will have to go to maintenance, which really is the big issue: will MS continue these baby steps to openness or will it go soggy once out of the spotlight, which is not unprecedented by other standards stakeholder? Even after the final vote (assuming an acceptance vote, as seems likely) governments will need to keep the pressure on Ecma to continue working with SC34 and to get these outstanding issues addressed ASAP; it is not the case that unaddressed issues need to disappear down a black hole, but SC34’s only power comes from having strong government and user backing to give this maintenance the steroids it needs: this not only means monstering MS to continue through maintenance, but also (for governments) to provide adequate resources: staffing, delegates, and long-term support for participation at standards meetings.

I have more details at What is in the new draft of OOXML?. Brian Jones has a fairly detailed Narrative of the ISO/IEC DIS 29500 BRM Meeting that is very factual. I recommend readers take a lot of the other material on the web about the BRM with a large grain of salt.

Rick Jelliffe

AddThis Social Bookmark Button

Eve Maler and Jeanne el Andaloussi’s out-of-print book Developing SGML DTDs: from Text to Model to Markup has just been put online I see. (Through the magic of Docbook!)

Even though it looks dated in its SGML examples, it really is about a methodology for analysing and designing schemas (especially for literature, i.e. “documents” rather than “data”) that is just as useful today. We might call SGML XML, and we might use “MIME type” or “data type” instead of “notation”, but the development issues this book addresses never went away. Anyone who wants to be an expert in XML schemas and document analysis needs to be aware of it, IMHO.

A good taster might be Learning to recognize semantic components.

Rick Jelliffe

AddThis Social Bookmark Button

Bruce Byfield has a nice article A Field Guide to Free Software Supporters. On his typology I’d be in between 4) Softcore advocate and 5) Mainstream advocate.

What struck me when reading it was whether pretty well the same categories could also describe people’s attibutes to Standards (and Open Standards, Open APIs, Open Systems)? Not a bad fit, with different names sometimes. In that category I guess I would be somewhere between 6) Hardcore (see All Interface Technologies by Market Dominators should be QA-ed, ZRAND Standards!) and 3) the participating idealist (because the standards issues I participate in are the ones involved in my day-to-day jobs in the markup/industrial publishing industry).

Rick Jelliffe

AddThis Social Bookmark Button

I was chuffed to see the ODF Alliance quoting this blog in their new Alliance Response to Ecma’s Proposed Disposition of Comments on OOXML. And they seem particularly interested in getting good results on the Standards Australia issues AU-09, and AU-15, AU-23 which are issues I submitted.

I guess they love me now! Though not enough to mention me by name, I am the only person quoted who is left nameless merely one XML expert. Hmmm, “He who shall not be named”… Since Groklaw thinks that the mere linking to this blog with my name by collegues foreshadows bad things, it is only prudent. I suppose it will have to be a secret love.

Since they quote me, I hope it is not too much to look at their response.*

Procedural Irregularities

In their early material various claims are made which bear looking at in more depth. They say there are many “documented irregularities”, yet when ISO JTC1 looked at them they found no substance. Looking at the list on Wikipedia where is the actual evidence of this villainy?:

  • Portugal: a fixed working group size caused late-applicants to have sour grapes. Actually, the Portuguese already had expanded the size of that working group. Not chairs. The problem as such is the regularity not the irregularity, it seems: Sun and IBM didn’t like the rules. (Note the Wikipedia entry is biased.)
  • Sweden: MS withdrew within hours an mistaken inappropriate offer of support to 2 partners before the meetings and notified the Swedish body themselves before any votes. (Again the Wikipedia entry is biased: IIRC it was MS who reported it, not “it surfaced.”) Sweden ended up abstaining due a procedural SNAFU: a double count of a vote in a meeting where another meeting could not be convened in time. So what do we have? A cock-up, transparency, the correct channels notified, no votes affected: no smoking gun (unless there is material that hasn’t come out.)
  • In the Netherlands, the MS delegate voted one way, other people voted another way: again, a case of regularity not irregularity. (The Wikipedia entry is biased here:why is that substantial problem? Different national bodies have different rules depending on their bureaucratic culture and traditions apart from anything.)
  • In Switzerland, it seems discussions were limited to technical and editorial considerations. These are the only comments that can be considered by the BRM, as has been emphasized recently by Alex Brown, the BRM convenor. So the Swiss chairman had in fact completely legitimate view, as far as I can see, as far as what is in-scope for ballot comments; that other NBs might put out-of-scope material in their ballot responses might make them feel good but they don’t go anywhere. (The Wikipedia article does not mention the scope of ballot comments to provide some balance.)
  • Malaysia voting abstain is typical when there is no consensus. Australia did the same, it not an irregular procedure. If a NB submits their comments with the abstention, the comments get to the BRM and they become part of the mix, so no harm is done.
  • Cyprus joins late. The idea that one side is more remiss than the other in trying to stack SC34 is not evidenced by the numbers: they just came in different waves separated by a few months. Given that perhaps 2500 of the 3500 comments sent in by NBs are parroted comments from a mail-in campaign (i.e. not from a proper independent review) it would take a lot of chutzpah for the ODF Alliance to get too excited by this one.
  • Finally, in Norway MS asked its partners to participate. Again, no procedural irregularity at all.

I don’t know if pointing this out will have much effect. I think the point with the various bribery/corruption claims is that they have the necessary truthiness, so it doesn’t matter if none of them have any procedural irregularities.

5 Months?

ODF Alliance say there was only 5 months to review, yet there was a full year before then during the Ecma process for participation (e.g. by ODF Alliance and Ecma member IBM). Yet the draft was submitted in: December 2006 draft submitted and the ballot was in September 2007: that is at least 9 months. (And then there is the five months until the BRM for further looking at how to resolve the issues and the issues of other NBs.)

And after that comes the maintenance process, whatever form it will take: certainly it will have a pretty high premium on interoperability with ODF and other standards.

6,045 Pages

I have previously dealt with why raw page count is not a very fertile metric. There is so much duplication, so much whitespace and so many diagrams that the effective size for review is much smaller. Furthermore, the assumption that any large standard will not be reviewed with an international and national division of labour is, in my experience and certainly in this case, incorrect.

3520 Comments

The trouble with this number is that people then think “3520 flaws” rather than “750 individual issues and a lot of repetition”. Too many? In my blog On error rates in drafts of standards I have a good quote from Jim Melton, the editor of SQL, who has commented on his standards frequently getting thousands of comments. For a large standard, a good number of comments is an indication of real review, and says absolutely nothing good or bad about the general quality of the standard or the technology IMHO.

Seven Dwarfs

The ODF Alliance groups its response under 7 heads:

In short, the proposal does NOT address the critical need for: a.) review time; b.) harmonization, c.) a clear name; d.) a sound standard with no (new or old) technical errors; e.) interoperability; f.) support for legacy documents; and g.) consistency of “fixes.”

Lets have a look at each of them:

Review time

I have mentioned above that there is more review time than is often bandied about.

But the ODF Alliance argument here is that OOXML should be be standardized because of errors that were not found in DIS29500. This is a remarkably hopeful claim (perhaps a cunning plan): see falsifiability for a discussion on why it is shakey ground.

The strongest evidence would be if the (non-duplicate) flaw rates detected for DIS29500 were far in excess of the same for other standards. However, as the blog item above mentions, the numbers don’t go that way.

However, this is not to say that OOXML and ODF and PDF would not have been better submitted as Committee Drafts in the accelerated process to ISO/IEC JTC1 SC34. No-one is particularly enamored of any of the current fast-track processes.

Harmonisation

It is interesting that the ODF Alliance quotes Tim Bray that the world doesn’t need another way to express basic typesetting features. If it is so important, why didn’t ODF just adopt W3C CSS or ISO DSSSL conventions? Why did they adopt the odd automatic styles mechanism which no other standard uses? Now I think the ODF formating conventions are fine, and automatic styles are a good idea. But there is more than one way to make an omlette, and a good solution space is good for users.

My perspective is that harmonisation (which will take multiple forms: modularity, pluralism, base sets, extensions, mappings, round-trippability, feature-matching, convergence of component vocabularies, etc, not just the simplistic common use of a common syntax) will be best achieved by continued user pressure, both on MS and the ODF side, within a forum where neither side can stymie the legitimate needs of other.

Clear name

This is actually something that I have been pushing since early last year, in discussions with other SC34 people. It is part of the general observation that many of the problems with DIS29500 are not with the technology or the technical parts but can be fixed editorially: the scoping and conformance issues are examples. My point is not that “Office Open XML” is particularly confusing or that it should not continue as a brand name (not ISO’s business!), my point is rather that it is too similar to ODF/ODA/OpenOffice to be the name of the standard. I don’t know why the standard cannot have an extra part added to its name to be more descriptive. (And indeed if the plan to split out OPC to a separate part comes off, then the Ofiice Open XML really applies to the other parts so it may not be the best collective name.)

For example, the full name of the ISO Schematron is Information technology — Document Schema Definition Languages (DSDL) — Part 3: Rule-based validation — Schematron.

But is this really a showstopper for the standard? Of course not: the brand OOXML is already out in the wild. And Alex Brown has indicated that this kind of issue might be at the bottom of the list for discussion at the BRM; it is the kind of thing where people are happy to spend days discussing, which Alex is clearly not going to allow. 120 people are not traveling from all parts of the world for a week to get the issues they have raised ignored because other people’s issues are taking a disproportionate amount of time.

Sound standard

This is where I (this blog) get quoted! The blog item was The design goals of XML.

Note the difference in approaches. My angle is “I think this is a problem, I hope it can be fixed.” Their angle is “He thinks this is a problem, therefore the whole process should be abandoned.”

I think there is a kind of bait-and-switch going on: to understand it you have to make the distinction in your mind between what a particular draft (e.g. DIS29500) says and the larger concept of what OOXML could be when fixed up (e.g. substantially the same, with the same design approaches, though different in details.) It is the difference between text and technology. Here is the ploy: first find a technical or editorial problem in the draft, then transfer this to OOXML as if it were intrinsic or necessary, then use it as evidence of the unreformability of OOXML, in which case there is no point fixing the draft since the whole thing stinks.

My POV, if anyone cares, is no different from what I wrote in 2005:

I read recently a criticism of the “Binary XML Infoset” project as polluting the stream. I believe the lesson to be learned from XML is not that “Everyone should use one format, it should be simple, it should be Unicode, it should use angle brackets” but the far more challenging “Respect-driven standards development produces really good and generally applicable results.”

Note in particular this:

when I read general, rather than technical, criticism of standards or standards bodies, I usually detect strategic sour grapes, where the organization or writer is trying to undermine a process that they cannot influence enough. XML wasn’t based on the mentality people who don’t or won’t use this are idiots but we want to add to the solution space.

All that being said, I think buried in this section is the germ of an entirely valid point: even things included for legacy reasons should be in standard notations. You have make a more specific judgment than legacy=good (as some Ecma some people are perhaps prone to) or legacy=bad (as some anti-ODF people are perhaps prone to).

For example, I have written about the integer measurement system EMU used in OOXML: this is unusual but useful and a common kind of thing to do (e.g. groff, PDF, etc). But I don’t see any reason for twips let alone half points, they are just a bunion and a carbuncle, if not vice versa. Are they showstoppers? Well, it would be really good to get gratuitous problems fixed now, rather than leaving it for maintenance. But it is a matter best practice, but not an actual error or gap.

Interoperability

Interoperability is a great motherhood word. No-one is perfect.

They complain that

While the proposers “agree that it is important for the specification to support multiple types of object linking,” they suggest changing oleLink(OLE Link) to oleLink(Generic Object Connection). And, instead of referencing the specific OLE2 connection they say to use any generic ‘embedded object’.

When we look at ODF we see they have an element draw:object-ole which has a definition represents objects which only have a binary representation, almost the same thing. So the ODF Alliance want to keep the reference to OLE (and make it a normative reference, which is probably dubious but I digress). Fair enough: lets make the spec better! But look at the use this issue is put to: the heading says “What is missing? Interoperability! Why ignore the re-use of existing standards?” but the use of existing standards is never mentioned in the text.

I suspect that the heading is a carry over from a previous draft, where the body text was changed as it was discovered that among the Editors Disposition of Comments are details of adding scores of references to the various standards used by OOXML (both in DIS29500 and in other proposed fixes.) But my point is that the conclusion is not supported by the evidence, and their reaction to the issues they raise is too strident and over-reacting.

Support for legacy documents

This begins with actually quite an interesting point, and the first really new things to consider. Should a new standard have deprecated material? Putting aside the general point that a fast-tracked standard is not a new standard but a review and rebadging of an an existing external standard, the comment is that OOXML is a different case than other standards where this mechanism has been used: like C++ these standards capture a living technology in which some parts are living and others are dieing, but the ODF Alliance thinks that compatibility or legacy options are only warranted when they reflect multiple previous implementations. I wonder whether the presence of compatibility options designed to handle old Word Perfect behaviours puts a spanner in the works for that argument?

From the interesting start, the material on this point rapidly descends, ultimately saying

However, from the details provided, it appears that Ecma is merely taking a subset of VML, giving it another name (DrawingML), and using it in places where VML was previously called for. What is deprecated
merely re-enters through the back door.

This is quite bizarre: VML and DrawingML are in different namespaces and I have not seen anything in the Editor’s Disposition of Comments about taking subsets of VML and renaming it. I’d love to know what in particular is meant by this. DrawingML is not something new, but part of the draft (VML had almost been entirely retired, the difference is that the Editor wants to completely retire it.) In particular, there is nothing in the section they quote (Response 92) about subsetting: there is only material on the mechanics of deprecating VML, removing references to it in favour of DrawingL, and enhancing DrawingML so that it can do every that VML did (for example, to support rich text comments); deprecating VML necessarily involves making sure that DrawingML has equivalent features, how else could it be? So the ODF Alliance comment here is completely wrong, perhaps they think they can get away with it because the Editor’s Disposition of Comments document is not generally available.

The background to all this is that France’s AFNOR in its comments asked that the standard be split up with all the core material in one part and all the deprecated functions, documented settings, VML etc in a second part. Many other NBs also asked for the standard to be split up and for OPC to be its own part. My suggestion, through Standards Australia, was to split into 9 parts for example. So ECMA’s proposal is to do both: a part for core, one for deprecated/legacy/VML material, and a part for OPC, but then to add various conformance classes for different application areas which would give the same conformance subset effect that having multiple parts would achieve. So splitting up is a straightforward and direct response to NB suggestions.

Consistency

Once the Editor’s initial Disposition of Comments document is out, then the issue of consistency rightly becomes important for reviewers. If the Editor accepts one comment with a particular fix on certain grounds, why not accept another comment with a similar fix on the same grounds? So now is exactly the time to be bringing up consistency issues. And there certainly might be inconsistent responses to different NB comments, where the NB comments are themselves incompatible.

It is the job of the BRM to work through as many of these these kind of issues as it can. The Editor can only say “Here is how I would solve this” and the BRM has to sort through the issues and contradictions. And ultimately it is the National Bodies who then decide whether the revised text of the standard passes their tests.

The ODF Alliance give two example of horrible inconsistent responses. One is concerned with which version of schemas is normative, with the choices being suggested of either the electronic version or neither. (I hope what will happen is that the schemas will be printed as an annex in the standard, and that many of the schema fragments in the standard will be removed. ) I don’t think they are very serious here, the standard will end up saying something, and that something will in all probability be whatever the BRM decided.

The other inconsistency concerns another one of the Standards Australia Issues I raised. I don’t see the contradiction here: one response concerns content-type labels, the other concerns how to locate executables. Maybe there is some deeper issue that has evaded me…I think there might be a confusion here between OOXML content types (which are expressed using MIME content type notation, and live in the [Content_Types].xml part) and relationship types (which are expressed using a URI syntax and live in the various .rels parts.)

Again, the reason to mention all this is not to say that it is not appropriate to bring up issues like consistency in the lead up to the BRM. My problem is in using these run-of-the-mill things that can happen in any standard as evidence that we should decide to disallow the revised OOXML spec ahead of fixing it.

They write:

Can we in good faith endorse a standard that is not technically sound with conflicting recommendations on technical remedies?

But hold on, who is asking for such an endorsement? The purpose of the BRM is to fix these, so that the identified tecnical unsoundnesses get addressed and that there are no conflicts in the editor’s instructions. Then, after these have been fixed, the National Bodies can respond by changing their ballot responses if they are satisfied.

I am sad if I may jeopardize the love of the ODF Alliance, but this document of theirs is so full of non sequiturs that I don’t see it as adding much light to the discussions. But perhaps the purpose of the document is not to join in any dialog but to try to withdraw participants from it.

[Update: I think if I make fun of poor efforts, I should also praise good efforts. After the disaster of the document above, I see the ODF Alliance has now put out another one OOXML: Top 10 Worst Responses to the NB Comments which is a much more respectable effort, raising reasonable issues this time, restraining itself from the dire and lazy mish-mash, and good-humoured rather than ranting, which is particularly welcome. Its only a document format. In a previous blog I mentioned the spin technique of “innoculation” with the example of list, but I don’t see new ODF Alliance document as that at all, but entirely appropriate, and the kind of things the BRM should be discussing and that non-armchair people should be thinking about. (Of course, I do make the same proviso as with the NB comments: if you parrot a set of points provided by a campaign, you are not doing an independent review of the standard draft but you are doing a review of the pre-fab talking points! If every NB comes with its own Top 10 Worst list, that allows much more coverage and improvement than just one: otherwise when the BRM takes 10 minutes to fix these 10, there will be four days left twiddling thumbs! :-) ) So, well done ODF Alliance, I hope this is a sign of things to come.]

Kurt Cagle

AddThis Social Bookmark Button

Over the last couple of years, I’ve worked extensively with Firefox, and while it still has its warts (and while I believe that its days of double digit rises in adoption are probably coming to a close) overall, I’ve found that it has become, for me anyway, my de facto browser into the web and the focus of most of the web applications (and extensions) that I’ve built in the last year. For that reason alone, if nothing else, I’ve been watching closely as Firefox 3.0 approaches its final release.

The second beta version of FF3 is now out, and I have to say that overall I’m feeling quite pleased with what I’m seeing, with a few caveats. Since I do generally dig into the application daily, my focus in trying it out (and in writing this review) is less on the immediate UI and functionality changes for the typical user and more on how its going to affect web software development. Thus, I ask that you forgive me for not talking about the new theme (okay, though not a radical departure from the old) or other user improvements or give you a lot of screenshots … I want to look a little more deeply under the hood.

M. David Peterson

AddThis Social Bookmark Button

Just noticed that the gang over @ Bungee Labs updated their site design, and couldn’t help but be inspired by the following graphic that greeted me upon my arrival,



Now *THAT’S* how to effectively tell your story in less words than exist in one of my average sentences. Nicely done, Bungee!

Kurt Cagle

AddThis Social Bookmark Button

A few years ago, I was briefly involved with a publishing company that was interested in packaging and producing eBooks. The challenges that we faced in trying to go from client submissions in Word, the occasional PDF and even straight text files proved to be daunting, largely because these works would in general place such a requirement on editors that it was not cost-effective enough to be a viable model. Most people working with Word have only a limited understanding and therefore use for word styles, and the notion of even more stringent structured documents was completely foreign to them.

Rick Jelliffe

AddThis Social Bookmark Button

I cannot think of a technical book that I have enjoyed more in the last decade than Yannis Haralambous’ new Fonts & Encodings from O’Reilly. It plonked on my desk this week, with a resounding bang: it has over 1,000 pages with many graphics.

The book really should be called “Fonts and their encoding” as it is not really about character sets at all, though Unicode appears throughout. It surveys the area of fonts, covering multiple platforms and systems, always wryly and clearly. Here is what I like in particular:

  • Reading this book you get the idea that you are encountering a world that would otherwise be almost closed to you: not just technical information but background and gossip. It is almost at the level of Ken Lunde’s CJKV Information Processing (perhaps the best technical book ever written for taking an inchoate mass of facts and constructing a clear and systematic survey), which is probably the highest praise I could give.
  • Haralambous’ style is delightful: Scott Horne’s translation does not attempt to lose the French accent (this is a translation of the original 2004 French Edition) but this is nothing but positive for the text. The result is a book that seems to have been written by a human not a droid. He seems to be a character like BIS’ Martin Bryan, who cannot talk for long without saying something really interesting.
  • Haralambous comes from a background of high-quality typesetting. One of the most tedious aspects of 2007 for me has been the interaction with people who know absolutely nothing about typesetting, even low quality typesetting, but who feel competent to be dogmatic on ODF and Open XML. His even-handedness and expertise are really admirable.
  • Haralambous is one of the instigators of the Omega project, which is a grafting of TeX, Unicode and OpenFont (= TrueType) fonts. As such he pays decent attention to fonts from all backgrounds: the last 400 pages of the book are appendixes on bitmap fonts, TeX fonts, PostScript fonts, TrueType fonts, MetaFont, and even a little section on Bezier curves. I see the book as really timely for the next generation of platform-independent Open Source publishing applications.
  • It is interesting to see how integrated XML is to the whole book. Notably, the lengthy sections on TrueType use Just van Rossum’s TTX XML-ization. The author really seems to get XML.
  • Apart from a great 70 page section on the History of Latin Typefaces, the book includes some good material on Arabic/Indic typesetting that I had not seen before. The treatment of CJK (Chinese/Japanese/Korean) issues I didn’t care for much: but it is a big area, and I think it would be great if a new edition of Lunde’s book could be prepared: CJK processing does not involve fiddling with glyphs much, so I can understand why there would not be much treatment of it here.

I haven’t read the sections on typographic programs yet: my license to FontLab is somewhere in storage but I haven’t used it for quite a while: just skimming the FontLab material here and it seems the book provides a lot of the information I didn’t have workable access to a decade ago. Cool!

The great thing about writing (and, hopefully, reading) a big fat survey book is that the gaps in the status quo become very evident. A decade ago, when I wrote my XML & SGML Cookbook, it became apparent that DTDs and grammars were not capable of representing many of the constraints and abstractions that document description languages needed: out of that idea eventually popped Schematron. Haralambous only briefly mentions it in this book, but his website has some papers where he describes his idea for typesetting based on textemes which comes out of his awareness of the gaps. It will be interesting to see what direction he takes there.

The only quibble I have about the book is that I would have liked to have seen more treatment of cutting edge technologies such as SIL’s work. However, the book’s strength is that it brings a modern European (in particular, a West continental European) perspective and I can understand that the line had to be drawn somewhere.

It is somewhat surprising to me to find a technical book where I think it would be more productive to have the book in my library rather than try to locate the information on the WWW. The local technical bookstores nowadays have computer sections that are full of product manuals and certification courses: finding a book that even has a sense of history and enjoyment of the subject matter is water in the desert.

I don’t know if this is an experiment by OReilly, to translate a book from their non-English operations, but it is really successful.

Kurt Cagle

AddThis Social Bookmark Button

Every so often, you come across a book that not only informs, but challenges your perceptions, leaving you seeing things in a way that you would not have before you started reading. I have a fair number of science fiction books that I’ve read over the years that left me in the major paradigm shift state after reading it (usually at about three in the morning), but its been rare in recent years that I’ve found a tech book that has done so. However, the book RESTful Web Services by Leonard Richardson and Sam Ruby (O’Reilly press, 2007) managed to do just that.

Kurt Cagle

AddThis Social Bookmark Button

It was perhaps inevitable - having turned the geospatial Earth into an animated, zoomable extravaganza, Google has turned its gaze skyward. With Google Sky, the tens of thousands of Hubble based images (as well as those of more prosaic Earth-bound telescopes) have been knitted into a seamless fabric that lets you explore the universe in myriads of ways - from zooming in on the Pinwheel nebula to charting the luminescent clouds of the Eagle hatchery.
Kurt Cagle

AddThis Social Bookmark Button

The Dow Jones Industrial Average (the DOW) did quite a dance today, with its peak to trough extending nearly 300 points before closing, pretty much at random, more or less where it started. I bring this up not to turn this column into an economic report about Wall Street (definitely out of bounds here, except perhaps in the discussion of Atom-based XML feeds retrieving DJ stats) but to discuss a bit about systems theory and to review a book that I think should be pretty much de required reading for XML architects.

I suspect that I’ve always been something of a systems theorist, and I’ve noticed that systems theory tends to attract architects like moths to a bright light (no comment about getting burned). You can tell the systems theorists out there - they are the ones that clandestinely like to play Sim City at work, who can readily tell you what the Austrian school of economics is despite not being an economist, who were getting nervous about calving ice shelves and CO2 concentrations long before Al Gore started doing his stage show. Some of us are scientists, some are programmers, some are environmentalists or economists, but the common thread that binds us together is that we’re the ones who never stopped asking “WHY?” as kids.

Rick Jelliffe

AddThis Social Bookmark Button

DonationCoder.com has a very good Word Processor Review by Zaine Ridling, divided into three tiers: Major Word Processors (Open Office, Office 2007, Word Perfect), Second Tier Word Processors (AbiWord, EIOffice, etc.) and Online Word Processors (Google Docs, etc.) that is well worth reading for an idea of the capabilities of each. The final Pro and Con tables are handy.

The predictable quibble I have is that the reviewer apparently believes that application features are disconnected from save formats. So while he opens with If ever a maxim fit, one size does not fit all applies accurately to word processors and diligently mentions the different feature sets of the different applications, these different features never need to save any information that ODF cannot handle, it seems.

I think the best resolutions is that if a document does use some features that a format cannot handle, the application should alert the user who can choose the appropriate format. For Office 2010, for example, a user could set ODF to be the default default, and OpenXML can be the fidelity default, for example. I think that is one good way to reconcile the basic ODF-wasn’t-designed-for-our-feature-set issue with the we-want-ODF-as-our-default-format issue. Rather than panicking ‘It is impossible to use ODF because it doesn’t support all these things” (which is clearly true for many, but hopefully not for most Office documents, presumably following one of the standard statistical patterns) on the one hand, or chanting “ODF gives you everything you need” on the other hand (which similarly is hopefully true for most, but certainly not all Office documents)

It would be interesting to also include the word processors from Adobe (FrameMaker), IBM and Lotus as well. And it would be interesting to also include validation reports where the XML-in-ZIP save formats were validated against their standard schemas, since validity is a great tool for determining whether an application is doing the right thing,

Rick Jelliffe

AddThis Social Bookmark Button

Bob is a really clear-thinking and enthusiastic guy, and one of most interesting to wine and dine with. His book Document Engineering is important for anyone who wants a better vision of where XML is leading us. I’ve just discovered the IT Conversations website, which has podcasts of various people of interest to me: Miguel de Icaza for example.

Bob’s podcast has much of interest. An idea that hadn’t registered with me before is that one of the drivers for (larger) business to adopt a document-engineering approach is because they need to componentize their business functions: a document doesn’t care whether it goes to Florence, Bangalore or Kinshasa. Globalization as a driver for XML: that’s a pretty strong driver.

Bob also has a blog with co-author Tim McGrath Doc or Die

Rick Jelliffe

AddThis Social Bookmark Button

People trying to figure out where they stand on the desirability of multiple overlapping standards for technologies, or who would scream if they hear the issue reduced to VHS versus Betamax one more time, might like to add this article Why China wants its own video standard onto their reading list.

Of course, the IP and licensing issues of MPEG have long been controversial; standards that are not royalty-free are entirely dubious, especially in the modern climate. I am writing this from Delhi, India, [which (outside my window at least) has the most beautiful greens of any city I have ever seen…I had heard of Assam gardens and so on but was not prepared for how vivid things are]; but from here China’s position against technologies with royalties that only rich countries (rich manufacturers, rich consumers) can afford is not just interesting or prudent, but clearly obvious.

Rick Jelliffe

AddThis Social Bookmark Button

I wasn’t there, but the XTech 2007 Conference seems to have its presentations online already: fast!

Scanning through them, one made me really happy. It was Henri’s talk on the WhatWG’s HTML 5 validation efforts. Actually, “I’ve won!” flashed through my mind. It was not because the HTML5 group had started to use multiple validation languages, along the layered or progressive lines I (and the DSDL rabble) have been advocating, nor even because they were using Schematron, nor even because Henri says that Schematron (and RELAX NG) while better than XSD were not as good as they expected (thereby giving me a challenge to show how they could do it in Schematron with the correct idiom, and thereby make me appear well smart).

No, what made me happy was a little line towards the end where the issue of generating usable user messages was raised (p41). This is the most important part of Schematron, not the use of paths or assertions or phases or flags or any of the mechanics, nice though they may be. The “big idea” behind Schematron, such as it is, is that the problem of validation is just as much (indeed, more) one of communicating constraints (and therefore unmatched constraints) to users as it is about representing them to machines. Validation is not just binary, or even a set of fixed outcomes: it is about determining, locating and communicating the status of a document and its parts.

This is especially because the user experiences the document often mediated through some user interface, not as elements and attributes: so validation messages that are given in terms of the elements and attributes rather than either the information model or the user interface will just be mystifying. And especially confusing when they give messages about where the problem was found, not what caused the problem: for example when there is a missing element and the error message is in terms of “Found unexpected XXX” rather than “YYY is missing”.

I am a bit of a broken record on this, but I think a relentless emphasis on the human user is really important for standards: XML succeeded by providing not only simplicity but native-language markup.

Simon St. Laurent

AddThis Social Bookmark Button

I’m here at the Web 2.0 Expo, a computer book editor surrounded by all kinds of possibilities for web-related books, articles, PDFs - pretty much everything here is publishable, and would interest someone. At the same time, though, there’s been a consistent message here: everyone out there knows what they want better than you know what they want. So….

M. David Peterson

AddThis Social Bookmark Button

Last Sunday the power supply on my *MUCH* beloved DevBox finally gave up the ghost.

Picasa Web Albums - xmlhacker - Dead DevBox

DSC00536.JPG

Death of the DevBox

As per the above photo, if not obvious, that’s a power supply half the size of my DevBox hanging off to the side. The machine itself is completely custom, right down to the screws that keep the sub-compact power supply and cooling system snugly fit inside. Finding a replacement is no easy task, and it was becoming more and more obvious that my last minute hack of desparation — ripping a power supply out of a nearby tower, pulling off the side of the DevBox, and plugging it into the motherboard — was not something that was safe and as such, something that I could expect to last for very much longer. Couple this with the fact that in its current state it was no longer a “portable” workstation (coupled with a flat-screen monitor and a reasonably sized ergonomic keyboard, you might be surprised at just how portable such a workstation can be) and it became all too obvious,

It was time for Timmy’s well deserved retirement. (< Yes, his name is Timmy. Long story… Don’t ask. ;-)

Anyone who knows me, knows that I have never claimed to be a Mac FanBoy. As per the intro to the photo collage of my first Mac purchase a year ago October,

Rick Jelliffe

AddThis Social Bookmark Button

Geekfodder!

Rob Cameron, who is a professor at Simon Fraser University, has released u8u16 in open source beta, a really exciting library which implements an “iconv” like transcoder (i.e. it converts data from one character set and encoding to another), and which uses the SIMD instructions that modern CPUs have.

I think I was the first person to write something on this technique, certainly on the Internet, in my blog item Using C++ Intrinsic Functions for Pipelined Text Processing a couple of years ago, but only because the idea was too obvious to people involved with DSP to write about, I gather: of course you can use instrinsic functions for text processing! My code just used C++ intrinsics as an optimization on top of C++ code. But Cameron takes it to another level: his code abstracts out the features of the most common SIMD devices so that his algorithms can be arranged to work on this abstraction and compile to a wide range of targets processors, and he can dispense with the code. He reports 4 to 25 times speed increases, depending on the data; which is very promising.

I would love to see an XML parser that combines Cameron’ SIMD work with the optimizations from IBM’s XML Screamer, which seem to increase the speed of Java processing by two or three fold. Cameron’s work is important because it gives a working abstraction that can inform decision-making on buiding SIMD-using capabilities into Java’s text processing.

M. David Peterson

AddThis Social Bookmark Button

Update: The first part of “Week 1 : The Zune Experience” (with more to follow later today) is now available @ http://dev.aol.com/blog/mdavidpeterson/2007/02/26/week-1-the-zune-experience

[Original Post]

So as I blogged about last Thursday, I received the Zune I was awarded for being one of the first 10 folks to create and publish a VHD-based instance of their rPath Linux-based project. In the 10 days since, I’ve realized a couple of things,

1) “WOW! You think maybe you could turn up the quality rating the next time you post a picture of yourself so you don’t look like a 14 year going through puberty?” Or is just the angle I’m looking at it again, this time from a different monitor?

Well, regardless, my apologies if I scared you, your children, love ones, or possibly any of your pets due to concerns over catching “Whatever the hell that is on his face! Beth, get some rubbing alcohol! John, *DON’T* touch the screen until we disinfect it!”

Yikes!

So, on to the next item on the list,

2) Zune ROCKS!!!

As made mentioned at the bottom of that same linked post,

M. David Peterson

AddThis Social Bookmark Button

So much to talk about, so little time, but none-the-less, let’s get this party started ;)

Amplee, IronPython, ASP.NET, WSGI, AtomicXML, and Xameleon Update

[Amplee@SWiK.net]

So both Sylvain and I have been jamming away at the integration of Amplee, IronPython, ASP.NET, WSGI, AtomicXML, and Xameleon. Attempting to merge together such a cross-section of various technologies, as you can imagine, has been interesting. None-the-less, we have things working pretty