Here’s my notes from the last day of XML Conference 2007. David has collected some of the blogging about the conference.
Here’s a random photo I found last night that was quite funny (though not from the conference).
Joel Amoussou: RESTful IDEAS
Why do we need the framework IDEAS?
This work was born out of the context of aircraft manufactures moving their content (manuals) from paper and DVDs to the web. The existing technologies were ATA iSpec 2200, CALS (military stuff), and PDF/proprietary formats. S1000D was developed to provide a more modular XML vocabulary to help move away from huge SGML files.
Here’s one use case driving the development of IDEAS: As a mechanic, I’d like to be notified when a piece of content is updated about a plane I work on. Ideally, an aggregator or (Open)search client would allow me to access content from the airframe manufacture, engine manufacture, the FAA, and airline policy and procedure manuals.
The move in airline manufacturing toward distributed manufacturing has posed huge communication problems (see the delays of all the new Boeing and Airbus planes). If both the subcontractors (a client
) and airframer communicated (a server) via AtomPub, wouldn’t they be able to communicate more easily? It’d be much cheaper than earlier, more complex systems (using the standard SOA stack).
Instead of SOA (SOAP, XML Schema, WSDL, UDDI, WS-*) move to ROA: RelaxNG, REST, Atom, AtomPub, and OpenSearch.) [I’ve never heard “Resource oriented architecture”/ROA before.]
AtomPub and OpenSearch
[Examples of AtomPub implementation details surrounding S1000D. As always seems to happen, I’m pleased to see AtomPub working well for an industry I know very little about.]
So, what are the steps in implementing a nice S1000D service?
- Generate and serve Atom feeds from your content storage database (perhaps in XSLT, using ROME, or Abdera).
- Turn your search engine into an OpenSearch provider (see Lucene’s web service)
- Integrate a feed reader into your publishing engine (most web browsers have this already).
- Add OpenSearch to the client’s browser
- Configure a feed aggregator (for content coming from external sources), perhaps using the PlanetTool
[Demo of how to turn an S1000D publishing module (with references to DITA) into an Atom feed using XSLT, a nice touch. If you hate XSLT, use ROME or Abdera.]
Sidewinder is a framework, recently open sourced, for building desktop applications. It’s following the trend of having (imperfect) applications run in the web browser and web-connected gadgets in the desktop (Dashboard widgets). These web applications are interesting not because they’re slick (they’re not as slick as a comparable desktop app), but because they’re solving real problems and keeping their data “in the cloud”, providing collaboration and mobility. Those advantages totally outweigh the lack of features in the online versions. Gadgets, on the other hand, don’t really solve a problem but could. The (known) limitations of these new applications are the limits of the browser but it’s simply a reflection of the difficulty of developing desktop applications.
Some goals of Sidewinder:
- Turn any document into a desktop app
- Give web apps access to the same features that desktop apps have
- Build “internet-facing” desktop apps faster
- Encourage reuse: a web page becomes an app
- Write everything using standards: XHTML+XForms+RDFa+SVG
[Demo of opening a page off of the web using sidewinder, crippled by broken hotel wifi, and a couple more demos that flop without an internet connection. All of a sudden the wifi comes back up, which is wonderfully kind to the presenter. There’s an iPhone application on the desktop via the web. All of it is written using wxWidgets.]
David Megginson presents a few notes before the closing plenary. Coming conferences of interest: XTech in May in Ireland (”The web on the move”) and CFP just opened, Balisage: The Markup Conference in August in Montreal, and XML Conference 2008 will be in the DC area (technically VA) next year in December.
Jason Hunter: You’re Darn Right XML has a Future on the Web
[Jason is doing the keynote due to popular demand, apparently.] This is focused on an answer to the question posed in the opening keynote: Does XML have a future on the web? Jason’s answer? Yes. After developing JDOM and being a part of the Java community in the 90s and early 00s, he’s been knee-deep in XQuery for 5 years now. His most recent work has centered around MarkMail, which Jason thinks is one of the most XML-centric applications on the web. It’s built in XML from the very bottom to the very top (XML database, application written in XQuery, delivered in XHTML). But is this simply having a golden XQuery hammer making everything seem like an XML nail? Well, email provides an interesting complexity, and couldn’t really be stored using JSON. The content of the email is a document, but the headers are really data, and this pairing bridges the big gap in the XML community between the document-focused folks and the data folks.
Does the XML hold the thing important, or is the XML itself the thing important?
OK, this email content is mixed, so how do we model this? An object-oriented language makes it weird, as does a relational system. XML content, by nature, is inherently textual, so it demands rich text search, it is hierarchical, so structure is important, yet irregular (ow), and works most easily in an XML-centered language (*cough* XQuery).
- The letters of Dolley Madison Digital Edition is “MarkMail, 1800.”
- LearnAlberta.ca: “I’m in 5th grade and want to learn about tigers. What can you do for me?”
- The New England Journal of Medicine (needing content metadata, “press view”, with all of the pictures and slides pulled out with a little context)
- O’Reilly Labs code search (compare “cat:perl shift” in O’Reilly books versus “shift” in Google or “perl shift”). This is an example of “answers not links.”
- Congressional Quarterly, which provides history and context for what new bills change
- Oxford African American Studies Center, which repackages the huge corpus of their content for a particular niche for less money
- Elsevier’s PathCONSULT, which gives doctors the ability to search for diagnostic help
There are a lot of public emails out there in the world that have a tremendous amount of good information in them. So, how do you build an email archive service? You clearly need parts of a search engine and parts of a relational store. Actually, this half-and-half approach is the worst of all worlds. It turns out that it’s very easy to write an email to XML converter.
[Demo of MarkMail, showing the analytics of “xml” (up over time, though xml-dev is down). “Michael Kay is the number one human sending emails about xml.” Ah, xml-dev.markmail.org. “I hate” search.” Who’s negative? Simon St. Laurent says “I hate” a lot (laughter). Search on PowerPoint attachments
ext:ppt, which are also indexed and searched. Oooh, inline view of a PowerPoint view in MarkMail!]
Architecturally, it’s still MVC, though it’s all XML. XML is the Model, XQuery is the Controller, and XHTML is the View. This is a bit cooler though, because the View can ask the system that it’s part of for the context it needs when given a node, rather than having to pass everything the View might need to do work. This is just an XML-oriented update of Perl CGI: XQuery as CGI (”no impedance mismatch”).
The last big thought: A Java programmers view of XML evolution [read from
bottom to top]:
XML as Model XQuery (MarkLogic) / XML as Content SOAP (Axis) / XML as Wrapper XSLT, XPath (Saxon, Cocoon) / XML as Config XML Parsers / XML 1.0