Java needs to start over with a modern media library to stay relevant in multimedia, and possibly in desktop applications as a whole. But what’s needed and who’s going to pay for it? This is Part III.

In Part I of this “Rebooting Java Media” blog series, I made the case that in the Web 2.0 era, the “read/write web”, it’s inevitable that media frameworks will need to be optimized for creative tasks, something that Flash has taken a huge lead at providing. I also rhetorically asked if this really needed to be Java’s fight. I said it did, for a number of reasons: first, Java’s suitability for complex and deep tasks makes it well suited for difficult problem domains like media. Secondly, with web apps and Ajax filling many roles Java might have been meant for, and with Flash holding such a huge and growing lead as a next-generation client technology, it raises the question that if Java isn’t going to be relevant for media, then just what kinds of desktop apps is it going to suited for?

Part II looked at the existing Java media libraries, both in terms of their existing capabilities and their suitability for building the next generation of Java media. All of them came up lacking, particularly in that only one, QuickTime for Java, has strong support for creative tasks and good abstractions, but is limited to two platforms and is well on its way to deprecation. Java Media Framework offers shallow functionality across a number of now-obsolete formats, but offers little if any production capacity for the Web 2.0 user, and seven years of absolute neglect have eliminated any credibility it might have claimed. IBM’s MPEG-4 Toolkit is also playback only, but is interesting for supporting the popular MPEG-4 and MP3 formats and for offering high-performance in an all-Java library. Unfortunately, it is proprietary, expensive, and playback-only. A number of open-source Java wrappers to native frameworks — some built as JMF extensions — are also noteworthy for extending the number of playable formats, but generally offer little creative capability other than possibly transcoding between formats.

So that leaves Java without a library that can do the kinds of things that Flash is already doing on the web. How advanced is Flash, and how common are the kinds of creative tasks we’re talking about? Consider this: multi-track audio editing is now a children’s game on the PBS Kids website. Or what about the recent holiday season’s gotta-see-it link, the tween-based, video-compositing goofiness of Elf Yourself. This is where Flash — and more importantly, user expectations — are today. The only one of the Java media frameworks mentioned in Part II that is suited for these tasks is QuickTime for Java, and it’s on its way to the Server in the Sky. Whatever developers and users are doing with online media in a couple years, it’s a safe bet they won’t be using Java Media Framework.

If Java wants to get in this game, how can it do so? Let’s throw out one straightforward option: pay Adobe a big stack of money to license Flash and develop a Flash player in Java. After all, if you want to defeat your enemy, sing his song. Would it work? This would re-legitimize Java as a media platform, and could be good for Adobe in spreading Java to more devices (particularly if it can be done in ME on the small device, where Java has a distinct advantage). Also, as GPL Java makes the platform more palatable to Linux users, a Java-based Flash could bring modern Flash capabilities to Linux, where Flash upkeep has lagged. But then again, Linux support is a problem Adobe could solve for itself if it were so inclined. And for Windows and Mac, Adobe doesn’t exactly have a problem distributing Flash or getting users to upgrade to the latest version, so the claim that a Java implementation of Flash would improve Flash’s distribution may fall on deaf ears.

But hey, it was worth a shot.

So what’s the prospect for developing a new Java media library? What does it need to do? As I’ve said in the preceding parts, think “rip, mix, burn”, or more accurately, “capture, edit, export.” These capabilities are a minimum. Specifically, the API needs to account for:

  • Representing media in a stopped state - this is the amazingly missing piece in JMF. You can’t edit in place or work with metadata because the only “interesting” state in JMF is “playing”. In fact, the concept of getting a player or processor from a “data source” makes the nature of that source completely opaque. Good for one-line demo apps, bad for anything more involved.

  • Capture - it’s a shame there isn’t a standard for microphones and video cameras, like there is for scanners with TWAIN. Maybe there’s room here for a standard of that ilk, allowing, say, an application to register for callbacks whenever the device has data ready. But what might be more important is realizing that most of these devices currently use a USB interface, so the most important step to achieve this would be to get an implementation of JSR-80, the Java USB API, into core Java. With this, microphone, camera, and webcam drivers could be written in Java, making devices work with more platforms than is the case today, where many devices are Windows-only solely for reasons of software support. USB is already designed to work with these devices, reserving blocks of IDs for them; it’s time for Java to catch up and realize that USB isn’t a synonym for “mass storage device”.

  • Multiple tracks, Z-ordering, and tweens - it all sounds esoteric, but Flash is doing it today, and the power of these ideas are part of the reason Flash is winning. For production reasons, it’s also essential to support media references, which allows you to work with pointers to media at production time, and only copy the needed parts of media sources into the final file at export or “flattening” time. Learned from experience: unnecessary 50 MB file copies really add up, so use pointers when you can.

JMF doesn’t make the cut here, and I’ve argued its worst failing was not that it supported too few codecs and formats, but that it supported too many, offering shallow functionality (pretty much just playback) for all of them. Flash only supports a few proprietary codecs, and that hasn’t exactly hurt it. In fact, the right answer might be to internally support one format, and provide deep functionality in that format. This, after all, is QuickTime’s approach — everything you open in QuickTime is imported into a QuickTime movie in memory, allowing the same features to work consistently on whatever you open (the downside is that you can’t, for example, do round-trip editing on foreign formats like MP3’s… metadata is lost in the import). I imagine that Microsoft’s libraries probably do the same thing by treating Windows Media as a sort of “favorite son” format (I don’t know for sure because I’ve never been particularly interested in developing Windows software… fact checks welcome). Ultimately, it’s the only approach that makes sense for supporting capture and editing — otherwise, you might have to write different capture and editing code for each supported format. It makes vastly more sense to work with one known format at production-time and export to other formats afterwards.

But what format? Should a new one be devised? Why would you do that, when MPEG-4 is already available? MPEG-4 has the benefit of being widely supported, remarkably flexible, and developed by the pre-eminent standards body in the field. By building the library to work with MPEG-4 as its native format, Java will gain immediate compatibility with a massive range of content and content providers, and will be ideally positioned to be the platform for generating and developing that content.

But, some of you are already typing, MPEG-4 is patent-encumbered! We should do everything in Ogg.

Sigh. I was all prepared to fight this fight — to point out how by and large, neither the consumer nor professional media industries have a problem with patented technologies (indeed, pros often adopt proprietary technologies to the point of genericizing their names: “chyron”, “ultimatte”, “teleprompter”, “avid”, “betacam”, etc.) and have heard nothing from open-source zealots that actually matters to them. I was ready to counter that MPEG’s patents are licensed in a reasonable and non-discriminatory fashion. I was even ready to deal with the noxious AT&T lawsuit claiming infringement — we all know the patent system is so broken that a claim is no proof of genuine infringement, nor the lack of claim against Vorbis and Theora a proof that they’re patent-free.

Fortunately, I don’t have to fight this fight, because an open-source authority stepped up just last week. In World Domination 201, Eric S. Raymond (henceforth the usual “ESR”) and Rob Landley recognize the effect that a lack of media support has had on Linux adoption and the positioning of Linux in the race to the 64-bit era. They have an entire section on “Facing the Music”, about how to deal with patent-encumbered codecs and formats and their non-support in the Linux community, specifically, the fact that your iTunes movies and songs won’t and can’t work on Linux. They write… seriously, this is ESR and not me… “There is lots of audio and video content out there in proprietary formats that ordinary computer users want to reach. ‘But you should demand it in open-source formats’ is not an answer, it’s an invitation to be written off as irrelevant.”

What they call for is the creation of a company to license multimedia codecs and formats — they single out the MPEG codecs MP3 and H.264 as particularly desirable because their controlling entities only seek to maximize royalty revenue and have no “strategic” aims — and make them available to the Linux community. “The only way to get such a package of licenses is to have some entity put it together, and stand behind it with money and lawyers.”

Granted, I think the authors overlook the growing importance of media creation, not just playback, and I worry about oversimplifying the issue of playing media from iTunes. While the H.264 codec may not be a strategic bludgeon, Apple’s FairPlay wrapper is, and I’ll bet that ESR isn’t aware that third-party software can’t play iTunes movies, even by using Apple’s QuickTime libraries. Still, it’s a huge step forward to acknowledge that until now, the open source community has been talking in a language that the media community is neither able nor willing to hear, and has done so to the community’s great harm.

Please feel free to jump in anytime you see an analogy to Java media. While for years an RFE to bring the Ogg formats to JMF has retained a hammer-lock among the Top 25 RFE’s, the status of Java media has dwindled to the point of pretty much being a joke. Adding a little-used format to JMF isn’t going to change anything. If Eric S. Raymond is ready to get real about supporting end-users with useful multimedia libraries, then the Java community can and should too.

In fact… with the GPL’ing of Java, the Java Trap will be a thing of the past. And if Java were to solve its media problem, and be under a license palatable to the Linux community, then it could solve the Linux community’s problem too. This could also be a major driver to actually get Linux users to install Java (or, as we’ll soon be able to type, apt-get java).

So, appealing as all this is, who the hell is going to do it, and pay for it?

With all grand plans for Java, the default answer is usually Sun. In this case, I think that’s wrong. Sun’s history with JMF — a lousy design, a crazy selection of partners, and seven years of indifference — is proof enough that this is not their game. Sun has no talent for media, no passion for media, and no relevant patents. Their attention is elsewhere, seemingly in pitting Java in an entropic arms race with C# to see who can get to the heat death of the Universe first.

I’ve praised the deep thinking in QuickTime that has enabled that library to flourish for the better part of two decades, so is this a job for Apple? Surely not. They’re a hardware company, not a charity, and their support of Java presumably ends where Mac OS X and the iPod do. I wouldn’t want to try to convince Steve Jobs to write a QuickTime rival for other company’s devices, and I bet you wouldn’t either.

Adobe? No, the point is to keep desktop Java viable, and that means competing with Flash. Scratch Real, which clings to their proprietary codecs and server/encoder revenue — nothing about this proposal does them any good.

What about the open-source community? Daniel Steinberg pointed out after reading part I that one could simply say “hey, Java’s open source now, so you go write it.” While finding talented multimedia developers is no piece of cake, the real deal-killer is dealing with the licensing of codecs and other patented multimedia necessities. ESR says you need “money and lawyers” to back up such a library, and he’s surely right. Maybe this could be handled through a non-profit foundation rather than a for-profit company, but it still involves more money than code, and that’s not easy to get.

Speaking of money, though, how about Google? It sounds crazy, but a few of their recent moves are consistent with this proposal. By buying YouTube (after making an effort with their own Google Video), they’ve clearly shown an awareness of the importance of media, or at least the value of selling ads next to a player window. With audio and video increasingly prevalent on the web, a strong embrace of multimedia technologies is consistent with the company’s mission to capture and organize the world’s information. But to develop and give away a multimedia library? Well, we’ve seen Google get into library development with the Google Web Toolkit, and apps like Google Docs and Spreadsheets suggest a desire to help people create and format the data that Google will then organize. So, extending that by analogy into the realm of multimedia production and distribution… well, it’s at least possible that Google could be interested, if not exactly likely.

Am I dreaming? I’m a realist. At the end of the day, I see the term “Specializing in Java Media” on my business card and wonder if that’s too limiting, if I’ve put myself on a sinking (or well and truly sunk) ship. Java has a lot to offer the world in its cross-platform story, its handling of complex problem domains, and its large and enthusiastic developer community.

But we can’t do it alone. And that’s why I’ve got books on Flash checked out to my Safari bookshelf and a Flash trial with another week left on it, along with a QTKit project to start rewiring my neurons to think in Objective-C. Just in case the dream doesn’t come true.