During my upcoming presentation at ETel, Voice and the Web: The New Terrain, I'll be examining how the global telephone network evolved from a completely closed system to where we're headed when the global telephone network finally becomes available to applications developers everywhere.
In the course of putting together the presentation I asked myself why much of the 2.0 hoopla isn't about voice.
We're telecom innovators. We think about people and communications and technology a lot. And we look at Myspace and can't help but wonder how all that happened without us. Put another way, just how did social computing get so social without voice?
First, let's check the observation. Tens of millions of messages, perhaps, pass through Myspace daily. Those messages are text, images, or both. But not voice. And yet voice seems so obvious. Friend online? Click here to ring both your phones. But no.
On flickr we find photos from everywhere in the world. And looking at everybody's stuff even turns out to be fun and engaging. And we can see exactly who took what, and why. But click here to ring the photographer's phone? Again, no. No voice.
And Craigslist? Do people call each other when they use humanity's largest watercooler to sell a sofa? In fact, they frequently do. This one's interesting. Clicking through the want ads and personals turns up a surprising number of phone numbers, frequently lightly scrambled—"4* 15 # three two six 1805 for more info"—to throw off the spammers. More phone numbers, in fact, than we might expect. So Craigslist allows for the power of voice but, crucially, doesn't do anything to actively promote voice between users. Why not?
The technology exists today to pass out one-day, three-day, or seven-day disposable telephone numbers to anybody buying that sofa or looking for a date. And away would go the spambots, forever. But, no. No voice.
Why? Doesn't the social web realize that people talk?
eBay is our current best counterexample to the voiceless web. eBay believes in the power of voice. So much so, in fact, that it bought Skype for billions of dollars.
So, on the one hand we have Myspace and Craigslist—currently the first and seventh largest websites on the planet—whose planners and designers either don't know they can bring voice to their users, or don't care. And, on the other hand, we have eBay—probably the world's largest online buyers' community—spending billions to bring Skype to users that could have been Skyping all along, if only they had cared. Both parts of this equation are bizarre. A complete lack of interest in voice on one side together with an obvious over-response on the other.
Part of the problem may be that voice doesn't actually make sense in all of the social contexts that we, as telecom innovators, might hope. Maybe flickr is a case in point. If browsing the world's photos means that we're looking mostly at photos taken by people we've never met, from different time zones, maybe voice just isn't the right way to reach out and make an introduction.
Another part may be fear of integration. Up until very recently, if you found yourself hosting a well-trafficked site with a large user base, it wasn't at all clear how you could offer up voice to your users, even if you wanted to. This may be what's going on with Myspace. The obvious interaction guffaws of the site are the stuff of legend in the usability community. That could point to any number of things, of course, but one likely culprit may just be the risk of integrating anything at all with site growth that rapid.
And then, there may be a genuine lack of interest on the part of some of the most successful of the social sites. I can't be certain, but I suspect this to be the case with Craigslist. Perhaps Craig himself doesn't care. Or perhaps nobody's approached him. Or perhaps it simply isn't clear enough yet that voice is a genuine possibility on the Web.
Where voice simply isn't the right tool for the job—flickr, perhaps—then we can stop asking questions. But where voice is simply disadvantaged—either through the lack of interest or because of integration difficulties—we owe it to ourselves to look past these proximal causes and go at least one layer deeper.
Consider, to start with, that voice, at least traditionally, has cost money. The public network didn't come about for free. Then compare those decades of centralized state planning and control to the free, drop-in Web components—think shopping carts and comment boards, as well as Google Maps and Feedburner-type web services. It's easy to see why voice may not be the first thing that springs to the minds of talented Web developers everywhere. VoIP may, of course, turn "costs money" into a type of "free," but then we run into the fact that whatever the outcome of the religious war on the uptake of VoIP handsets, what users really love is wireless, which puts us squarely back in the "costs money" domain of the PSTN.
Of course, costing money isn't the end of successful innovation. But it probably doesn't help that the web has evolved as an almost exclusively transaction-driven economy. Click here. For a search, for an API call, for an image, an article, or a book. It doesn't matter what it is—on the Web, it's the outcome of a mostly stateless, mostly timeless transaction.
But voice? Voice has always been about minutes, unlimited local calling, nights or weekends notwithstanding. It might help us to ask how we can turn voice into the type of billing the Web expects. That is, a billed transaction rather than a bunch of minutes.
And last—and probably most fruitfully—we can tackle the question of integration and just how hard it is to use voice on the Web. Just how hard should it be for a web developer to start or stop a telephone call? That's something we've been tackling at Jaduka. And I'll be talking more about our API at ETel, which will give developers direct access to, and all the inherent benefits of, the world's highest-quality, ubiquitous, public-switched telephone network (PSTN).
Client-side installs are a barrier to innovation, not a help. Web developers and users alike hate Flash and Java downloads, and it seems unlikely Skype will change these feelings in any significant way. So why shouldn't control of voice on the Web look and act just like everything else on the Web—that is, like a Web service?
So I ask again: how did social computing get so social without voice? Maybe part of the social web doesn't need us. But clearly other parts of the social web will. Whatever the case, it will be up to us to package our services, and to bill for them, in ways that web developers everywhere understand, appreciate, and will explore.
Catch my talk at ETel on Wednesday, February 28, at 4:15pm–4:30pm, in Salon ABCDE.
Trevor Baca is VP of software engineering at Jaduka and oversees software engineering, real-time systems engineering, telephony services development, information architecture, usability, and user-experience engineering teams.
Return to ETel.
Copyright © 2009 O'Reilly Media, Inc.