Trend #4: Distributed Computing
Koman: What about distributed computing over networks?
Rashid: I think there we're really seeing a revolution. I'm extremely excited about what's happening with XML and the notion that we can now create the self-describing databases of information and exchange data that is self-describing. That we can put on the Net descriptions of the interfaces to servers so that you can literally program against a server without initially knowing anything about it. But by simply pointing your development environment to the data that exists out there, the directory information, it pulls down the description of the description or the schema of the database of the server that you're talking to, and you can pull that into your development environment and just work with it right there. That is just a tremendous new ability. We've been building distributed computing systems for a long time and people have worked around these kinds of issues in the past. Now we're actually starting to see real commercial enterprises doing it, and large databases being put online in this form.
One of the things we did, to start my research group a few years ago, is we put one of the very first terabyte databases out, which was the Terra server, a database of imaging the surface of the Earth. Over the years we added topographical information , lots of data about the areas that we have images. We put in map information, and so forth.
Last year the researcher that originally put the site together took that site and turned it into a Web service, making it available through XML and through SOAP and all of the various protocols that allow you to sort of talk to this database as a programmatic component. Almost immediately we saw people using it. It's been used in courses that people are teaching about writing distributed applications. The USDA is using it to build applications to help farmers do soil analysis. And it's very cool just to see that happen all of a sudden.
We're now working with the National Science Foundation and Cal-Tech and Johns Hopkins University on a national virtual observatory. We're basically putting online a database looking out towards the sky, and making available to astronomers this type of information where you can both get the images but you can also program against it and access the information in a programmatic way.
And that's really exciting for everybody because it means that suddenly we can think about having in some sense on the Internet the world's best telescope that's always on -- it's just accumulating data from real telescopes, but it's there whenever anybody wants to find a bit of astronomical data. That's something that many people are working on together in the astronomy community and we're part of that.
Koman: So it seems like an additional theme is using the Internet to cull together really massive databases you can program against.
Rashid: The way I look at it is, a lot of the very early Internet systems were in fact distributed systems where you had protocols for communicating back and forth between computers for the purposes of writing software, not just for people to sort of visualize information and bring up Web pages and things of that sort. And now I think we're finally reaching a point where those kinds of ideas are really going to see fruit. We're going to see people be able to build large-scale distributed applications using the tremendous resources that exist on the Internet as a database, if you can think of it that way, for those computations.
Koman: What do you make of the peer-to-peer networking paradigm?
Rashid: Well, again -- I hate to say this, back in the old days when I was growing up ... ah, kids never really believe those stories anyway, but -- when I was doing my thesis work, we were doing peer-to-peer networking. This is an old idea. It's been around for a long time. It's called distributed computing.
I mean, we went through this period, I think, where client-server became kind of a very special case that everybody was handling particularly well for the Internet, and in corporations and enterprises as well. If you go back historically it was all peer-to-peer communication. So I think what we're seeing now is really, people getting back to that, and it takes many forms. Sometimes people talk about file sharing, that's a kind of peer-to-peer. But I think the most interesting is where you really start talking about building -- literally building systems that are survivable, that are fault-tolerant because the information's distributed, and because the applications themselves are distributed.
Koman: So do you think client-server doesn't necessarily have to be around forever?
Rashid: Well, I think, well, client-server will be around forever, because if nothing else it's a special case of peer-to-peer. It's just where you've got a big peer and a lot of little peers. I think the reality there is that we will move toward more distributed implementations. It's already the case if you look at big Web sites that they're not typically implemented by having one computer that your client actually talks to. They're often implemented that way now, both for purposes of scalability and for purposes of fault tolerance, as a distributed-computing component. Which you may think of as being a single thing, a single server, but that's not actually how they're implemented.
And I think increasingly that's just going to be pushed out to the leaf nodes, if you want to think of that way, in the network. Where itÍs not about computing in the center of the network, but it's also going to be about computing in the leaf nodes, the individual personal computers and devices that plug in. And I really think it will be much more of a truly distributed system that you'll increasingly not be able to tell where a particular computation was performed. And probably won't care. And, in fact, you'll be very happy about the fact that if I go from one device to another, the same computations appear to be performed, even though I don't really know exactly where or how.
If I've got a small personal device that's doing something for me I don't really want to know, is the device doing it or is something, somewhere in the network this information is being processed and I'm simply watching it or looking at it?
And likewise on my PC -- the distinction shouldn't be that important. It should just be a question of how do we provide computing in a most effective way for users that makes the best use of the network, that gives them the best level of reliability, of scalability, and that protects their information in the best possible way?
Koman: Uh-huh. So classes of software like the SETI@Home module, for example, you would see more of that.
Rashid: Yeah, you're definitely going to see more of that. I think increasingly you're going to see people taking advantage of the fact that there's an enormous amount of computing capacity out there and beginning to think about it as one big, gigantic, world, super computer.
Maybe not exactly like those science fiction stories ... I mean I love science fiction, but I sometimes do cringe when they show these people kind of flying through cyberspace in these kind of weird, psychedelic outfits and so forth. It may not be quite like that, but certainly you're going to see computation distributed in a way that people dreamed about back when I was earning my stripes. Now you're really seeing it begin to happen.
Koman: So, just to stay on the science fiction theme for a second, on Star Trek there was this one supercomputer on board and you just spoke out and it would do your command. It was sort of like a really mega-centralized computer.
Rashid: Yeah, well that evolved through the various on Star Trek shows over the years. It's always interesting to see how the technology of the day seems to influence the science fiction of the day.
Koman: Right. That was sort of the Sixties ...
Rashid: And by the time you got to things like Star Trek, The Next Generation, they were already having nanotechnology, nanobots, and distributed intelligence. It's just interesting to see how things changed as time went on.
The original Star Trek series, I think, reflected people's notions of computing of that particular day which was large, big, single computers that were so massive and so expensive that you couldn't have more than one of them. Although they did introduce the concept of the tablet PC, I think. They all had these sort of computing tablets they would draw on or make their notes on or talk through. And we're only now getting around to actually building those things.
Koman: Well, I guess this was a tangent.
Rashid: It's a great tangent. I mean I was a big Star Trek fan. Whenever a new Star Trek movie would come out I'd always take the people that worked for me out to see it, so I have a personal financial stake in the quality of each Star Trek movie.
Koman: A lot of disappointments. So, will we see a Holodeck in our lifetimes?
Rashid: Oh gosh, when are you going to see a Holodeck? Well, ah, the Holodeck was very complicated in Star Trek. There were many different technologies associated with that. So it's going to be hard to produce exactly what they have.
When are we going to be able to create something that looks on a computer screen as though it's completely real? I think that's probably within the next five years, maybe 10. I might be off, but we're moving very rapidly in that direction.
Don't miss the O'Reilly Emerging Technology Conference, May 13-16, in Santa Clara, Calif. This year, we'll explore how P2P and Web services are coming together in a new Internet operating system. Register by March 22 and save up to $695.