Wireless DevCenter    
 Published on Wireless DevCenter (http://www.oreillynet.com/wireless/)
 http://www.oreillynet.com/pub/a/wireless/2001/09/28/relay.html
 See this if you're having trouble printing code examples


SMS Relay -- An Idea for Fault-Tolerant Communications

by Brian McConnell
09/28/2001

This article examines the task of creating a wireless communication system that can survive a catastrophic failure, and still provide basic communication services to its users. Specifically, the author suggests that the mobile carriers borrow a page from Gnutella and enable mobile phones to relay their short text messages to one another until they reach land lines.

While wireless communication systems appear to be inherently resistant to damage in natural or man-made disasters, they are not virtual systems that reside entirely in the ether. Like landline networks, they depend on physical facilities: short- and long-haul interconnections to the rest of the telephone network, antennas, and switching facilities. Although the last mile may be a wireless connection, much of the time calls are traveling through terrestrial facilities that can be damaged or knocked offline.

The recent events in New York and Washington, D.C. exposed two key weaknesses in wireless communication systems. First, unlike two-way radios, cellular telephones depend on centralized facilities to function. Knock those facilities offline or swamp them with high volumes of call traffic, and the handsets become little more than battery-operated paperweights. Even if the base stations are functioning, the network can be overwhelmed when a large number of users try to use the system simultaneously. Most telephone networks are designed with the assumption that only a fraction of the total number of users will attempt to place calls at any given moment. This assumption fails in an emergency situation such as a natural disaster or the September 11 attacks.

Second -- and this factor is unique to the United States -- there is no uniform standard for digital cellular networks. The U.S. cellular telephone system is a balkanized patchwork of several different networks, each using different and incompatible technologies. As a result, a handset designed to work on one carrier's network can only talk to that type of network. In other regions of the world, carriers commonly use the GSM standard, which enables phones to automatically roam across different carrier networks, even within the same city. Because of the balkanized nature of the U.S. system, a bad situation was made even worse because carriers could not pick up the slack for each other. If the government had not abdicated its responsibility and had forced carriers to use a common standard (just as it has done for TV and radio broadcasters), subscribers could have automatically placed calls through alternate carriers if their primary carrier was knocked offline.

This wasn't a problem when cellular telephones were viewed as a luxury item or an expensive business tool. The loss of service, while inconvenient, was not viewed as a critical event. But in recent years, cellular telephones have become an essential tool, and in emergency situations, a vital lifeline. Cellular dialtone is no longer a luxury; it is just as important as other essential utilities. This trend will only accelerate in the wake of recent events.

Every phone a relay point

Beyond ContactBeyond Contact
A Guide to SETI and Communicating with Alien Civilizations

By Brian McConnell
March 2001
0-596-00037-5, Order Number: 0375
424 pages, $24.95

Fortunately, the cellular carriers and handset manufacturers already have the tools at hand to create a nearly indestructible wireless communication network. This network can be overlaid onto existing systems, and does not require the use of new technologies, beyond a software upgrade.

Nearly every digital cellular phone sold today is a computer. It may look like a phone, but it is really a computer designed for making and receiving voice calls and text messages. By making minor hardware changes and updating the software used in cellular phones, vendors could produce phones that operate in a limited capacity even when the cellular network is rendered inoperable.

The key to doing this is to redesign the way cellular phones handle text messaging (also known as SMS or short messaging service). This service enables phone users to send and receive short text messages via their phones. It's not as good as a live phone call, but it's better than nothing, essentially a cross between instant messaging and email. In the current system, the phone needs to be able to communicate with a base station to send and receive text messages, just as it does with a voice call.

"Every phone or pager could act as a relay point for forwarding short text messages. This would enable a user who is out of range from a functioning base station to bounce his or her message off of one or more other telephones that act as intermediate relay agents."

If you alter the phones so that each handset can itself relay text messages on behalf of other phones, you can build a text-messaging network that can continue operating even in the face of a severe and widespread outage. The trick is to apply techniques used in peer-to-peer computing (and in Gnutella in particular) to create a parallel system that can relay text messages without central coordination. In this system, every phone or pager could act as a relay point for forwarding short text messages. This would enable a user who is out of range from a functioning base station to bounce his or her message off of one or more other telephones that act as intermediate relay agents. This system can also be implemented in a way that is automatic and invisible to users.

While this technique is not appropriate for voice phone calls (the quality of service would be too unpredictable), it could work quite well for text messages. Text messages do not need to be delivered instantly. A delay of several seconds, while intolerable for voice, is barely perceptible for a text message. It is much easier to build a store-and-forward system than it is to support real-time full duplex voice communication.

Text messages also require much less bandwidth than a phone call, so even a severely-compromised network can carry a large volume of text messages. A text message requires one kilobyte, or less, on average, whereas a compressed digital phone call requires about 10 Kbps upstream and downstream. A one-minute phone call therefore requires 120 Kbytes of bandwidth, compared to a single kilobyte or less for a text message.

The goal in creating such a system is to build a network that can fail gracefully, and to clear the airwaves for emergency phone calls by providing a reliable way to send non-urgent messages during widespread outages and periods of high system demand.

Peer-to-peer messaging

This basic scheme has already been widely used in PC file sharing applications, as well as PC-based instant messaging services. One of the best examples to date is the Gnutella file-sharing system. Many of the techniques used in this system can be directly applied to the task of creating an ultra-reliable text-messaging system for cellular phones and pagers.

Gnutella is a completely decentralized, or peer-to-peer, file-sharing system. Unlike Napster, there is no centralized server that acts as a broker in processing search requests, matching users with each other. Gnutella clients automatically seek out other Gnutella clients elsewhere on the Internet. Each time a Gnutella client discovers another client, it asks that computer for a list of the other clients it has already discovered. It then repeats this process for each of the computers in this list, and does so ad infinitum. Within a matter of minutes, the program typically discovers thousands of other computers that are running the Gnutella system, and can communicate with any one of them.

"Gnutella's designers were less concerned with creating a network that could withstand a physical attack than they were with creating a system that could withstand legal attacks from censors and attorneys."

This design makes the system very reliable and difficult to disable. Gnutella's designers were less concerned with creating a network that could withstand a physical attack than they were with creating a system that could withstand legal attacks from censors and attorneys. The decentralized nature of the system makes it impossible to shut the system down by disabling a single node or small group of nodes.

When applied to wireless text messaging, this technique can be used to build a network that automatically routes around outages by enabling messages to bypass terrestrial facilities in the affected areas. There are three keys to building a system like this.

Automatic discovery and message delivery

A telephone or pager in this system will normally send and receive messages through the nearest base station. When the phone loses its ability to communicate with a base station, either because it is out of range or because the network has failed, it switches to its backup mode.

Whenever the device is turned on, it will listen for short messages from other devices capable of relaying messages and, in turn, the devices they are able to communicate with. The device uses this information to learn how messages can be routed to a base station, using other phones or pagers as intermediate relay points when direct communication is not possible. The device does this silently and automatically.

A typical status message might look something like the following example:

<route>
<id>4155552222</id>
<dnstream>4155551234,4155552345,4155553456</dnstream>
<upstream>4155559876</upstream>
</route>

In this example, the phone 415-555-2222 is telling other devices that it can relay messages to 415-555-1234, 415-555-2345, 415-555-3456 and 415-555-9876. It is also saying that it can relay messages upstream, to other devices and the outside world, via 415-555-9876.

When the user travels out of range, or the network goes dark, the device will attempt to relay messages via other handheld devices instead of trying to communicate with a base station directly. Messages can make many hops between endpoints; for example, hopping across a half dozen cellphones in their journey to the nearest operating base station. This will be invisible to the user. Even if a message traverses many devices in its trip, the cumulative delay will be barely noticeable, typically a few seconds.

Likewise, the terrestrial network can use the same technique to discover how to reach users who cannot communicate directly with a base station. Just as handheld devices will listen for status messages about who can communicate with whom, so too will the base stations. With this information, the terrestrial side of the network can maintain a constantly-updated table of the best routes to individual users, including routes that involve bouncing messages off of several handheld devices en route.

When relaying messages on behalf of other users, a phone or pager will be playing the electronic equivalent of hot potato. Upon receiving a message that needs to be relayed, the device will be tasked with finding the nearest base station or another device that has an upstream connection as quickly as possible. In a worst-case scenario, the device will store and queue the message for delivery as soon as a path opens up. All of this will happen automatically, so a user might be unaware that his phone is being used to relay messages to and from people in a blacked out area.

Store and relay

What is interesting about this approach is that, in addition to making efficient use of scarce resources (base stations), it is also capable of operating in a worst-case scenario, where a large region is completely isolated by a service outage; for example, by a large hurricane. In such a scenario, hundreds of thousands of people could be cut off due to widespread damage to terrestrial facilities. There would be no network to talk to across a large area, and what facilities were still online would be swamped with traffic.

Users who needed to communicate with other people within this blackout area would be able to do so without ever touching the terrestrial network, although it might take some time for a message to reach its destination via a "scenic route." Users who needed to communicate with people outside the blackout area could also do so, as their messages would be passed along, in bucket brigade fashion, until they eventually found a point of entry to the outside world (not unlike the characters in The Matrix).

One can even envision a situation where a message literally hitches a ride with another user. Imagine that someone is driving through a disaster area. There is no network service whatsoever in the affected area. Several people have attempted to send messages out of the area. Their phones discover this passerby's phone and push copies of their queued messages onto it. A couple hours later, this traveler finds his way out of the affected area. His phone has queued copies of several dozen messages for delivery and faithfully forwards the mail now that it can talk to the network again. It's certainly not "instant messaging" by any stretch of the imagination, but it's better for the message to arrive late than never. In this extreme situation, queued messages literally walk or drive out of the affected area until they can find a point of entry to the terrestrial network.

Other advantages

This type of system is not intended to replace existing cellular and paging networks, but rather to serve as a backup during service outages and peak demand periods. This technique may be particularly useful for cellular networks, as part of a strategy to clear non-emergency voice and data traffic from the primary network when necessary, and to provide reliable text-only communication in the event of a widespread outage or emergency situation.

In addition, such a system could also prove useful outside of disasters and wartime. A decentralized text messaging network could reduce the need for terrestrial facilities, or at least be used to queue and reroute messages around isolated equipment failures (a common event, compared to large-scale outages). This approach could also be used to create location-based messaging services, since phones and pagers could infer the distance to nearby users based on signal strength, number of hops, and so on. Besides improving overall reliability, it should encourage the use of text messaging, which is becoming an important source of revenue for carriers, especially among younger users.

Most interesting, perhaps, is the minimal cost of creating such a system, compared to other telecom infrastructure projects. All of the infrastructure hardware needed to provide this capability is already in place. Creating such a distributed messaging system is primarily a matter of defining standards and writing new software based on those standards. It should not require a massive investment, such as switches, in new facilities. The biggest obstacle to creating such a system is bureaucracy, both within the carriers and in standards agencies.

In fact, something very similar to this system already exists. Cybiko has already created a peer-to-peer instant-messaging system and PDA aimed at the youth market. This device operates independently of cellular telephone networks, and uses the same communication frequencies as cordless telephones. The system allows for line-of-sight transmission, and for one-hop message forwarding. Although it does not interact with cellular networks, and does not have a Gnutella-like client discovery mechanism, the system offers a glimpse of what the future holds, since it would be fairly straightforward to meld something like Cybiko with the text/instant messaging services offered by cellular and paging service providers.

If used in conjunction with other low-cost strategies for controlling the use of the telephone network (such as forcing cellphones to block redialing attempts to non-emergency numbers, encouraging users to send voicemail or voice email in lieu of live calls), carriers can build wireless communication systems that can stand up to the worst sort of abuse from Mother Nature, enemies and their own customers. Hopefully we won't be needing this any time soon, but you never know when the next natural or man-made disaster will arise.

It's something to ponder when you live two miles from the San Andreas fault.

Brian McConnell is an inventor, author, and serial telecom entrepreneur. He has founded three telecom startups since moving to California. The most recent, Open Communication Systems, designs cutting-edge telecom applications based on open standards telephony technology.


Return to the Wireless DevCenter.

Copyright © 2007 O'Reilly Media, Inc.