O'Reilly Network    
 Published on O'Reilly Network (http://www.oreillynet.com/)
 See this if you're having trouble printing code examples

Practical VoIP Using VOCAL

An Introduction to VoIP and VOCAL

by Luan Dang, Cullen Jennings and David G. Kelly, authors of Practical VoIP Using VOCAL

For years, software has been available for making "free" long-distance calls between workstations over the Internet. The early versions of this software provided poor quality, but users were willing to suffer packet loss, jitter, and latency in return for bypassing normal long-distance toll charges. Today, users can choose from a large variety of Voice over IP (VoIP) software packages. Improvements in bandwidth and the processing speeds of home PCs have enabled practical conversations through VoIP devices.

In North America and Europe, the novelty of "free" long-distance calling has withered away with the reduction of toll charges from the major telecom carriers. For example, in the United States, cell phone providers are presently bundling free long-distance calling with their regular services. By itself, a perceived advantage in long-distance charges will not be enough to convince individuals and organizations in these countries to replace their traditional phone systems with IP-based systems. The current belief being expounded by the pundits is that before IP phone systems become widely accepted in developed markets, they are going to require advanced, integrated features of the type that are practically impossible to implement in traditional private branch exchanges (PBXs) and central offices (COs).

Related Article:

Speaking About VoIP -- The authors of Practical VoIP Using VOCAL discuss why VoIP is on the verge of taking off, and how their book and VOCAL are helping the community to grow and build VoIP applications.

In other regions of the globe, the intersection of costs and features will play an important role in the adoption of VoIP. Some areas suffer from both crippling poverty and extravagant import duties imposed on communication equipment. In these countries, there are service providers who will do everything possible to bring low-cost systems into their market space. In others areas, landlines may be scarce, but capital and technical expertise are available to build new phone systems. Many of these new ventures are looking toward VoIP as a flexible solution that enables deploying phone services to millions of subscribers quickly.

The nature of VoIP, also known as packet telephony, permits the type of advanced features that will win over new users. As anyone who uses the World Wide Web knows, packets running over an IP network can deliver text, pictures, and audio and video content. The PC is becoming a redundant tool for Internet access with the advent of personal digital assistants (PDAs),cell phones, and other portable devices that provide access to email and other web content. Unlike the PSTN, the Internet is decentralized and permits smart endpoints. Someday, the concept of making a phone call may become obsolete by the concept of simply being in touch with people through a variety of smart IP-based devices.

Related Reading

Practical VoIP Using VOCAL
By David G. Kelly, Cullen Jennings, Luan Dang

Advances in packet telephony could also lead to new forms of virtual offices that would seem alien to our current telecommuting practices. Our descendants could know an enhanced mode of long-distance communication in which body language, along with the other 90% of communication that is lost over audio-only devices, is transmitted and received intact. This could make face-to-face meetings a rare novelty. What this might do to city planning, traffic jams, and, indeed, our lifestyles is a worthy subject for another book.


VOCAL (the Vovida Open Communication Application Library) is an open source software project that provides call control, routing, media, policy, billing information and provisioning on a system that can range from a single box in a lab with a few test phones to a large, multi-host carrier grade network supporting hundreds of thousands of users. VOCAL is freely available from the Cisco Systems-sponsored Vovida.org community Web site.

When we started designing VOCAL, we had three primary goals in mind:

System Architecture

A distributed architecture suited our aim to open source VOCAL as it provided components that developers from the community could build upon or build into their projects. Scaling the system meant assigning one type of server, which became known as the Marshal server, with the task of being a single point of contact for the subscribers and enabling duplicates of this server to be added to the system as the subscriber population grew. Our original idea was to achieve load balancing by assigning each additional Marshal server with a specific population of subscribers. Our original plan also called for a multi-host system with redundancy for all call control servers to avoid a single point of failure.

Data Types

In its most basic form, a Voice over IP system is a set of data combined with the capacity to process calls. There is persistent data, such as the provisioning databases of users and server configurations, as well as the dynamic data that is potentially different for each call. The call control servers handle the following types of data:

Registration: When a user connects to her service provider, the system needs to add his or her address to a list of active endpoints.

Security: The system requires a perimeter to allow qualified users in and keep intruders out. Security is multifaceted and includes, for example:

Features: Phone users are used to working with a variety of features, including voice mail, call forwarding, and call blocking. VoIP has long been touted as a possible source for advanced features that are impossible to implement in the PSTN.

Billing: Although this feature is not important for a test or hobby system, commercial VoIP applications that require billing are growing in size and population.

Policy: How do you work with other VoIP networks? How do you share billing and allow access to calls coming from known or unknown systems? These issues are handled by the Policy server.

Having set forth our goals and a generic architecture, the next phase in our planning was evaluating the different protocol stacks available to us.

VoIP Protocol Stacks

In 1997, the only VoIP protocol with any following was H.323, a specification created by the International Telecommunications Union (ITU) for the transport of call signaling over networks. By 1999, when we started Vovida Networks, there were two new options: Media Gateway Control Protocol (MGCP)and SIP.

Then, as now, each protocol had its own set of advantages. H.323, being the first widely available VoIP protocol, enjoyed a head start as developers implemented it as toll-bypass systems as well as PC-to-phone and video-conferencing applications. The best-known H.323 application was Microsoft Netmeeting.

MGCP is well suited for centralized systems that work with dumb endpoints, such as analog phones. The most celebrated use of MGCP is for high-capacity gateways designed to work with traditional telecom equipment. There is also momentum building for a replacement to MGCP called MEGACO/H.248.

SIP is an easy-to-use protocol that enables developers to push the intelligence to the edge of the networks, implement a distributed architecture, and create advanced features.

We chose to base VOCAL on SIP because it suited our needs for rapid development, and we liked its similarities to Hypertext Transfer Protocol (HTTP, RFC 2616) and Simple Mail Transfer Protocol (SMTP, RFC 2821). At the same time, we provided translating endpoints to help us include H.323 and MGCP developers in our community.

Looking at the different organizations present at recent trade shows such as Voice on the Net (VON), we have seen more and more implementations of VoIP using SIP. One example is Microsoft announcing its decision to drop its H.323-based Netmeeting product in favor of Messenger, a SIP application that integrates voice, video, application sharing, and instant messaging and runs on Microsoft's operating system. Also, 3G Wireless, the new cellular phone standard from the ITU, has chosen SIP as its VoIP protocol.

Having chosen SIP, let's look at how the standard describes the roles that different server types play within call processing and then how we implemented our requirements into a SIP-based system.

SIP Architecture Components

RFC 3261 describes the components that are required to develop a SIP-based network. In many implementations, some of these components are combined into the same software modules. As you might suspect, there are also many different ways to achieve the same results. Some implementations may duplicate some components to enable more options for interoperability with other systems.

SIP user agents

RFC 3261 defines the telephony devices as user agents (UAs),which are combinations of user agent clients (UACs)and user agent servers (UASs). The UAC is the only entity on a SIP-based network that is permitted to create an original request. The UAS is one of many server types that are capable of receiving requests and sending back responses. Normally, UAs are discussed without any distinction made between their UAC and UAS components.

SIP UAs can be implemented in hardware such as IP phones and gateways or in software such as softphones running on the user's computer. It is possible for two user agents to make SIP calls to each other with no other software components.

SIP servers

Even though the UA contains a server component, when most developers talk about SIP servers, they are referring to server roles usually played by centralized hosts on a distributed network. Here is a description of the four types of SIP servers that are discussed in the RFC:

In VOCAL, the SIP Location, Redirect, and Registrar servers are combined together into a single server called the VOCAL Redirect server. SIP servers can provide a security function by authenticating users before permitting their messages to flow through the network. Frequently, all four server types are included in one implementation. Proxies can also provide features such as Call Forward No Answer (CFNA).

All the VOCAL components are revisited in detail in our book Practical VoIP using VOCAL.

Luan Dang is Director of Software Development at Cisco Systems.

Cullen Jennings is Senior Manager of Voice Signaling Software at Cisco Systems.

David G. Kelly is an experienced technical writer who joined Cisco as part of the Vovida Networks acquisition in 2000.

Return to the O'Reilly Network.

Copyright © 2009 O'Reilly Media, Inc.