An Introduction to VoIP and VOCAL
Pages: 1, 2

Data Types

In its most basic form, a Voice over IP system is a set of data combined with the capacity to process calls. There is persistent data, such as the provisioning databases of users and server configurations, as well as the dynamic data that is potentially different for each call. The call control servers handle the following types of data:

Registration: When a user connects to her service provider, the system needs to add his or her address to a list of active endpoints.

Security: The system requires a perimeter to allow qualified users in and keep intruders out. Security is multifaceted and includes, for example:

  • Authentication: The system needs to ensure that the connecting users are who they say they are, that the contents of the message have not been modified, and that no one else could have sent the same message.
  • Call admission: The system must determine the types of calling that qualified subscribers are permitted to use.
  • Routing: One server needs to know who the call is for, in terms of whether the called party is a local subscriber, where to send off-network calls,and how to route the call through the system with respect to features and final destination.

Features: Phone users are used to working with a variety of features, including voice mail, call forwarding, and call blocking. VoIP has long been touted as a possible source for advanced features that are impossible to implement in the PSTN.

Billing: Although this feature is not important for a test or hobby system, commercial VoIP applications that require billing are growing in size and population.

Policy: How do you work with other VoIP networks? How do you share billing and allow access to calls coming from known or unknown systems? These issues are handled by the Policy server.

Having set forth our goals and a generic architecture, the next phase in our planning was evaluating the different protocol stacks available to us.

VoIP Protocol Stacks

In 1997, the only VoIP protocol with any following was H.323, a specification created by the International Telecommunications Union (ITU) for the transport of call signaling over networks. By 1999, when we started Vovida Networks, there were two new options: Media Gateway Control Protocol (MGCP)and SIP.

Then, as now, each protocol had its own set of advantages. H.323, being the first widely available VoIP protocol, enjoyed a head start as developers implemented it as toll-bypass systems as well as PC-to-phone and video-conferencing applications. The best-known H.323 application was Microsoft Netmeeting.

MGCP is well suited for centralized systems that work with dumb endpoints, such as analog phones. The most celebrated use of MGCP is for high-capacity gateways designed to work with traditional telecom equipment. There is also momentum building for a replacement to MGCP called MEGACO/H.248.

SIP is an easy-to-use protocol that enables developers to push the intelligence to the edge of the networks, implement a distributed architecture, and create advanced features.

We chose to base VOCAL on SIP because it suited our needs for rapid development, and we liked its similarities to Hypertext Transfer Protocol (HTTP, RFC 2616) and Simple Mail Transfer Protocol (SMTP, RFC 2821). At the same time, we provided translating endpoints to help us include H.323 and MGCP developers in our community.

Looking at the different organizations present at recent trade shows such as Voice on the Net (VON), we have seen more and more implementations of VoIP using SIP. One example is Microsoft announcing its decision to drop its H.323-based Netmeeting product in favor of Messenger, a SIP application that integrates voice, video, application sharing, and instant messaging and runs on Microsoft's operating system. Also, 3G Wireless, the new cellular phone standard from the ITU, has chosen SIP as its VoIP protocol.

Having chosen SIP, let's look at how the standard describes the roles that different server types play within call processing and then how we implemented our requirements into a SIP-based system.

SIP Architecture Components

RFC 3261 describes the components that are required to develop a SIP-based network. In many implementations, some of these components are combined into the same software modules. As you might suspect, there are also many different ways to achieve the same results. Some implementations may duplicate some components to enable more options for interoperability with other systems.

SIP user agents

RFC 3261 defines the telephony devices as user agents (UAs),which are combinations of user agent clients (UACs)and user agent servers (UASs). The UAC is the only entity on a SIP-based network that is permitted to create an original request. The UAS is one of many server types that are capable of receiving requests and sending back responses. Normally, UAs are discussed without any distinction made between their UAC and UAS components.

SIP UAs can be implemented in hardware such as IP phones and gateways or in software such as softphones running on the user's computer. It is possible for two user agents to make SIP calls to each other with no other software components.

SIP servers

Even though the UA contains a server component, when most developers talk about SIP servers, they are referring to server roles usually played by centralized hosts on a distributed network. Here is a description of the four types of SIP servers that are discussed in the RFC:

  • Location server
    Used by a Redirect server or a Proxy to obtain information about a called party's possible location.

  • Proxy server
    Also referred to as a Proxy. Is an intermediary program that acts as both a server and a client for the purpose of making requests on behalf of other clients. Requests are serviced internally or transferred to other servers.A proxy interprets and, if necessary, rewrites a request message before forwarding it.

  • Redirect server
    An entity that accepts a SIP request, maps the address into zero or more new addresses, and returns these addresses to the client. Unlike a Proxy, it cannot accept calls but can generate SIP responses that instruct the UAC to contact another SIP entity.

  • Registrar server
    A server that accepts REGISTER requests. A registrar is typically colocated with a Proxy or Redirect server and may offer location services. The Registrar saves information about where a party can be found.

In VOCAL, the SIP Location, Redirect, and Registrar servers are combined together into a single server called the VOCAL Redirect server. SIP servers can provide a security function by authenticating users before permitting their messages to flow through the network. Frequently, all four server types are included in one implementation. Proxies can also provide features such as Call Forward No Answer (CFNA).

All the VOCAL components are revisited in detail in our book Practical VoIP using VOCAL.

Luan Dang is Director of Software Development at Cisco Systems.

Cullen Jennings is Senior Manager of Voice Signaling Software at Cisco Systems.

David G. Kelly is an experienced technical writer who joined Cisco as part of the Vovida Networks acquisition in 2000.

Return to the O'Reilly Network.