Fast Prototyping of Telephony Applications with YATE
by Maciek Kaminski06/27/2006
So you have an idea for a novel, non-trivial telephony application? If it is truly novel, you will probably have to extend your favorite open source PBX to try it. When implementing your idea, if you are really lucky and/or a perfect coder, you'll get it right on your first try. However, the rest of us will have to go through a series of prototypes.
At the moment, the most common way of developing non-trivial open source telephony extensions is to code them directly in C/C++. Transmitting, receiving, and processing voice data generates thousands of events per second, and it is often most efficient to implement such routines in a low-level language such as C/C++.
However, when prototyping it is often more desirable to work in a higher-level language. Experiments are more readily and easily implemented in agile languages with extensive standard libraries that integrate seamlessly with databases, email, http, etc. Also, for most applications, the time-limiting factor is often the user interaction. This means that in some situations, the application must be as fast as a human.
In this article, I will present the YATE project (Yet Another Telephony Engine). YATE's API boundaries separate the parts of a telephony application that have to be "fast" from those that have to be just "fast enough." As a result, YATE allows developers to write scripts in higher-level languages, while leveraging the performance of native libraries without sacrificing too much efficiency.
Architecture
The YATE architecture owes a lot to the concept of a microkernel. Its core provides a minimal number of concepts and functions, delegating implementation of other functionality to modules. To communicate with each other, modules exchange messages.
A message contains four pieces of data (in order): a type, a list of character attributes, a return value, and one binary attribute. The binary attribute is used to carry a CallEndPoint, a concept that facilitates the manipulation of "media wiring." A CallEndPoint is simply a bunch of DataEndPoints (at least one for each kind of media, audio, video, etc.), each of which can be connected/disconnected to and from another (see Figure 1). A DataEndPoint comprises an incoming DataConsumer and an outgoing DataSource. When two CallEndPoints are connected, corresponding DataEndPoints are connected, which means that the DataCounsumer of DataEndPoint A gets connected to the DataSource of DataEndPoint B, and vice versa. If the source and consumer formats do not match, translators are inserted between them automatically.

Figure 1. CallEndPoints
Message Flows
Since this article is venturing into some of the more abstract parts of YATE, it may help to look at a concrete example where modules exchange messages in order to set up a call (Figure 2):

Figure 2. Message flow
What is going on here? There are two channel modules, SIP and ZAP, as well as a routing module. All three modules cooperate to handle an incoming call:
- An incoming call comes into
SipChannel(1). To determine where to direct it,SipChannelsends acall.routemessage toRoutingModule(2). RoutingModulehandlescall.route(3) by mapping thecalledattribute(1234) to call target(zap/1).- Now
SipChannelcreates a CallEndPoint for the incoming call and adds it to a newcall.executemessage, which it sends toZapChannel(4).ZapChannelcreates a new CallEndPoint and connects it to the CallEndPoint previously created bySipChannel. It then returns the message (5) and tries to call the destination. While waiting for the destination to answer, it may sendcall.ringing(6). - When the callee answers the call,
ZapChannelsends acall.answeredmessage (7) toSipChannel. Once the call is set up, media data flows between sources and consumers. DTMF events may be sent in both directions (8)(9). When one of the participating channels detects a hangup (10), it disconnects its CallEndPoint. That results in achan.disconnectedand eventually acall.hangupmessage being sent by both channels (not pictured).
Modules
Besides SIP and ZAP, YATE has a few other modules that provide either VOIP or ISDN channels: h323, iax, and wanpipe. There are also pseudo channel modules that provide additional functions. For example, wave plays/records wave files; tone plays tones; moh plays "music on hold"; festival is used for doing text-to-speech (TTS) with the festival speech synthesis system; conference is used for conference calls; etc.
Apart from channel modules that handle media traffic, there are modules that handle other telephony tasks. For example, routing, cdr, and user authorization.
One module that makes YATE a great prototyping framework is extmodule. It implements the YATE external protocol, a text-based protocol that allows message exchange via tcp. I won't go into the protocol details here, as it is beyond the scope of this article, but there is an official YATE wiki page dedicated to extmodule. However, I will present examples written in Python to show how flexible YATE scripting via external protocol can be.



