People are right to worry about SPIT. Once it becomes possible for people to make SIP calls from one network to another without a pre-existing peering relationship, it becomes possible for malicious users to start flooding those networks with automated calls.

There is, however, a simple solution that allows VoIP network providers to strike a reasonable compromise between openness (e.g. the ability for anybody to dial user@voipprovider.com, just as they might send an email via SMTP), and reasonable security measures to thwart automatic dialing.

One simple trick that providers can implement is to force callers to respond to a voice prompt like “To complete this call, dial 1 (random noise) 2 (random noise) 5 (random noise).” The goal is to exploit the limitations of automated speech recognition so that a bot cannot get past this IVR challenge question. The IVR will always play a slightly different sentence, so it’s not obvious where the spoken digits begin, and then will intermix the spoken digits with background noises that will confuse a computer. Same basic idea as prompting a user to transcribe distorted text.

Once the caller passes this voice captcha test, that user’s endpoint can be added to a white list so that subsequent calls can be processed automatically.

While this will not prevent robot dialers from hogging capacity on the IVR systems that answer these calls, it will be a good strategy to prevent these SPIT calls from getting through to live users or their voice mail boxes. This isn’t a cure-all in of itself, and should be used in combination with other techniques: such as building whitelists of VoIP networks that peer for each other, automatically identifying suspicious calling patterns to they can be blocked at the firewall, and so forth.