The Phone Network as an Application Platform
By Kevin Werbach, Mon Jan 08 00:00:00 GMT 2001

The Web and telephony worlds are converging... but not in the way you might think. The phone network is becoming programmable, and may change the way you think about the Internet.


With emerging voice application platforms, it will soon be as easy to create a phone-based service as it is to develop a Web page. This will make possible an array of new services, especially for wireless users.

The Value of Development Platforms

Why has the Web grown so quickly? Most of the attention has focused on how easy it is to get on the Internet and to access information through the Web. But equally, if not more important, is how easy the Web makes it to publish content and dynamic applications. HTML, the language of the Web, is easy enough for even non-programmers to understand.

Thanks to simple graphical tools and hosting sites such as GeoCities (now part of Yahoo!), almost anyone can build a Web page and make it available to a global audience. Even more sophisticated sites such as online stores require only a small amount of programming expertise. Web hosting providers handle the physical provisioning of servers and their associated Internet connectivity, freeing developers to focus on the services they want to deploy.

As a result, thousands of companies and millions of individuals have been able to create their own unique Web presences.

Programmability allows experimentation. The innovators who created companies such as eBay and Amazon.com didn't need to convince any central authority that their ideas were viable businesses. They simply launched sites on the Web, and revised them over time to enhance the user experience.

In contrast, the telephone network has historically been a closed environment. Only operators or third parties with significant technical and financial resources can create custom services. Features such as call waiting and call forwarded are hard-wired into central switches, accessible only to carriers.

Businesses that want functions such as voice mail and speed dialing must either purchase expensive premises-based PBX equipment or order Centrex services offered by their carrier. Computer-telephony integration software and hardware from companies such as Dialogic is expensive, difficult to configure and not fully standardized.

As a result, new services have been slow to appear. Except for the largest enterprises, which can afford in-house telecom departments, it's a one-size-fits all world.

Because it has never been feasible for end-users to develop and deploy phone-based services, we tend to ignore the potential of telephone networks as application platforms. This is a mistake. There are well over a billion telephones worldwide, with more than 90-percent household penetration in most Western countries. Add to that several hundred million mobile phones, projected to grow to 1.4 billion by 2004.

All these are network-connected by definition, and all are capable of taking speech as an input. Beside this, the world's 200 million or so Internet-connected PCs seem almost insignificant. Moreover, a mobile phone provides connectivity wherever its owner goes, while a PC is tied to one location.

Given all these factors, there is a massive opportunity in bringing the programmability of the Web to the world of telephony. Several startups, most prominently Tellme and Voxeo, are now pursuing this opportunity. Consumer-oriented "voice portals" may be generating more buzz, but these infrastructure solutions will be more important in the long run.

The Platform Opportunity

Voice portals such as TellMe, BeVocal, AOL by Phone (formerly Quack) and Hey Anita provide information such as stock quotes and sports scores to any telephone user through speech-recognition technology.

Behind the scenes, though, these services look much more like Websites than telephone-based systems. And therein lies the opportunity. Voice-portal applications, like any Website, are straightforward to update or modify.

TellMe's founders, who came from Netscape and Microsoft, understood from the beginning that creating a platform and bringing in third-party developers would create a much more powerful business than an end-user voice portal alone. They created a set of tools for outsiders to create applications that are hosted on the TellMe service.

TellMe itself does the heavy lifting of ordering and provisioning local phone lines, connecting those circuits to computer-telephony integration equipment in data centers, managing traffic on its network and delivering a consistent voice-based user interface. In return, developers get the ability to create their own personalized applications and make them accessible to anyone with a telephone.

TellMe also attracts users to its consumer voice portal service, which helps pay for the telephony infrastructure and gives third-party developers a built-in audience.

Though TellMe has been the most active in developing a platform alongside its consumer service, others such as BeVocal are also working in this direction. Meanwhile, a startup called Voxeo is quietly developing a pure-play voice infrastructure service, making the development platform its primary offering.

For these companies, the allure of the platform is partly economic. In contrast to consumer voice portals, which are generally free and sponsored by advertising, development and hosting services can generate recurring monthly fees that grow with usage.

Building the Perfect App

What is a phone application anyway?

Let's say you're a small business, such as a doctor's office or a hair salon, and you want to take appointment requests 24 hours a day. Paying receptionists to answer the phone all night is cost-prohibitive. Outsourcing to a call center would also be expensive, and wouldn't get the information into your scheduling application. You're basically stuck with an answering machine or voice mail. If you have a Website, you can offer online scheduling with relatively little development effort, but that only works for customers who are online.

Now imagine you could create a phone-based application as easily as a Web-based one. Customers could call in and connect directly to your scheduling application via automated speech recognition. If you wanted to add additional features, such as announcements or a company directory, you could do so with minimal additional effort.

Large companies already have this capability. United Airlines, Home Shopping Network and seven of the top ten US retail brokerage firms use phone-based speech recognition to cut the costs of human operators in their call centers. The new development is that almost anyone will be able to do so.

And the scheduling example is just one possibility. Unified messaging services such OneBox (now part of OpenWave, formerly Phone.com) allow email, voice mail and fax messages to be accessed through a single interface have been gaining popularity.

Other startups such as Etrieve make it possible to hear your email messages read over the phone from anywhere. The companies that developed these services had to engage in extensive (and expensive) custom development, and though the resulting services are useful for businesspeople, they are relatively inflexible.

With a voice platform such as TellMe or Voxeo, a unified messaging system is simple to create. Both of these companies offer drag-and-drop graphical tools that to pull together the components of a voice application and generate the underlying XML code automatically. Standalone unified messaging services may still have a place where outsourcing and scalability are important, but the market for companies creating applications that meet their own unique needs will ultimately be bigger. Just look at the size of the market for Web application servers from companies such as BEA Systems, Microsoft, IBM, Allaire and Sun.

VoiceXML and CallXML: speech gets standardized

The final pieces of the voice application puzzle are standards. The Web grew up around HTML, dynamic business-to-business services are coalescing around the extensible markup language (XML) and wireless data services are being built on WAP and iMode. In all these areas, proprietary approaches developed first, and in many cases are still in use, but standards catalyzed the rapid expansion of the market.

The two primary standards for voice-based services are VoiceXML and CallXML. The VoiceXML standards effort began in early 1999, and version 1.0 of the specification was issued in March 2000. VoiceXML brought together IBM's SpeechML, Motorola's VoxML and the Lucent and AT&T markup languages based on the PhoneWeb project at Bell Labs. The group has successfully expanded beyond those four companies to encompass virtually all the important players in speech-based services among its 140+ members. Leading voice portals such as TellMe use VoiceXML extensively.

VoiceXML is a specialized language built on top of XML. It allows voice applications to be built out of "documents" that define dialogues using standard markup tags. This approach will make it easier to create new applications, and will allow applications and users to migrate across different companies' platforms.

Numerous companies are developing VoiceXML servers and browsers, though the familiar questions of when new features constitute valuable extensions and when they represent deviations from the standard have already begun to arise.

CallXML, which Voxeo is championing, addresses a different set of issues from VoiceXML. VoiceXML covers the "content" of voice services, such as the menus, user commands, files to play under certain circumstances and application logic. CallXML focuses on call control, such as routing information, ringing a particular number, conferencing, and so forth.

Life in a Voice Application World

Voice application platforms will allow developers to innovate much more rapidly than they can today. This will be particularly important in the wireless world, where personalized services and the ability to interact with local applications and devices are fundamental.

Internet service providers (ISPs) and Web-based portal sites don't control what goes on the Web, though they have some ability to direct users to their own or partners' offerings. Similarly the PalmPilot took off because Palm encouraged third-party developers to create software for the platform. Yet operators today tightly control most wireless data services.

Opening up the phone as an application platform will free users from the tyranny of wireless carriers, because phone-based services will be able to combine speech, touchtone and wireless data mechanisms into new kinds of applications.

VoiceXML and CallXML are based on XML, the same protocol that is becoming increasingly common as the foundation of Web-based applications and services. Over time, there will be crossover between Web and phone-centric applications.

With Internet telephony technologies allowing voice calls to be routed over Internet protocol (IP) networks, the closed model of traditional telephony will gradually become a thing of the past.

Kevin Werbach is the Editor of Release 1.0, an influential monthly report that covers the converging worlds of technology, communications and the Internet. He also co-organizes the annual PC Forum and High Tech Forum conferences for technology industry executives.

Kevin is known worldwide as a leading thinker on topics such as the future of e-business, network architecture, convergence and technology policy. An active participant in online communities for over fifteen years, he is particularly interested in the complex ways that new technologies intersect with markets and society.