AT&T's Watson APIs Let Apps Recognize Speech
Apr 20, 2012 2:11 PM PT
AT&T could shake up the voice-recognition market when it launches several application programming interfaces (APIs) for its Watson speech recognition program in June.
Developers will be able to use these APIs to create new apps and services with voice recognition and transcription capabilities.
AT&T will also release a software development kit (SDK) that lets devs create programs to capture spoken words and send them into the network for transcription.
The APIs and SDK will let app devs implement voice recognition capabilities into their apps easily.
"This hopefully opens the door to let your two-man garage shop guy who wants to leverage speech recognition apps do so without having to license large amounts of speech recognition technology or work with a handset OEM," Michael Morgan, a senior analyst at ABI Research, told TechNewsWorld. "Usually this stuff gets funneled to much larger development programs."
What Is AT&T Watson?
AT&T Watson integrates speech recognition and other speech technologies. It has tools for tuning recognition, adapting language and acoustic models, and adding custom extensions.
The technology has nothing to do with the IBM Watson supercomputer, which can answer questions posed in natural language.
Languages humans have developed to communicate with each other are defined as natural languages, as opposed to constructed languages or formal languages such as programming languages or mathematical logic.
Where in the World Is AT&T Watson Used?
AT&T Watson technology has been implemented in various areas. These include AT&T's Translator app for Android and Apple's iOS, as well as in its YPMobile app, which offers mobile voice directory search.
The Translator app, which is offered free, runs on both Android and iOS. It translates spoken or written input into another language. The app recognizes English, Spanish, French, German, Italian, Chinese and Japanese.
What's Coming in June
AT&T will release Watson APIs for Web search, local business search, question and answer (think Siri), converting voice mail to text, SMS, dictation, and AT&T's U-Verse compendium of Internet access, TV and phone services in June.
App devs will find the APIs helpful because "to be able to talk to your devices in natural language requires some pretty heavy processing that's so complex that you need to create different lexicons and ontologies for different areas," ABI's Morgan said. "Doing command and control like telling your device to open an app requires a different lexicon from asking a question."
As a large organization, AT&T can "create a whole new lexicon and have people access that immediately and the lexicon can train itself," ABI's Morgan pointed out. It's "going to build new lexicons so, if people have a sport app, for example, they're going to have a ready-made lexicon for them to go," Morgan said. There is one drawback, however -- the barrier to entry into speech-enabled apps will be lowered only for devs who work with AT&T.
How About Siri and ICS?
Both iOS and Android Ice Cream Sandwich (ICS) already implement voice recognition, but AT&T Watson may take voice recognition capabilities further.
"Apple is just focusing on making Siri a personal assistant, but it doesn't let you open Facebook or 'Angry Birds' now," ABI's Morgan said. "Android is tuned to search."
AT&T "is going to create a rainbow beyond search," Morgan suggested. It will offer ready-made lexicons so app devs can pick the ones they want, and "now your little feature phone with a crappy calendar app will be able to take voice instructions."
Watson "was designed to be applied to a variety of telephony-related projects," Rob Enderle, principal analyst at the Enderle Group, told TechNewsWorld.
However, Morgan doesn't see how AT&T Watson will be a product differentiator or how the wireless giant will make money off the platform.
AT&T did not respond to our requests for comment for this story.