Speech Recognition Software Slowly Making Progress

They are the bane of many consumers’ existence, but the boon to many corporate executives’ business plans: Speech recognition systems can make consumers’ blood boil as they unsuccessfully try to respond to prompts, such as “Please enter your account number,” but the technology can also help executives reduce customer service costs, even more effectively than transferring a call center from Des Moines, Iowa, to Calcutta, India.

Recent technical advances have made speech recognition more customer-friendly, yet the technology is making slow, steady inroads into the corporate marketplace, and so far, having limited effects on sales. Speech recognition systems, which have been available for more than a decade, are often used to replace touch-tone dialing options, systems where users rely on telephone key pads to enter information.

Conserving Staff Time

The goal of both of these systems is to offload repetitive tasks from customer service representatives, who represent a recurring expense, to devices that have a one-time charge. When combined with speech-to-text and/or text-to-speech technology, speech recognition systems can automate the entire customer service process.

“Because of the growing emphasis on customer service recently, many companies have become interested in speech recognition systems,” said Daniel Hong, industry analyst at market research firm Datamonitor PLC.

Traditionally, the speech recognition market has been hampered by low speech recognition rates, proprietary devices and high prices. Vendors have been trying to address these problems, especially improving the recognition rates. “More powerful processors and refined algorithms have helped dramatically improve speech recognition rates,” said Steve Cramoysan, a principal analyst with Gartner Group Inc. In a growing number of cases, these systems are able to understand even the most pronounced local dialects, such as a New York twang or a Southern drawl.

Limiting User Options

Art Schoeller, senior analyst at market research firm the Yankee Group, said that even when the system can recognize what the caller is saying, other problems can arise.

“Companies have to limit the number of possible responses that users can say,” he explained. “Speech recognition systems are not sophisticated enough to wade through the wide variety of reasons why a customer may call for help.” As a result, most companies limit use of these systems to closed questions, which offer users only a few possible answers, such as “Are you calling from your home telephone?” rather than open-ended questions like “How can we help you?” which can be answered in many ways.

The need to limit responses means that companies have to spend a great deal of time designing speech recognition applications. Firms need to outline internal business flows, determine which tasks these systems can handle, and then integrate them into their existing IT infrastructures.

The end result is that buyers end up paying vendors a lot of money for professional services teams that help with the design and deployment of speech recognition applications. “Users spend up to six dollars for services and add-ons for every one dollar that they spend on a speech recognition system,” Gartner Group’s Cramoysan told TechNewsWorld.

Recently vendors have been trying to simplify the process by adding vertical market templates, say, for example, one for financial services, on top of their systems. “The vendors have been talking a lot about the templates, but I’m not sure how much value they offer to users,” said Yankee Group’s Schoeller. “Every company has different business processes, so in the end, they all end up tailoring these systems to correspond to their workflows.”

V for Standards Victory

Another historic problem has been system design. The proprietary approaches that vendors relied on made it difficult for users to build and maintain these systems. Not only were corporations locked into buying hardware from specific manufacturers, but they also had to buy vendor specific speech recognition application development tools.

The Voice XML (VXML) specification, which has gone through two releases, was designed to extend the benefits of XML programming to the speech recognition market. “By simplifying application development, VXML has had a significant positive impact on the speech recognition market,” stated Gartner Group’s Cramoysan.

While the standard has been helpful, it is not a panacea. Standards often offer users base functionality that vendors enhance, but each improvement makes it more difficult for users to mix and match different suppliers’ wares. Microsoft has also muddied the standards area a bit. Rather than throw its weight completely behind VXML, the company has been promoting another standard, Speech Application Language Tags (SALT), which features various extensions to scripting languages, including HTML and XML.

Cisco Systems and Intel are solidly in Microsoft’s corner, but most of the traditional speech recognition vendors, such as IBM, Nuance Communications, and ScanSoft, have not paid much attention to it.

Since the vendors have slowly adopted various standards, product development has become simpler for them and product pricing has dropped for users. “Speech recognition product and service pricing dropped by as much as 30 percent in 2004,” Datamonitor’s Hong told TechNewsWorld.

Beyond Niche Status

Improved technology and lower pricing have made speech recognition products appealing to more companies. It has been popular in the airline industry, financial services, and telecommunications, but recently has been gaining interest in areas such as publishing and hardware and software support as well. In a few instances, companies are now using it for internal communications, such as field service: employees who finish change a trouble ticket from open to closed.

But the growth has been slow and steady rather than quick and dramatic. One reason is many companies are pushing their customers to Web-based customer service, which can be easier to implement than speech recognition service systems.

The end result is, speech recognition’s future is bright but not luminous. “Companies will continue to replace touch-tone customer service systems with speech recognition,” concluded Yankee Group’s Schoeller. “That will result in respectable growth rates, 5 to 10 percent per year, but not the astronomical 20 percent or more increases that some envisioned a few years ago.”

Leave a Comment

Please sign in to post or reply to a comment. New users create a free account.

Technewsworld Channels