Skype Begins Dismantling the Language Barrier
Dec 16, 2014 1:44 PM PT
Microsoft on Monday announced the first phase of its Skype Translator preview program, which initially will facilitate conversations between English and Spanish speakers. The translator will convert spoken words both ways.
It also will translate instant messages in 40 languages. Translations occur in near-real time.
Participants must run Windows 8.1 or Windows 10 Technical Preview on a desktop or tablet. You can sign up for the preview here.
Children in two schools -- Peterson School in Mexico City and Stafford Elementary School in Tacoma, Washington -- have tried out Skype Translator, Microsoft said.
How Skype Translator Works
Skype Translator's automatic speech recognition conducts a deep neural network analysis of what was said in a conversation, comparing the words spoken against snippets of millions of previously recorded samples.
The audio then is transformed to a series of words in text that apparently correspond to the meaning of what was stated.
Next, the system corrects the speech, removing disfluences such as "um" or "ah" and repetitions. It then picks the most likely word from a list of words that sound like what was said.
The selected words are translated into another language, then converted to sound.
Technical Underpinnings of the System
In 2011, researchers at Microsoft came up with a new, context-dependent model for large vocabulary speech recognition that they published in 2012.
The architecture is a hybrid of deep neural network (DNN) and hidden Markov model technology that modeled DNNs using senones, reducing errors by 16 percent.
The team also used general-purpose graphics processing units to train and decode speech.
The researchers built giant artificial neural networks to train their speech recognition system. One of them contains more than 66 million interneural connections and is the largest ever created for speech recognition.
Training took about 20 days and consisted of creating a new, slightly more refined model every few hours.
When tested against Switchboard, a phone-call transcription benchmark, the system achieved a word-error rate of 18.5 percent. That is 33 percent better than current state-of-the-art conventional systems.
Fly 'en cuero'
Skype Translator's accuracy is a major concern, as blunders can have unintended consequences.
"There will be translation mistakes," warned Mark Ballam, head of the Center for International Business Education and Research (CIBER) at San Diego State University. "How could someone detect or catch it if a mistake is made in the translation?"
Much of a conversation consists of body language and facial expressions, and "nonverbal communications would not be translated using this type of service," he told TechNewsWorld.
"I use Skype, but I don't want to rely on a computer for the nuances in a conversation, said Seth Kaplowitz, professor of international business at San Diego State University College of Business Administration.
Artificial intelligence is getting better, but "I think having a human translator would let you find out how accurate your stuff is," Kaplowitz told TechNewsWorld.
Possible Uses in Business
There are possible uses for Skype Translator, despite its shortcomings.
An instant translation service such as this "is interesting not only for consumers, but for international business users who want to have high-level chats," Alaa Saayed, unified communications industry analyst at Frost & Sullivan, told TechNewsWorld.
Some detailed discussions, such as financial transactions, "may still require the use of a single language or the use of a [human] translator," observed Jim McGregor, principal analyst at Tirias Research.
"The real benefit is not to management, but to everyone else -- from engineers to purchasing and accounting," he told TechNewsWorld.
"This could be very useful for engineers all over the world to communicate with each other and assist each other with design challenges," McGregor continued. "In the tech industry, most business can be conducted with senior managers in English, but many other functions do not have the same requirement for English."