SAN FRANCISCO, May 17 — Google has introduced an experimental new verbal language translation system that could dictate translated speech in the speaker’s voice.
Most vocal translation software today, including Google Translate, use a three-step process to convert vocal content in one language to vocal content in a second language: automatic speech recognition, machine translation, and text-to-speech synthesis. While this multipart system is effective, Google is developing a simpler system that directly translates speech into speech in the target language.
Google calls this method Translatotron, and it has the potential to offer quicker interface speed, reduce the number of translation errors that occur during the transcription sequence and retain the voice of the original speaker.
In fact, the steps that involve text are completely omitted and replaced with spectrograms — visual representations of speech. While the Translatotron translation speed lags behind that of the conventional system with the internal text conversion, it demonstrates “the feasibility of the end-to-end direct speech-to-speech translation.”
The system can be further enhanced with an optional speaker encoder which, based on the vocal characteristics of a brief example utterance, can create translated speech that sounds like the original speaker, making the translation more natural and human.
Translatotron is the first successful creation of a speech-to-speech translation system and is a “starting point” for demonstrating how the future of language translation could look. — AFP-Relaxnews