FEATURE-Google seeks world of instant translations

Tue Mar 27, 2007 7:03pm EDT
 
[-] Text [+]

By Adam Tanner

MOUNTAIN VIEW, Calif., March 28 (Reuters) - In Google Inc.'s (GOOG.O) vision of the future, people will be able to translate documents instantly into the world's main languages, with machine logic, not expert linguists, leading the way.

Google's approach, called statistical machine translation, differs from past efforts in that it forgoes language experts who program grammatical rules and dictionaries into computers.

Instead, they feed documents humans have already translated into two languages and then rely on computers to discern patterns for future translations.

While the quality is not perfect, it is an improvement on previous efforts at machine translation, said Franz Och, 35, a German who heads Google's translation effort at its Mountain View headquarters south of San Francisco.

"Some people that are in machine translations for a long time and then see our Arabic-English output, then they say, that's amazing, that's a breakthrough," said Och.

"And then other people who have never seen what machine translation was ... they read through the sentence and they say, the first mistake here in line five -- it doesn't seem to work because there is a mistake there."

But for some tasks, a mostly correct translation may be good enough.

Speaking over lunch this week in a Google cafeteria famed for offering free, healthy food, Och showed a translation of an Arabic Web news site into easily digestible English.

Two Google workers speaking Russian at a nearby table said, however, that a translation of a news site from English into their native tongue was understandable but a bit awkward.

FEEDING THE MACHINE

Och, who speaks German, English and some Italian, feeds hundreds of millions of words from parallel texts such as Arabic and English into the computer, using United Nations and European Union documents as key sources.

Languages without considerable translated texts, such as some African languages, face greater obstacles.

"The more data we feed into the system, the better it gets," said Och, who moved to the United States from Germany in 2002.

The program applies statistical analysis, an approach he hopes will avoid diplomatic faux pas, such as when Russian leader Vladimir Putin's translator miffed then German Chancellor Gerhard Schroeder by calling him the German "Fuehrer." The word is verboten in that context because of its association with Adolf Hitler. "I would hope that the language model would say, well, Fuehrer Gerhard Schroeder is ... very rare but Bundeskanzler Gerhard Schroeder is probably 100 times more frequent than Fuehrer and then it would make the right decision," Och said.   Continued...

 

Featured Broker sponsored link

Editor's Choice

A selection of our best photos from the past 24 hours.  Slideshow 

Most Popular on Reuters

  • Articles
  • Video