Some interesting facts about Machine Translation

July 11, 2011 2 Comments »

Before I start, I would like to give a brief definition of Machine Translation as stated in the Webster´s Dictionary: “Machine translation, sometimes referred to by the abbreviation MT, is a sub-field of computational linguistics that investigates the use of computer software to translate text or speech from one natural language to another.”

One of the first Machine Translations took place back in 1954 in the so-called Georgetown-IBM experiment, where over sixty Russian sentences were fully-automatic translated into English. The experiment was a huge success of its time; however, the problem became very clear when the Russian term hydraulic ram was translated as “water goat“.

In 1959, IBM created a MT Software called Mark I; and by 1963, the Mark II had been already developed, which provided word-for-word Russian language translations at the rate of about 5,000 words per hour. By 1971, a MT Software had been developed on the IBM 360/67 computer that translated between 80,000 to 100,000 words from English to Vietnamese per hour!

Today, a vast amount of software programs exist on the market that provide Machine Translation. Some of them work on-line, such as the SYSTRAN system which used to power both GOOGLE translate and also AltaVista’s Babel Fish. GOOGLE had been using SYSTRAN for several years when, back in 2007,  switched to a statistical translation method. CANDIDE from IBM was the first statistical machine translation software.

Humans can translate somewhere between 1,500 and 3,000 words a day, while an average MT software can translate up to 4,000 words a minute. In addition, a MT software can store translated documents and re-use phrases that have already been translated. However, there is no MT software with an output to be qualified as a “perfect” translation, and this is why we still depend on human translators.





