Universal ‘concept’ based dictionary aims to improve machine translation

Free online services such as Google Translate and Babel Fish are a godsend for those who wish to shop on foreign websites, browse news stories in their language of origin, or get a little extra help with their French homework. If you don’t understand the foreign language at all an automated translation at least gives a feel of what a sentence or email sent roughly means. However, if you’re looking for more detailed and accurate translation i.e. a newspaper publisher seeking foreign readers or a public health expert wanting to educate speakers of another language, human translation still remains the only option.

Unedited, low-cost machine translation can be excellent for translating low-value text and providing the general idea to people who only expect the “gist.” For texts of greater value and for audiences with higher expectations, professional human translation helps companies avoid translation blunders and their costly consequences. In 2010 use of Google translate was actually responsible for a legal mishap, “a Russian trucker (in the Netherlands) involved in a bar brawl was released because the (court) summons he received was poorly translated from Dutch into Russian using Google Translate,” reported the Dutch-English news blog 24oranges. Instead of reading, “you are to appear in court on 3 August 2010,” as it should have, the summons said something more like “you have to avoid being in court on 3 August 2010.”

Automated machine translation is notorious for its mistranslations. Asked to translate “spring in her step” into French, for example, Google chooses printemps – the season – for “spring”. Similar examples abound. The inability of computers to deal with homonyms – words that are spelled the same but have different meanings – is just one reason why machine translations are often so garbled.

Martin Benjamin proposes to change that,  an anthropologist-turned-lexicographer he is launching a new method of automated translation which involves the user in selecting from possible meanings based on concepts. Until now Machine translation has largely been a statistical business: computers learn to translate by searching for correlations in texts that have been translated by humans. Benjamin believes it’s time to put humans back in the loop.

His new project launched this week is called Kamusi, a multilingual dictionary that could, with some serious funding, contain all the world’s languages. Kamusi’s vital difference to other online dictionaries is that it’s built around concepts as well as words. Kamusi can avoid the possible mistranslation of ‘spring in her step’ by recognising that “spring” is associated with multiple concepts and prompting the user to say which is relevant.

Google’s reason for opting for an algorithmic approach to its translation program is simple, once it’s up and running, it’s cheap and fast. On the other hand for Kamusi to become viable Benjamin needs bilingual speakers to add words to his dictionary and, by comparison this is going to be very slow and expensive.

So far Kamusi has relied on volunteers and a grant from the US National Endowment for the Humanities. Benjamin hopes that speakers of minority languages will be motivated to add more terms for free, since they will gain the ability to translate their language into those that are already represented in the dictionary. Benjamin is also counting on some top-down support: companies that do business in Africa, a continent that is poorly served by existing dictionaries, for instance, might be motivated to pay for large numbers of local words to be added to Kamusi.  The present demonstration version contains 100 words from 15 languages, including English, Swahili and Japanese.

It won’t be easy to build up Kamusi. Totting up wages and other expenses, Benjamin estimates it will cost around $5 to add each new concept. Representing 10,000 concepts in 100 languages would require $5 million of funding. However if he succeeds in financing Kamusi the old sci-fi dream of a machine that can create perfect translations of any text, in any language, at the press of a button will be one step closer to reality.

Sources include New Scientist, deseretnews.com, search engine people.com


TJC Global offers a comprehensive set of translation services to enable you to stay competitive in the global market. As a translation and interpreting agency with over twenty five years’ experience, TJC offers highly qualified and experienced professional translators specialising in Technical, Medical and Pharamceutical, Legal, Business Services translation, as well as many more specialist fields such as Renewable Energy, Global Issues and Life Sciences, who can assist your business with whatever translation or interpreting services you may require.

Members of: ATC, ITI, Proz

See our LinkedIn profile or visit us on Twitter

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: