Explore LABS


Foreign Language Data - Transliteration

Previous Article Index Next Article

matchIT supports foreign language data out of the box including foreign character sets listed below: 


There have been companies using matchIT successfully in around two dozen non-English speaking countries and there are many companies based in UK/US using matchIT successfully for non-English data from many countries, including several European languages. Although foreign names and words are included in the standard lexicons that matchIT uses, results can be further improved by adding to the lexicons for names such as Johann, equivalents such as Johann and Hans, as well as similar business entries and equivalents such as GmbH, AG, Jewelerie and Bijouterie. Reference datasets of words for foreign languages can easily be loaded into matchIT’s lexicons. 

The phonetic algorithm was specifically designed to be used with English names but is proven to work well with Spanish, German and other names that occur commonly in the US. It has been designed with foreign language versions in mind (i.e. for data collected in countries where foreign languages are spoken).  These could quite easily be developed or third party country-specific algorithms incorporated, according to demand. 


Transliteration is not the same as translation, in which words are converted from one language to another; when transliterating, it’s the characters themselves that are converted from one alphabet to another.  For example, the Chinese character 昌 means “prosperous” and is pronounced “chang”, and the Chinese character 李 means “plum” and is pronounced “li”. Transliteration converts the Chinese name 昌李 into “chang li” (translation would convert this to “prosperous plum”).

*transliteration is currently not available through our desktop product


Previous Article Index Next Article


Was this article helpful?
0 out of 0 found this helpful


Please sign in to leave a comment.