Tue Dec 11, 2012 6:54am EST

* Reuters is not responsible for the content in this press release.


European Community research project to enrich multilingual terminologies in biomedicine

A key factor in successful text mining is the use of comprehensive terminologies which capture the different ways concepts can be expressed. However, although extensive terminologies exist for English, they are less common for other languages.

According to Wikipedia, a mantra is “a group of sounds, syllables or words capable of creating transformation”. MANTRA is also the highly apposite acronym of a new, European Community funded research project: Multilingual Annotation of Named entities and Terminological Resource Acquisition. Linguamatics is pleased to announce its participation as a commercial partner.

The object of MANTRA is to enrich multilingual terminologies in the biomedical domain by exploiting parallel corpora in several different languages. For example, from the knowledge that an English patent (claims 4, 5 and 6) refers to Branching Enzyme, it should be possible to discover the previously unknown German synonym Verzweigungsenzym from claims 4, 5 and 6 of the German translation. The new synonym can then be used in analysing other document sets. Terminologies in one language and the same documents in other languages can be mined simultaneously to provide enriched terminologies in those other languages.

Participants in the two year project include the University of Zurich (CH), the Friedrich-Schiller University of Jena (DE), the Erasmus Medical Centre Rotterdam (NL), Averbis GmbH (DE) and the EMBL-EBI (European BioInformatics Institute) (DE).

Linguamatics will be working both in the research phases of the project and in showcasing the benefits that can be obtained using its state-of-the-art text mining engine I2E.

About Linguamatics and the I2E Text Mining Platform

Linguamatics is the world leader in deploying innovative natural language processing (NLP)-based text mining for high-value knowledge discovery and decision support. Linguamatics I2E is used by top commercial, academic and government organizations, including nine of the top ten global pharmaceutical companies. I2E can be used to mine a wide variety of text resources, such as scientific literature, patents, clinical trials data, news feeds and proprietary content. It is available as an in-house enterprise system, a managed service and software-as-a-service (SaaS) on the cloud.

Typical applications include:

  • Mapping gene-disease relationships and identifying potentially novel therapeutic targets
  • Biomarker discovery
  • Drug repurposing
  • Drug safety
  • Patent analysis
  • Clinical trial site selection and study design
  • Mining electronic medical records to improve prediction of health outcomes
  • Translational medicine
  • Competitive intelligence
  • Social media mining
  • Sentiment analysis

For more information, see and


MANTRA will provide multilingual terminologies and semantically annotated multilingual documents, e.g., patent texts, to improve the accessibility of scientific information from multilingual documents. The project will exploit these multilingual document sets to harvest terms and concept representations in different languages in order to augment currently available terminological resources such as MeSH.

For more information, see

Sue Ziobro
Head of Marketing
+44 1223 651 910