You are here

Cross-Language Information Retrieval

Project information

LHNCBC is no longer conducting active research on this project. Information is presented here for historical purposes.

From 2002 to 2006, this project focused on expanding user access to its biomedical information resources (such as ClinicalTrials.gov) by supporting languages other than English, e.g., Spanish. An extensive source of biomedical knowledge developed and maintained by NLM is the Unified Medical Language System (UMLS). Our approach for adapting the UMLS for multilingual applications, especially information retrieval, was mainly applied to ClinicalTrials.gov, so that Spanish queries would retrieve relevant trials from the ClinicalTrials.gov repository. Different Spanish-language prototypes for the clinical trials had also been developed in house, and these prototypes were also presented in various conference papers.

The three main components of our cross-language information retrieval approach consisted of:

  • expanding the UMLS by adding relevant entries in other languages
  • combining multiple linguistic, machine-translation, and statistical approaches to facilitate information retrieval
  • adapting the UMLS lexical tools (e.g., LVG) for other languages
Publications/Tools: 
Divita G, Rosemblat G, Browne AC. Building a medical Spanish lexicon. AMIA Annu Symp Proc. 2007 Oct 11:941.
Zeng-Treitler Q, Kim H, Rosemblat G, Kesselman A. Can multilingual machine translation help make medical record content more comprehensible to patients? Stud Health Technol Inform. 2010;160(Pt 1):73-7.
Rosemblat G, Graham L, Tse T. Extractive Summarization in Clinical Trials Protocol Summaries: A Case Study Proc IICAI-2007, pp. 1824-1837
Rosemblat G, Tse T. User Study of a Spanish-language ClinicalTrials.gov Prototype System AMIA Annu Symp Proc. 2006:659-63
Rosemblat G, Graham L. Cross-Language Search in a Monolingual Health Information System: Flexible Designs and Lexical Processes Proc ISKO, pages 173-182, Vienna, Austria, July 2006.
Rosemblat G, Graham L. A Pragmatic Approach to Summary Extraction in Clinical Trials Proc HLT-NAACL 06, pages 124-125, New York City, June 2006
Rosemblat G, Tse T, Gemoets D, Gillen JE, Ide NC. Supporting Access to Consumer Health Information Across Languages Proc ICML-9, September 20-23, 2005; Salvador, Bahia, Brazil.
Rosemblat G, Tse T, Gemoets D. Adapting A Monolingual Consumer Health System for Cross-Language Information Retrieval Advances in Knowledge Organization, Proceedings of the Eighth International ISKO Conference. 2004 July;9:315-321.
Rosemblat G, Gemoets D, Browne AC, Tse T. Machine Translation-Supported Cross Language Information Retrieval for a Consumer Health Resource AMIA Annu Symp Proc. 2003:564-8.