You are here

Natural Language Processing

Printer-friendly versionPrinter-friendly version

LHNCBC's NLP R&D improves search and retrieval and facilitates discovery through advances in analyzing biomedical texts, graphical presentation of results, and multi-language search.

Projects

  • Automated Indexing Research

    The Indexing Initiative (II) project investigates language-based and machine learning methods for the automatic selection of subject headings for use in both semi-automated and fully automated indexing environments at NLM. Its major goal is to facilitate the retrieval of biomedical information from textual databases such as MEDLINE.

  • BabelMeSH and PICO Linguist

    BabelMeSH and PICO (Patient, Intervention, Comparison, and Outcome) Linguist are multi-language tools for searching MEDLINE/PubMed. Thirteen languages, including character-based languages, are supported. Recent enhancements include a query using more than one language and retrieving citations in more than one language.

  • De-Identification Tools

    Computational de-identification seeks to remove all of the identifiers in such narrative text in order to produce de-identified documents that can be used in research while protecting patient privacy.

  • Lexical Systems & Tools

    LHNCBC's Lexical Systems Group develops and maintains the SPECIALIST lexicon and the tools that support and exploit it. The SPECIALIST Lexicon and NLP Tools are at the center of NLM's natural language research, providing a foundation for all our natural language processing efforts.

  • Semantic Knowledge Representation

    The Semantic Knowledge Representation project conducts basic research in symbolic natural language processing based on the UMLS knowledge sources. A core resource is the SemRep program, which extracts semantic predications from text. SemRep was originally developed for biomedical research.