You are here

Biomedical text summarization to support genetic database curation: using Semantic MEDLINE to create a secondary database of genetic information.

Printer-friendly versionPrinter-friendly version
Workman TE, Fiszman M, Hurdle JF, Rindflesch TC
J Med Libr Assoc. 2010 Oct;98(4):273-81. doi: 10.3163/1536-5050.98.4.003.
Abstract: 

OBJECTIVE

This paper examines the development and evaluation of an automatic summarization system in the domain of molecular genetics. The system is a potential component of an advanced biomedical information management application called Semantic MEDLINE and could assist librarians in developing secondary databases of genetic information extracted from the primary literature.

METHODS

An existing summarization system was modified for identifying biomedical text relevant to the genetic etiology of disease. The summarization system was evaluated on the task of identifying data describing genes associated with bladder cancer in MEDLINE citations. A gold standard was produced using records from Genetics Home Reference and Online Mendelian Inheritance in Man. Genes in text found by the system were compared to the gold standard. Recall, precision, and F-measure were calculated.

RESULTS

The system achieved recall of 46%, and precision of 88% (F-measure=0.61) by taking Gene References into Function (GeneRIFs) into account.

CONCLUSION

The new summarization schema for genetic etiology has potential as a component in Semantic MEDLINE to support the work of data curators.

Workman TE, Fiszman M, Hurdle JF, Rindflesch TC. Biomedical text summarization to support genetic database curation: using Semantic MEDLINE to create a secondary database of genetic information. J Med Libr Assoc. 2010 Oct;98(4):273-81. doi: 10.3163/1536-5050.98.4.003.