Blooper Detector
MEDLINE records contain MeSH indexing terms assigned by human indexers. Indexers may be aided by the MTI (Medical Text Indexer), an automated system that recommends a list of MeSH indexing terms from which indexers may select at their workstations. Sometimes MTI recommendations are grossly erroneous. For example, recommendations for the MEDLINE record (PMID 11748928) titled, "Viral Interleukin 6 stimulates human peripheral blood B cells that are unresponsive to human interleukin 6." include the MeSH term "Coma" due to "unresponsive" in the title. This term is clearly inappropriate for indexing unresponsive cells. Occasionally, indexers themselves assign terms erroneously; for example, the MEDLINE record (PMID 9809206) titled, "Modeling Escherichia coli. The concept of competitive coherence." was mis-indexed with the term Competitive Behavior (which MeSH reserves for human and animal behavior). Such erroneous terms are sometimes called "bloopers." The goal of our research is to develop a Blooper Detector that can automatically detect bloopers using JDI (Journal Descriptor Indexing) to identify them as outliers, in contrast to the more reasonable recommendations returned by MTI.
Each line in MTI output is a recommendation. The format for this line consists of 8 fields, as shown in the following table:
Field | Content | Notes |
---|---|---|
1 | PMID | PubMed assigned unique identifier.
If free text, this is "0". |
2 | Term | MeSH Term.
Starts with '*', comes from Title section |
3 | CUI | Concept Unique Identifier for the MeSH term |
4 | Score | MTI score |
5 | Type | MH: MeSH Heading
HM: Heading Mapped to ET: Entry Term NM: Supplemental Concept SH: MeSH SubHeading CT: MeSH CheckTag |
6 | Misc | If ET, this explains the replacement
If not, blank |
7 | Location | If from MMI:
TI: Title AB: Abstract TI;AB: Title and Abstract |
8 | Path(s) | MM: MetaMap's MMI
RC: PubMed Related Citations TG: John Wilbur's Trigram Method |
The blooper detector selects the MeSH recommendation to be evaluated, as follows:
For example:
From the line
17313486|*Stupor|C0085628|23580|MH|RtM via: unresponsive behavior|TI|MM
the recommendation to be evaluated is Stupor.
From the line
17313486|*B-Cells|C0004561|21420|ET|Entry Term Replacement for "B-Lymphocytes"|TI;AB|MM;RC
the recommendation to be evaluated is B-Lymphocytes.
Text | Recommended Term | ||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Options | Input | JDI Option | Input | JDI Option | |||||||||||||||||||||||||||||||||||||||||||||
TIAB_SR_SR | TIAB | Mesh Term | TIAB_nSR_nSR
| TIAB | Mesh Term | TIAB_SR_nSR
| TIAB | Mesh Term | TIAB_SR_MH
| TIAB | Mesh Term | TIAB_nSR_MH
| TIAB | Mesh Term | TIABMH_SR_SR
| TIABMH | Mesh Term | TIABMH_nSR_nSR
| TIABMH | Mesh Term | TIABMH_SR_nSR
| TIABMH | Mesh Term | TIABMH_SR_MH
| TIABMH | Mesh Term | TIABMH_nSR_MH
| TIABMH | Mesh Term | |