LexBuild

Lexicon Pre-Release Test

Introduction

The LEXICON is the base of annual release of UMLS-Metathesaurus. After the Lexicon is frozen, there are several tests needs to be performed before the official release.

I. Cross-Reference test:

This test checks all cross-reference EUI among acronyms, abbreviations, and nominalizations. This check can be performed in LexBuild Web system:

  • Post-Proc -> Cross-Ref -> "Identify" -> Fix

It can be also performed by manually running the following scripts:

  • $LB_DIR/Tools/WebScript/IdentifyProblem 1 foo
  • $LB_DIR/Tools/WebScript/AutoFixAndIdentify true true true true true true true /home/lu/www/Tomcat/tomcat/webapps/WebLexBuild 1 foo

    Manually fix may be required after the auto-fix.

II. Check Irreg
This program checks base form in Irreg variants for records with spelling variants

  • $LB_DIR/Tools/PostProcessing/CheckIrreg

Manually fix is needed based on the output results at
$LB_DIR/data/WebApp/Outputs/PostProc/irreg.data

III. Check Trademark
This program identifies all records with:

  • annotation=trademark
  • annotation=trade name
  • annotation=trademarked ...
The problem was caused from old system when there was not field for Trademark. Instead, annotation is used for trademark. All these old records need to be updated. These problems should not be repeated and should be resolved for one-time deal.
  • $LB_DIR/Tools/PostProcessing/CheckTradeMark

Manually fix is needed based on the output results at
$LB_DIR/data/WebApp/Outputs/PostProc/tradeMark.data

After data are all fixed in LB, then generate LEXICON again and make it as the official frozen LEXICON (By 7/31).