Sub-Term Mapping Tools

Frequently Asked Questions

(Please read before asking a question)

  • How can I ask a question?
    See Contact Us

  • I can't install STMT successfully?
    One of the most common mistakes is that users install STMT from the wrong directory. Make sure to run the STMT installation script from the top directory of ${STMT_DIR}. Please refer to installation instruction for details.

  • Can I install STMT to Mac (other than Linux and MS Windows)?
    Yes, it requires manual installation

  • Do I need to install STMT if I just to want use Java APIs to get sub-terms?
    No. You don't need to install STMT if you just want to get the sub-term with your own defined corpus file. All you need is the stmt${YEAR}api.jar or stmt${YEAR}dist.jar. If you want to use the preloaded corpus (Lexicon or UMLS-Core synonyms) or run the command line tools, then you will need to install STMT. Please refer to STMT files usage for details.

  • How many different tools are included in STMT? What are they?
    Five, please see details at STMT user documents
    ToolsFull NameUsage
    lsfLexItem Sub-Term FinderFind a term, subterms, longest prefix subterm in Lexicon
    mtMapping ToolMap a term to CUIs, EUIs, (recursive) synonyms, and preferred term
    ntNormalization ToolIncludes LexNorm, SynonymNorm, and LvgNorm
    smtSynonym Mapping ToolFind CUIs|preferred terms by substituting synonym for subterms
    stmtSub-Term Mapping ToolGeneric sub-term tool to find subterms, prefix subterms, longest prefix subterms, and all permutation of synonym substitutions

  • How to setup STMT with different data release?
    Please refer to user documents - data version setup for details.

  • Which file should I configure as corpus in configuration file? SYNONYM_FILE or CORPUS_FILE
    It depends on your application:
    • If you only deal with sub-terms or prefixes: use CORPUS_FILE
    • If you need to find the synonym substitutions for sub-terms: use SYNONYM_FILE.
    Please note that all terms in corpus (from both files) should be normalized. Also, CORPUS_FILE is ignored by STMT when SYNONYM_FILE is specified. The first field (key term) in SYNONYM_FILE is used for corpus. Please refer to configuration setup for details.

  • How to use different version of LVG in SMT?

  • How to use different version of Lexicon or UMLS-Metathesaurus in SMT to find concepts?
    • Update HSqlDb
      • Download the desired version of HSqlDb data HSqlDb files
      • unpack and put it under ${STMT}/data/
    • Update Lvg
      The version of Lexical Tools should be synchronized, see above to update Lexical Tools
    • Update configuration file
      Update DB_NAME in smt configuration file, smt.properties

    It is a good idea to use -X option with different smt configuration file by saving different versions information in different smt.properties.X

  • How to customize my own synonym file?