Text Categorization

MeSH Subheadings

MeSH (Medical Subject Headings) is the National Library of Medicine's controlled vocabulary thesaurus. It consists of sets of terms naming descriptors in a hierarchical structure that permits searching at various levels of specificity. There are more than 23,000 descriptors in MeSH that are arranged in both an alphabetic and a hierarchical structure. MH is the tag for MeSH terms, which is used to describe the subject of each journal article in MEDLINE.

MeSH Subheadings are used with MeSH terms to help describe more completely a particular aspect of a subject. For example, the drug therapy of asthma is displayed as asthma/drug therapy. It is also referred as qualifiers. There are 83 topical qualifiers (MeSH Subheadings) used for indexing and cataloging in conjunction with descriptors (MeSH terms). MeSH Subheadings can be represented in full names, short names, and two-letter abbreviations. Some of them have spelling variants on their full names.

  • Description:

    This Java API is to read in MeSH subheadings from a file and load to a Java Object. This Java object provides conversion between full names, short names, and abbreviations for Subheadings.

  • API Usage:
    • SubHeadings(String inFile)
    • SubHeadings(String inFile, boolean verbose)

  • Inputs:

  • Algorithm:
    • Process:
      • Read in file and save Sub-Headings into Java Objects, SubHeadings.
      • 83 Subheadings (138 lines with some spelling variants in full names) in shs.txt as format of:
        AbbreviationShort NameFull Name