Because of a lapse in government funding, the information on this website may not be up to date, transactions submitted via the website may not be processed, and the agency may not be able to respond to inquiries until appropriations are enacted. The NIH Clinical Center (the research hospital of NIH) is open. For more details about its operating status, please visit cc.nih.gov. Updates regarding government operating status and resumption of normal operations can be found at OPM.gov.

Lexical Tools

Sort Words by Order

  • Short Description: Sort words of the input term by order.

  • Full Description:

    It is to sort the words in ascending ASCII order (not dictionary order), and strip punctuations. This is useful when dealing with terms and vocabularies in which the terms may be inverted, and if they are inverted, may or may not be inverted around commas. For example, one may see the term "lung cancer"; "Cancer, Lung"; and "Cancer Lung" all refer to the same term. Word order sort will change all three of the above examples to "cancer lung".

    No effect on the -m option. "none" is added at the end of the output.

  • Difference: The Java version keeps the original case of each word from the input term after word sorting.

  • Features:
    1. Replace punctuations with spaces.
    2. Sort words in ascending ASCII order.


  • Symbol: w

  • Examples:
    
    shell> lvg -f:w
    Cancer, Lung
    Cancer, Lung|Cancer Lung|2047|16777215|w|1|
    
    Lung Cancer
    Lung Cancer|Cancer Lung|2047|16777215|w|1|
    
    More examples

  • Implementation Logic:
    1. Tokenize all words from the input term.
    2. Replace punctuations with spaces.
    3. Sort words in an ascending ASCII (case sensitive) order.

  • Source Code: ToSortWordsByOrder.java

  • Hierarchy: Object -> Transformation -> ToSortWordsByOrder