Because of a lapse in government funding, the information on this website may not be up to date, transactions submitted via the website may not be processed, and the agency may not be able to respond to inquiries until appropriations are enacted. The NIH Clinical Center (the research hospital of NIH) is open. For more details about its operating status, please visit cc.nih.gov. Updates regarding government operating status and resumption of normal operations can be found at OPM.gov.

Text Categorization

What is new?

The Text Categorization tool 2011 version is the 5th official public release. It was developed in pure Java, capable of handling UTF-8. Bellows are some specifications of this tool.

System

  • Upgrade to Java 1.6.0.21
  • Upgrade to HSqlDb 2.0.0
  • Provides scripts for command line tools

Data

  • Used MEDLINE.2011 for citations created in years of 2008, 2009, 2010
  • Used Metathesaurus.2010AB
  • Used lsi2011.xml
  • Used the latest data set for JDI, STI, and STRI
  • Updated the default value of Mac. normalized count
  • Compatible to run with data set of:
    • tcData.2010
    • tcData.2009
    • tcData.2008
    • tcData.2007

Features

  • Add new features in StWsd to take ST abbreviations and TUI as St candidates