Because of a lapse in government funding, the information on this website may not be up to date, transactions submitted via the website may not be processed, and the agency may not be able to respond to inquiries until appropriations are enacted. The NIH Clinical Center (the research hospital of NIH) is open. For more details about its operating status, please visit cc.nih.gov. Updates regarding government operating status and resumption of normal operations can be found at OPM.gov.

Text Categorization

StJds (StJd table)


    A table (file) stores the testing data set (4000 MEDLINE Journals, 1.3M records) of JD scores for each ST based on:

    • word frequency count
    • document count for word
    This file is pre-generated, read in, loaded into RAM, and then used to perform ST indexing (real-time) on text. Currently, this table is generated based on Susanne's data. New programs are to be developed to generate the data from scratch (new test set).

  • Description:

    This Java class is to read in St-Jd table from a file and load to a java Object. This java object provides basic method to set and get JD scores for a specified ST.

  • Inputs: The format of this file, stJdTable.txt, is:
    STJD indexWord scoresDocument scoresJD IdJD Full Name

  • Java Files:
    • StJd.java
    • StJds.java

  • Algorithm:
    • Read in file and save St-Jd scores into Java objects, StJds.