Text Categorization

Pre-Process: Jdid-Dc-NFactor

  • Description:
    This file includes information of Jdid-Dc-NFactor of all JDs in training set (MEDLINE).

  • Input:

  • Java Files & Algorithm:
    • GenerateJdDcNFactor.java
    • Read in Journal Descriptors from jds.txt
    • Calculate dc for JDs
      • Read UI, JID, JDs from UiJidJds.${NUM}.txt
      • Update document count for all JDs
    • Calculate Nfactor
      • Calculate total and average document count for all JDs
      • avg. = total Dc for all JDs / num of JDs
      • Assign NFactor for all JDs
        • NFactor = avg./jdDc (if jdDc > avg.)
        • NFactor = 1.0 (otherwise)
    • Print out Jdid-Dc-NFactor

  • Output file: