Step | Description | Inputs | Outputs | Notes
|
---|
MEDLINE Unigram Spectrum Analysis
|
---|
1 | Group raw unigram by core-term.lc
| - ./Medline/unigram.${YEAR}
| - ./Medline/unigram.${YEAR}.core.lc
- ./Medline/unigram.${YEAR}.core.lc.detail
|
|
2 | Get MEDLINE unigram WC Frequency Spectrum
NGramUtil.GetBasicHistogram
| - ./Medline/unigram.${YEAR}.core.lc
| - ./Medline/unigram.${YEAR}.core.lc.his.csv
| - Used as input data for Excel diagram
|
Lexicon Word Spectrum Analysis
|
---|
10 | Get Lexicon single word frequency spectrum
LexWords.GetLexWordFreSpectrum
- TYPE 0: all, 1: SW, 2: MW
| - ${IN_DIR}inflVars.data
- ./Medline/unigram.${YEAR}.core.lc
| - ./LexSpec/sWord.b.csv (Lexicon words in MEDLINE or not)
- ./LexSpec/sWord.l.csv (Lexicon words with MEDLINE WC)
- ./LexSpec/sWord.rpt
- ./LexSpec/sWord.sum
|
|
11 | Group distilled n-gram set by core-term.lc
NGramUtil.GroupByCoreTerm
| - ${NGRAM_DIR}nGrams/distilledNGram.${YEAR}
| - ${NGRAM_DIR}nGrams/distilledNGram.${YEAR}.core.lc
- ${NGRAM_DIR}nGrams/distilledNGram.${YEAR}.core.lc.detail
| - Same as step-11 in 06.NGramUtil
|
12 | Get all words frequency spectrum
LexWords.GetLexWordFreSpectrum
- TYPE 0: all, 1: SW, 2: MW
| - ${IN_DIR}inflVars.data
- ${NGRAM_DIR}nGrams/distilledNGram.${YEAR}.core.lc
| - ./LexSpec/aWord.b.csv (Lexicon words in MEDLINE or not)
- ./LexSpec/aWord.l.csv (Lexicon words with MEDLINE WC)
- ./LexSpec/aWord.rpt
- aWord.sum
|
|
13 | Get multiwords frequency spectrum
LexWords.GetLexWordFreSpectrum
- TYPE 0: all, 1: SW, 2: MW
| - ${IN_DIR}inflVars.data
- ${NGRAM_DIR}nGrams/distilledNGram.${YEAR}.core.lc
| - ./LexSpec/mWord.b.csv (Lexicon words in MEDLINE or not)
- ./LexSpec/mWord.l.csv (Lexicon words with MEDLINE WC)
- ./LexSpec/mWord.rpt
- ./LexSpec/mWord.sum
|
|
Lexicon Word Histgram Analysis (Used in Amia Paper)
|
---|
20 | Get normTerm.lc from inflVars
| - ${IN_DIR}inflVars.data.f1
| - ./LexHist/inflVars.data.f1.core.lc
| - Get the norm-term.lc from inflVars
|
21 | Split single word and multiwords from lexicon inflVars
LexWords.SplitSingleMultiWords
| - ./LexHist/inflVars.data.f1.core
| - inflVars.data.f1.core.mw
- inflVars.data.f1.core.sw
| - Same as step-11 in 06.NGramUtil
|
22 | Add WC to Lexicon single word
NGramUtil.AddWcToCoreTerm
| - ./LexHist/inflVars.data.f1.core.sw
- ${NGRAM_DIR}nGrams/nGramSet.${YEAR}.30.core.lc
| - ./LexHist/inflVars.data.f1.core.sw.wc
|
|
23 | Add WC to Lexicon multiword |