Because of a lapse in government funding, the information on this website may not be up to date, transactions submitted via the website may not be processed, and the agency may not be able to respond to inquiries until appropriations are enacted. The NIH Clinical Center (the research hospital of NIH) is open. For more details about its operating status, please visit cc.nih.gov. Updates regarding government operating status and resumption of normal operations can be found at OPM.gov
Configuration Setup
CSpell Java provides users choices of different set up options through the configuration file. The default configuration file is ${CSPELL_DIR}/data/Config/cSpell.properties. The variables used in the configuration file are the empirical best value and listed in the following table. "Relative path" refers to the path relative to cSpell top directory, ${CSPELL_DIR}.
I. Configuration Variables
Directories and Files (13) | ||
---|---|---|
Variable Names | Descriptions | Variable Values (Default) |
CS_DIR | the absolute path of the CSpell directory |
|
CS_INFORMAL_EXP_FILE | the relative path of the informal expression file |
|
CS_CHECK_DIC_FILES | the relative path of the check dictionary file |
|
CS_SUGGEST_DIC_FILES | the relative path of the suggestion dictionary file |
|
CS_SPLIT_WORD_DIC_FILES | the relative path of the split word dictionary file |
|
CS_MW_DIC_FILE | the relative path of the multiword dictionary file |
|
CS_UNIT_DIC_FILE | the relative path of the units file |
|
CS_SV_DIC_FILE | the relative path of the spelling variants dictionary file |
|
CS_AA_DIC_FILE | the relative path of the abbreviation/acronym dictionary file |
|
CS_PN_DIC_FILE | the relative path of the proper noun dictionary file |
|
CS_FREQUENCY_FILE | the relative path of the word frequency file |
|
CS_W2V_IM_FILE | the relative path of the word2Vec CBOW input matrix file |
|
CS_W2V_OM_FILE | the relative path of the word2Vec CBOW output matrix file |
|
Modes Setup (2) | ||
---|---|---|
Variable Names | Descriptions | Variable Values (Default) |
CS_FUNC_MODE | Functional mode |
|
CS_RANK_MODE | Ranking mode for non-word, 1-to-1 and Split |
|
Detector Variables (5) | ||
---|---|---|
Variable Names | Descriptions | Variable Values (Default) |
CS_MAX_LEGIT_TOKEN_LENGTH | The maximum length of a legit token for spelling detection and correction. |
|
CS_DETECTOR_RW_SPLIT_WORD_MIN_LENGTH | The minimum length for real-word split detection. |
|
CS_DETECTOR_RW_SPLIT_WORD_MIN_WC | The minimum word count (frequency) for real-word split detection. |
|
CS_DETECTOR_RW_1TO1_WORD_MIN_LENGTH | The minimum length for real-word 1-to-1 detection. |
|
CS_DETECTOR_RW_1TO1_WORD_MIN_WC | The minimum word count for real-word 1-to-1 detection. |
|
Score Variables (3) | ||
---|---|---|
Variable Names | Descriptions | Variable Values (Default) |
CS_ORTHO_SCORE_ED_DIST_FAC | Weighting factor of edit distance for orthographic score. |
|
CS_ORTHO_SCORE_PHONETIC_FAC | Weighting factor of phonetic for orthographic score. |
|
CS_ORTHO_SCORE_OVERLAP_FAC | Weighting factor of overlap for orthographic score. |
|
Context Setup Variables (7) | ||
---|---|---|
Variable Names | Descriptions | Variable Values (Default) |
CS_W2V_SKIP_WORD | A Boolean flag of skipping context words if have no word2Vec score. |
|
CS_NW_1TO1_CONTEXT_RADIUS | Context radius for non-word 1-to-1. |
|
CS_NW_SPLIT_CONTEXT_RADIUS | Context radius for non-word split. |
|
CS_NW_MERGE_CONTEXT_RADIUS | Context radius for non-word merge. |
|
CS_RW_1TO1_CONTEXT_RADIUS | Context radius for real-word 1-to-1. |
|
CS_RW_SPLIT_CONTEXT_RADIUS | Context radius for real-word split. |
|
CS_RW_MERGE_CONTEXT_RADIUS | Context radius for real-word merge. |
|
II. Syntax
III. File Location
Notes: The CSpell installation program generates ${CSPELL_DIR}/data/config/cSpell.properties automatically (from ${CSPELL_DIR}/data/Config/cSpell.properties.TEMPLATE) according to options users chose during the installation.