Configuration Setup
CSpell Java provides users choices of different set up options through the configuration file. The default configuration file is ${CSPELL_DIR}/data/Config/cSpell.properties. The variables used in the configuration file are the empirical best value and listed in the following table. "Relative path" refers to the path relative to cSpell top directory, ${CSPELL_DIR}.
I. Configuration Variables
Directories and Files (13) | ||
---|---|---|
Variable Names | Descriptions | Variable Values (Default) |
CS_DIR | the absolute path of the CSpell directory |
|
CS_INFORMAL_EXP_FILE | the relative path of the informal expression file |
|
CS_CHECK_DIC_FILES | the relative path of the check dictionary file |
|
CS_SUGGEST_DIC_FILES | the relative path of the suggestion dictionary file |
|
CS_SPLIT_WORD_DIC_FILES | the relative path of the split word dictionary file |
|
CS_MW_DIC_FILE | the relative path of the multiword dictionary file |
|
CS_UNIT_DIC_FILE | the relative path of the units file |
|
CS_SV_DIC_FILE | the relative path of the spelling variants dictionary file |
|
CS_AA_DIC_FILE | the relative path of the abbreviation/acronym dictionary file |
|
CS_PN_DIC_FILE | the relative path of the proper noun dictionary file |
|
CS_FREQUENCY_FILE | the relative path of the word frequency file |
|
CS_W2V_IM_FILE | the relative path of the word2Vec CBOW input matrix file |
|
CS_W2V_OM_FILE | the relative path of the word2Vec CBOW output matrix file |
|
Modes Setup (2) | ||
---|---|---|
Variable Names | Descriptions | Variable Values (Default) |
CS_FUNC_MODE | Functional mode |
|
CS_RANK_MODE | Ranking mode for non-word, 1-to-1 and Split |
|
Detector Variables (5) | ||
---|---|---|
Variable Names | Descriptions | Variable Values (Default) |
CS_MAX_LEGIT_TOKEN_LENGTH | The maximum length of a legit token for spelling detection and correction. |
|
CS_DETECTOR_RW_SPLIT_WORD_MIN_LENGTH | The minimum length for real-word split detection. |
|
CS_DETECTOR_RW_SPLIT_WORD_MIN_WC | The minimum word count (frequency) for real-word split detection. |
|
CS_DETECTOR_RW_1TO1_WORD_MIN_LENGTH | The minimum length for real-word 1-to-1 detection. |
|
CS_DETECTOR_RW_1TO1_WORD_MIN_WC | The minimum word count for real-word 1-to-1 detection. |
|
Score Variables (3) | ||
---|---|---|
Variable Names | Descriptions | Variable Values (Default) |
CS_ORTHO_SCORE_ED_DIST_FAC | Weighting factor of edit distance for orthographic score. |
|
CS_ORTHO_SCORE_PHONETIC_FAC | Weighting factor of phonetic for orthographic score. |
|
CS_ORTHO_SCORE_OVERLAP_FAC | Weighting factor of overlap for orthographic score. |
|
Context Setup Variables (7) | ||
---|---|---|
Variable Names | Descriptions | Variable Values (Default) |
CS_W2V_SKIP_WORD | A Boolean flag of skipping context words if have no word2Vec score. |
|
CS_NW_1TO1_CONTEXT_RADIUS | Context radius for non-word 1-to-1. |
|
CS_NW_SPLIT_CONTEXT_RADIUS | Context radius for non-word split. |
|
CS_NW_MERGE_CONTEXT_RADIUS | Context radius for non-word merge. |
|
CS_RW_1TO1_CONTEXT_RADIUS | Context radius for real-word 1-to-1. |
|
CS_RW_SPLIT_CONTEXT_RADIUS | Context radius for real-word split. |
|
CS_RW_MERGE_CONTEXT_RADIUS | Context radius for real-word merge. |
|
II. Syntax
III. File Location
Notes: The CSpell installation program generates ${CSPELL_DIR}/data/config/cSpell.properties automatically (from ${CSPELL_DIR}/data/Config/cSpell.properties.TEMPLATE) according to options users chose during the installation.