Configuration Setup
CSpell Java provides users choices of different set up options through the configuration file. The default configuration file is ${CSPELL_DIR}/data/Config/cSpell.properties. The variables used in the configuration file are the empirical best value and listed in the following table. "Relative path" refers to the path relative to cSpell top directory, ${CSPELL_DIR}.
I. Configuration Variables
| Directories and Files (13) | ||
|---|---|---|
| Variable Names | Descriptions | Variable Values (Default) |
| CS_DIR | the absolute path of the CSpell directory |
|
| CS_INFORMAL_EXP_FILE | the relative path of the informal expression file |
|
| CS_CHECK_DIC_FILES | the relative path of the check dictionary file |
|
| CS_SUGGEST_DIC_FILES | the relative path of the suggestion dictionary file |
|
| CS_SPLIT_WORD_DIC_FILES | the relative path of the split word dictionary file |
|
| CS_MW_DIC_FILE | the relative path of the multiword dictionary file |
|
| CS_UNIT_DIC_FILE | the relative path of the units file |
|
| CS_SV_DIC_FILE | the relative path of the spelling variants dictionary file |
|
| CS_AA_DIC_FILE | the relative path of the abbreviation/acronym dictionary file |
|
| CS_PN_DIC_FILE | the relative path of the proper noun dictionary file |
|
| CS_FREQUENCY_FILE | the relative path of the word frequency file |
|
| CS_W2V_IM_FILE | the relative path of the word2Vec CBOW input matrix file |
|
| CS_W2V_OM_FILE | the relative path of the word2Vec CBOW output matrix file |
|
| Modes Setup (2) | ||
|---|---|---|
| Variable Names | Descriptions | Variable Values (Default) |
| CS_FUNC_MODE | Functional mode |
|
| CS_RANK_MODE | Ranking mode for non-word, 1-to-1 and Split |
|
| Detector Variables (5) | ||
|---|---|---|
| Variable Names | Descriptions | Variable Values (Default) |
| CS_MAX_LEGIT_TOKEN_LENGTH | The maximum length of a legit token for spelling detection and correction. |
|
| CS_DETECTOR_RW_SPLIT_WORD_MIN_LENGTH | The minimum length for real-word split detection. |
|
| CS_DETECTOR_RW_SPLIT_WORD_MIN_WC | The minimum word count (frequency) for real-word split detection. |
|
| CS_DETECTOR_RW_1TO1_WORD_MIN_LENGTH | The minimum length for real-word 1-to-1 detection. |
|
| CS_DETECTOR_RW_1TO1_WORD_MIN_WC | The minimum word count for real-word 1-to-1 detection. |
|
| Score Variables (3) | ||
|---|---|---|
| Variable Names | Descriptions | Variable Values (Default) |
| CS_ORTHO_SCORE_ED_DIST_FAC | Weighting factor of edit distance for orthographic score. |
|
| CS_ORTHO_SCORE_PHONETIC_FAC | Weighting factor of phonetic for orthographic score. |
|
| CS_ORTHO_SCORE_OVERLAP_FAC | Weighting factor of overlap for orthographic score. |
|
| Context Setup Variables (7) | ||
|---|---|---|
| Variable Names | Descriptions | Variable Values (Default) |
| CS_W2V_SKIP_WORD | A Boolean flag of skipping context words if have no word2Vec score. |
|
| CS_NW_1TO1_CONTEXT_RADIUS | Context radius for non-word 1-to-1. |
|
| CS_NW_SPLIT_CONTEXT_RADIUS | Context radius for non-word split. |
|
| CS_NW_MERGE_CONTEXT_RADIUS | Context radius for non-word merge. |
|
| CS_RW_1TO1_CONTEXT_RADIUS | Context radius for real-word 1-to-1. |
|
| CS_RW_SPLIT_CONTEXT_RADIUS | Context radius for real-word split. |
|
| CS_RW_MERGE_CONTEXT_RADIUS | Context radius for real-word merge. |
|
II. Syntax
III. File Location
Notes: The CSpell installation program generates ${CSPELL_DIR}/data/config/cSpell.properties automatically (from ${CSPELL_DIR}/data/Config/cSpell.properties.TEMPLATE) according to options users chose during the installation.