Detectors
Detectors are used to detect spelling errors (non-word and real-word). Different corrections require different detectors. For example, the detector for a non-word correction is to detect if a token is a non-word errors (i.e. words not in the dictionary) while the detector for a real-word correction is to detect if a token is a real-word errors (errors are valid words, but not intended). This page uses the detector for non-word spelling (1-to-1) correction to illustrate the concept of detector. Please refer to each process for the details of different types of detectors.
The non-word 1-to-1 detector checks if a token is a spelling error. A token can be valid (not need to be correct) if it is known by the (checking) dictionary or a spelling error exception. They are described as follows:
I. Dictionary
II. Algorithm
III. Exception Examples
Input | Notes |
---|---|
year-long | Spelling variants |
dont's | possessive |
123 | digit |
123.456 | digit |
_ | punctuation |
12-35-00 | digit and punctuation |
12.35.00 | digit and punctuation |
clinicaltrials.gov | url |
http://www.yahoo.com?test=1%20try%20abc | url |
123@gmail.com | |
-0.25mm | measurement |
30mg/50kg | measurement |