Syntactic Uninvert
- Short Description:
Syntactic uninvert the input phrase.
- Full Description:
This flow strips out non-information words (such as NOS, NEC);
and then uninvert phrase around commas if the words after comma is not a conjunction words (such as about, beneath, where, yet).
Non-information words are defined in file "data/misc/nonInfoWords.data".
Conjunction words are defined in file "data/misc/conjunctionWord.data".
Both non-information words and conjunction words are configurable.
No effect on the -m option. "none" is added at the end of the output.
- Difference:
- The Java version drops comma if comma is at the end of the word after stripping the non-information words, such as "Kyphosis, NOS".
- The Java version does not uninvert words if it is a conjunction word shows up after a comma at any position of the words; while the "C" version does not uninvert only if a conjunction word show up after the first comma.
- Features:
- Strip non-information words.
- Tokenize phrase by using commas as delimiters.
- Check if the word after commas is a conjunction word.
- Uninvert input phrases around commas if above condition is false.
- Symbol:
S
- Examples:
shell> lvg -f:S
Angioplasty, Transluminal, Percutaneous Coronary
Angioplasty, Transluminal, Percutaneous Coronary|Percutaneous Coronary Transluminal Angioplasty|2047|16777215|S|1|
Kyphosis, NOS
Kyphosis, NOS|Kyphosis|2047|16777215|S|1|
Sedative, hypnotic, or anxiolytic amnestic disorder
Sedative, hypnotic, or anxiolytic amnestic disorder|Sedative, hypnotic, or anxiolytic amnestic disorder|2047|16777215|S|1|
More examples
Implementation Logic:
- Strip non-information words.
- Tokenize phrases by using commas as delimiters.
- Compose tokens in the reverse order if the word after comma is not a conjunction words.
Source Code: ToSyntacticUninvert.java
Hierarchy: Object -> Transformation -> ToSyntacticUninvert