Standardization on Derivational Pairs
All dPairs in Lexical Tools are two directional without consider which is the source. In other words, if D-1 is a derivation of D-2, then D-2 must be a derivation of D-1, where D-1 is derivation 1 and D-2 is derivation 2. Please note that both D-1 and D-2 must be a base (uninflected) form. From this, we know the following two dPairs are identical:
D-1|CAT-1|EUE-1|D-2|CAT-2|EUI-2
D-2|CAT-2|EUE-2|D-1|CAT-1|EUI-1
The above two dPairs might exist in the derivation file (database table) and result in duplication on derivation generation. A standardization form is defined as follows to resolve this issue.
- zeroD & suffixD:
A standardized dPair D-1|CAT-1|EUE-1|D-2|CAT-2|EUI-2
is defined as:
- D-1 < D-2 ,alphabetically
- CAT-1 < CAT-2 ,alphabetically (if D-1 == D-2)
- EUI-1 < EUI-2 , alphabetically (if D-1 == D-2 and CAT-1 == CAT-2)
- prefixD:
A standardized dPair D-1|CAT-1|EUE-1|D-2|CAT-2|EUI-2
is defined as:
- D-1 = prefix + base
- D-2 = base
Bellows are some notes on the process regarding to this issue:
- Standardization is implemented when the raw dPair file is generated (std-raw)
- The manual tagged dPair file could be either direction (tag)
- All tagged dPairs are standardized when added to meta file (meta)
- All result dPairs are standardized (*.${YEAR})
- Two directional derivations are generated in Lexical Tools database retrieval