Lexical Tools

Annually Release - Data from Lvg

Following files are generated from Lvg. Those files from Lexicon in previous section must be generated and loaded into Lvg database first. These operations are detailed as follows:

  • canonical.data
    • > cd ~/Projects/LVG/Tools/CanonGenerator
    • Generate atoms.data and validated with OCCS (with ls -al) and copy to ./data/200X/dataOrg/
    • Run ./bin/0.0.ModifyAtoms to modify atoms file
    • Run ./bin/RunCanonAll to generate ./data/${YEAR}/data/canonical.data
      • Make sure update ${LVG_DIR} in ${LVG_DIR}/data/config/lvg.properties
    • copy above file to $LVG_Components/PreDataBase/data/${YEAR}/data/canonical.data

  • antiNorm.data
    • Run GenerateAntiNorm to generate ./data/200X/data/antiNorm.data
    • The format of above file is: normalized form|inflected form|Category|Inflection|Eui

  • fruitful.data
    • Generate a file $LVG_Components/PreDataBase/data/200X/data/inflLc.data by lowercased and unified on the inflected form.
    • Generate fruitful variants (-f:Gn -m) on above file and output to $LVG_Components/PreDataBase/data/200X/data/fruitful.data

Above 3 files can be generated by the following command:

  • > cd $LVG_Components/CanonGenerator/bin/1.RunCanonAll
  • > cd $LVG_Components/PreDataBase/bin/5.Generate2Files

After above 3 files are properly generated, steps described below are then followed:

  • Copy above files to "$LVG_DIR/data/tables/" (./bin/Move2Files)
  • Run Analyze* to check max. sizes of all fields
    • java AnalyzeCanon
    • java AnalyzeFruitful
    • java AnalyzeAntiNorm
  • Load these data into HSqlDb and MySql database