LexBuild

LexBuild - Auto Task features

LexBuild include serveral tasks that are performed automatically daily. They are implemented with scripts and corntab. They are describes below:

  • Auto approval
    • Background:
      Due to the shortness of staffs and the high quality of our linguists, the auto-approval feature is added to replace manual approval in 2018 for submitted leRecords that are created, modified, and deleted (with singature as "Auto-approved"). This feature can be removed and reversed to manual approval process as needed.
    • Processes:
      A crontab job is executed to automatically approve all submitted records. The re-index on gSpell (for close match) are followed to ensure all approved terms are indexed. Fianlly, table are backup automatically.
    • crontab codes:
      30 3 * * 1-5 /usr/local/Lsg/Projects/LB/LexBuild/Tools/LoadDb/AutoApprove 1 /usr/local/Lsg/Projects/LB/LexBuild/Tools/LoadDb/AutoApprove.log
    • Implementation:
      • scripts: ${LB_DIR}/Tools/LoadDb/AutoApprove
      • source code: ${LEXBUILD_SOURCES}/PostProc/AutoApprove.java

      • Use ${LB_DIR}/data.${HOST_NAME}/WebApp/Config/cg1.cfg for Db connection
      • Use Auto-approved for the approval signature
      • Use today as the approval date
      • Get all records (to be approved) from DB table LEX_RECORD_TEMP
      • Go through records for auto-approved:
  • Auto generate
    • Backgroud: The Lexicon, inflection variants (for both approved and to be approaved lexRecords) are generated nightly from the database. These files are used for gSpell re-index.
    • crontab codes:
      30 5 * * 1-5 /usr/local/Lsg/Projects/LB/LexBuild/Tools/LoadDb/GenScript < /usr/local/Lsg/Projects/LB/LexBuild/Tools/LoadDb/GenLexicon.option
      35 5 * * 1-5 /usr/local/Lsg/Projects/LB/LexBuild/Tools/LoadDb/GenScript < /usr/local/Lsg/Projects/LB/LexBuild/Tools/LoadDb/GenInflVars.option
    • Implementation:
      • script: ${LB_DIR}/Tools/LoadDb/GenScript
      • source code: ${LEXBUILD_SOURCES}/JavaDb/GenerateLexicon.java
      • generate lexRecrod from DB order by eui

        Outputs: ${LB_DIR}/data.${HOST_NAME}/WebApp/Outputs/Lexicon/LEXICON

      • source code: ${LEXBUILD_SOURCES}/JavaDb/GenerateInflVars.java
      • generate inflVar from DB order by eui

        Outputs: ${LB_DIR}/data.${HOST_NAME}/WebApp/Outputs/Lexicon/InflVars

        Outputs: ${LB_DIR}/data.${HOST_NAME}/WebApp/Outputs/Lexicon/InflVarsTemp

  • Auto backup
    • Backgroud:
    • crontab codes:
      40 5 * * 1-5 /usr/local/Lsg/Projects/LB/LexBuild/Tools/LoadDb/DbScript < /usr/local/Lsg/Projects/LB/LexBuild/Tools/LoadDb/DbBackup.option
    • Implementation:
      • script: ${LB_DIR}/Tools/LoadDb/DbScript
      • source code: ${LEXBUILD_SOURCES}/JavaDb/dbTablesBackup.java
      • generate DB tables from DB

        Outputs: ${LB_DIR}/data.${HOST_NAME}/WebApp/Outputs/Tables/*

  • Auto re-index in gSpell
    • Backgroud:
      output
      gSpell is used for close-match. The new added lexRecord need to be added to gSpell dictionary. It takes a long time, thus, it is conducted at night time daily.
    • crontab codes:
      1 6 * * 1-5 /usr/local/Lsg/Projects/LB/LexBuild/Tools/WebScript/ReIndexDic 1 /usr/local/Lsg/Projects/LB/LexBuild/Tools/WebScript/ReIndexDic.log
    • Implementation:
      • scripts: ${LEXBUILD_TOOLS}/WebScript/ReIndexDic

      • Update status to 1 ReIndexing: in gSpellStatus.txt
      • Generate InflVars.all and InflVarsTemp (in ${LB_DIR}/data.${HOST_NAME}/WebApp/Outputs/Lexicon)
      • Unify InflVars.all and InflVarsTemp to InflVars.uSort and InflVarsTemp.uSort, respectively
      • ReIndex gSpell dictionaries (LexiconLb and LexiconLbTemp)
      • Update status to 2 Ready for reload: in gSpellStatus.txt
        The updated indexed dictionary will be reloaded to gSpell when the next user log in

      • The gSpell real-time reload function has issues, use crontab to restart the tomcat instead.

  • crontab
    • Edit crontab: shell> crontab -e
    • List crontab: shell> crontab -l
    • Remove crontab: shell> crontab -r
    • shell> crontab file

    • log: /var/log/cronr
    • format: Use * for wild card
      123456
      min (0-59)hour (0-23)day/month (1-31)month (1-12)day/week (0-6)command

  • Location for Related Files
    • Db config file:
      • ${LB_DIR}/data.${HOST_NAME}/WebApp/Config/cg1.cfg
    • gSpell Dictionary:
      ${APPLICATION}/GSpell/gspell_V0.0.40.${HOST_NAME}/nls/gspell/dictionaries:
      • LexiconLb
      • LexiconLbTemp