Lexical Tools

Annually Release Procedures

This page describes an annually release procedures for Lexical tools package with new set of lexicon data.

I. lvg${YEAR} baseline (without new data)

  1. Prepare lvg${YEAR} baseline
    • Go to ${LHC_GIT}/lvg (this is suppose the latest version)
    • shell> git checkout develop
    • Change ${PREV_YEAR} to ${YEAR} in:
      • pom.xml files under ${LVG}
      • build.xml in ${LVG}, ${LVG}/examples, and ${LVG}/install
    • Change ${PREV_YEAR} to ${YEAR} in $LVG/overview.html
    • Update $LVG/data/config/lvg.properties, lvg.properties.TEMPLATE
      • ${PREV_YEAR} to ${YEAR}
      • LVG_DIR

    • Build with shell> ant release
      • should be built succesfully without issues
      • generates ./lib/lvg${YEAR}api.jar
      • generates ./lib/lvg${YEAR}dist.jar
      • generates ./lib/lvg${YEAR}lite.jar

  2. Update Java source code
    • Modify prolog of java files to
      • Remove all LEX-XX from history tag
      • Modify ${PREV_YEAR} from version tag
      		shell> cd ${LHC_GIT}/lvg-baselinecode/BaselineCode/bin
      		shell> ModifyLvgJavaCode
      		> YYYY (${YEAR})
      		> 1
      		> y
      		

      => move sources.new to sources

      		shell> cd ${LHC_GIT}/lvg/src/main
      		shell> rm -rf java
      		shell> mv java.new java
      		
      		shell> cd ${LHC_GIT}/lvg/src/main
      		shell> fgrep "LEX-" ./java/*/*/*/*/*/*/*.java
      		=> make sure no "LEX-" (should) exist in the source code
      		

  3. Update Database for baseline

    => reload old data to database and update the DB_NAME to ${YEAR}
    Please refer to Update DB session

    • Update ${YEAR} in $LVG_DIR/loadDb/sources/gov/nih/nlm/nls/lvg/loadDb/HSqlDb/db.cfg
    • ${LVG_DIR}/loadDb/bin/2.LoadDb ${YEAR}
    • add readonly=true in $LVG_DIR/data/HSqlDb/lvg${YEAR}.properties

  4. Update Installation program for baseline

    => update installation programs:
    Please refer to Update Installation Program session

    • Update ${YEAR} in installation script under ${LVG}/install/bin/
    • Update ${YEAR} in ${LVG_DIR}/install/sources/gov/nih/nlm/nls/lvg/install/Setup/Param.java

  5. Build/install the baseline

    => build and test to make sure the result is same as last release

    		> cd ${LVG}
    		> ant clean
    		> ant release
    		> packLvgFull ${YEAR}
    		or
    		> packLvgFull ${YEAR} TRUE FALSE TRUE FALSE
    		> install to ${PROJECT}
    		> The installation should be successful
    		

  6. Test Baseline: build the unit test suite

    Please see Regression test on Flows Unit pages. The result on unit test should be the same as last year because neither function or data are changed at this point. the only difference are:

    • ci.out (on ${LVG_DIR} and ${DB_NAME})
    • v.out (on ${YEAR})

=>The baseline (without features/packages updates) is completed after above 6 steps!

Tag this base line without new feature (git tag v.${YEAR}.1.baseline)

II. Upgrade all software components

  1. Update DB
    • Update HyperSonic JDBC Driver:
      • Download the latest version:
        to ${DEV_DIR}/Database/HSqlDb/hsqldb.N.N.N/.. and unzip, then copy
        • ./doc/hsqldb_lic.txt
        • ./doc/hypersonic_lic.txt
        • hsqldb-2.7.2-jdk8.jar

        to ${LVG}/lib/jdbcDrivers/HSqlDb/.
    • Compile:
      • Update hsqldb version in ${LVG}/pom.xml

      • > cd ${LVG_DIR}/loadDb
      • > ant clean
      • > ant cleanJar
      • > ant

        For mvn:

      • in 2024, use hsqlDb.2.7.2-jdk8
        		<artifactId> hsqldb </artifactId>
        		<version> 2.7.2 </version>
        		<classifier> jdk8 </classifier>
        		
      • also modified settings.xml (global and local) to add lhc-nexus in the repository (due to limited internet connection issues for subnet 106).
    • Load previous data into latest HyperSonic Sql Db:
      • > vi ${LVG_DIR}/loadDb/sources/gov/nih/nlm/nls/lvg/loadDb/HSqlDb/db.cfg
        • DB_NAME = lvg${YEAR}
        • LVG_DIR = /nfsvol/lex/Lu/LHC_Git/lvg
      • > mv ${LVG_DIR}/data/HSqlDb ${LVG_DIR}/data/HSqlDb.${PRE_YEAR}
      • > cd ${LVG_DIR}/loadDb/bin
      • > ./2.LoadDb ${YEAR}
        => Source codes might need to be modified (on SQL State and Vendor Error code) in case of dropping non-exist table if HsqlDb is upgraded with new SQL/JDBC standard
        		DbBase.ExecuteDdl(String query)
        		...
        		if((e.getSQLState().equals("25006"))
        		&& (e.getErrorCode() == -3706))
        		...
        		
      • add readonly=true in ${LVG_DIR}/data/HSqlDb/lvg${YEAR}.properties
        => This is a must to make HSqlDb multi-threads
    • Test:
      • > cd ${LVG_DIR}/loadDb/bin
      • > ./3.TestDb ${YEAR}

  2. Update ICU4J
    • Download the latest version of icu4j-xx.jar
    • for Ant: put icu4j-xx.jar to ${LVG}/lib/unicode/
    • for mvn: modify pom.xml
    • Update icu.jar in ${LVG}/build.xml
    • Update icu version in ${LVG}/pom.xml

  3. Update JRE
    • Download the latest version of: Java (JRE) or download older archive version of Java (JRE) to ${LVG_DIR}/bin/jreDist/:
      • jre-XuXX-linux-x64.tar.gz (64-bit)
      • jre-XuXX-windows-x64.exe (64-bit)
        => Need to download from 64-bit PC, then change .com to .exe

    • Install the lastest JDK to /usr/local/Applications/Java/
      • set JAVA_HOME in ~/.cshrc

      • /usr/bin/java -> link to the real java
      • /usr/bin/javac -> link to the real javac

      • /etc/alternatives/java -> link to the real java
      • /etc/alternatives/javac -> link to the real javac

      • source ~/.cshrc
    • update ${jdk.dir} in ./build.xml and */build.xml (for javac and javadoc)

  4. Update Installation Program
    • Change program parameters:
      • > cd ${LVG_DIR}/install/sources/gov/nih/nlm/nls/lvg/install/Setup
      • Update all data members in Param.java
        • VERSION
        • JRE_DIR
      • Update all scripts in install/bin
        • Update lvg${YEAR}
        • Update Java version
    • Compile:
      • > cd ${LVG_DIR}/install
      • > ant clean
      • > ant cleanJar
      • > ant

      • > cd ${LVG_DIR}
      • > ant dist
      • > ant release
    • Test:
      • Quick Test:
        • > cd $LVG_DIR
        • > ./install/bin/install_test.sh

        • Change JAVA=java for all *_test
        • Change name for all *_test to *
          • > mv lgt_test lgt
          • > mv luiNorm_test luiNorm
          • > mv lvg_test lvg
          • > mv norm_test norm
          • > mv toAscii_test toAscii
          • > mv wordInd_test wordInd
          • > mv fields_test fields

      • Full Test: Pack and perform unit test again to complete the baseline
        		> cd ${LVG}
        		> packLvgLite ${YEAR}
        		  => must run this to remove old lvg${YEAR}lite and start everything from scratch
        		  => test lvg lite, see LVG lite Test 
        		  => 
        		> ant clean
        		> ant release
        		  => generate ,/lib/lvg${YEAR}api.jar and ./lib/lvg${YEAR}dist.jar
        		> packLvgFull ${YEAR}
        		  => packLvgFull ${YEAR} TRUE FALSE TRUE FALSE
        		  => generate ./lvg${YEAR}.tgz (use jar from ant)
        		

        and > make build (test on maven build) => generate ./target/lvg-${YEAR}.0-SNAPSHOT.jar and ./target/lvg-${YEAR}.0-SNAPSHOT-shaded.jar > packLvgFull ${YEAR} ${PACK} ${UPLOAD} ${SNAPSHOT} ${MVN_JAR} => packLvgFull ${YEAR} TRUE FALSE TRUE TRUE => pack, not upload, use snapshot built, use mvn jar => generate lvg${YEAR}.tgz (use jar from mvn)

        > install to ${PROJECT} > perform Unit tests for above 2 lvg${YEAR}.tgz => Update config file: lvg.properties.hsql => make sure the results are the same,no diff between abvoe two package. > Tag this baseline with upgrade software (git tag -a v.${YEAR}.2.baseline)

III. Update functions: Complete SCRs for ${YEAR} release

  • Update ${YEAR} in ${LVG_SRC_DIR}/Lib/GlobalBehavior.java
  • Complete all SCRs

======================================================
This should be done before freezing the Lexicon (July).
======================================================

IV. Update data: Integrate with New Lexicon Data

  • Use LexBuild to generate new LEXICON
  • Generate the derivations (zeroD, prefixD, suffixD): 3~4 weeks work (with new tagged files and annual updates)
  • Generate the lexSynonyms (CUI, EUI, NLP): 1~2 weeks work (after recieveing tagged files)
  • Generate the lexAntonyms (LEX, SD, PD, CC, SN): 1~2 weeks work (after recieveing tagged files)

  • LVG-PreDataBase: Procedures to prepare all data files for Lexical Tools
    You may also manually prepare all these table, see bellows for details:

  • Update SD-Rules - ${LVG_DIR}/data/rules/dm.rul (2 weeks work!):
    • This step may be simplfied if no new SD-Rule is evaluated/added. The same SD-Rule set from previous release can be used with updated exceptions.
    • If new SD-Rules are evaluated, please follow SD-Rules Evaluation/Optimization to get the optimized SD-Rules (must be done if there are new SD-Rules)
    • Theoretically, the optimized SD-Rules set should be updated annually
    • Update the html links:
      • ${LVG_DERIVATIONS}/derivations/SD-Rules-Opti/Ex-current
      • ${LVG_DERIVATIONS}/derivations/SD-Rules-Opti/optiSet.html
    • Re-generate SD-Rule trie for Lvg after updating the optimized set
      • Input files:
        • dir: ${SUFFIX_D}/data/${YEAR}
        • ./data/${YEAR}/data/suffixD.no.data (for exceptions)
        • ./data/${YEAR}/dataR/SdRulesOptimum/${RULE}/sdRules.stats.out.xx.x.opti (for optimized SD-Rule set)
          => link the result from optimized set to ./data/${YEAR}/dataR/sdRules.stats.out
          Use the previous optimized set if no new Sd-Rules are evaluated
          • 2018 used 2017 (no new SD-Rules were evaluated)
          • 2019 used 2017 (no new SD-Rules were evaluated)
          • 2020 used new SD-Rules (./SdRulesOptimum/16.ility-le/sdRules.stats.out.16.1.opt)
          • 2021 used new SD-Rules (./SdRulesOptimum/35.ity-y/sdRules.stats.out.35.3.opt)
          • 2022 used 2021 (no new SD-Rules were evaluated)
          • 2023 used 2022 (no new SD-Rules were evaluated)
          • 2024 used new SD-Rules (./SdRulesOptimum/37.ity-y/sdRules.stats.out.37.1.opt)

      • shell>cd ${SUFFIX_D}/bin
        shell>GetSdRule ${YEAR}
        8
        ${GOOD_RULE_NO} (105)
      • Output: ${SUFFID_D}/data/${YEAR}/dataR/dm.rul.${YEAR}.${GOOD_RULE_NO}
      • Copy to ${LVG}/data/rules/dm.rul
        => This is the trie file that Lexical Tools uses to load to Trie mechanism
        => To test it, the stem must be >= 4
        shell> ./bin/lvg -f:d -kd:3 -p -m
        > 1234se
        > ...
        > 1234se|1234zation|128|1|d|1|RULE|se$|verb|base|zation$|noun|base| <= ensure this is in the output
        > ...
        => use -kd:3 to include results from SD-Rules
        => 1234 is the stem (>= 4)

        shell>cp -p ${DERIVATION_DIR}/3.suffixD/data/${YEAR}/dataR/dm.rul.${YEAR}.${GOOD_NO} ${LVG_DIR}/data/rules/dm.rul.${YEAR}.${GOOD_NO}

V. Update other software components

  • Update $LVG_DIR/sources/gov/nih/nlm/nls/lvg/Tools/GuiTool
  • Update $LVG_DIR/examples
    Example test: Run all programs under ${LVG_DIR}/examples
    • ${LVG_DIR}/examples/bin
      • Go through ReadMe.txt and run all examples to make sure they work
      • Updates ${YEAR} in testExample script
      • Add API examples if new tool APIs are added

  • Update lvg web site and documents:
    • apiDoc: should be done in the build process
    • Web Pages:
      • ${LVG_WEB}/web/index.html
      • ${LVG_WEB}/web/download.html (to be finalized later)
      • ${LVG_WEB}/web/fag.html (to be finalized later)
      • ${LVG_WEB}/web/release/index.html
      • ${LVG_WEB}/web/release/${YEAR}.html (to be finalized later)
    • userDoc (${LVG_WEB}/docs/userDoc):
      • ${USER_DOC}/install/releaseNotes.html
      • ${USER_DOC}/install/install.html
      • ${USER_DOC}/install/config.html
      • ${USER_DOC}/install/mySql.html
      • ${USER_DOC}/install/installManual.html
      • ${USER_DOC}/install/repository.html (to be finalized later)

      ================================================================

      => This is good enough for the internal release to OCCS (if have tight schedule)

      => Unit tests from following section should be perfromed before OCCS internal release

      ================================================================

      Bellows can be done after internal release, but must complete before official public release. However, we try to complete this before internal release in the past.

    • designDoc:
      • ${LVG_WEB}/docs/designDoc/LifeCycle/deploy/release/index.html (this page: updates new processes)
      • ${LEXICON_WEB}/docs/designDoc/UDF/derivations (Updates in Lexicon web site after 2015+, such as dGrowth.thml)
      • ${LEXICON_WEB}/docs/designDoc/UDF/synonyms (Updates in Lexicon web site, such as sGrowth.thml)
      • ${LEXICON_WEB}/docs/designDoc/UDF/antonyms (Updates in Lexicon web site, such as aGrowth.thml)

VI. Test & More Documents

  • Tests:
    Test TypeDescriptions
    unit testtest all flows and options
    lite testtest lvg${YEAR}lite.tgz
    ASCII testCheck non-ASCII on norm and luiNorm
    GUI Lexical Tool testTest Lgt- Lvg GUI tools
    Perfromance testTest performance on norm and LuiNorm
    Platform Testrelease test on different platforms
    Other testsOther tests not listed inthis table
  • Doucments:

    • Unit Test Documents:
      => ${TEST}/LVG/UnitTest/bin/
      • Update test results to Web site Doc
      • Source: ${TEST}/LVG/UnitTest/data/${YEAR}
      • Target: ${LHC_GIT}/lvg-testexamples/data/Html.${YEAR}
      • Deploy: ${WWW_LVG}/htdocs/docs/designDoc/LifeCycle/test
      • Process:
        • must install LVG to ${PROJECT}
        • Must complete unit tests
        • shell> cd ${LVG}/PostProc/TestExamples/bin
          CommandDescription
          shell> 1.GenerateExampleHtmlFiles ${YEAR} convert unit test results to Html
          shell> 2.DeployExampleHtmlFiles ${YEAR} deploy unit test Html file to ${WWW_LVG}

        • Mannually Update flow and option web pages
          • flow exmaples: ${LVG_WEB}/docs/designDoc/UDF/flow/*.html
          • option examples: ${LVG_WEB}/docs/designDoc/LifeCycle/requirement/lvgOptions/*.html

        • ASCII Test Documents
          • Update Unicode Examples to Web site Doc
          • Source: ${LHC_GIT}/lvg-unicodetables/data/Html.${YEAR}
          • Target: ${WWW_LVG}/htdocs/docs/designDoc/UDF/unicode
            • DefaultTables
            • MapTables
          • Processes:
            • => Need to update ./lib/icu4j.jar and lvgDist.jar and install LVG to ${PROJECT}
            • shell> cd /${LHC_GIT}/lvg-unicodetables/bin
              Command & optionsDescription
              shell> 0.VerifyFiles ${YEAR} check ${LVG}/data/Unicode?*.data
              shell> 1.GenerateDefaultHtmlFiles ${YEAR} generate default html files
              shell> 2.GetUnicodeCoreNormResult ${YEAR} generate core norm html file
              shell> 3.NormResults ${YEAR} generate norm html files
              shell> 4.DeployHtmlFiles ${YEAR}
              4
              deploy html files to ${WWW_LVG}

    ============================================================

    Above tests should be completed for the internal release for OCSS. This is the best scenario for internal test. The following steps are needed for the final official public release.

    ============================================================

  • Performance test on norm and luiNorm:
  • replace indents with 4 spaces in the source code
    Change "\t" to "    " for all Java code under $LVG_DIR/sources/gov/nih/nlm/nls/lvg/
    shell> cd ${GIT}/lvg-baselinecode/bin/ModifyLvgJavaCode
    ${YEAR}
    2
    

  • Final check all files in repository (manually go through)

  • Examples Test
    Update flow and option web pages (manually)
    • flows: ${LVG_WEB}/docs/designDoc/UDF/flow/*.html (update flow examples)
    • options: ${LVG_WEB}/docs/designDoc/LifeCycle/requirement/lvgOptions/*.html (update option examples)

VII. Lvg Compile & Pack

  • > cd $LVG_DIR
  • > ant clean
  • > ant
  • > ant release

    After 2020, CiCd is applied for official release (maven build). The Ant build is used for development because of ease of process.

  • pack: gtar -czvf lvg${YEAR}.tgz lvg${YEAR}
  • unpack: gtar -xzvf lvg${YEAR}.tgz

VIII. Pack LVG programs to LEX/LEX_PGMS and LEX.tgz in AA Release

IX. Update Web Tools

  • Copy $WEB_LVG/${PREV_YEAR} to $WEB_LVG/${YEAR} ...