The SPECIALIST Lexicon

Lexicon Web Site Annual Update Procedure

This page describes annual update procedures for the web site of SPECIALIST LEXICON.

I. Baseline

  • Check out from LHC-Git: wwwlexicon
  • Dir: ${LHC_Git}/wwwlexicon-p
  • only use master branch

II. Contents Update

  • Update Statistics - both Lexicon growth and wordCount growth
    • Dir: ${WWW_LEXICON}/htdocs/docs/designDoc/UDF/statistics/
    • Update lexiconGrowth.jpg (from PC, MS-Excel)
      • In PC, copy ${WORK}/${YEAR}/Projects/Lexicon/Stats/growth.${YEAR}.xls from previous year
      • Open and update growth.${YEAR}.xls
      • Add records and forms from ./outputs/statistics.txt to growth.${YEAR}, the difference are auto-calculated
      • Add ${YEAR} data/Year to the chart
        • Move cursor to the chart
        • Right Click and select "Select Data..."
        • Edit Legend Entries (Year, Lexical Items, INflected Forms) to select data range - (include 1st row), (Growth!$A$1:$C$3X)
          => Click on Legend Entries (Series): Year, Lexical Items, Inflected Forms -> click Edit and change Series Values (30 -> 31)
        • Do the similar edit on the Horizonal Axis label.
          => Click on Horizontal Axis Labels -> click Edit and change Axis label range (30 -> 31) or use cursor to select column A (not include 1st row) to change to the right year value (Growth!$A$2:$A$31) - see the next step

      • Edit Horizontal Axis (Year) Labels
        • Click "Year" on the dialog window
        • Click Edit button (on the right side)
        • Select range: (Growth!$A$2:$A$25), select year range by cursor/shift
      • Save the image
        • Copy to PowerPoint
        • Use Save as Pictures and save to JPG format

          or

        • Open Paint
        • Copy the chart from Excel and paste to Paint
        • Save as jpg (${YEAR}LexiconGrowth), include frame
      • Add ${UDF}/statistics/${YEAR}.html
      • Update ${UDF}/statistics/growth.html

    • Similar steps for word stats
      • growth.${YEAR}.xls
      • Run LMW vs LSW stats
        • shell>mkdir ${LMW}/data/${YEAR}/inData
        • shell>cd ${LMW}/data/${YEAR}/inData
        • shell>ln -sf ${LEXICON}/data/${YEAR}/tables.release/inflVars.data inflVars.data
        • shell>cd ${LMW}/bin
        • shell>01.ElementWords ${YEAR}
          7
      • Update data to WordCount in growth.${YER}.xls from ${LMW}/${YEAR}/outData/01.ElementWord/LexStats.data
      • Add ${YEAR} data/Year to the chart
        • Move cursor to the chart
        • Right Click and select "Select Data..."
        • select Year, then click Edit, update series values
        • repeats above steps for Single Words and Multiwords
        • Save image as jpg in Paint
      • Add ${UDF}/wordStats/${YEAR}.html
      • Update ${UDF}/wordStats/growth.html

    • Updates POS distribution
      • use data in ${LEXICON}/data/${YEAR}/outputs/statistics.txt
      • POS-${YEAR}
      • Update POS.
      • Usually, only the top 4 POS changes (Noun, Adj, Verb, Adv)

      • Update ${UDF}/statistics/posDist.html
  • Routine update web site as follows:
  • Update download
    • Dir: ${WWW_LEXICON}/htdocs/web/release/
    • Add ${YEAR}.html
    • Update index.html
  • Update ${YEAR} on following pages:
    • ./web/index.html (home page)
  • Update Lexicon release
    • Copy from ${BACKUP}/Releases/UMLS/${YEAR}_AA_release to ${WWW_LEXICON}/htdocs/release/
      • LEX_DOC
        • ASCII
        • DOCS
        • LEX_DB
        • LEX_PGMS
        • LRxxx
        • MISC
        • NUMBERS
        • XML
      • LEX.tgz

III. Ci-Cd Pre-Processes: upload data

Upload data to LHC-Nexus and the ask Ci-Cd person (Anton) to upload to LHC-Download

  • Dir: ${LHC_GIT}/wwwlexicon-p/
  • File: uploadFiles
    • make uploadRelease
      upload release files: LEX.tgz and LEX_DOC.tgz
    • make uploadAll
      upload release files: LEX.tgz, LEX_DOC.tgz and docs files (*.tgz)
  • Upload files to Nexus and lhncbc:
    • Upload files from development machine to Nexus
      • Source Dir: ./htdocs/release
      • Target Dir: https://lhc-nexus.nlm.nih.gov/#browse/browse:lhc-lexicon-raw -> www -> lexicon -> ${YEAR}
      • Processes:
        uploadFiles
      • Files
        • ./docs/LEX.tgz
        • ./docs/LEX_DOC.tgz

        • ./release/java8_data.tgz
        • ./release/mwData.tgz
        • ./release/mwConsumerData.tgz
        • ./release/baseOrder.tgz
      • Delete Files (these files are ignore in git)
        • htdocs/release/LEX.tgz
        • htdocs/release/LEX_DOC.tgz

        • htdocs/docs/designDoc/UDF/java8/java8_data.tgz
        • htdocs/docs/designDoc/UDF/multiwords/mwData.tgz
        • htdocs/docs/designDoc/UDF/multiwords/mwConsumerData.tgz
        • htdocs/docs/designDoc/UDF/lexRecord/content/baseOrder.tgz
    • Upload files from Nexus to LHC download site:
      • Source: https://lhc-nexus.nlm.nih.gov/#browse/browse:lhc-lexicon-raw -> www -> lexicon -> ${YEAR}/release
      • Target: https://data.lhncbc.nlm.nih.gov/public/lsg/lexicon/${YEAR}/release
      • Files:
        • ./LEX.tgz
        • ./LEX_DOC.tgz (expand it, then delete the tgz file)
      • File format type:
        • *.pdf: application/pdf
        • *.tgz: binary/octet-stream
        • others: text/plain
      • Processes:
        • Ask Anton to manually upload.
    • Download and verify files from web site URL:
      • downloadFiles ${YEAR} TRUE
      • unzipFiles ${YEAR} TRUE
      • target dir: ${HOME}/ci-cd-data/lexicon.${YEAR}
      • Check the md5sum for all files

IV. Ci-Cd Processes: Development - Test - Stage - Deploy

  • Develop: develop in local machine lexdev (${GIT}/wwwlexicon-p)
    		shell> git add -A
    		shell> git commit -m "LEX_..."
    		shell> git push
    		
  • Test: use git tag to triggle Ci-Cd pipleline and deploy to Test site
    		shell> git tag v.${YAER}.stage.##
    		shell> git push origin tag v.${YAER}.stage.##
    		or
    		shell> git push --tags (push all tags to origin)
    		
  • Stage: Use the Git Ci-Cd pipleline and manual sent to Stage site by press the Deploy stage botton
  • Deploy: Manually contact with LHC-Portal staffs by eMail or JIRA ticket