Visual Tagging Tool

Tagging tool

Tagging is necessary in most NLP projects. Part-of-speech tagging (POS tagging or POST), also called grammatical tagging or word-category disambiguation, is the process of marking up the words in a text (corpus) as corresponding to a particular part of speech, based on both its definition, as well as its context. To develop a tagger (tagging tool) involves following steps:

  • Define tags
  • Establish gold standard training data set
    • Experts hand tagging (requires a tool to ease the tagging process)
  • Develop automatic tagging system (tagger)
    • Integration multiple developing processes (requires a standard file format for tagged record between processes)
  • Evaluation
    Compare the tagging results to the gold standard set with following: (requires a standard file format for tagged record)
    • Specificity
    • Sensitivity
    • Precision
  • Refine
    Use the evaluation results to refine the tagging algorithm/system (require an integrated test suite, testing tool, and standard file format for tagged record)

VTT package includes all features to fit the need of entire developing life cycle for tagging system:

  • Provide a tool for handing tagging
  • Provide a standard file format for tagging to integrated between human hand tagging and automatic tagging system
  • Provide a tool for evaluation and can be integrated into a test suite for refine the tagging algorithm in developing a tagging system