Lexical Tools

WordInd System Options

WordInd is used to break up a string into an unique list of lowercased "words". The definition of "word" is dependent on how the string is tokenized. It is currently defined to be all tokens that contain only runs of alphanumeric characters of length greater than or equal to 1.

The definition of a token includes a run one or more of non-white space, non-punctuation characters as defined within the ISO-Latin-I character set, and tagged by Java as such. WordInd throws out anything that is not a word.

This page lists all system options for WordInd programs

Original Flag	New Flag	Feature Descriptions
Input Filter Options:
tN	t:INT	Define the field to use as the input term field. The default is 1.
Global Behavior Options:
	c	Reserve cases of input terms.
h	h	Print program help information.
	hs	Print option's hierarchy structure.
	i:STR	Define input file name. The default is screen input.
	o:STR	Define output file name. The default is screen output.
	p	Show the prompt. The default is no prompt.
s'Char'	s:STR	Defines a field separator for the input. The default is "\|".
v	v	Return the current version identification of WordInd.
Output Filter Options:
oN	F:INT F:INT:INT:...	Copy specified field(s) from input to output.
n	n	Return a "-No Output-" message when an input produces no output.

Examples:

shell> wordInd -c
This is a book.
This
is
a
book

shell> wordInd -F:2:1
aa~bb~cc|dd~ee
dd~ee|aa~bb~cc|aa
dd~ee|aa~bb~cc|bb
dd~ee|aa~bb~cc|cc

shell> wordInd -t:7 -F:1:6
C0185495|ENG|P|L0223844|PF|S0298948|Denis-Browne splint strapping|3|
C0185495|S0298948|denis
C0185495|S0298948|browne
C0185495|S0298948|splint
C0185495|S0298948|strapping

shell> wordInd -i:in.data -o:out.data
Read data from file, in.data, and send output to file, out.data.

shell> wordInd -n
$$$
-No Output-

shell> wordInd -s:/
aa/bb/cc|dd/ee
aa

shell> wordInd -t:2
a~bb~cc|dd~ee
dd
ee

shell> wordInd -t:2 -n
a~bb~cc||dd~ee
-No Output-

shell> wordInd -v
wordInd.2025