competition update
This commit is contained in:
719
language_model/srilm-1.7.3/man/html/nbest-scripts.1.html
Normal file
719
language_model/srilm-1.7.3/man/html/nbest-scripts.1.html
Normal file
@@ -0,0 +1,719 @@
|
||||
<! $Id: nbest-scripts.1,v 1.52 2019/09/09 22:35:37 stolcke Exp $>
|
||||
<HTML>
|
||||
<HEADER>
|
||||
<TITLE>nbest-scripts</TITLE>
|
||||
<BODY>
|
||||
<H1>nbest-scripts</H1>
|
||||
<H2> NAME </H2>
|
||||
nbest-scripts, combine-rover-controls, compare-sclite, compute-best-rover-mix, compute-sclite-nbest, compute-sclite, fix-ctm, merge-nbest, nbest-error, nbest-posteriors, nbest-rover, nbest-vocab, nbest-words, nbest2-to-nbest1, rescore-acoustic, rescore-decipher, rescore-reweight, rover-control-tying, rover-control-weights, search-rover-combo, sentid-to-sclite - rescore and evaluate N-best lists
|
||||
<H2> SYNOPSIS </H2>
|
||||
<PRE>
|
||||
<B>rescore-decipher</B> [ <B>-bytelog</B> ] [ <B>-nodecipherlm</B> ] [ <B>-multiwords</B> ] \
|
||||
[ <B>-multi-char</B> <I>C</I> ] [ <B>-pretty</B> <I>mapfile</I> ] \
|
||||
[ <B>-ngram-tool</B> <I>program</I> ][ <B>-filter</B> <I>command</I> ] \
|
||||
[ <B>-norescore</B> ] [ <B>-lm-only</B> ] [ <B>-count-oovs</B> ] [ <B>-limit-vocab</B> ] \
|
||||
[ <B>-vocab-aliases</B> <I>mapfile</I> ] [ <B>-fast</B> ] \
|
||||
<I>nbest-file-list</I> <I>score-dir</I> <B>-lm</B> ... <I>lm-options</I> ...
|
||||
<B>rescore-acoustic</B> <I>old-nbest-dir</I>|<I>old-file-list</I> <I>old-ac-weight</I> \
|
||||
<I>new-score-dir1</I> <I>new-ac-weight1</I> ... <I>new-nbest-dir</I> [ <I>max-nbest</I> ]
|
||||
<B>rescore-reweight</B> [ <B>-multiwords</B> ] [ <B>-multi-char</B> <I>C</I> ] <I>score-dir</I>|<I>file-list</I> \
|
||||
[ <I>lmw</I> [ <I>wtw</I> [ <I>score-dir1 score-weight1</I> ... ] [ <I>max-nbest</I> ]]]
|
||||
<B>rescore-minimize-wer</B> <I>score-dir</I> [ <I>lmw</I> [ <I>wtw</I> [ <I>max-nbest</I> ]]]
|
||||
<B>nbest2-to-nbest1</B> [ <I>nbest-file</I> ]
|
||||
<B>nbest-rover</B> [ <I>sentid-list</I> | <B>-</B> ] <I>control-file</I> \
|
||||
[ <I>posterior-file</I> [ <I>nbest-lattice-options</I> ] ]
|
||||
<B>combine-rover-controls</B> [ <B>lambda=</B><I>weights</I> ] [ <B>postscale=</B><I>S</I> ] \
|
||||
[ <B>keeppaths=1</B> ] <I>rover-control</I> [ ... ]
|
||||
<B>rover-control-weights</B> [ <B>weights=</B>"<I>w1 ... wn</I>" ] <I>control-file</I>
|
||||
<B>rover-control-tying</B> <I>control-file</I>
|
||||
<B>nbest-optimize-args-from-rover-control</B> [ <B>print_weights=1</B> ] [ <B>print_dirs=1</B> ] <I>control-file</I>
|
||||
<B>compute-best-rover-mix</B> [ <B>lambda=</B><I>weights</I> ] [ <B>addone=</B><I>c</I> ] [ <B>precision=</B><I>p</I> ] \
|
||||
[ <B>tying=</B>"<I>b1 ... bn</I>" ] [ <B>write_weights=</B><I>file</I> ] <I>reference-posteriors-output</I>
|
||||
<B>search-rover-combo</B> [ <B>-scorer</B> <I>script</I> ] [ <B>-datadir</B> <I>dir</I> ] \
|
||||
[ <B>-weights</B> <I>weights</I> ] [ <B>-sentids</B> <I>list</I> ] \
|
||||
[ <B>-refs</B> <I>refs</I> ] [ <B>-smooth-weight</B> <I>S</I> ] \
|
||||
[ <B>-J</B> <I>n</I> ] <I>list-of-control-files</I>
|
||||
<B>nbest-posteriors</B> [ <B>weight=</B><I>W</I> ] [ <B>lmw=</B><I>lmw</I> ] [ <B>wtw=</B><I>wtw</I> ] [ <B>postscale=</B><I>S</I> ] \
|
||||
[ <B>max_nbest=</B><I>M</I> ] <I>nbest-file</I>
|
||||
<B>merge-nbest</B> [ <B>multiwords=1</B> ] [ <B>multichar=</B><I>C</I> ] [ <B>nopauses=1</B> ] \
|
||||
[ <B>max_nbest=</B><I>M</I> ] <I>nbest-file</I> ...
|
||||
<B>nbest-vocab</B> [ <I>nbest-list</I> ... ]
|
||||
<B>nbest-words</B> [ <I>nbest-list</I> ... ]
|
||||
<B>nbest-oov-counts</B> <B>vocab=</B><I>vocabfile</I> [ <B>vocab_aliases=</B><I>aliasfile</I> ] <I>nbest-list</I>
|
||||
<B>nbest-error</B> <I>score-dir</I>|<I>file-list</I> <I>refs</I> [ <I>nbest-lattice-option</I> ... ]
|
||||
<B>sentid-to-sclite</B> <I>hyps</I>
|
||||
<B>sentid-to-ctm</B> <I>hyps</I>
|
||||
<B>fix-ctm</B> <I>ctmfile</I>
|
||||
<B>compute-sclite</B> <B>-r</B> <I>refs</I> <B>-h</B> <I>hyps</I> [ <B>-h</B> <I>hyps</I> ... ] [ <B>-S</B> <I>subset</I> ... ] \
|
||||
[ <B>-multiwords</B>|<B>-M</B> ] [ <B>-noperiods</B> ] [ <B>-R</B> ] [ <B>-g</B> <I>glmfile</I> ] [ <B>-H</B> ] \
|
||||
[ <B>-v</B> ] [ <I>sclite-options</I> ...]
|
||||
<B>compute-sclite-nbest</B> <I>file-list</I> <I>output-dir</I> -r <I>refs</I> [ <B>-filter</B> <I>script</I> ] [ <I>sclite-options</I> ...]
|
||||
<B>compare-sclite</B> <B>-r</B> <I>refs</I> <B>-h1</B> <I>hyps1</I> <B>-h2</B> <I>hyps2</I> [ <B>-S</B> <I>subset</I> ] \
|
||||
[ <I>compute-sclite-options</I> ... ]
|
||||
</PRE>
|
||||
<H2> DESCRIPTION </H2>
|
||||
These scripts perform common tasks on N-best hypotheses in
|
||||
<A HREF="nbest-format.5.html">nbest-format(5)</A>,
|
||||
especially those needed for rescoring and extracting and evaluating
|
||||
1-best hypotheses.
|
||||
<P>
|
||||
<B> rescore-decipher </B>
|
||||
applies a language model implemented by
|
||||
<A HREF="ngram.1.html">ngram(1)</A>
|
||||
to the N-best lists listed in
|
||||
<I>nbest-file-list</I>.<I></I><I></I><I></I>
|
||||
The N-best files may be in compressed format.
|
||||
The rescored N-best lists are stored in directory
|
||||
<I>score-dir</I>.<I></I><I></I><I></I>
|
||||
All following arguments are passed to
|
||||
<A HREF="ngram.1.html">ngram(1)</A>
|
||||
and are used to control the language model.
|
||||
The following options are handled by
|
||||
<B> rescore-decipher </B>
|
||||
itself:
|
||||
<DL>
|
||||
<DT><B> -bytelog </B>
|
||||
<DD>
|
||||
causes scores to be output on the bytelog scale
|
||||
(see
|
||||
<A HREF="nbest-format.1.html">nbest-format(1)</A>).
|
||||
<DT><B> -nodecipherlm </B>
|
||||
<DD>
|
||||
indicates that the recognizer language model is not being provided
|
||||
(with
|
||||
<B>-decipher-lm</B>).<B></B><B></B><B></B>
|
||||
(This is only possible if the N-best lists are not in ``NBestList1.0'' format.)
|
||||
<DT><B> -multiwords </B>
|
||||
<DD>
|
||||
specifies that N-best lists contain words joined by underscores, which are
|
||||
to be split into their component prior to rescoring.
|
||||
<DT><B>-multi-char</B><I> C</I><B></B><I></I><B></B><I></I><B></B>
|
||||
<DD>
|
||||
defines a multiword separator character.
|
||||
The default is underscore ``_''.
|
||||
<DT><B>-pretty</B><I> mapfile</I><B></B><I></I><B></B><I></I><B></B>
|
||||
<DD>
|
||||
specifies a word mapping file that allows individual words to be globally
|
||||
replaced by strings of zero or more other words, e.g., to remove vocabulary
|
||||
mismatches between the input N-best lists and the rescoring LM.
|
||||
The
|
||||
<I> mapfile </I>
|
||||
contains one mapping per line, the first field specifying the word to be
|
||||
replaced and subsequent fields forming the replacement string.
|
||||
<DT><B>-ngram-tool</B><I> program</I><B></B><I></I><B></B><I></I><B></B>
|
||||
<DD>
|
||||
specifies a non-standard
|
||||
<I> program </I>
|
||||
to perform the actual LM evaluation
|
||||
(by default,
|
||||
<A HREF="ngram.1.html">ngram(1)</A>
|
||||
is used).
|
||||
Such a program must understand
|
||||
<B>ngram</B>'s<B></B><B></B><B></B>
|
||||
command-line options related to N-best rescoring.
|
||||
<DT><B>-filter</B><I> command</I><B></B><I></I><B></B><I></I><B></B>
|
||||
<DD>
|
||||
specifies a
|
||||
<I> command </I>
|
||||
that is used to filter the N-best hypotheses prior to
|
||||
evaluating the language model.
|
||||
This may be used for more general textual rewriting so that non-standard
|
||||
LMs can be applied.
|
||||
The output N-best lists will contain the filtered hypotheses.
|
||||
<DT><B> -norescore </B>
|
||||
<DD>
|
||||
causes N-best lists to be simply reformatted from one of the Decipher formats
|
||||
into the SRILM N-best format, separating acoustic and LM scores, without
|
||||
replacing the existing LM scores.
|
||||
In this case only the
|
||||
<A HREF="ngram.1.html">ngram(1)</A>
|
||||
options
|
||||
<B>-decipher-lmw</B><B></B><B></B><B></B>
|
||||
and
|
||||
<B>-decipher-wtw</B><B></B><B></B><B></B>
|
||||
are relevant, and others are ignored.
|
||||
<B> -norescore </B>
|
||||
and
|
||||
<B> -filter </B>
|
||||
may be used together to perform textual rewriting of N-best lists.
|
||||
<DT><B> -lm-only </B>
|
||||
<DD>
|
||||
dumps out LM scores only, instead of complete N-best lists.
|
||||
<DT><B>-count-oovs</B><B></B><B></B><B></B>
|
||||
<DD>
|
||||
writes the count of out-of-vocabulary and zero-probability words to
|
||||
the output score files (instead of rescored N-best lists).
|
||||
<DT><B> -limit-vocab </B>
|
||||
<DD>
|
||||
saves memory by arranging for
|
||||
<A HREF="ngram.1.html">ngram(1)</A>
|
||||
to load only those N-gram parameters that are relevant to the vocabulary
|
||||
of the N-best lists to be rescored.
|
||||
After determining the N-best vocabulary the
|
||||
<B> -limit-vocab </B>
|
||||
option is passed to
|
||||
<A HREF="ngram.1.html">ngram(1)</A>.
|
||||
<DT><B>-vocab-aliases</B><I> map</I><B></B><I></I><B></B><I></I><B></B>
|
||||
<DD>
|
||||
declares that certain words are to be treated as alternative spellings
|
||||
of the same word for LM evaluation; see the same option for
|
||||
<A HREF="ngram.1.html">ngram(1)</A>.
|
||||
The
|
||||
<I> map </I>
|
||||
is filtered of unused words when used in conjunction with
|
||||
<B>-limit-vocab</B>,<B></B><B></B><B></B>
|
||||
and then passed on to
|
||||
<A HREF="ngram.1.html">ngram(1)</A>.
|
||||
<DT><B> -fast </B>
|
||||
<DD>
|
||||
performs rescoring using only functions built into
|
||||
<A HREF="ngram.1.html">ngram(1)</A>.
|
||||
This avoids some computational and I/O overhead and therefore runs faster,
|
||||
but the options
|
||||
<B>-filter</B>,<B></B><B></B><B></B>
|
||||
<B>-pretty</B>,<B></B><B></B><B></B>
|
||||
and
|
||||
<B> -lm-only </B>
|
||||
are not supported, and
|
||||
<B> -nodecipherlm </B>
|
||||
is obligatory.
|
||||
</DD>
|
||||
</DL>
|
||||
<P>
|
||||
<B> rescore-acoustic </B>
|
||||
replaces the acoustic scores in a set of N-best lists by a weighted
|
||||
combination of new scores.
|
||||
The old N-best lists are given by either a directory
|
||||
<I> old-score-dir </I>
|
||||
or a filelist
|
||||
<I>old-file-list</I>;<I></I><I></I><I></I>
|
||||
<I> old-ac-weight </I>
|
||||
is the weight given to the old scores.
|
||||
Directories containing the new scores are listed alternating with the
|
||||
corresponding weights; each score directory must contain one
|
||||
file per waveform segment, each having the same file basenames as
|
||||
the original N-best lists.
|
||||
The new scores should appear in a single column per file, one per line.
|
||||
The N-best lists containing the new combined acoustic scores are written to
|
||||
<I>new-nbest-dir</I>.<I></I><I></I><I></I>
|
||||
The optional
|
||||
<I> max-nbest </I>
|
||||
argument can be used to limit the length of the N-best lists output.
|
||||
Also, When a new score file is encountered containing fewer than
|
||||
<I> max-nbest </I>
|
||||
lines, the missing scores are set to the lowest score encountered so far.
|
||||
<P>
|
||||
<B> rescore-reweight </B>
|
||||
combines the scores in N-best lists with a set of weights and outputs
|
||||
the 1-best hypotheses.
|
||||
The N-best files are found in directory
|
||||
<I> score-dir </I>
|
||||
or listed in
|
||||
<I>file-list</I>.<I></I><I></I><I></I>
|
||||
Optional arguments set the language model weight
|
||||
<I> lmw </I>
|
||||
(default 8),
|
||||
the word transition weight
|
||||
<I> wtw </I>
|
||||
(default 0),
|
||||
and the maximum number
|
||||
<I> max-nbest </I>
|
||||
of hypotheses to consider (default all).
|
||||
Optionally, any number of additional score directories and associated
|
||||
weights
|
||||
<I> score-dir1 score-weight1 score-dir2 score-weight2 </I>
|
||||
... can be specified, following the
|
||||
<I> wtw </I>
|
||||
parameter.
|
||||
These additional scores are combined with those contained in the
|
||||
N-best lists themselves as in
|
||||
<B> rescore-acoustic </B>
|
||||
(using unit weight for the original acoustic scores).
|
||||
The
|
||||
<B> -multiwords </B>
|
||||
and
|
||||
<B> -multi-char </B>
|
||||
options have the same function as for
|
||||
<B>rescore-decipher</B>.<B></B><B></B><B></B>
|
||||
The output format for 1-best hypotheses is
|
||||
<PRE>
|
||||
<I>sentid</I> <I>w1</I> <I>w2</I> ...
|
||||
</PRE>
|
||||
where
|
||||
<I> sentid </I>
|
||||
is the sentence ID derived from the N-best filename, followed by
|
||||
the words.
|
||||
<P>
|
||||
<B> rescore-minimize-wer </B>
|
||||
is similar to
|
||||
<B> rescore-reweight </B>
|
||||
but picks hypotheses using the word error minimization algorithm
|
||||
of
|
||||
<A HREF="nbest-lattice.1.html">nbest-lattice(1)</A>.
|
||||
<P>
|
||||
<B> nbest2-to-nbest1 </B>
|
||||
converts an N-best list in ``NBestList2.0'' format to ``NBestlist1.0'',
|
||||
for the benefit of programs that have not yet been updated to deal with
|
||||
the new format.
|
||||
<P>
|
||||
<B> nbest-rover </B>
|
||||
combines hypotheses from multiple N-best lists at the word level,
|
||||
by performing the same kind of word error minimization as
|
||||
<A HREF="nbest-lattice.1.html">nbest-lattice(1)</A>,
|
||||
in a generalization of the ROVER algorithm.
|
||||
<I> sentid-list </I>
|
||||
is a file listing sentence IDs.
|
||||
These must match the filenames in a set of N-best directories,
|
||||
which are specified in a
|
||||
<I>control-file</I>.<I></I><I></I><I></I>
|
||||
The format for the latter is
|
||||
<PRE>
|
||||
<I>dir1</I> <I>lmw1</I> <I>wtw1</I> <I>w1</I> [<I>n1</I> [<I>s1</I>]]
|
||||
<I>dir2</I> <I>lmw2</I> <I>wtw2</I> <I>w2</I> [<I>n2</I> [<I>s2</I>]]
|
||||
...
|
||||
</PRE>
|
||||
Each line specifies an N-best directory, the language model and word transition
|
||||
weights to be used in score combination, and a weight to be applied to the
|
||||
posterior probabilities.
|
||||
A weight of "=" denotes a value equal to the previous system and is used to encode
|
||||
weight tying.
|
||||
An optional next-to-last parameter for each N-best list allows the lists to be
|
||||
truncated to the top <I>n1</I>, <I>n2</I>, etc., hypotheses.
|
||||
The final optional parameter sets the posterior distribution scaling factor,
|
||||
which defaults to the language model weight.
|
||||
Optionally,
|
||||
<I> control-file </I>
|
||||
can also contain lines of the form
|
||||
</PRE>
|
||||
<I>dir</I> <I>w</I> <B>+</B>
|
||||
</PRE>
|
||||
These indicate that additional score files can be found in directory
|
||||
<I> dir </I>
|
||||
and that the scores found therein should be added to the following
|
||||
N-best list set with weight
|
||||
<I>w</I>.<I></I><I></I><I></I>
|
||||
Several lines of this form may occur preceding a regular N-best
|
||||
directory specification; the corresponding additive combination of multiple
|
||||
scores is performed.
|
||||
<BR>
|
||||
If ``-'' is specified for
|
||||
<I>sentid-list</I>,<I></I><I></I><I></I>
|
||||
the sentence IDs are inferred from
|
||||
the contents of the first directory <I>dir1</I> specified in
|
||||
<I>control-file</I>.<I></I><I></I><I></I>
|
||||
If
|
||||
<I> posterior-file </I>
|
||||
is specified on the command line, posterior word probability estimates are
|
||||
written to that file.
|
||||
<P>
|
||||
Additional arguments are treated as options, in particular
|
||||
<DL>
|
||||
<DT><B> -missing-nbest </B>
|
||||
<DD>
|
||||
indicates that empty hypotheses are to be used for N-best lists that are missing
|
||||
from the directory specified in the control file.
|
||||
</DD>
|
||||
</DL>
|
||||
<P>
|
||||
Any other additional arguments are passed to the underlying
|
||||
<A HREF="nbest-lattice.1.html">nbest-lattice(1)</A>
|
||||
invocation.
|
||||
<BR>
|
||||
<B> nbest-rover </B>
|
||||
can process N-best lists in any of the formats described in
|
||||
<A HREF="nbest-format.5.html">nbest-format(5)</A>,
|
||||
<I>as long as all N-best lists for a given utterance are in the same format</I>.
|
||||
When Decipher formats are used only their acoustic scores are used.
|
||||
<P>
|
||||
<B> combine-rover-controls </B>
|
||||
takes one or more
|
||||
<B> nbest-rover </B>
|
||||
control files as arguments and outputs a new control file that specifies
|
||||
the combination of the input files.
|
||||
Directory names in the input files are adjusted to reflect the relative
|
||||
location of the input files,
|
||||
unless the
|
||||
<B> keeppaths=1 </B>
|
||||
option is used.
|
||||
Each input system is given equal weight,
|
||||
unless the optional
|
||||
<B>lambda=</B><I>weights</I><B></B><I></I><B></B><I></I><B></B>
|
||||
argument is used to specify a space-separated list of system weights
|
||||
(spaces in the weight vector need to be quoted on the command line).
|
||||
The
|
||||
<B>postscale=</B><I>S</I><B></B><I></I><B></B><I></I><B></B>
|
||||
argument overrides the posterior scaling factor in all input systems with the value
|
||||
<I>S</I>.<I></I><I></I><I></I>
|
||||
<P>
|
||||
<B> rover-control-weights </B>
|
||||
either retrieves or changes the weights in an nbest-rover control file.
|
||||
If the
|
||||
<B> weights= </B>
|
||||
argument is specified, the weights in the input control file are altered to the
|
||||
values in the argument and a new control file is written to stdout.
|
||||
Otherwise, the list of current weights is output as a single line.
|
||||
<P>
|
||||
<B> compute-best-rover-mix </B>
|
||||
estimates the best weighting of a set of nbest system outputs for
|
||||
combination with
|
||||
<B>nbest-rover</B>.<B></B><B></B><B></B>
|
||||
The required input file
|
||||
<I> reference-posteriors-output </I>
|
||||
is produced by running
|
||||
<B> nbest-rover </B>
|
||||
to record the posteriors of the reference word strings on a tuning set:
|
||||
<BR>
|
||||
<B>nbest-rover -</B> <I>control-file</I> /dev/null \
|
||||
<BR>
|
||||
<B>-refs</B> <I>references</I> \
|
||||
<BR>
|
||||
\fB-write-ref-posteriors\f <I>reference-posteriors-output</I>
|
||||
<BR>
|
||||
Initial weights are specified with
|
||||
<B>lambda=</B><I>weights.</I><B></B><I></I><B></B><I></I><B></B>
|
||||
<BR>
|
||||
An additive constant for Laplace smoothing can be specified with
|
||||
<B>addone=</B><I>c.</I><B></B><I></I><B></B><I></I><B></B>
|
||||
<BR>
|
||||
The
|
||||
<B> tying= </B>
|
||||
argument allows the system weights to be tied.
|
||||
It should specify a string of positive integers (the bin numbers) with one value
|
||||
for each system weight.
|
||||
For example
|
||||
<B> tying='1 1 2 3 3' </B>
|
||||
means that the first two and the last two of five weights are to be tied (put in the same bin).
|
||||
<BR>
|
||||
The estimated weight vector can optionally be written to a file using
|
||||
<B>write_weights=</B><I>file.</I><B></B><I></I><B></B><I></I><B></B>
|
||||
The weights can then be inserted into the original
|
||||
<I>control-file</I>,<I></I><I></I><I></I>
|
||||
e.g., using
|
||||
<B>rover-control-weights</B>.<B></B><B></B><B></B>
|
||||
<P>
|
||||
<B> rover-control-tying </B>
|
||||
extracts the value for the
|
||||
<B> compute-best-rover-mix </B>
|
||||
<B> tying= </B>
|
||||
argument from an existing nbest-rover control file.
|
||||
<P>
|
||||
<B> nbest-optimize-args-from-rover-control </B>
|
||||
extracts information from existing nbest-rover control files that can be passed as
|
||||
arguments to
|
||||
<A HREF="nbest-optimize.1.html">nbest-optimize(1)</A>
|
||||
for initializing the search.
|
||||
Options allow printing only the score weights,
|
||||
or only the list of additional scores directories.
|
||||
<P>
|
||||
<B> search-rover-combo </B>
|
||||
searches for a good subset of systems to combine via
|
||||
<B>nbest-rover</B>.<B></B><B></B><B></B>
|
||||
It performs a greedy search starting with the system that gives the lowest individual error,
|
||||
and then adds one system at a time until no further error reduction is possible.
|
||||
The required argument <I>list-of-control-files</I> is a file listing the nbest-rover control files
|
||||
representating the individual systems to be combined.
|
||||
An nbest-rover control file is written to stdout representing the combined system.
|
||||
Options are:
|
||||
<DL>
|
||||
<DT><B>-scorer</B><I> script</I><B></B><I></I><B></B><I></I><B></B>
|
||||
<DD>
|
||||
Specifies a script that evaluates a hypothesis file.
|
||||
The script must take a single argument that is a hypothesis file in sentid format and output a single number
|
||||
(the error rate) to stdout.
|
||||
For example, the script could be based on parsing
|
||||
<B> compute-sclite </B>
|
||||
output, but must know where to find the reference file etc.
|
||||
<DT><B>-weights</B><I> list-of-weights</I><B></B><I></I><B></B><I></I><B></B>
|
||||
<DD>
|
||||
Specifies the list of system weights to try when adding a system.
|
||||
By default this is just 1, but can be a space-separated list of weights, such as "1 0.5 0.2 0.1".
|
||||
<DT><B>-sentids</B><I> list</I><B></B><I></I><B></B><I></I><B></B>
|
||||
<DD>
|
||||
A list of sentence IDs to perform the evaluation on (as in the first argument to
|
||||
<B>nbest-rover</B>),<B></B><B></B><B></B>
|
||||
<DT><B>-datadir</B><I> dir</I><B></B><I></I><B></B><I></I><B></B>
|
||||
<DD>
|
||||
Where to place auxiliary data files.
|
||||
By default this is
|
||||
<B> SEARCH-DATA </B>
|
||||
in the current directory.
|
||||
<DT><B>-refs</B><I> refs</I><B></B><I></I><B></B><I></I><B></B>
|
||||
<DD>
|
||||
Triggers system weight optimization using
|
||||
<B> compute-best-rover-mix </B>
|
||||
for each system combination before evaluating
|
||||
its error rate.
|
||||
The file
|
||||
<I> refs </I>
|
||||
should point to a reference file in sentid format.
|
||||
Note that these references are not used to evaluate the error rate of a system
|
||||
(which is done within the scorer script, see above) but only to be
|
||||
passed to
|
||||
<B>compute-best-rover-mix</B>.<B></B><B></B><B></B>
|
||||
This option overrides the
|
||||
<B> -weights </B>
|
||||
option since system weights are estimated.
|
||||
<DT><B>-smooth-weight</B><I> S</I><B></B><I></I><B></B><I></I><B></B>
|
||||
<DD>
|
||||
Enables hierarchical weight smoothing.
|
||||
Each weight estimate is interpolated with the previous estimate (with one fewer systems);
|
||||
the previous weight vector gets weight
|
||||
<I>S</I>.<I></I><I></I><I></I>
|
||||
<DT><B>-J</B><I> n</I><B></B><I></I><B></B><I></I><B></B>
|
||||
<DD>
|
||||
Parallelize the evalation of system combinations with up to
|
||||
<I> n </I>
|
||||
parallel jobs.
|
||||
This uses the included parallelization script
|
||||
<B>rexport.gnumake</B>,<B></B><B></B><B></B>
|
||||
but the environment variable
|
||||
<B> REXPORT </B>
|
||||
may be set to a command that takes a list of command lines as argument and executes them in an appropriate manner.
|
||||
</DD>
|
||||
</DL>
|
||||
<P>
|
||||
<B> nbest-posteriors </B>
|
||||
rescales the scores in an N-best list to reflect (weighted) posterior
|
||||
probabilities.
|
||||
The output is the same N-best list with acoustic scores set to
|
||||
the log (base 10) of the posterior hyp probabilities and LM scores set to zero.
|
||||
<B>postscale=</B><I>S</I><B></B><I></I><B></B><I></I><B></B>
|
||||
attenuates the posterior distribution by dividing combined log
|
||||
scores by
|
||||
<I> S </I>
|
||||
(the default is
|
||||
<I>S</I>=<I>lmw</I>).<I></I><I></I>
|
||||
If
|
||||
<B>weight=</B><I>W</I><B></B><I></I><B></B><I></I><B></B>
|
||||
is specified the posteriors are multiplied by
|
||||
<I>W</I>.<I></I><I></I><I></I>
|
||||
<B>max_nbest=</B><I>M</I><B></B><I></I><B></B><I></I><B></B>
|
||||
limits the number of hypotheses used to the top
|
||||
<I>M</I>.<I></I><I></I><I></I>
|
||||
This script is used mostly as a helper in
|
||||
<B>nbest-rover</B>.<B></B><B></B><B></B>
|
||||
<P>
|
||||
<B> merge-nbest </B>
|
||||
merges hypotheses from one or more N-best lists into a single list,
|
||||
collapsing hypotheses that occur in more than one input list.
|
||||
If all input lists use the same
|
||||
<A HREF="nbest-format.5.html">nbest-format(5)</A>
|
||||
then the output will also be in that format and contain the information
|
||||
from the first list in which a hypothesis was encountered.
|
||||
Otherwise, the output will be in SRI Decipher(TM) NBestList1.0 format
|
||||
and contain acoustic scores and word strings only.
|
||||
The
|
||||
<B>max_nbest=</B><I>M</I><B></B><I></I><B></B><I></I><B></B>
|
||||
option limits input to the first
|
||||
<I> M </I>
|
||||
hypotheses from each input list.
|
||||
<B> multiwords=1 </B>
|
||||
merges hypotheses that are identical after resolving multiwords, with
|
||||
<B>multichar=</B><I>C</I><B></B><I></I><B></B><I></I><B></B>
|
||||
defining a non-default multiword separator character.
|
||||
<B> nopauses=1 </B>
|
||||
merges hypotheses that are identical after removal of pause words.
|
||||
<P>
|
||||
<B> nbest-vocab </B>
|
||||
outputs the vocabulary used in a set of N-best lists.
|
||||
(The N-best files cannot be compressed, but may be concatenated and
|
||||
supplied via stdin.)
|
||||
<P>
|
||||
<B> nbest-words </B>
|
||||
strips any score and alignment information from N-best lists and outputs
|
||||
only the words, one hypothesis per line.
|
||||
<P>
|
||||
<B> nbest-oov-counts </B>
|
||||
computes the number of out-of-vocabulary words for each hypothesis in an N-best list,
|
||||
relative to a vocabulary listed in
|
||||
<I>vocabfile</I>.<I></I><I></I><I></I>
|
||||
Optionally a vocabulary mapping from
|
||||
<I> aliasfile </I>
|
||||
is applied (as with the
|
||||
<A HREF="ngram.1.html">ngram(1)</A>
|
||||
<B> -vocab-aliases </B>
|
||||
option).
|
||||
The OOV counts are output on stdout and can be used as a score file for
|
||||
N-best rescoring.
|
||||
<P>
|
||||
<B> nbest-error </B>
|
||||
computes the overall oracle word error rate of a set of N-best lists
|
||||
in directory
|
||||
<I> score-dir </I>
|
||||
or listed in
|
||||
<I>file-list</I>.<I></I><I></I><I></I>
|
||||
The reference answers are given in
|
||||
<I> refs </I>
|
||||
in the format output by
|
||||
<B> rescore-reweight </B>
|
||||
(see above).
|
||||
Additional arguments are passed to the underlying invocation of
|
||||
<A HREF="nbest-lattice.1.html">nbest-lattice(1)</A>,
|
||||
and can be used to limit the depth of the N-best list,
|
||||
compute lattice error rather than N-best error, etc.
|
||||
<P>
|
||||
<B> sentid-to-sclite </B>
|
||||
converts 1-best hypotheses and references in the format used here to
|
||||
the ``trn'' format expected by the NIST
|
||||
<A HREF="sclite.1.html">sclite(1)</A>
|
||||
scoring software.
|
||||
<P>
|
||||
<B> sentid-to-ctm </B>
|
||||
converts 1-best hypotheses and references in the format used here to NIST
|
||||
<A HREF="ctm.5.html">ctm(5)</A>
|
||||
format.
|
||||
The script relies on an encoding of conversation IDs, channel, and utterance
|
||||
time marks in the sentence IDs and may need adjustment to local conventions.
|
||||
<P>
|
||||
<B> fix-ctm </B>
|
||||
converts output produced by the
|
||||
<B> -output-ctm </B>
|
||||
option of
|
||||
<A HREF="nbest-lattice.1.html">nbest-lattice(1)</A>
|
||||
and
|
||||
<A HREF="lattice-tool.1.html">lattice-tool(1)</A>
|
||||
to a format suitable for scoring with NIST
|
||||
<A HREF="sclite.1.html">sclite(1)</A>.
|
||||
It, too, relies on information encoded in the sentids IDs and may need
|
||||
adjustments.
|
||||
<P>
|
||||
<B> compute-sclite </B>
|
||||
is a wrapper around
|
||||
the NIST
|
||||
<A HREF="sclite.1.html">sclite(1)</A>
|
||||
scoring tool.
|
||||
<I> refs </I>
|
||||
and
|
||||
<I> hyps </I>
|
||||
are the reference and hypothesized transcripts, respectively.
|
||||
The
|
||||
<I> refs </I>
|
||||
file can be either in "sentid" format or in
|
||||
<A HREF="stm.5.html">stm(5)</A>
|
||||
format. In the latter case,
|
||||
<I> hyps </I>
|
||||
will be converted to
|
||||
<A HREF="ctm.5.html">ctm(5)</A>
|
||||
format using the
|
||||
<B> sentid-to-ctm </B>
|
||||
helper script.
|
||||
The
|
||||
<I> hyps </I>
|
||||
file can be either in "sentid" format or in
|
||||
<A HREF="ctm.5.html">ctm(5)</A>
|
||||
format.
|
||||
More than one
|
||||
<B> -h </B>
|
||||
option can be given to combine the contents of multiple hypotheses files.
|
||||
Optionally,
|
||||
<B> -S </B>
|
||||
specifies a
|
||||
sorted list of sentence IDs
|
||||
<I> subset </I>
|
||||
to score.
|
||||
Multiple
|
||||
<B> -S </B>
|
||||
options may be given, to form the intersection of several subsets.
|
||||
<B> -multiwords </B>
|
||||
or
|
||||
<B> -M </B>
|
||||
splits ``multiwords'' joined by underscores into their component words
|
||||
prior to scoring.
|
||||
<B> -noperiods </B>
|
||||
deletes periods from the hypotheses prior to scoring
|
||||
(typically used to bridge different conventions for spelled letters).
|
||||
<B> -R </B>
|
||||
preserves reject words in the hypotheses for scoring (as appropriate if
|
||||
references also contain rejects).
|
||||
<B> -g </B>
|
||||
<I> glmfile </I>
|
||||
enables filtering of references and hypotheses by the NIST
|
||||
<B> csrfilt.sh </B>
|
||||
script, controlled by the filter file
|
||||
<I> glmfile </I>
|
||||
(this is only possible with an stm reference file).
|
||||
In that case, the
|
||||
<B> -H </B>
|
||||
option causes hesitations (as defined by the filter)
|
||||
to be deleted from the output for scoring purposes.
|
||||
<B> -v </B>
|
||||
displays the complete command used to invoke
|
||||
<B>sclite</B>.<B></B><B></B><B></B>
|
||||
Any additional options are passed to
|
||||
<B>sclite</B>,<B></B><B></B><B></B>
|
||||
e.g., to control its output actions or alignment mode.
|
||||
<P>
|
||||
<B> compute-sclite-nbest </B>
|
||||
runs
|
||||
<B> compute-sclite </B>
|
||||
on a set of N-best lists specified by
|
||||
<I> file-list </I>
|
||||
and deposits the error counts in a directory
|
||||
<I>output-dir</I>.<I></I><I></I><I></I>
|
||||
These error counts may be used with the
|
||||
<A HREF="nbest-optimize.1.html">nbest-optimize(1)</A>
|
||||
<B> -errors </B>
|
||||
option to specify the hypothesis-level errors explicitly.
|
||||
The references must be given in a file
|
||||
<I> refs </I>
|
||||
one per line, with the first word in each line matching
|
||||
the file basename of the corresponding N-best list.
|
||||
Additional options to be passed to
|
||||
<B> compute-sclite </B>
|
||||
(and ultimately to
|
||||
<A HREF="sclite.1.html">sclite(1)</A>)
|
||||
may be specified at the end of the command line.
|
||||
The
|
||||
<B> -filter </B>
|
||||
option specifies a filtering
|
||||
<I> script </I>
|
||||
that edits the hypotheses before error computation.
|
||||
<P>
|
||||
<B> compare-sclite </B>
|
||||
scores two sets of hypotheses
|
||||
<I> hyps1 </I>
|
||||
and
|
||||
<I> hyps2 </I>
|
||||
for the same test set and computes in
|
||||
how many cases the first or second set had lower word error.
|
||||
The remaining options are as for
|
||||
<B>compute-sclite</B>.<B></B><B></B><B></B>
|
||||
The script ignores hypotheses for sentence that do not appear in both
|
||||
hypothesis files, to ensure comparable scoring results.
|
||||
<H2> SEE ALSO </H2>
|
||||
<A HREF="nbest-format.5.html">nbest-format(5)</A>, <A HREF="ngram.1.html">ngram(1)</A>, <A HREF="nbest-lattice.1.html">nbest-lattice(1)</A>, <A HREF="nbest-optimize.1.html">nbest-optimize(1)</A>, <A HREF="sclite.1.html">sclite(1)</A>,
|
||||
<A HREF="stm.5.html">stm(5)</A>, <A HREF="ctm.5.html">ctm(5)</A>.
|
||||
<BR>
|
||||
J.G. Fiscus, A Post-Processing System to Yield Reduced Word Error Rates:
|
||||
Recognizer Output Voting Error Reduction (ROVER),
|
||||
<I>Proc. IEEE Automatic Speech Recognition and Understanding Workshop</I>,
|
||||
Santa Barbara, CA, 347-352, 1997.
|
||||
<BR>
|
||||
A. Stolcke et al., "The SRI March 2000 Hub-5 Conversational Speech
|
||||
Transcription System",
|
||||
<I>Proc. NIST Speech Transcription Workshop</I>, College Park, MD, 2000.
|
||||
<H2> BUGS </H2>
|
||||
<B> sentid-to-sclite </B>
|
||||
has some assumptions about the structure of sentence IDs built-in and
|
||||
may need to be modified for
|
||||
<B> compute-sclite </B>
|
||||
and
|
||||
<B> compare-sclite </B>
|
||||
to work.
|
||||
<P>
|
||||
<B> rescore-decipher </B>
|
||||
<B> -pretty </B>
|
||||
may not work correctly with the
|
||||
<B> -limit-vocab </B>
|
||||
option if the word mapping adds to the vocabulary subset used in the N-best
|
||||
lists.
|
||||
<H2> AUTHOR </H2>
|
||||
Andreas Stolcke <stolcke@icsi.berkeley.edu>
|
||||
<BR>
|
||||
Copyright (c) 1995-2006 SRI International
|
||||
<BR>
|
||||
Copyright (c) 2011-2019 Andreas Stolcke
|
||||
<BR>
|
||||
Copyright (c) 2011-2018 Microsoft Corp.
|
||||
</BODY>
|
||||
</HTML>
|
||||
Reference in New Issue
Block a user