Files
b2txt25/language_model/srilm-1.7.3/man/man1/nbest-lattice.1
2025-07-02 12:18:09 -07:00

497 lines
15 KiB
Groff

.\" $Id: nbest-lattice.1,v 1.49 2019/09/09 22:35:36 stolcke Exp $
.TH nbest-lattice 1 "$Date: 2019/09/09 22:35:36 $" "SRILM Tools"
.SH NAME
nbest-lattice \- rescore N-best lists and lattices
.SH SYNOPSIS
.nf
\fBnbest-lattice\fP [ \fB\-help\fP ] \fIoption\fP ...
.fi
.SH DESCRIPTION
.B nbest-lattice
rescores N-best lists or optimizes word-level recognition scores
(as opposed to sentence-level scores).
There are two rescoring modes.
In
.I "N-best word error minimization"
mode, the program computes the posterior expected word error for each
hypothesis relative to all hypotheses in the N-best list, choosing the one
with the lowest value.
.PP
In
.I "lattice word error minimization"
mode, the program constructs a word lattice from all the N-best hypotheses
and extracts the path with the lowest expected word error.
This is similar to N-best word error minimization but allows
hypotheses not contained in the N-best list.
A variant of this mode uses a word ``mesh'' instead of a word lattice,
in which all hypotheses are aligned into a grid of word positions,
and one is allowed to chose a word from each grid position, thus allowing an
even greater number of potential hypotheses.
.SH OPTIONS
.PP
Each filename argument can be an ASCII file, or a
compressed file (name ending in .Z or .gz), or ``-'' to indicate
stdin/stdout.
.TP
.B \-help
Print option summary.
.TP
.B \-version
Print version information.
.TP
.BI \-debug " level"
Controls the amount of output (the higher the
.IR level ,
the more).
At level 1, the expected word error counts for the chosen hypotheses
are printed.
At level 2, the word posterior probabilities are printed in addition
(only for lattice mode, similar to
.BR \-dump-posteriors ).
.TP
.B \-wer
Chooses N-best word error minimization mode.
.TP
.B \-lattice\-wer
Chooses lattice word error minimization mode (the default).
.TP
.B \-use-mesh
Choose the variant of lattice mode that uses word meshes
instead of simple lattices.
.TP
.BI \-deletion-bias " D"
Causes the probabilities of deletions to be biased by a factor
.I D
in doing mesh-based word error minimization.
This controls the trade-off between insertion and deletion errors.
The default is 1 (no bias).
.TP
.B \-random-tie-break
Break ties between words with equal probability pseudo-randomly
when doing mesh-based word error minimization.
The default is to decide for the word with the lowest internal
index (which reflects the order in the vocab file, or in
which they are encountered in the input data).
.TP
.B \-no-tie-break
Disable all explicit tie breaking, for backward compatibility.
.TP
.BI \-rescore " file"
Reads the N-best list from
.IR file .
The N-best list can be in any of the formats described in
.BR nbest-format (5).
.TP
.BI \-nbest " file"
A synonym for
.BR \-rescore .
.TP
.BI \-write-nbest " file"
Outputs the N-best list to a file, after sorting and processing
(for validation or format conversion purposes).
.TP
.BI \-nbest-files " file-list"
Rescores multiple N-best lists whose filenames are read from
.IR file-list .
.TP
.BI \-write-nbest-dir " directory"
Outputs N-best lists to
.IR directory ,
to files named after the input N-best lists,
for when multiple N-best lists are processed (see
.BR \-nbest-files ).
.TP
.BI \-write-vocab " file"
Outputs vocabulary used in N-best list.
.TP
.B \-decipher-nbest
Output N-best list in Decipher
.BR nbest-format (5),
rather than the default native SRILM format.
(All N-best formats are accepted for input regardless of this option.)
.TP
.B \-no-rescore
Suppress rescoring of lattices;
useful if only the operations of lattice/N-best list reading/writing
are desired.
.TP
.BI \-max-nbest " n"
Limits the number of hypotheses read from each N-best list to the first
.IR n .
.TP
.BI \-max-rescore " m"
In N-best mode, only choose among the top
.I m
hypotheses when optimizing word error.
This is convenient to limit computation for long N-best lists.
The cutoff is made after reading all hypotheses (subject to
.BR \-max-nbest )
and reordering them according to the posterior probabilities.
.br
The worst-case time taken in N-best error minimization is proportional to
.I m
times
.IR n ,
where
.I n
is the length of the N-best list (or the value given to
.BR \-max-nbest ).
However, in practice the average time per sentence is independent of
.IR m ,
so this option is usually not necessary.
.br
In lattice mode, only align the top
.I m
scoring hypotheses (after reweighting and sorting) into the lattice.
.TP
.BI \-posterior-prune " threshold"
Don't process N-best hypotheses whose cumulative posterior probability
is below
.IR threshold .
This is another strategy to speed up the algorithm.
.TP
.B \-no-reorder
Process N-best hypotheses in the order in which they appear.
By default, hypotheses are first sorted by their aggregate scores.
.TP
.B \-nbest-backtrace
Preserve backtrace information (word-level timemarks and scores) when reading
N-best lists containing such information (see
.BR nbest-format (5)).
The default is to ignore backtrace information and record only sentence-level
scores and the word identities.
.TP
.B \-output-ctm
Output word hypotheses in NIST CTM (conversation time mark) format.
Note that word start times will be relative to the segment start times,
the first column will contain the N-best filename, and the channel field
is always 1.
The word confidence field contains posterior probabilities.
This option also implies
.BR \-nbest-backtrace .
.TP
.BI \-rescore-lmw " lmw"
Sets the language model weight used in combining the language model log
probabilities with acoustic log probabilities
(only relevant if separate scores are given in the N-best input).
.TP
.BI \-rescore-wtw " wtw"
Sets the word transition weight used to weight the number of words relative to
the acoustic log probabilities
(only relevant if separate scores are given in the N-best input).
.br
If
.B \-no-reorder
is not specified, and either
.I lmw
or
.I wtw
are specified to be non-zero, the aggregate scores are recomputed using
those weights; otherwise aggregate scores supplied in the input N-best lists
are used to sort hypotheses.
.TP
.BI \-posterior-scale " scale"
Divide the total weighted log score by
.I scale
when computing normalized posterior probabilities.
This controls the peakedness of the posterior distribution.
The default value is whatever was chosen for
.BR \-rescore-lmw ,
so that language model scores are scaled to have weight 1,
and acoustic scores have weight 1/\fIlmw\fP.
.TP
.BI \-posterior-amw " amw"
Sets the acoustic model weight for computing posteriors;
the default is 1.
This and the next two options allow posteriors to be computed using a
different weighting than that used in ranking and reordering the
hypotheses.
.TP
.BI \-posterior-lmw " lmw"
Sets the language model weight for computing posteriors.
The default is to use whatever was specified for
.BR \-rescore-lmw .
.TP
.BI \-posterior-wtw " wtw"
Sets the word transition weight for computing posteriors.
The default is to use whatever was specified for
.BR \-rescore-wtw .
.br
If all three of
.IR amw ,
.IR lmw ,
and
.I wtw
are set to zero the posteriors are computed directly from the
aggregate scores stored in the N-best input.
.TP
.BI \-vocab " file"
Read the N-best list vocabulary from
.IR file .
This option is mostly redundant since words found in the N-best input
are implicitly added to the vocabulary.
.TP
.BI \-vocab-aliases " file"
Reads vocabulary alias definitions from
.IR file ,
consisting of lines of the form
.nf
\fIalias\fP \fIword\fP
.fi
This causes all tokens
.I alias
to be mapped to
.IR word .
.TP
.B \-tolower
Map vocabulary to lowercase, eliminating case distinctions.
.TP
.B \-multiwords
Split multiwords (words joined by '_') into their components when reading
N-best lists.
.TP
.BI \-multi-char " C"
Character used to delimit component words in multiwords
(an underscore character by default).
.TP
.BI \-noise " noise-tag"
Designate
.I noise-tag
as a vocabulary item that is to be ignored in aligning hypotheses with
each other (the same as the -pau- word).
This is typically used to identify a noise marker.
.TP
.BI \-noise-vocab " file"
Read several noise tags from
.IR file ,
instead of, or in addition to, the single noise tag specified by
.BR \-noise .
.TP
.B \-keep-noise
Do not remove pause or noise tokens from hypotheses. The default
is to preserve noise tags but still eliminate pauses.
.TP
.BI \-nbest-error
Compute the N-best error (minimum word error) of the N-best list read with
.BR \-nbest .
Pause and noise tokens (as specified with
.BR \-noise )
in the N-best list are ignored.
.TP
.B \-dump-posteriors
Output posterior probabilities of all N-best hypotheses
instead of choosing the best hypothesis.
In N-best mode, only the posterior probability for each hypothesis is output.
In lattice mode, the hyp posterior is followed by word posterior probabilities
for each (non-pause, non-noise) token in the hypothesis.
The
.B \-max-rescore
option limits the number of hypotheses per N-best list processed.
.TP
.B \-dump-errors
Output word correctness indicators for all N-best hypotheses
instead of choosing the best hypothesis.
For each hypothesis, a line is output containing first the total number of
errors and the list of indicators of whether the corresponding word is
correct, substituted or inserted relative to the reference string.
The location of deleted words is also indicated by a corresponding marker.
The
.B \-max-rescore
option limits the number of hypotheses per N-best list processed.
.TP
.BI \-reference " w1 w2 ..."
Specifies a reference word string for
.BR \-dump-errors ,
.BR \-nbest-error ,
and
.B \-lattice-error
options.
Additionally, in
.B -use-mesh
mode, the reference words are recorded in the word mesh and can be output
with
.BR \-write ,
indicating which word in each alignment position is the correct one.
.TP
.BI \-refs " references"
Read a table of reference transcripts from file
.IR reference ,
for when multiple N-best lists are processed (see
.BR \-nbest-files ).
Each line in
.I references
must contain the sentence ID (the last component in the N-best filename
path, minus any suffixes) followed by zero or more reference words.
.PP
The following options only affect lattice mode.
.TP
.BI \-read " file"
Reads an initial lattice from
.IR file ,
to be merged with additional paths constructed from the
N-best hypotheses.
.TP
.BI \-lattice-files " file"
Reads the names of one or more lattices from
.I file
and aligns those lattices with the main lattice being built.
Each line of
.I file
must contain a lattice filename, optionally followed by a weight.
.TP
.BI \-dump-lattice-alignments
Causes
.B \-lattice-files
to write out the position alignments between the
.B \-read
input lattice and each of the lattices in
.IR file ,
as well as their alignment costs.
.TP
.BI \-write " file"
Writes the resulting word posterior lattice or mesh to
.IR file ,
in
.BR wlat-format (5).
.TP
.BI \-write-dir " directory"
Write the resulting N-best lattices to
.IR directory ,
in files named after the input N-best lists,
for when multiple N-best lists are processed (see
.BR \-nbest-files ).
.TP
.B \-prime-lattice
Start building the lattice with the best hypothesis obtained from
N-best error minimization. This produces slightly better alignments
and sometimes lower error rates. The default is to start with the
top-scoring hypothesis.
.TP
.B \-prime-with-1best
Similar to
.BR \-prime-lattice ,
but uses the top-ranked sentence hypothesis for priming.
(Experience shows that
.B "\-no-reorder \-prime-lattice"
gives best results.)
.TP
.B \-prime-with-refs
Similar to
.BR \-prime-lattice ,
but uses the reference words for priming.
.TP
.B \-no-merge
Build a lattice from the N-best hypotheses without merging edges
(string/lattice alignment). This creates a lattice with one disjoint path
per hypothesis, and is useful mainly for debugging purposes.
This option has no effect with
.B \-use-mesh
since word meshes can represent only one word type per
alignment position.
.TP
.B \-lattice-error
Compute the lattice error (minimum word error) of the lattice read with
.B \-read
or built with
.BR \-nbest .
.TP
.BR \-dictionary " file"
Use word pronunciations listed in
.I file
to construct word alignments when building word meshes.
This will use an alignment cost function that reflects the number of
inserted/deleted/substituted phones, rather than words.
The dictionary
.I file
should contain one pronunciation per line, each naming a word in the first
field, followed by a string of phone symbols.
.TP
.BR \-hidden-vocab " file"
Read a subvocabulary from
.I file
and constrain word meshes to only align those words that are either all
in or outside the subvocabulary.
This may be used to keep ``hidden event'' tags from aligning with
regular words.
.TP
.BR \-suppress-vocab " file"
Read a subvocabulary from
.I file
and disallow its words when decoding the best word string from a lattice or mesh.
If such a word has the highest posterior probability at a given position, the word with
next highest posterior is chosen instead, or a null word if no other word choice is
available.
This is useful when special tokens are included in the nbest inputs to
mediate alignments, but are not meant to be included in the output.
.TP
.BI \-time-penalty " p"
Apply soft time constraints during word alignment (in word mesh mode only).
In addition to the expected word error, a penalty term is added to the
cost function minimized during alignment.
The penality term is only applied if word meshes or input N-best lists contain
backtrace information (see
.BR \-nbest-backtrace )
and is scaled by the factor
.I p
(which is zero by default).
The penality term for
aligning two word hypotheses or word mesh columns is the absolute difference in
their times (the time of a word mesh column is the posterior-averaged
time of all its component words).
If two successive words or word columns have time stamps in the wrong temporal order,
their time difference is added to the penalty term.
.TP
.B \-average-times
When aligning two instances of the same word, average their times (if available)
instead of adopting the one with the time information associated with the
highest poosterior probability.
.TP
.B \-record-hyps
Record the ranks of the hyps contributing to each word hypothesis in the
resulting word lattice;
the information is included in
.B \-write
output.
.SH "SEE ALSO"
ngram(1), nbest-optimize(1), nbest-scripts(1), nbest-format(5), wlat-format(5).
.br
A. Stolcke, Y. Konig, and M. Weintraub,
``Explicit Word Error Minimization in N-best List Rescoring,''
\fIProc. Eurospeech\fP, 163\-166, 1997.
.br
The ``word meshes'' used here are equivalent to the ``confusion networks''
described in:
L. Mangu, E. Brill, and A. Stolcke, ``Finding Consensus Among Words:
Lattice-based Word Error Minimization.'' \fIProc. Eurospeech\fP,
vol. 1, 495-498, 1999.
.SH BUGS
Several functions are not uniformly implemented for all rescoring modes
(e.g.,
.BR \-lattice-files ,
.BR \-dictionary ,
.BR \-record-hyps ,
and
.B \-nbest-backtrace
are currently effective only in mesh-lattice mode).
.br
It is a common mistake (not a bug) to use the default LM weight with
N-best lists directly from Decipher.
Decipher N-best lists have the recognizer's LM weight already
built in, so they should be processed with
.nf
nbest-lattice -rescore-lmw 1 -posterior-scale \fILMW\fP
.fi
where
.I LMW
is the LM weight during recognition.
This is not an issue if the N-best lists have been rescored with
.BR rescore-decipher .
.SH AUTHOR
Andreas Stolcke <stolcke@icsi.berkeley.edu>
.br
Copyright (c) 1996\-2010 SRI International
.br
Copyright (c) 2011\-2019 Andreas Stolcke
.br
Copyright (c) 2011\-2019 Microsoft Corp.