competition update
This commit is contained in:
37
language_model/srilm-1.7.3/man/cat5/classes-format.5
Normal file
37
language_model/srilm-1.7.3/man/cat5/classes-format.5
Normal file
@@ -0,0 +1,37 @@
|
||||
classes-format(5) classes-format(5)
|
||||
|
||||
|
||||
|
||||
NNAAMMEE
|
||||
classes-format - File format for word class definitions
|
||||
|
||||
SSYYNNOOPPSSIISS
|
||||
_c_l_a_s_s [_p] _w_o_r_d_1 _w_o_r_d_2 ...
|
||||
|
||||
DDEESSCCRRIIPPTTIIOONN
|
||||
Various programs dealing with word classes use this format to define
|
||||
the posssible expansions of classes and their respective probabilities.
|
||||
Each expansion appears on a separate line as in the synopsis, where
|
||||
_c_l_a_s_s names a word class, _p gives the probability for the class expan-
|
||||
sion, and _w_o_r_d_1 _w_o_r_d_2 _._._. defines the word string that the class
|
||||
expands to. If _p is omitted it is assumed to be 1. (All expansion
|
||||
probabilities for a given class should sum to one, although this is not
|
||||
necessarily enforced by the software and would lead to improper mod-
|
||||
els.)
|
||||
|
||||
Note that the concept of word class here is generalized to include
|
||||
``multi-words'', or phrases consisting of more than one word. All
|
||||
expansions must have at least one word. Certain models might impose
|
||||
more restrictive formats.
|
||||
|
||||
SSEEEE AALLSSOO
|
||||
ngram(1), ngram-class(1), disambig(1), training-scripts(1), pfsg-
|
||||
scripts(5).
|
||||
|
||||
AAUUTTHHOORR
|
||||
Andreas Stolcke <stolcke@speech.sri.com>.
|
||||
Copyright 1999 SRI International
|
||||
|
||||
|
||||
|
||||
SRILM File Formats $Date: 2007/12/19 22:08:05 $ classes-format(5)
|
||||
64
language_model/srilm-1.7.3/man/cat5/nbest-format.5
Normal file
64
language_model/srilm-1.7.3/man/cat5/nbest-format.5
Normal file
@@ -0,0 +1,64 @@
|
||||
nbest-format(5) nbest-format(5)
|
||||
|
||||
|
||||
|
||||
NNAAMMEE
|
||||
nbest-format - File formats for N-best hypotheses lists
|
||||
|
||||
DDEESSCCRRIIPPTTIIOONN
|
||||
SRILM currently understands three different formats for lists of N-best
|
||||
hypotheses for rescoring or 1-best hypothesis extraction. The first
|
||||
two formats originated in the SRI Decipher(TM) recognition system, the
|
||||
third format is particular to SRILM.
|
||||
|
||||
The first format consists of the header
|
||||
NBestList1.0
|
||||
followed by one or more lines of the form
|
||||
(_s_c_o_r_e) _w_1 _w_2 _w_3 ...
|
||||
where _s_c_o_r_e is a composite acoustic/language model score from the rec-
|
||||
ognizer, on the bytelog scale. (A bytelog is a logarithm to base
|
||||
1.0001, divided by 1024 and rounded to an integer.) This format is
|
||||
output by the SRI Decipher(TM) recognizer, by the nnggrraamm --nnbbeesstt, and by
|
||||
nnbbeesstt--llaattttiiccee --wwrriittee--nnbbeesstt --ddeecciipphheerr--nnbbeesstt.
|
||||
|
||||
The second Decipher(TM) format is an extension of the first format that
|
||||
encodes word-level scores and time alignments. It is marked by a
|
||||
header of the form
|
||||
NBestList2.0
|
||||
The hypotheses are in the format
|
||||
(_s_c_o_r_e) _w_1 ( st: _s_t_1 et: _e_t_1 g: _g_1 a: _a_1 ) _w_2 ...
|
||||
where words are followed by start and end times, language model and
|
||||
acoustic scores (bytelog-scaled), respectively. This format may also
|
||||
contain scores and time marks for sub-word units (phones and HMM
|
||||
states), in the same format as above, but with the _w's denoting phone
|
||||
and state names. Sub-word units will have time marks that are con-
|
||||
tained in the duration of the preceding word units, and may thus be
|
||||
easily identified.
|
||||
|
||||
The third format understood by SRILM lists hypotheses in the format
|
||||
_a_s_c_o_r_e _l_s_c_o_r_e _n_w_o_r_d_s _w_1 _w_2 _w_3 ...
|
||||
where the first three columns contain the acoustic model log probabil-
|
||||
ity, the language model log probability, and the number of words in the
|
||||
hypothesis string, respectively. All scores are logarithms base 10.
|
||||
(This format must not be preceded by an ``NBestList'' header.) This
|
||||
format is output by the nnggrraamm --rreessccoorree and by nnbbeesstt--llaattttiiccee --wwrriittee--
|
||||
nnbbeesstt without the --ddeecciipphheerr--nnbbeesstt option.
|
||||
|
||||
SSEEEE AALLSSOO
|
||||
ngram(1), nbest-lattice(1), segment-nbest(1), nbest-scripts(1), pfsg-
|
||||
scripts(1).
|
||||
|
||||
BBUUGGSS
|
||||
All these formats are somewhat ad hoc and could use a more rational
|
||||
design. The ``NBestList1.0'' format is particularly cumbersome because
|
||||
it conflates acoustic and language model scores.
|
||||
A generalization to an arbitrary number of separate scores would be
|
||||
nice.
|
||||
|
||||
AAUUTTHHOORR
|
||||
Manual page written by Andreas Stolcke <stolcke@speech.sri.com>.
|
||||
Copyright 1999-2001 SRI International
|
||||
|
||||
|
||||
|
||||
SRILM File Formats $Date: 2007/12/19 22:08:05 $ nbest-format(5)
|
||||
89
language_model/srilm-1.7.3/man/cat5/ngram-format.5
Normal file
89
language_model/srilm-1.7.3/man/cat5/ngram-format.5
Normal file
@@ -0,0 +1,89 @@
|
||||
ngram-format(5) ngram-format(5)
|
||||
|
||||
|
||||
|
||||
NNAAMMEE
|
||||
ngram-format - File format for ARPA backoff N-gram models
|
||||
|
||||
SSYYNNOOPPSSIISS
|
||||
\\ddaattaa\\
|
||||
nnggrraamm 11==_n_1
|
||||
nnggrraamm 22==_n_2
|
||||
...
|
||||
nnggrraamm _N==_n_N
|
||||
|
||||
\\11--ggrraammss::
|
||||
_p _w [_b_o_w]
|
||||
...
|
||||
|
||||
\\22--ggrraammss::
|
||||
_p _w_1 _w_2 [_b_o_w]
|
||||
...
|
||||
|
||||
\\_N--ggrraammss::
|
||||
_p _w_1 ... _w_N
|
||||
...
|
||||
|
||||
\\eenndd\\
|
||||
|
||||
DDEESSCCRRIIPPTTIIOONN
|
||||
The so-called ARPA (or Doug Paul) format for N-gram backoff models
|
||||
starts with a header, introduced by the keyword \\ddaattaa\\, listing the
|
||||
number of N-grams of each length. Following that, N-grams are listed
|
||||
one per line, grouped into sections by length, each section starting
|
||||
with the keyword \\_N--ggrraamm::, where _N is the length of the N-grams to fol-
|
||||
low. Each N-gram line starts with the logarithm (base 10) of condi-
|
||||
tional probability _p of that N-gram, followed by the words _w_1..._w_N mak-
|
||||
ing up the N-gram. These are optionally followed by the logarithm
|
||||
(base 10) of the backoff weight for the N-gram. The keyword \\eenndd\\ con-
|
||||
cludes the model representation.
|
||||
|
||||
Backoff weights are required only for those N-grams that form a prefix
|
||||
of longer N-grams in the model. The highest-order N-grams in particu-
|
||||
lar will not need backoff weights (they would be useless).
|
||||
|
||||
Since log(0) (minus infinity) has no portable representation, such val-
|
||||
ues are mapped to a large negative number. However, the designated
|
||||
dummy value (-99 in SRILM) is interpreted as log(0) when read back from
|
||||
file into memory.
|
||||
|
||||
The correctness of the N-gram counts _n_1, _n_2, ... in the header is not
|
||||
enforced by SRILM software when reading models (although a warning is
|
||||
printed when an inconsistency is encountered). This allows easy tex-
|
||||
tual insertion or deletion of parameters in a model file. The proper
|
||||
format can be recovered by passsing the model through the command
|
||||
ngram -order _N -lm _i_n_p_u_t -write-lm _o_u_t_p_u_t
|
||||
|
||||
Note that the format is self-delimiting, allowing multiple models to be
|
||||
stored in one file, or to be surrounded by ancillary information. Some
|
||||
extensions of N-gram models in SRILM store additional parameters after
|
||||
a basic N-gram section in the standard format.
|
||||
|
||||
SSEEEE AALLSSOO
|
||||
ngram(1), ngram-count(1), lm-scripts(1), pfsg-scripts(1).
|
||||
|
||||
BBUUGGSS
|
||||
The ARPA format does not allow N-grams that have only a backoff weight
|
||||
associated with them, but no conditional probability. This makes the
|
||||
format less general than would otherwise be useful (e.g., to support
|
||||
pruned models, or ones containing a mix of words and classes). The
|
||||
nnggrraamm--ccoouunntt(1) tool satisfies this constraint by inserting dummy proba-
|
||||
bilities where necessary.
|
||||
|
||||
For simplicity, an N-gram model containing N-grams up to length _N is
|
||||
referred to in the SRILM programs as an _N-th order model, although
|
||||
techncally it represents a Markov model of order _N-1.
|
||||
|
||||
BBUUGGSS
|
||||
There is no way to specify words with embedded whitespace.
|
||||
|
||||
AAUUTTHHOORR
|
||||
The ARPA backoff format was developed by Doug Paul at MIT Lincoln Labs
|
||||
for research sponsored by the U.S. Department of Defense Advanced
|
||||
Research Project Agency (ARPA).
|
||||
Man page by Andreas Stolcke <stolcke@speech.sri.com>.
|
||||
Copyright 1999, 2004 SRI International
|
||||
|
||||
|
||||
|
||||
SRILM File Formats $Date: 2007/12/19 22:08:05 $ ngram-format(5)
|
||||
65
language_model/srilm-1.7.3/man/cat5/pfsg-format.5
Normal file
65
language_model/srilm-1.7.3/man/cat5/pfsg-format.5
Normal file
@@ -0,0 +1,65 @@
|
||||
pfsg-format(5) pfsg-format(5)
|
||||
|
||||
|
||||
|
||||
NNAAMMEE
|
||||
pfsg-format - File format for Decipher(TM) probabilistic finite-state
|
||||
grammars
|
||||
|
||||
SSYYNNOOPPSSIISS
|
||||
nnaammee _n_a_m_e
|
||||
nnooddeess _N _w_1 ... _w_N
|
||||
iinniittiiaall _i
|
||||
ffiinnaall _f
|
||||
ttrraannssiittiioonnss _T
|
||||
_n_1 _n_2 _p
|
||||
...
|
||||
|
||||
DDEESSCCRRIIPPTTIIOONN
|
||||
Probabilistic finite-state grammars (PFSGs) are a form of finite-state
|
||||
automaton or transducer used by the SRI Decipher(TM) recognizer. PFSGs
|
||||
emit words (outputs) at the nodes, not on the arcs. Certain types of
|
||||
language models manipulated by SRILM can be translated into PFSGs for
|
||||
direct use in the recognizer.
|
||||
|
||||
Since it is usually fairly easy to convert between different finite-
|
||||
state network representations, PFSGs can serve as an intermediate for-
|
||||
mat for the generation of other finite-state formats. For example,
|
||||
PFSGs can be converted to the AT&T ffssmm(5) format.
|
||||
|
||||
Each PFSGs is given a _n_a_m_e. The name is significant if PFSGs are to be
|
||||
composed, in which case the _n_a_m_e specifies the category it expands.
|
||||
|
||||
The nnooddeess line gives the number of nodes in the state graph, followed
|
||||
by the word strings associated with each node. If the node represents
|
||||
a category expanded by another PFSG, then the name string of that PFSG
|
||||
is given here. The token NNUULLLL is special and designates the corre-
|
||||
sponding node as non-emitting. It is conventional to use lowercase
|
||||
strings for words, and uppercase for categories and PFSG names
|
||||
(``NULL'' must be avoided, of course).
|
||||
|
||||
The iinniittiiaall and ffiinnaall lines specify the start and end states of the
|
||||
grammar, respectively. Nodes are numbered starting at zero.
|
||||
|
||||
The ttrraannssiittiioonnss line gives the number of arcs (transitions) between
|
||||
states. It is followed by as many lines, each specifying one transi-
|
||||
tion by its originating state _n_1, its target state _n_2, and the transi-
|
||||
tion cost _p. The transition cost is usually interpreted as 10000.5
|
||||
times the natural logarithm of a probability, and should be normalized
|
||||
and scaled accordingly.
|
||||
|
||||
SSEEEE AALLSSOO
|
||||
pfsg-scripts(1), fsm(1).
|
||||
|
||||
BBUUGGSS
|
||||
File formats are a matter of taste ...
|
||||
There is no way to specify words with embedded whitespace.
|
||||
|
||||
AAUUTTHHOORR
|
||||
PFSGs were developed as part of SRI's Decipher(TM) recognition system.
|
||||
Manual page written by Andreas Stolcke <stolcke@speech.sri.com>.
|
||||
Copyright 1999, 2004 SRI International
|
||||
|
||||
|
||||
|
||||
SRILM File Formats $Date: 2007/12/19 22:08:05 $ pfsg-format(5)
|
||||
116
language_model/srilm-1.7.3/man/cat5/wlat-format.5
Normal file
116
language_model/srilm-1.7.3/man/cat5/wlat-format.5
Normal file
@@ -0,0 +1,116 @@
|
||||
wlat-format(5) File Formats Manual wlat-format(5)
|
||||
|
||||
|
||||
|
||||
NNAAMMEE
|
||||
wlat-format - File format for SRILM word posterior lattices
|
||||
|
||||
SSYYNNOOPPSSIISS
|
||||
Word lattices:
|
||||
vveerrssiioonn 22
|
||||
nnaammee _s
|
||||
iinniittiiaall _i
|
||||
ffiinnaall _f
|
||||
nnooddee _n _w _a _p _n_1 _p_1 _n_2 _p_2 ...
|
||||
...
|
||||
|
||||
Word meshes (confusion networks):
|
||||
nnaammee _s
|
||||
nnuummaalliiggnnss _N
|
||||
ppoosstteerriioorr _P
|
||||
aalliiggnn _a _w_1 _p_1 _w_2 _p_2 ...
|
||||
rreeffeerreennccee _a _w
|
||||
hhyyppss _a _w _h_1 _h_2 ...
|
||||
iinnffoo _a _w _s_t_a_r_t _d_u_r _a_s_c_o_r_e _g_s_c_o_r_e _p_h_o_n_e_s _p_h_o_n_e_d_u_r_s
|
||||
ttiimmee _a _t
|
||||
...
|
||||
|
||||
DDEESSCCRRIIPPTTIIOONN
|
||||
Word posterior lattices and meshes are lattices generated by aligning
|
||||
N-best hypotheses with nnbbeesstt--llaattttiiccee(1), or by aligning PFSG or HTK
|
||||
lattices with llaattttiiccee--ttooooll(1). They compactly encode possible word
|
||||
hypotheses sequences and their posterior probabilities. (Word meshes
|
||||
have become generally known as ``confusion networks'' or ``sausages.'')
|
||||
|
||||
A word lattice is a partially ordered directed graph with nodes repre-
|
||||
senting word hypotheses. Nodes are identified by non-negative inte-
|
||||
gers. The file format specifies the initial node _i, the final node _f,
|
||||
and any number of additional nodes _n. For each node _n the following
|
||||
associated information is given on the same line: the word identity _w
|
||||
(the string ``NULL'' is used with initial and final nodes), the align-
|
||||
ment position _a (identical values in this field identify hypotheses
|
||||
that occur at the same position), and the word posterior probability _p.
|
||||
Following these values, zero or more transitions to successor nodes are
|
||||
specified, each given by the node index _n_i and the transition posterior
|
||||
probability _p_i. In a properly normalized word lattice the transition
|
||||
posteriors _p_i sum up to the node posterior _p.
|
||||
|
||||
Word meshes represent a more constrained lattice format in which word
|
||||
hypotheses are in a total order. A mesh contains a number of alignment
|
||||
positions, and a set of mutually exclusive word hypotheses in each
|
||||
position (the ``confusion sets''). The word mesh represents all sen-
|
||||
tence hypotheses that can be generated by freely combining word
|
||||
hypotheses at each position. The file format specifies the number of
|
||||
alignment positions _A and the total posterior probability mass _P con-
|
||||
tained in the lattice, followed by one or more confusion set specifica-
|
||||
tions. For each alignment position _a, the hypothesized words _w_i and
|
||||
their posterior probabilities _p_i are listed in alternation. The
|
||||
pseudo-word string **DDEELLEETTEE** represents an empty hypothesis.
|
||||
|
||||
Optionally, the word mesh format encodes additional information about
|
||||
the hypothesis alignment from which it resulted. The keyword rreeffeerreennccee
|
||||
specifies the correct word _w that was aligned at position _a. The key-
|
||||
word hhyyppss is used to list the sentence hypotheses of which a certain
|
||||
word hypothesis was a part. The word hypothesis is identified by an
|
||||
alignment postion _a and the word string _w, and is followed by the inte-
|
||||
ger IDs _h_i (typically, the N-best ranks) of the associated sentence
|
||||
hypotheses.
|
||||
|
||||
As another optional element, the word mesh can contain word-level
|
||||
acoustic and temporal information, following the keyword iinnffoo, the
|
||||
alignment position _a, and the word identity _w. This information is
|
||||
derived by nnbbeesstt--llaattttiiccee(1) from word- and phone-level backtraces of N-
|
||||
best hypotheses (as represented in Decipher NBestList2.0 format). The
|
||||
details of this information are defined in the SRILM class NNBBeesstt--
|
||||
WWoorrddIInnffoo and subject to change, but currently include the following.
|
||||
_s_t_a_r_t: word start time (in seconds from the beginning of the waveform);
|
||||
_d_u_r: word duration (in seconds); _a_s_c_o_r_e: acoustic model likelihood (log
|
||||
base 10); _g_s_c_o_r_e: grammar (LM and pronunciation) score (log base 10);
|
||||
_p_h_o_n_e_s: sequence of phones in word (separated by colons); _p_h_o_n_e_d_u_r_s:
|
||||
sequence of phone durations (in numbers of frames, separated by
|
||||
colons). When word meshes are derived from HTK format lattices, pro-
|
||||
nunciation field will consist of the HTK phone alignment information,
|
||||
which encodes both phone sequence and durations; the phone duration
|
||||
field in turn is used to encode the duration model scores, if present.
|
||||
NNoottee:: The encoded information pertains to the word hypothesis with the
|
||||
highest posterior probability among all hypotheses of the same word
|
||||
aligned to a given word mesh position.
|
||||
|
||||
The ttiimmee keyword is used for debugging purposes and encodes the esti-
|
||||
mated timestamp _t of an alignment position _a when the input contains
|
||||
backtrace information. It is ignored when reading in word meshes.
|
||||
|
||||
Both formats optionally encode the associated utterance IDs in the nnaammee
|
||||
field. Word lattices and meshes can be converted to PFSG format using
|
||||
the script wwllaatt--ttoo--ppffssgg.
|
||||
|
||||
SSEEEE AALLSSOO
|
||||
nbest-lattice(1), lattice-tool(1), pfsg-scripts(1), pfsg-format(5),
|
||||
nbest-format(5).
|
||||
L. Mangu, E. Brill, & A. Stolcke, ``Finding consensus in speech recog-
|
||||
nition: word error minimization and other applications of confusion
|
||||
networks,'' _C_o_m_p_u_t_e_r _S_p_e_e_c_h _a_n_d _L_a_n_g_u_a_g_e 14(4), 373-400, 2000.
|
||||
|
||||
BBUUGGSS
|
||||
Detailed alignment and acoustic information is so far only implemented
|
||||
for word meshes, although conceptually it would apply equally to word
|
||||
lattices.
|
||||
|
||||
AAUUTTHHOORR
|
||||
Andreas Stolcke <andreas.stolcke@microsoft.com>
|
||||
Copyright 2001-2011 SRI International
|
||||
Copyright 2011-2019 Microsoft Corp.
|
||||
|
||||
|
||||
|
||||
SRILM File Formats $Date: 2019/02/06 09:53:12 $ wlat-format(5)
|
||||
Reference in New Issue
Block a user