competition update

2025-07-02 12:18:09 -07:00
parent 9e17716a4a
commit 77dbcf868f
2615 changed files with 1648116 additions and 125 deletions
--- a/language_model/srilm-1.7.3/man/cat5/classes-format.5
+++ b/language_model/srilm-1.7.3/man/cat5/classes-format.5
@@ -0,0 +1,37 @@
+classes-format(5)                                            classes-format(5)
+
+
+
+NNAAMMEE
+       classes-format - File format for word class definitions
+
+SSYYNNOOPPSSIISS
+       _c_l_a_s_s [_p] _w_o_r_d_1 _w_o_r_d_2 ...
+
+DDEESSCCRRIIPPTTIIOONN
+       Various  programs  dealing  with word classes use this format to define
+       the posssible expansions of classes and their respective probabilities.
+       Each  expansion  appears  on  a separate line as in the synopsis, where
+       _c_l_a_s_s names a word class, _p gives the probability for the class  expan-
+       sion,  and  _w_o_r_d_1  _w_o_r_d_2  _._._.   defines  the word string that the class
+       expands to.  If _p is omitted it is assumed to  be  1.   (All  expansion
+       probabilities for a given class should sum to one, although this is not
+       necessarily enforced by the software and would lead  to  improper  mod-
+       els.)
+
+       Note  that  the  concept  of  word class here is generalized to include
+       ``multi-words'', or phrases consisting of  more  than  one  word.   All
+       expansions  must  have  at least one word.  Certain models might impose
+       more restrictive formats.
+
+SSEEEE AALLSSOO
+       ngram(1),  ngram-class(1),  disambig(1),   training-scripts(1),   pfsg-
+       scripts(5).
+
+AAUUTTHHOORR
+       Andreas Stolcke <stolcke@speech.sri.com>.
+       Copyright 1999 SRI International
+
+
+
+SRILM File Formats       $Date: 2007/12/19 22:08:05 $        classes-format(5)
--- a/language_model/srilm-1.7.3/man/cat5/nbest-format.5
+++ b/language_model/srilm-1.7.3/man/cat5/nbest-format.5
@@ -0,0 +1,64 @@
+nbest-format(5)                                                nbest-format(5)
+
+
+
+NNAAMMEE
+       nbest-format - File formats for N-best hypotheses lists
+
+DDEESSCCRRIIPPTTIIOONN
+       SRILM currently understands three different formats for lists of N-best
+       hypotheses for rescoring or 1-best hypothesis  extraction.   The  first
+       two  formats originated in the SRI Decipher(TM) recognition system, the
+       third format is particular to SRILM.
+
+       The first format consists of the header
+            NBestList1.0
+       followed by one or more lines of the form
+            (_s_c_o_r_e) _w_1 _w_2 _w_3 ...
+       where _s_c_o_r_e is a composite acoustic/language model score from the  rec-
+       ognizer,  on  the  bytelog  scale.   (A  bytelog is a logarithm to base
+       1.0001, divided by 1024 and rounded to an  integer.)   This  format  is
+       output  by the SRI Decipher(TM) recognizer, by the nnggrraamm --nnbbeesstt, and by
+       nnbbeesstt--llaattttiiccee --wwrriittee--nnbbeesstt --ddeecciipphheerr--nnbbeesstt.
+
+       The second Decipher(TM) format is an extension of the first format that
+       encodes  word-level  scores  and  time  alignments.   It is marked by a
+       header of the form
+            NBestList2.0
+       The hypotheses are in the format
+            (_s_c_o_r_e) _w_1 ( st: _s_t_1 et: _e_t_1 g: _g_1 a: _a_1 ) _w_2 ...
+       where words are followed by start and end  times,  language  model  and
+       acoustic  scores  (bytelog-scaled), respectively.  This format may also
+       contain scores and time  marks  for  sub-word  units  (phones  and  HMM
+       states),  in  the same format as above, but with the _w's denoting phone
+       and state names.  Sub-word units will have time  marks  that  are  con-
+       tained  in  the  duration  of the preceding word units, and may thus be
+       easily identified.
+
+       The third format understood by SRILM lists hypotheses in the format
+            _a_s_c_o_r_e _l_s_c_o_r_e _n_w_o_r_d_s _w_1 _w_2 _w_3 ...
+       where the first three columns contain the acoustic model log  probabil-
+       ity, the language model log probability, and the number of words in the
+       hypothesis string, respectively.  All scores are  logarithms  base  10.
+       (This  format  must  not be preceded by an ``NBestList'' header.)  This
+       format is output by the nnggrraamm --rreessccoorree  and  by  nnbbeesstt--llaattttiiccee  --wwrriittee--
+       nnbbeesstt without the --ddeecciipphheerr--nnbbeesstt option.
+
+SSEEEE AALLSSOO
+       ngram(1),  nbest-lattice(1),  segment-nbest(1), nbest-scripts(1), pfsg-
+       scripts(1).
+
+BBUUGGSS
+       All these formats are somewhat ad hoc and could  use  a  more  rational
+       design.  The ``NBestList1.0'' format is particularly cumbersome because
+       it conflates acoustic and language model scores.
+       A generalization to an arbitrary number of  separate  scores  would  be
+       nice.
+
+AAUUTTHHOORR
+       Manual page written by Andreas Stolcke <stolcke@speech.sri.com>.
+       Copyright 1999-2001 SRI International
+
+
+
+SRILM File Formats       $Date: 2007/12/19 22:08:05 $          nbest-format(5)
--- a/language_model/srilm-1.7.3/man/cat5/ngram-format.5
+++ b/language_model/srilm-1.7.3/man/cat5/ngram-format.5
@@ -0,0 +1,89 @@
+ngram-format(5)                                                ngram-format(5)
+
+
+
+NNAAMMEE
+       ngram-format - File format for ARPA backoff N-gram models
+
+SSYYNNOOPPSSIISS
+       \\ddaattaa\\
+       nnggrraamm 11==_n_1
+       nnggrraamm 22==_n_2
+       ...
+       nnggrraamm _N==_n_N
+
+       \\11--ggrraammss::
+       _p    _w         [_b_o_w]
+       ...
+
+       \\22--ggrraammss::
+       _p    _w_1 _w_2          [_b_o_w]
+       ...
+
+       \\_N--ggrraammss::
+       _p    _w_1 ... _w_N
+       ...
+
+       \\eenndd\\
+
+DDEESSCCRRIIPPTTIIOONN
+       The  so-called  ARPA  (or  Doug  Paul) format for N-gram backoff models
+       starts with a header, introduced by the  keyword  \\ddaattaa\\,  listing  the
+       number  of  N-grams of each length.  Following that, N-grams are listed
+       one per line, grouped into sections by length,  each  section  starting
+       with the keyword \\_N--ggrraamm::, where _N is the length of the N-grams to fol-
+       low.  Each N-gram line starts with the logarithm (base  10)  of  condi-
+       tional probability _p of that N-gram, followed by the words _w_1..._w_N mak-
+       ing up the N-gram.  These are  optionally  followed  by  the  logarithm
+       (base 10) of the backoff weight for the N-gram.  The keyword \\eenndd\\ con-
+       cludes the model representation.
+
+       Backoff weights are required only for those N-grams that form a  prefix
+       of  longer N-grams in the model.  The highest-order N-grams in particu-
+       lar will not need backoff weights (they would be useless).
+
+       Since log(0) (minus infinity) has no portable representation, such val-
+       ues  are  mapped  to  a large negative number.  However, the designated
+       dummy value (-99 in SRILM) is interpreted as log(0) when read back from
+       file into memory.
+
+       The  correctness  of the N-gram counts _n_1, _n_2, ... in the header is not
+       enforced by SRILM software when reading models (although a  warning  is
+       printed  when  an inconsistency is encountered).  This allows easy tex-
+       tual insertion or deletion of parameters in a model file.   The  proper
+       format can be recovered by passsing the model through the command
+            ngram -order _N -lm _i_n_p_u_t -write-lm _o_u_t_p_u_t
+
+       Note that the format is self-delimiting, allowing multiple models to be
+       stored in one file, or to be surrounded by ancillary information.  Some
+       extensions  of N-gram models in SRILM store additional parameters after
+       a basic N-gram section in the standard format.
+
+SSEEEE AALLSSOO
+       ngram(1), ngram-count(1), lm-scripts(1), pfsg-scripts(1).
+
+BBUUGGSS
+       The ARPA format does not allow N-grams that have only a backoff  weight
+       associated  with  them, but no conditional probability.  This makes the
+       format less general than would otherwise be useful  (e.g.,  to  support
+       pruned  models,  or  ones  containing a mix of words and classes).  The
+       nnggrraamm--ccoouunntt(1) tool satisfies this constraint by inserting dummy proba-
+       bilities where necessary.
+
+       For  simplicity,  an  N-gram model containing N-grams up to length _N is
+       referred to in the SRILM programs as  an  _N-th  order  model,  although
+       techncally it represents a Markov model of order _N-1.
+
+BBUUGGSS
+       There is no way to specify words with embedded whitespace.
+
+AAUUTTHHOORR
+       The  ARPA backoff format was developed by Doug Paul at MIT Lincoln Labs
+       for research sponsored by  the  U.S.  Department  of  Defense  Advanced
+       Research Project Agency (ARPA).
+       Man page by Andreas Stolcke <stolcke@speech.sri.com>.
+       Copyright 1999, 2004 SRI International
+
+
+
+SRILM File Formats       $Date: 2007/12/19 22:08:05 $          ngram-format(5)
--- a/language_model/srilm-1.7.3/man/cat5/pfsg-format.5
+++ b/language_model/srilm-1.7.3/man/cat5/pfsg-format.5
@@ -0,0 +1,65 @@
+pfsg-format(5)                                                  pfsg-format(5)
+
+
+
+NNAAMMEE
+       pfsg-format  -  File format for Decipher(TM) probabilistic finite-state
+       grammars
+
+SSYYNNOOPPSSIISS
+       nnaammee _n_a_m_e
+       nnooddeess _N _w_1 ... _w_N
+       iinniittiiaall _i
+       ffiinnaall _f
+       ttrraannssiittiioonnss _T
+       _n_1 _n_2 _p
+       ...
+
+DDEESSCCRRIIPPTTIIOONN
+       Probabilistic finite-state grammars (PFSGs) are a form of  finite-state
+       automaton or transducer used by the SRI Decipher(TM) recognizer.  PFSGs
+       emit words (outputs) at the nodes, not on the arcs.  Certain  types  of
+       language  models  manipulated by SRILM can be translated into PFSGs for
+       direct use in the recognizer.
+
+       Since it is usually fairly easy to convert  between  different  finite-
+       state  network representations, PFSGs can serve as an intermediate for-
+       mat for the generation of other  finite-state  formats.   For  example,
+       PFSGs can be converted to the AT&T ffssmm(5) format.
+
+       Each PFSGs is given a _n_a_m_e.  The name is significant if PFSGs are to be
+       composed, in which case the _n_a_m_e specifies the category it expands.
+
+       The nnooddeess line gives the number of nodes in the state  graph,  followed
+       by  the word strings associated with each node.  If the node represents
+       a category expanded by another PFSG, then the name string of that  PFSG
+       is  given  here.   The  token NNUULLLL is special and designates the corre-
+       sponding node as non-emitting.  It is  conventional  to  use  lowercase
+       strings  for  words,  and  uppercase  for  categories  and  PFSG  names
+       (``NULL'' must be avoided, of course).
+
+       The iinniittiiaall and ffiinnaall lines specify the start and  end  states  of  the
+       grammar, respectively.  Nodes are numbered starting at zero.
+
+       The  ttrraannssiittiioonnss  line  gives  the number of arcs (transitions) between
+       states.  It is followed by as many lines, each specifying  one  transi-
+       tion  by its originating state _n_1, its target state _n_2, and the transi-
+       tion cost _p.  The transition cost is  usually  interpreted  as  10000.5
+       times  the natural logarithm of a probability, and should be normalized
+       and scaled accordingly.
+
+SSEEEE AALLSSOO
+       pfsg-scripts(1), fsm(1).
+
+BBUUGGSS
+       File formats are a matter of taste ...
+       There is no way to specify words with embedded whitespace.
+
+AAUUTTHHOORR
+       PFSGs were developed as part of SRI's Decipher(TM) recognition  system.
+       Manual page written by Andreas Stolcke <stolcke@speech.sri.com>.
+       Copyright 1999, 2004 SRI International
+
+
+
+SRILM File Formats       $Date: 2007/12/19 22:08:05 $           pfsg-format(5)
--- a/language_model/srilm-1.7.3/man/cat5/wlat-format.5
+++ b/language_model/srilm-1.7.3/man/cat5/wlat-format.5
@@ -0,0 +1,116 @@
+wlat-format(5)                File Formats Manual               wlat-format(5)
+
+
+
+NNAAMMEE
+       wlat-format - File format for SRILM word posterior lattices
+
+SSYYNNOOPPSSIISS
+       Word lattices:
+       vveerrssiioonn 22
+       nnaammee _s
+       iinniittiiaall _i
+       ffiinnaall _f
+       nnooddee _n _w _a _p _n_1 _p_1 _n_2 _p_2 ...
+       ...
+
+       Word meshes (confusion networks):
+       nnaammee _s
+       nnuummaalliiggnnss _N
+       ppoosstteerriioorr _P
+       aalliiggnn _a _w_1 _p_1 _w_2 _p_2 ...
+       rreeffeerreennccee _a _w
+       hhyyppss _a _w _h_1 _h_2 ...
+       iinnffoo _a _w _s_t_a_r_t _d_u_r _a_s_c_o_r_e _g_s_c_o_r_e _p_h_o_n_e_s _p_h_o_n_e_d_u_r_s
+       ttiimmee _a _t
+       ...
+
+DDEESSCCRRIIPPTTIIOONN
+       Word  posterior  lattices and meshes are lattices generated by aligning
+       N-best hypotheses with nnbbeesstt--llaattttiiccee(1), or by  aligning  PFSG  or  HTK
+       lattices  with  llaattttiiccee--ttooooll(1).   They  compactly encode possible word
+       hypotheses sequences and their posterior probabilities.   (Word  meshes
+       have become generally known as ``confusion networks'' or ``sausages.'')
+
+       A  word lattice is a partially ordered directed graph with nodes repre-
+       senting word hypotheses.  Nodes are identified  by  non-negative  inte-
+       gers.   The file format specifies the initial node _i, the final node _f,
+       and any number of additional nodes _n.  For each node  _n  the  following
+       associated  information  is given on the same line: the word identity _w
+       (the string ``NULL'' is used with initial and final nodes), the  align-
+       ment  position  _a  (identical  values in this field identify hypotheses
+       that occur at the same position), and the word posterior probability _p.
+       Following these values, zero or more transitions to successor nodes are
+       specified, each given by the node index _n_i and the transition posterior
+       probability  _p_i.   In a properly normalized word lattice the transition
+       posteriors _p_i sum up to the node posterior _p.
+
+       Word meshes represent a more constrained lattice format in  which  word
+       hypotheses are in a total order.  A mesh contains a number of alignment
+       positions, and a set of mutually  exclusive  word  hypotheses  in  each
+       position  (the  ``confusion sets'').  The word mesh represents all sen-
+       tence hypotheses  that  can  be  generated  by  freely  combining  word
+       hypotheses  at  each position.  The file format specifies the number of
+       alignment positions _A and the total posterior probability mass  _P  con-
+       tained in the lattice, followed by one or more confusion set specifica-
+       tions.  For each alignment position _a, the hypothesized  words  _w_i  and
+       their  posterior  probabilities  _p_i  are  listed  in  alternation.  The
+       pseudo-word string **DDEELLEETTEE** represents an empty hypothesis.
+
+       Optionally, the word mesh format encodes additional  information  about
+       the hypothesis alignment from which it resulted.  The keyword rreeffeerreennccee
+       specifies the correct word _w that was aligned at position _a.  The  key-
+       word  hhyyppss  is  used to list the sentence hypotheses of which a certain
+       word hypothesis was a part.  The word hypothesis is  identified  by  an
+       alignment postion _a and the word string _w, and is followed by the inte-
+       ger IDs _h_i (typically, the N-best ranks)  of  the  associated  sentence
+       hypotheses.
+
+       As  another  optional  element,  the  word  mesh can contain word-level
+       acoustic and temporal information,  following  the  keyword  iinnffoo,  the
+       alignment  position  _a,  and  the word identity _w.  This information is
+       derived by nnbbeesstt--llaattttiiccee(1) from word- and phone-level backtraces of N-
+       best  hypotheses (as represented in Decipher NBestList2.0 format).  The
+       details of this information are  defined  in  the  SRILM  class  NNBBeesstt--
+       WWoorrddIInnffoo  and  subject  to change, but currently include the following.
+       _s_t_a_r_t: word start time (in seconds from the beginning of the waveform);
+       _d_u_r: word duration (in seconds); _a_s_c_o_r_e: acoustic model likelihood (log
+       base 10); _g_s_c_o_r_e: grammar (LM and pronunciation) score (log  base  10);
+       _p_h_o_n_e_s:  sequence  of  phones in word (separated by colons); _p_h_o_n_e_d_u_r_s:
+       sequence of  phone  durations  (in  numbers  of  frames,  separated  by
+       colons).   When  word meshes are derived from HTK format lattices, pro-
+       nunciation field will consist of the HTK phone  alignment  information,
+       which  encodes  both  phone  sequence and durations; the phone duration
+       field in turn is used to encode the duration model scores, if  present.
+       NNoottee::  The encoded information pertains to the word hypothesis with the
+       highest posterior probability among all hypotheses  of  the  same  word
+       aligned to a given word mesh position.
+
+       The  ttiimmee  keyword is used for debugging purposes and encodes the esti-
+       mated timestamp _t of an alignment position _a when  the  input  contains
+       backtrace information.  It is ignored when reading in word meshes.
+
+       Both formats optionally encode the associated utterance IDs in the nnaammee
+       field.  Word lattices and meshes can be converted to PFSG format  using
+       the script wwllaatt--ttoo--ppffssgg.
+
+SSEEEE AALLSSOO
+       nbest-lattice(1),   lattice-tool(1),  pfsg-scripts(1),  pfsg-format(5),
+       nbest-format(5).
+       L. Mangu, E. Brill, & A. Stolcke, ``Finding consensus in speech  recog-
+       nition:  word  error  minimization  and other applications of confusion
+       networks,'' _C_o_m_p_u_t_e_r _S_p_e_e_c_h _a_n_d _L_a_n_g_u_a_g_e 14(4), 373-400, 2000.
+
+BBUUGGSS
+       Detailed alignment and acoustic information is so far only  implemented
+       for  word  meshes, although conceptually it would apply equally to word
+       lattices.
+
+AAUUTTHHOORR
+       Andreas Stolcke <andreas.stolcke@microsoft.com>
+       Copyright 2001-2011 SRI International
+       Copyright 2011-2019 Microsoft Corp.
+
+
+
+SRILM File Formats       $Date: 2019/02/06 09:53:12 $           wlat-format(5)