competition update

2025-07-02 12:18:09 -07:00
parent 9e17716a4a
commit 77dbcf868f
2615 changed files with 1648116 additions and 125 deletions
--- a/language_model/srilm-1.7.3/man/cat3/File.3
+++ b/language_model/srilm-1.7.3/man/cat3/File.3
@@ -0,0 +1,76 @@
+File(3)                    Library Functions Manual                    File(3)
+
+
+
+NNAAMMEE
+       File -  Wrapper for stdio streams
+
+SSYYNNOOPPSSIISS
+       ##iinncclluuddee <<FFiillee..hh>>
+
+DDEESSCCRRIIPPTTIIOONN
+       The  FFiillee  class provides a simple wrapper around stdio streams for use
+       with C++.  It provides two kinds of convenience: Firstly,  constructors
+       and  destructors  manage opening and closing of the stream.  The stream
+       is checked for errors on closing, and the default behavior is to exit()
+       with  an error message if a problem was found.  Secondly, the getline()
+       method can be used for line-oriented input.   It  strips  comments  and
+       keeps track of input line numbers for error reporting.
+
+CCLLAASSSS MMEEMMBBEERRSS
+       FFiillee((ccoonnsstt cchhaarr **_n_a_m_e,, ccoonnsstt cchhaarr **_m_o_d_e,, iinntt _e_x_i_t_O_n_E_r_r_o_r == 11))
+
+       FFiillee((FFIILLEE **_f_p == 00,, iinntt _e_x_i_t_O_n_E_r_r_o_r == 11))
+              A  File  object  can be initialized with either a filename or an
+              existing stdio stream.  In the first case, the  file  is  opened
+              according  to  _m_o_d_e  (as  if by ffooppeenn(3)).  The _e_x_i_t_O_n_E_r_r_o_r flag
+              determines whether I/O errors should be treated as fatal.
+
+       ~~FFiillee(())
+              Destroying a File object implies closing the associated stream.
+
+       cchhaarr **ggeettlliinnee(())
+              Returns the next line from the input, stored in a static  buffer
+              of up to mmaaxxLLiinneeLLeennggtthh characters.  Empty lines and lines start-
+              ing with ## are skipped.
+
+       iinntt cclloossee(())
+              Closes the stream without destroying the File  object.   Returns
+              non-zero is an error condition occurs.
+
+       iinntt eerrrroorr(())
+              Returns  a  non-zero value if an error condition occurred on the
+              stream.
+
+       ooppeerraattoorr FFIILLEE **(())
+              A File object can be cast to FFIILLEE **  to  access  the  underlying
+              stdio stream.
+
+       oossttrreeaamm &&ppoossiittiioonn((oossttrreeaamm &&_s_t_r_e_a_m == cceerrrr))
+              Outputs  the  current  line  number  on  _s_t_r_e_a_m.   The _s_t_r_e_a_m is
+              returned so it can be used as the left operand of the <<<<  opera-
+              tor.
+
+       ccoonnsstt cchhaarr **nnaammee
+              The filename used in creating the File object.
+
+       ccoonnsstt uunnssiiggnneedd lliinneennoo
+              The current line number as maintained by ggeettlliinnee(()).
+
+       iinntt eexxiittOOnnEErrrroorr
+              When  set to ttrruuee this causes errors on the stream to be handled
+              by program termination (after printing an error message).
+
+SSEEEE AALLSSOO
+       stdio(3)
+
+BBUUGGSS
+       Many other potentially useful functions are not provided (yet).
+
+AAUUTTHHOORR
+       Andreas Stolcke <stolcke@icsi.berkeley.edu>
+       Copyright (c) 1995-1996 SRI International
+
+
+
+SRILM                    $Date: 2019/09/09 22:35:37 $                  File(3)
--- a/language_model/srilm-1.7.3/man/cat3/LM.3
+++ b/language_model/srilm-1.7.3/man/cat3/LM.3
@@ -0,0 +1,147 @@
+LM(3)                      Library Functions Manual                      LM(3)
+
+
+
+NNAAMMEE
+       LM - Generic language model
+
+SSYYNNOOPPSSIISS
+       ##iinncclluuddee <<LLMM..hh>>
+
+DDEESSCCRRIIPPTTIIOONN
+       The  LLMM class specifies a minimal language model interface and provides
+       some generic utilities.
+
+       LLMM inherits from DDeebbuugg, and the debugging level of an LM object  deter-
+       mines if and how much verbose information various is printed by various
+       functions.
+
+CCLLAASSSS MMEEMMBBEERRSS
+       LLMM((VVooccaabb &&_v_o_c_a_b))
+              Initializeing an LM object requries  specifying  the  vocabulary
+              over  which  the  LM is defined.  The _v_o_c_a_b object can be shared
+              among different LM instances.  The LM object can modify _v_o_c_a_b as
+              a side-effect, e.g., as a result of reading an LM from a file.
+
+       LLooggPP wwoorrddPPrroobb((VVooccaabbIInnddeexx _w_o_r_d,, ccoonnsstt VVooccaabbIInnddeexx **_c_o_n_t_e_x_t))
+
+       LLooggPP wwoorrddPPrroobb((VVooccaabbSSttrriinngg _w_o_r_d,, ccoonnsstt VVooccaabbSSttrriinngg **_c_o_n_t_e_x_t))
+              Returns the conditional log probability of _w_o_r_d given a history.
+              The history is given in reversed order (most recent word  first)
+              in  _c_o_n_t_e_x_t,  and terminated by VVooccaabb__NNoonnee.  Word or history can
+              be specified either by strings or indices.   All  functional  LM
+              subclasses have to implement at least the first version.
+
+       LLooggPP wwoorrddPPrroobbRReeccoommppuuttee((VVooccaabbIInnddeexx _w_o_r_d,, ccoonnsstt VVooccaabbIInnddeexx **_c_o_n_t_e_x_t))
+              Returns  the same conditional log probability as wwoorrddPPrroobb(()), but
+              on the promise that _c_o_n_t_e_x_t is identical to  the  last  call  to
+              wwoorrddPPrroobb(()).   This  often allows for efficient implementation to
+              speed up repeated lookups in the same context.
+
+       LLooggPP sseenntteenncceePPrroobb((ccoonnsstt VVooccaabbIInnddeexx **_s_e_n_t_e_n_c_e,, TTeexxttSSttaattss &&_s_t_a_t_s))
+
+       LLooggPP sseenntteenncceePPrroobb((ccoonnsstt VVooccaabbSSttrriinngg **_s_e_n_t_e_n_c_e,, TTeexxttSSttaattss &&_s_t_a_t_s))
+              Returns the total log probability of a string of  word  (a  sen-
+              tence).   The data in the _s_t_a_t_s object is incremented to reflect
+              the statistics of the sentence.
+
+       uunnssiiggnneedd ppppllFFiillee((FFiillee &&_f_i_l_e,, TTeexxttSSttaattss &&_s_t_a_t_s,, ccoonnsstt cchhaarr **_e_s_c_a_p_e_S_t_r_i_n_g
+       == 00))
+              Reads  sentences  from  _f_i_l_e,  computing their probabilities and
+              aggregate perplexity, and updating  the  _s_t_a_t_s.   The  debugging
+              state  of  the  LM  object  determines  how  much information is
+              printed to stderr.  debuglevel 0: total statistics  only;  debu-
+              glevel 1: per-sentence statistics; debuglevel 2: word probabili-
+              ties; debuglevel 3 and greater: LM specific information.
+              Lines in _f_i_l_e that start with _e_s_c_a_p_e_S_t_r_i_n_g  are  copied  to  the
+              output.   This  allows extra information in the input file to be
+              passed through unchanged.
+
+       uunnssiiggnneedd rreessccoorreeFFiillee((FFiillee &&_f_i_l_e,, ddoouubbllee  _l_m_S_c_a_l_e,,  ddoouubbllee  _w_t_S_c_a_l_e,,  LLMM
+       &&_o_l_d_L_M,,  ddoouubbllee _o_l_d_L_m_S_c_a_l_e,, ddoouubbllee _o_l_d_W_t_S_c_a_l_e,, ccoonnsstt cchhaarr **_e_s_c_a_p_e_S_t_r_i_n_g
+       == 00))
+              Reads N-best hypotheses and scores from _f_i_l_e,  replaces  the  LM
+              scores with new ones computed from the current model, and prints
+              the new scores (including hypotheses) to  stdout.   _l_m_S_c_a_l_e  and
+              _w_t_S_c_o_r_e  are  the  LM and word transition weights, respectively.
+              _o_l_d_L_M is the LM whose  scores  are  included  in  the  aggregate
+              scores  read  from  the input (provided so that they can be sub-
+              tracted out), and _o_l_d_L_m_S_c_a_l_e and _o_l_d_W_t_S_c_a_l_e are the old  LM  and
+              word transition weights, respectively.
+              Lines  in  _f_i_l_e  that  start with _e_s_c_a_p_e_S_t_r_i_n_g are copied to the
+              output.
+
+       vvooiidd sseettSSttaattee((ccoonnsstt cchhaarr **_s_t_a_t_e))
+              This is a generic interface to change the internal ``state''  of
+              a LM.  The default implementation of this function does nothing,
+              but certain LM subclass implementation may interpret  the  _s_t_a_t_e
+              string to assume different internal configurations.
+
+       PPrroobb wwoorrddPPrroobbSSuumm((ccoonnsstt VVooccaabbIInnddeexx **_c_o_n_t_e_x_t))
+              Returns  the  sum  of all word probabilities in _c_o_n_t_e_x_t.  Useful
+              for checking the well-definedness of a model.
+
+       VVooccaabbIInnddeexx ggeenneerraatteeWWoorrdd((ccoonnsstt VVooccaabbIInnddeexx **_c_o_n_t_e_x_t))
+              Returns a word index from  the  vocabulary,  randomly  generated
+              according to the conditional probabilities in _c_o_n_t_e_x_t.
+
+       VVooccaabbIInnddeexx   **ggeenneerraatteeSSeenntteennccee((uunnssiiggnneedd   _m_a_x_W_o_r_d_s  ==  mmaaxxWWoorrddssPPeerrLLiinnee,,
+       VVooccaabbIInnddeexx **_s_e_n_t_e_n_c_e == 00))
+
+       VVooccaabbSSttrriinngg  **ggeenneerraatteeSSeenntteennccee((uunnssiiggnneedd  _m_a_x_W_o_r_d_s  ==   mmaaxxWWoorrddssPPeerrLLiinnee,,
+       VVooccaabbSSttrriinngg **_s_e_n_t_e_n_c_e == 00))
+              Generates  a  random  sentence  of  length  up to _m_a_x_W_o_r_d_s.  The
+              result is placed in _s_e_n_t_e_n_c_e if specified, or in a static buffer
+              otherwise.
+
+       vvooiidd **ccoonntteexxttIIDD((ccoonnsstt VVooccaabbIInnddeexx **_c_o_n_t_e_x_t))
+              Returns  an implementation-dependent value that identifies a the
+              word context used to compute a  conditional  probability.   (The
+              context  actually  used may be shorted that what is specified in
+              _c_o_n_t_e_x_t).
+
+       BBoooolleeaann iissNNoonnWWoorrdd((VVooccaabbIInnddeexx _w_o_r_d))
+              Return ttrruuee if _w_o_r_d is a regular word in the LM, i.e., one  that
+              the  LM  computes probabilities for (as opposed to non-event tag
+              such as sentence-start).
+
+       BBoooolleeaann rreeaadd((FFiillee &&_f_i_l_e,, BBoooolleeaann _l_i_m_i_t_V_o_c_a_b == ffaallssee))
+              Read a LM from _f_i_l_e.  Return ttrruuee is the file contents was  for-
+              mated  correctly and an internal LM representation could be suc-
+              cessfully constructed from it.  The optional 2nd  argument  con-
+              trols  whether  words  not  already  in the vocabulary are to be
+              added automatically.
+
+       vvooiidd wwrriittee((FFiillee &&_f_i_l_e))
+              Writes the LM to _f_i_l_e in a format  that  can  be  read  back  by
+              rreeaadd(()).
+
+       VVooccaabb &&vvooccaabb
+              The  vocabulary  object  associated  with LM (set at initializa-
+              tion).
+
+       VVooccaabbIInnddeexx nnooiisseeIInnddeexx
+              The index of the noise tag, i.e., a word that  is  skipped  when
+              computing probabilities.
+
+       ccoonnsstt cchhaarr **ssttaatteeTTaagg
+              A string introducing ``state'' information that should be passed
+              to the LM.  Input lines starting with this  tag  are  handed  to
+              sseettSSttaattee(()) bbyy ppppllFFiillee(()) aanndd rreessccoorreeFFiillee(())..
+
+       BBoooolleeaann rreevveerrsseeWWoorrddss
+              If set to ttrruuee, the LM reverses word order before computing sen-
+              tence probabilities.  This means wwoorrddPPrroobb(()) is expected to  com-
+              pute conditional probabilities based on _r_i_g_h_t contexts.
+
+SSEEEE AALLSSOO
+       Vocab(3).
+
+BBUUGGSS
+AAUUTTHHOORR
+       Andreas Stolcke <stolcke@icsi.berkeley.edu>.
+       Copyright (c) 1995-1996 SRI International
+
+
+
+SRILM                    $Date: 2019/09/09 22:35:37 $                    LM(3)
--- a/language_model/srilm-1.7.3/man/cat3/Prob.3
+++ b/language_model/srilm-1.7.3/man/cat3/Prob.3
@@ -0,0 +1,79 @@
+Prob(3)                    Library Functions Manual                    Prob(3)
+
+
+
+NNAAMMEE
+       Prob - Probabilities for SRILM
+
+SSYYNNOOPPSSIISS
+       ##iinncclluuddee <<PPrroobb..hh>>
+
+DDEESSCCRRIIPPTTIIOONN
+       PPrroobb is a collection of types, constants and utility functions for han-
+       dling probabilities in the SRILM library.
+
+TTYYPPEESS
+       PPrroobb   A floating point number representing a probability.
+
+       LLooggPP   Logarithm to base 10 of a probability.
+
+CCOONNSSTTAANNTTSS
+       LLooggPP__ZZeerroo
+              Log of probability 0.
+
+       LLooggPP__IInnff
+              Log  of  probability  infinity  (not  a  legal  probability,  of
+              course).
+
+       LLooggPP__OOnnee
+              Log of probability 1.
+
+       LLooggPP__PPrreecciissiioonn
+              The number of significant digits in a LogP
+
+       PPrroobb__EEppssiilloonn
+              A  positive  value  close  to 0; probability sums less than this
+              should be considered effectively zero.
+
+FFUUNNCCTTIIOONNSS
+       BBoooolleeaann ppaarrsseeLLooggPP((ccoonnsstt cchhaarr **_s_t_r_i_n_g,, LLooggPP &&_p_r_o_b))
+              Converts a floating point string  representation  into  a  LogP.
+              Returns ttrruuee iff the number was parsed correctly.  This function
+              should be much faster  than  generic  C  library  functions  for
+              floating   point  parsing.   Also,  it  parses  singular  LogP's
+              (plus/minus infinity) correctly.
+
+       PPrroobb LLooggPPttooPPPPLL((LLooggPP _p_r_o_b))
+              Converts a LogP into a perplexity (PPL).
+
+       PPrroobbTTooLLooggPP((PPrroobb prob))
+              Converts a probability into a LogP.
+
+       LLooggPP MMiixxLLooggPP((LLooggPP _p_r_o_b_1,, LLooggPP _p_r_o_b_2,, ddoouubbllee _l_a_m_b_d_a))
+              Computes the LogP resulting from interpolating two  LogP's.   If
+              _p_1  and  _p_2  are probabilities corresponding to _p_r_o_b_1 and _p_r_o_b_2,
+              respectively, then the  result  is  the  LogP  corresponding  to
+              _l_a_m_b_d_a * _p_1 + (1 - _l_a_m_b_d_a) * _p_2.
+
+       The  following  functions  deal with _b_y_t_e_l_o_g_s.  Bytelogs are logarithms
+       scaled to represent probabilties and likelihoods as a short integer  in
+       SRI's DECIPHER(TM) recognizer (bytelog(_p) = log(_p) * 10000.5 / 1024).
+
+       ddoouubbllee PPrroobbTTooBByytteelloogg((PPrroobb _p_r_o_b))
+              Converts a probability to a bytelog.
+
+       ddoouubbllee LLooggPPttooBByytteelloogg((LLooggPP _p_r_o_b))
+              Convert a LogP to a bytelog.
+
+       LLooggPP BByytteellooggTTooLLooggPP((ddoouubbllee _b_y_t_e_l_o_g))
+              Convert a bytelog to a LogP.
+
+SSEEEE AALLSSOO
+BBUUGGSS
+AAUUTTHHOORR
+       Andreas Stolcke <stolcke@icsi.berkeley.edu>
+       Copyright (c) 1995-1996 SRI International
+
+
+
+SRILM                    $Date: 2019/09/09 22:35:37 $                  Prob(3)
--- a/language_model/srilm-1.7.3/man/cat3/Vocab.3
+++ b/language_model/srilm-1.7.3/man/cat3/Vocab.3
@@ -0,0 +1,241 @@
+Vocab(3)                   Library Functions Manual                   Vocab(3)
+
+
+
+NNAAMMEE
+       Vocab - Vocabulary indexing for SRILM
+
+SSYYNNOOPPSSIISS
+       ##iinncclluuddee <<VVooccaabb..hh>>
+
+DDEESSCCRRIIPPTTIIOONN
+       The  VVooccaabb class represents sets of string tokens as typically used for
+       vocabularies, word class names, etc.  Additionally,  Vocab  provides  a
+       mapping  from  such  string tokens (type VVooccaabbSSttrriinngg) to integers (type
+       VVooccaabbIInnddeexx).  VocabIndex values are typically used to  index  words  in
+       language  models to conserve space and speed up comparisons etc.  Thus,
+       VVooccaabb essentially implements a symbol table into which strings  can  be
+       ``interned.''
+
+TTYYPPEESS
+       VVooccaabbIInnddeexx
+              A non-negative integer for representing a string internally.
+
+       VVooccaabbSSttrriinngg
+              A character array representing a vocabulary item (e.g., a word).
+
+CCOONNSSTTAANNTTSS
+       mmaaxxWWoorrddLLeennggtthh
+              Maximum number of characters in a VocabString.
+
+       VVooccaabb__NNoonnee
+              A  special  VocabIndex  used to denote no vocabulary item and to
+              terminate VocabIndex arrays.
+
+       VVooccaabb__UUnnkknnoowwnn
+
+       VVooccaabb__SSeennttSSttaarrtt
+
+       VVooccaabb__SSeennttEEnndd
+
+       VVooccaabb__PPaauussee
+              Default VocabString values for some common,  predefined  vocabu-
+              lary  items:  unknown  word,  sentence  begin, sentence end, and
+              pause, respectively.
+
+CCLLAASSSS MMEEMMBBEERRSS
+       VVooccaabb((VVooccaabbIInnddeexx _s_t_a_r_t == 00,, VVooccaabbIInnddeexx _e_n_d == 00xx77ffffffffffffff))
+              When initializing a Vocab object, _s_t_a_r_t and _e_n_d  optionally  set
+              the minimum and maximum VocabIndex values assigned by the vocab-
+              ulary.  Indices are allocated in increasing  order  starting  at
+              _s_t_a_r_t.
+
+       VVooccaabbIInnddeexx aaddddWWoorrdd((VVooccaabbSSttrriinngg _n_a_m_e))
+              Looks up the index of a word string _n_a_m_e, adding the word if not
+              already part of the vocabulary.
+
+       VVooccaabbSSttrriinngg ggeettWWoorrdd((VVooccaabbIInnddeexx _i_n_d_e_x))
+              Returns the VocabString for _i_n_d_e_x,  or  0  if  the  index  isn't
+              defined.
+
+       ggeettIInnddeexx((VVooccaabbSSttrriinngg _n_a_m_e))
+              Returns  the VocabIndex for word _n_a_m_e, or VVooccaabb__NNoonnee if the word
+              isn't defined.  (Unlike aaddddWWoorrdd(()),  this  will  not  extend  the
+              vocabulary if the word is undefined.)
+
+       vvooiidd rreemmoovvee((VVooccaabbSSttrriinngg _n_a_m_e))
+
+       vvooiidd rreemmoovvee((VVooccaabbIInnddeexx _i_n_d_e_x))
+              Deletes a vocabulary item, either by name or by index.
+
+       uunnssiiggnneedd iinntt nnuummWWoorrddss(())
+              Returns the number of current vocabulary entries.
+
+       VVooccaabbIInnddeexx hhiigghhIInnddeexx(())
+              Returns  the highest VocabIndex value assigned so far.  The next
+              word added will receive an index  that  is  one  greater.   When
+              allocating various meaningful vocabulary subsets into contiguous
+              ranges, this function can be used to determine the corresponding
+              boundaries  in  VocabIndex  space,  and then use these values to
+              test subset membership etc.
+
+       VVooccaabbIInnddeexx uunnkkIInnddeexx
+              The  index  of  the  unknown  word  (by  default   assigned   to
+              VVooccaabb__UUnnkknnoowwnn).
+
+       VVooccaabbIInnddeexx ssssIInnddeexx
+              The  index  of  the  sentence-start  tag (by default assignedrto
+              VVooccaabb__SSeennttSSttaarrtt).
+
+       VVooccaabbIInnddeexx sseeIInnddeexx
+              The index of  the  sentence-end  tag  (by  default  assigned  to
+              VVooccaabb__SSeennttEEnndd).
+
+       VVooccaabbIInnddeexx ppaauusseeIInnddeexx
+              The index of the pause tag (by default assigned to VVooccaabb__PPaauussee).
+
+       BBoooolleeaann uunnkkIIssWWoorrdd
+              When  ttrruuee,  the  unknown  word  is  considered  a  regular word
+              (default ffaallssee).
+
+       BBoooolleeaann ttooLLoowweerr
+              When ttrruuee, all word strings are mapped to  lowercase.   This  is
+              convenient to combine vocabularies, language models, etc., whose
+              vocabularies differ only in the case convention (default ffaallssee).
+
+       BBoooolleeaann iissNNoonnEEvveenntt((VVooccaabbSSttrriinngg _w_o_r_d))
+
+       BBoooolleeaann iissNNoonnEEvveenntt((VVooccaabbIInnddeexx _w_o_r_d))
+              Tests a word string or index for being an ``non-event'', i.e., a
+              token  that is not assigned probability in a language model.  By
+              default, sentence-start, pauses,  and  unknown  words  are  non-
+              events.
+
+       uunnssiiggnneedd rreeaadd((FFiillee &&_f_i_l_e))
+              Reads  word strings from a file and adds them to the vocabulary.
+              For convenience, only the first word on each line is significant
+              (so  extra  information  could  be  contained  in  such a file).
+              Returns the number of words read.
+
+       vvooiidd wwrriittee((FFiillee &&_f_i_l_e,, BBoooolleeaann _s_o_r_t_e_d == ttrruuee))
+              Write the vocabulary strings to a file in  a  format  compatible
+              with rreeaadd(()).  The _s_o_r_t_e_d argument controls whether the output is
+              lexicographically sorted.
+
+       Often times one wants to manipulate not single  vocabulary  items,  but
+       strings of them, e.g., to represent sentences.  Word strings are repre-
+       sented as self-delimiting arrays of type VVooccaabbSSttrriinngg ** or VVooccaabbIInnddeexx **.
+       The last element in a string is 0 or VVooccaabb__NNoonnee, respectively.
+
+       uunnssiiggnneedd  ggeettWWoorrddss((ccoonnsstt VVooccaabbIInnddeexx **_w_i_d_s,, VVooccaabbSSttrriinngg **_w_o_r_d_s,, uunnssiiggnneedd
+       _m_a_x))
+              Extends ggeettWWoorrdd(()) to strings of word.  The result is  placed  in
+              _w_o_r_d_s, which must have room for at least _m_a_x words.  Returns the
+              actual number of indices in _w_i_d_s.
+
+       uunnssiiggnneedd aaddddWWoorrddss((ccoonnsstt VVooccaabbSSttrriinngg **_w_o_r_d_s,, VVooccaabbIInnddeexx **_w_i_d_s,,  uunnssiiggnneedd
+       _m_a_x))
+              Extends  aaddddWWoorrdd(())  to strings of indices.  The result is placed
+              in _w_i_d_s, which must have room for at least _m_a_x indices.  Returns
+              the actual number of words in _w_o_r_d_s.
+
+       uunnssiiggnneedd   ggeettIInnddiicceess((ccoonnsstt   VVooccaabbSSttrriinngg   **_w_o_r_d_s,,  VVooccaabbIInnddeexx  **_w_i_d_s,,
+       uunnssiiggnneedd _m_a_x))
+              Extends ggeettIInnddeexx(()) to strings of indices.  The result is  placed
+              in _w_i_d_s, which must have room for at least _m_a_x indices.  Returns
+              the actual number of words in _w_o_r_d_s.
+
+FFUUNNCCTTIIOONNSS
+       The following static  member  functions  are  utilities  to  manipulate
+       strings of vocabulary items, independent of a particular vocabulary.
+
+       uunnssiiggnneedd ppaarrsseeWWoorrddss((cchhaarr **_l_i_n_e,, VVooccaabbSSttrriinngg **_w_o_r_d_s,, uunnssiiggnneedd _m_a_x))
+              Parses  a character string _l_i_n_e into whitespace-delimited words.
+              On return, _w_o_r_d_s contains pointers to null-terminated substrings
+              of _l_i_n_e (whose contents is modified in the process).  _w_o_r_d_s must
+              have room for at least _m_a_x pointers.  Returns the actual  number
+              of words parsed.
+
+       uunnssiiggnneedd lleennggtthh((ccoonnsstt VVooccaabbIInnddeexx **_w_o_r_d_s))
+
+       uunnssiiggnneedd lleennggtthh((ccoonnsstt VVooccaabbSSttrriinngg **_w_o_r_d_s))
+              Returns the number items in a word string.
+
+       BBoooolleeaann ccoonnttaaiinnss((ccoonnsstt VVooccaabbIInnddeexx **_w_o_r_d_s,, VVooccaabbIInnddeexx _w_o_r_d))
+              Returns _t_r_u_e if the _w_o_r_d occurs among _w_o_r_d_s.
+
+       VVooccaabbIInnddeexx **rreevveerrssee((VVooccaabbIInnddeexx **_w_o_r_d_s))
+
+       VVooccaabbSSttrriinngg **rreevveerrssee((VVooccaabbSSttrriinngg **_w_o_r_d_s))
+              Reverses  a  string  of  words  in  place  (and  returns it as a
+              result).
+
+       vvooiidd wwrriittee((FFiillee &&_f_i_l_e,, ccoonnsstt VVooccaabbSSttrriinngg **_w_o_r_d_s))
+              Writes a string of space-delimited words to a file.
+
+       iinntt ccoommppaarree((VVooccaabbIInnddeexx _w_o_r_d_1,, VVooccaabbIInnddeexx _w_o_r_d_2))
+
+       iinntt ccoommppaarree((VVooccaabbSSttrriinngg _w_o_r_d_1,, VVooccaabbSSttrriinngg _w_o_r_d_2))
+              Compares two vocabulary items lexicographically.  Returns -1, 0,
+              +1 for less than, equal, or greater than, respectively.
+
+       iinntt ccoommppaarree((ccoonnsstt VVooccaabbIInnddeexx **_w_o_r_d_s_1,, ccoonnsstt VVooccaabbIInnddeexx **_w_o_r_d_s_2))
+
+       iinntt ccoommppaarree((ccoonnsstt VVooccaabbIInnddeexx **_w_o_r_d_s_1,, ccoonnsstt VVooccaabbIInnddeexx **_w_o_r_d_s_2))
+              Extends the order of _c_o_m_p_a_r_e_(_) to strings of words.
+
+       For compatibilty with the C library calling conventions, ccoommppaarree(()) can-
+       not be a member function of a Vocab object.   For  index-based  compar-
+       isons  the  associated  vocabulary  needs  to be set globally.  This is
+       achieved by calling the  ccoommppaarreeIInnddeexx(())  member  function  of  a  Vocab
+       object.
+
+       oossttrreeaamm &&ooppeerraattoorr<<<< ((oossttrreeaamm &&,, ccoonnsstt VVooccaabbSSttrriinngg **_w_o_r_d_s))
+
+       oossttrreeaamm &&ooppeerraattoorr<<<< ((oossttrreeaamm &&,, ccoonnsstt VVooccaabbIInnddeexx **_w_o_r_d_s))
+              These  operators  output  strings of words to a stream.  For the
+              second variant, the Vocab object used for  interpreting  indices
+              needs  to  be  identified  globally  by calling the _u_s_e_(_) member
+              function on the object.
+
+IITTEERRAATTOORRSS
+       The VVooccaabbIItteerr class provides iteration over vocabularies.  An iteration
+       returns  the elements of a Vocab in some unspecified, but deterministic
+       order.
+
+       When copied or used  in  initialization  of  other  objects,  VocabIter
+       objects  retain  the current ``position'' in an iteration.  This allows
+       nested iterations that enumerate all pairs of distinct elements, etc.
+
+       NOTE: While an iteration over a Vocab object is ongoing,  no  modifica-
+       tions  are  allowed  to  the  object, _e_x_c_e_p_t removal of the ``current''
+       vocabulary item.
+
+       VVooccaabbIItteerr((VVooccaabb &&_v_o_c_a_b,, BBoooolleeaann _s_o_r_t_e_d == ffaallssee))
+              Creates an iteration over _v_o_c_a_b.  If _s_o_r_t_e_d is set to  ttrruuee  the
+              vocabulary items will be enumerated in lexicographic order.
+
+       vvooiidd iinniitt(())
+              Reinitializes the iteration to its beginning.
+
+       VVooccaabbSSttrriinngg nneexxtt(())
+
+       VVooccaabbSSttrriinngg nneexxtt((VVooccaabbIInnddeexx &&_i_n_d_e_x))
+              Steps  the  iteration and returns the next word string.  Option-
+              ally, the associated word index is returned in _i_n_d_e_x.  Returns 0
+              if the vocabulary is exhausted.
+
+SSEEEE AALLSSOO
+       LM(3), File(3)
+
+BBUUGGSS
+       There  is  no good way to synchronize VocabIndex values across multiple
+       Vocab objects.
+
+AAUUTTHHOORR
+       Andreas Stolcke <stolcke@icsi.berkeley.edu>
+       Copyright (c) 1995-1996 SRI International
+
+
+
+SRILM                    $Date: 2019/09/09 22:35:37 $                 Vocab(3)