Files
b2txt25/language_model/srilm-1.7.3/man/man5/classes-format.5
2025-07-02 12:18:09 -07:00

37 lines
1.2 KiB
Groff

.\" $Id: classes-format.5,v 1.3 2007/12/19 22:08:05 stolcke Exp $
.TH classes-format 5 "$Date: 2007/12/19 22:08:05 $" "SRILM File Formats"
.SH NAME
classes-format \- File format for word class definitions
.SH SYNOPSIS
.nf
\fIclass\fP [\fIp\fP] \fIword1\fP \fIword2\fP ...
.fi
.SH DESCRIPTION
Various programs dealing with word classes use this format to define
the posssible expansions of classes and their respective probabilities.
Each expansion appears on a separate line as in
the synopsis, where
.I class
names a word class,
.I p
gives the probability for the class expansion, and
.I "word1 word2 ..."
defines the word string that the class expands to.
If
.I p
is omitted it is assumed to be 1.
(All expansion probabilities for a given class should sum to one,
although this is not necessarily enforced by the software and would
lead to improper models.)
.PP
Note that the concept of word class here is generalized to include
``multi-words'', or phrases consisting of more than one word.
All expansions must have at least one word.
Certain models might impose more restrictive formats.
.SH "SEE ALSO"
ngram(1), ngram-class(1), disambig(1), training-scripts(1), pfsg-scripts(5).
.SH AUTHOR
Andreas Stolcke <stolcke@speech.sri.com>.
.br
Copyright 1999 SRI International