@@ -191,11 +191,39 @@ bool LatticeBoost(const TransitionModel &trans,
191191 This function implements either the MPFE (minimum phone frame error) or SMBR
192192 (state-level minimum bayes risk) forward-backward, depending on whether
193193 "criterion" is "mpfe" or "smbr". It returns the MPFE
194- criterion of SMBR criterion for this file, and outputs the posteriors (which
195- may be positive or negative) into "arc_post".
196- Note: setting one_silence_class to false gives the old traditional behavior,
197- true gives a possibly improved behavior which will tend to reduce insertions
198- in the trained model.
194+ criterion of SMBR criterion for this utterance, and outputs the posteriors (which
195+ may be positive or negative) into "post".
196+
197+ @param [in] trans The transition model. Used to map the
198+ transition-ids to phones or pdfs.
199+ @param [in] silence_phones A list of integer ids of silence phones. The
200+ silence frames i.e. the frames where num_ali
201+ corresponds to a silence phones are treated specially.
202+ The behavior is determined by 'one_silence_class'
203+ being false (traditional behavior) or true.
204+ Usually in our setup, several phones including
205+ the silence, vocalized noise, non-spoken noise
206+ and unk are treated as "silence phones"
207+ @param [in] lat The denominator lattice
208+ @param [in] num_ali The numerator alignment
209+ @param [in] criterion The objective function. Must be "mpfe" or "smbr"
210+ for MPFE (minimum phone frame error) or sMBR
211+ (state minimum bayes risk) training.
212+ @param [in] one_silence_class Determines how the silence frames are treated.
213+ Setting this to false gives the old traditional behavior,
214+ where the silence frames (according to num_ali) are
215+ treated as incorrect. However, this means that the
216+ insertions are not penalized by the objective.
217+ Setting this to true gives the new behaviour, where we
218+ treat silence as any other phone, except that all pdfs
219+ of silence phones are collapsed into a single class for
220+ the frame-error computation. This can possible reduce
221+ the insertions in the trained model. This is closer to
222+ the WER metric that we actually care about, since WER is
223+ generally computed after filtering out noises, but
224+ does penalize insertions.
225+ @param [out] post The "MBR posteriors" i.e. derivatives w.r.t to the
226+ pseudo log-likelihoods of states at each frame.
199227*/
200228BaseFloat LatticeForwardBackwardMpeVariants (
201229 const TransitionModel &trans,
@@ -212,12 +240,25 @@ BaseFloat LatticeForwardBackwardMpeVariants(
212240 used in our normal MMI training recipes, where it's instead done using various command
213241 line programs that each do a part of the job. This function was written for use in
214242 neural-net MMI training.
215- If drop_frames is true, it will not compute any posteriors on frames where the num and
216- den have disjoint pdf-ids.
217- If "convert_to_pdf_ids" is true, it will convert the output to be at the level of pdf-ids,
218- not transition-ids.
219- If "cancel" is true, it will cancel out any positive and negative parts from
220- the same transition-id (or pdf-id, if convert_to_pdf_ids == true).
243+
244+ @param [in] trans The transition model. Used to map the
245+ transition-ids to phones or pdfs.
246+ @param [in] lat The denominator lattice
247+ @param [in] num_ali The numerator alignment
248+ @param [in] drop_frames If "drop_frames" is true, it will not compute any
249+ posteriors on frames where the num and den have disjoint
250+ pdf-ids.
251+ @param [in] convert_to_pdf_ids If "convert_to_pdfs_ids" is true, it will
252+ convert the output to be at the level of pdf-ids, not
253+ transition-ids.
254+ @param [in] cancel If "cancel" is true, it will cancel out any positive and
255+ negative parts from the same transition-id (or pdf-id,
256+ if convert_to_pdf_ids == true).
257+ @param [out] arc_post The output MMI posteriors of transition-ids (or
258+ pdf-ids if convert_to_pdf_ids == true) at each frame
259+ i.e. the difference between the numerator
260+ and denominator posteriors.
261+
221262 It returns the forward-backward likelihood of the lattice. */
222263BaseFloat LatticeForwardBackwardMmi (
223264 const TransitionModel &trans,
0 commit comments