| |
4.5.1 N-Best Generation
Until this point, we have only been determining the best hypothesis
for an utterance. What if we need to determine the ten or twenty best
hypotheses? Of course the software makes many other "guesses", but we
usually only need to know the best guess. For this experiment, we will
generate the five best hypotheses and store the results in a transcription
database.
Go to the directory
$ISIP_TUTORIAL/sections/s04/s04_05_p01/
Run the command:
isip_recognize -parameter_file params_decode.sof -list $ISIP_TUTORIA./databases/lists/identifiers_test.sof -verbose ALL
This will produce the following output:
Command: isip_recognize -parameter_file params_decode.sof -list $ISIP_TUTORIA./databases/lists/identifiers_test.sof -verbose ALL
Version: 1.16 (not released) 2002/09/25 00:20:53
loading front-end: ../../recipes/frontend.sof
loading language model: ../../models/lm_model_update.sof
loading acoustic model: ../../models/ac_model_update.sof
loading audio database: ./audio_db.sof
opening the output file: ./hypo.out
processing file 1 (st_9z59362a): ../../features/st_9z59362a.sof
ref:
hyp[1]: NINE ZERO FIVE NINE THREE SIX TWO
score[1]: -18749.466796875 frames: 294
ref:
hyp[2]: NINE ZERO FOUR NINE THREE SIX TWO
score[2]: -18750.454832132 frames: 294
ref:
hyp[3]: NINE ZERO FIVE NINE THREE SIX SIX
score[3]: -18751.543258765 frames: 294
ref:
hyp[4]: FIVE ZERO FIVE NINE THREE SIX TWO
score[4]: -18753.238435132 frames: 294
ref:
hyp[5]: FIVE ZERO FIVE THREE THREE SIX TWO
score[5]: -18755.321462354 frames: 294
processed 1 file(s) successfully, attempted 1 file(s), 294 frame(s)
In this experiment, the output file contains N best hypotheses
instead of just one best hypothesis. The score values indicate the
likelihood of each of the N best hypotheses. The lower the likelihood,
the lower the confidence is about a hypothesis. Note that the best
hypothesis has the best overall likelihood.
As discussed above, the output generated by N-Best begins with the
best hypothesis followed by other possibilities in decreasing
likelihood. This output format consisting of a list of hypotheses is
commomly known as a N-best list. N-best lists are popular because they
can be postprocessed by many natural language processing tools, and
can be reordered based on their grammatical and semantic content. The
N-best lists are also popular in the hybrid HMM-SVM based systems. In
such systems, the first recognition pass is performed using the
conventional continuous density HMM based system to generate the
N-best lists. These N-best lists are then used to perform a second
recognition pass using an SVM based system.
|
| |
|