Home Software Docs Tutorials Demos Databases Dictionaries Models Research Support Mailing Lists What's New
You are here: Evaluation / Alphadigits Tutorial / Prototype System / Tutorials / Software / Home  

 
 
  • Recognition of the Test Data:

    At this point we assume that a trained set of models are ready to be used to recognize the test data. If not, then please see this section for details on training a set of models.

    The decoder used for the evaluation is trace_projector. This decoder can be used in several modes. For the alphadigits task, we have defined a word graph in the data preparation section of the manual. So, we use the decoder in the Lattice Rescoring mode. Since our models are cross-word triphones, we need to set the appropriate options and parameters.

    Adetailed tutorial on how to use multiple LMs simultaneously for decoding and switch LMs dynamically at runtime can be found here.

    • Required Data:

      Features for the test data - Use the extract_feature utility.

      Grammar/Lattice for the test data - We generated this in the data preparation stage. Note that we need one lattice file per test utterance even if it is the same lattice as is the case with alphadigits.

      Models from the final pass of training - states, model definitions, phone map, transitions and the lexicon.

    • Using the Decoder:

      Here is a parameter file we would use for the recognition process. Note that we specify the context_mode tag to be "cross_word".

      trace_projector -p params.text

      The pruning thresholds are set based on the complexity of the recognition task. Since alphadigits is a relatively simple task, the thresholds can be tight. In case of tasks like Switchboard, these would be much higher. The "wdpenalty" option is very useful in controlling the insertion of short words like the letter "o" in the alphadigits task.


  • Word Error Rate (WER) Computation:

    We use the standard NIST scoring software for evaluation of recognition performance. We do, however, have a script that does the necessary format conversions to allow "sclite" to do the WER computation.

    The NIST tools expect a reference transcript in what we call the score format. The isip_eval utility is used to convert the output from the recognizer into this format and to then to evaluate the recognition performance using the NIST tools. Here is the score file generated from the evaluation data we used. Since we ran the decoder to output data in the "word" format, we use isip_eval as follows:

    isip_eval isip_word output.list ref.score output.score

    The script outputs the error statistics to stdout and also creates a report file that contains the alignment, confusion pairs etc.



prev

next

top
   
   
    Help / Support / Site Map / Contact Us / ISIP Home