| |
The production system represents an implementation
of a hierarchical search approach to speech recognition,
as described in:
N. Deshmukh, A. Ganapathiraju and J. Picone,
"Hierarchical Search for Large Vocabulary Conversational
Speech Recognition,"
IEEE Signal Processing Magazine,
vol. 16, no. 5, pp. 84-107, September 1999.
Such an approach is extremely flexible, but places a significant
burden on the user in terms of system configuration.
Currently, very few speech recognition
toolkits provide flexible tools to support model configuration. In
many cases, the models or even the source code of the toolkit need
to be manually modified to accommodate new tasks.
To alleviate this burden,
we have recently released a configuration tool:
isip_network_builder. Network Builder is a powerful GUI based tool
that automates building language models and designing acoustic models.
This tutorial will give you a basic overview of how to use
Network Builder. It is structured as follows:
Installation
To use network builder, a user first needs to install our latest
release of the
Production System (r00_n10).
Let's suppose the production system
is installed at $isip_r00_n10 on the user's environment. There are two
ways to install this software:
- With CVS Client:
- cd $isip_r00_n10
- cvs checkout util/speech/isip_network_converter
- cvs checkout util/speech/isip_network_builder
- cd $isip_r00_n10/util/speech/isip_network_converter
- make install
- cd $isip_r00_n10/util/speech/isip_network_builder
- make install
- Without CVS Client:
- download
isip_network_builder and isip_network_converter
to $isip_r00_n10/util/speech.
- cd $isip_r00_n10/util/speech/
- tar -xzvf builder_and_converter.tar.gz
- cd $isip_r00_n10/util/speech/isip_network_converter
- make install
- cd $isip_r00_n10/util/speech/isip_network_builder
- make install
Note that isip_network_converter is a driver program that
is required by the isip_network_builder to support multiple
file formats.
After installation, you can run isip_network_builder to start
the program.
Interface
The basic idea of network builder is to represent knowledge sources
for speech recognizers in a generalized hierarchy. These knowledge
sources include the language model, pronunciation model and HMM topologies.
The following figure shows a typical hierarchy for a continuous
digit speech recognizer. At the top level (referred to as the word level),
the language model for sentences is represented as a
digraph with loop back arcs which allows any digit to follow any other digit
(often referred to as a null grammar).
In such a generalized hierarchical structure, each level can be
conceptually considered as the same. Phrases, words or phones are
simply symbols at different levels. Each level contains a list of
symbols, Sij, and a list of graphs, Gik,
where, i is the index of the level, j is the index of the symbol
and k is the index of the graph.
Each graph has at least two dummy vertices: the start vertex
and the terminal vertex, indicating, respectively, the start and the end points
of the graph in a search space.
The main frame of isip_network_builder is divided into three
panels. The top-left panel labeled Hierarchy shows the
generalized hierarchical structure of the hierarchy. The bottom-left
panel labeled Graph List shows the list of graphs at the
current level. The right panel is the drawing area which allows users
to construct graphs by inserting nodes and arcs. Here are the
explanations of the menu options:
- File Menu:
- New: Clears the current window, and open a blank file.
- Open: Opens a existing network builder file.
- Save: Saves the current file.
- Save As: Saves the current file to a different file name.
- Close: Closes the current file.
- Exit: Exits network builder.
- Hierarchy Menu:
- Add Level: Adds a level to the hierarchy.
- Delete Level: Removes the currently selected level from
the hierarchy.
- Symbols: Opens the symbol table for the current level.
- Graph Menu:
- Insert Start: Inserts a start node, each graph must
have a start node.
- Insert Stop: Inserts a stop node, each graph must
have a stop node.
- Insert Node: Inserts a node into the current graph.
- Insert Arc: Inserts an arc between two nodes.
- Insert Self Arc:Inserts an self connecting arc.
- Copy: Copies the selected node to the clipboard.
- Cut: Copies and deletes the selected node.
- Paste: Pastes the node from the clipboard.
- Delete: Deletes the currently selected node.
- Copy Graph: Copies the selected graph to the clipboard.
- Paste Graph: Pastes the graph from the clipboard.
Example
In this section, we provide a step-by-step example with three levels.
By the end of this example, you will be able to build a simple network using
isip_network_builder. Let's start from the beginning with
an empty screen, as shown below.
To create a new level, select Hierarchy --> Add Level from
the menu option. The result is shown below.
To change the level name, click on the level name shown in the tree.
Change the level name to word. The result is shown below.
To add a start node to the graph, select Graph --> Insert Start
from the menu option.
Place the start node by clicking on the drawing area.
The node will be inserted into the graph.
Similarly, insert another node and a stop node into the graph.
To insert an arc from the vertex Start to the vertex
Node, click on the vertex Start first. A selection
box will appear around the vertex.
Click on the vertex Node and an arc will be inserted into the graph.
The result is shown below.
Click on the vertex Node and the vertex Stop to
insert another arc.
To change the symbols represented by a vertex in a graph, right-click
the vertex. Right-click the vertex Node to change the symbols
represented by this vertex. The dialog window appears to let you
configure the symbols on the node.
Change the node name to one and add a symbol called one to this
node.
Click on the Add button. The symbol one is inserted into the
symbol list.
Click the OK button. You will see the node name is changed in the
graph.
Add the second level called phone.
Construct the pronunciation model as follows for the word one.
Add the second level called state.
Construct the HMM models for the phones w as follows.
After constructing all the models, you can save the model in a file. You can
save the model file in three different formats. Now, you have
successfully constructed a simple three-level model.
Support
For more information about this tool, please refer to the FAQ section of the
user's guide or direct your questions to
ies_help@cavs.msstate.edu.
|
| |
|