| |
3.5.1 MFCC Example:
Basic MFCCs
The
Mel-Frequency Cepstrum Coefficients
(MFCC) front end is a popular choice
front end
for state-of-the-art speech recognition systems.
This method uses 12 absolute MFCC's which are described in
Section 3.3.3
as well as the first and second-order derivatives of those coefficients.
The picture below illustrates the general steps required to build this
front end.
See our
on-line workshop notes
for a more detailed description of the theory underlying this method.
This section explains how to create this front end using our
software. Two variations of the MFCC front end are presented.
The method shown in the diagram above is described below.
Enhancements to this method that include
energy normalization
and
cepstral mean subtraction
are described in
Section 3.5.2.
In the example of a basic MFCC front end described below, we use
absolute energy, 12 MFCC's (often referred to as absolute MFCC's),
and the first and second order derivatives of these absolute
MFCC's. These are concatenated into a single feature vector as follows:
|
13
|
Absolute
|
Energy (1) and MFCCs (12)
|
|
13
|
Delta
|
First-order derivatives of the 13 absolute coefficients
|
|
13
|
Delta-Delta
|
Second-order derivatives of the 13 absolute coefficients
|
|
39
|
Total
|
Basic MFCC Front End
|
Go to the directory $ISIP_TUTORIAL/sections/s03/s03_05_p01/.
Here you'll see the file recipe_basic_mfcc.sof
To view the signal flow graph for this recipe, start Transform
Builder as shown:
isip_transform_builder recipe_mfcc.sof
To test this recipe, run the following command:
isip_transform -debug BRIEF -param recipe_basic_mfcc.sof -type text -suffix _mfcc speech.sof
You may compare your feature file, stored in the file
named speech_basic_mfcc.sof, to our reference,
speech_mfcc.sof.
to verify that you have produced the proper result.
See
Section 3.3.1
for tips on comparison and verification.
|
| |
|