ACTIVE (Alphabetical Order By Directory):

  • Nonlinear Statistical Modeling of Speech

    Hidden Markov models (HMMs) have been the primary approach to to speech recognition for almost 25 years. The goal of this project is to develop a new approach to statistical modeling of speech based on nonlinear statistics. Our first step will be to implement a speaker recognition system using a nonlinear time series approach to modeling the signal. This approach will be compared to our previous attempts to advance HMMs based on Support Vector Machines (SVMs) and Relevance Vector Machines (RVMs).

  • Internet-Accessible Speech Recognition Technology

    Speech recognition research remains a major activity for ISIP. Large vocabulary conversational speech recognition (LVCSR) is a fascinating technology that draws heavily from the diverse research areas of statistical pattern recognition, digital signal processing, artificial intelligence, linguistics, and information theory. On this web site you will find a powerful and flexible public domain speech recognition system written in C++.

Footer

INACTIVE (Alphabetical Order By Directory):

  • Vehicle Performance Monitoring System

    This project is a one-year collaboration with the Mississippi Department of Transportation (MDOT) to adapt and apply the Mississippi State University wireless web-based vehicle performance and monitoring system (VPMS)(developed in the Campus Bus Networking project) to provide capabilities to measure vehicle utilization, real-time performance monitoring and historical information on vehicle travel paths. This research will also provide key technology components to be incorporated into future transportation safety programs to be conducted by the Mississippi Department of Transportation (MDOT).

  • Campus Bus Networking

    Networked vehicles will be a cornerstone of the next generation intelligent transportation system. In this project, we are developing the hardware and software necessary to perform two-way communications with a vehicle track and to collect critical vehicle performance data. Visit our web page that tracks the campus bus system in real time.

  • IP Version 6 Research

    IP version 6 (IPv6) is the next generation Internet protocol that has the potential to drastically change the way we use the Internet as part of our everyday lives. We are exploring IPv6 and areas of research that we can contribute to the development and deployment of this next generation protocol. We are currently investigating peer to peer IPv6 networks and applications, mobile IPv6, and high performance routing.

  • Aurora Evaluation Of Speech Recognition Front Ends

    The goal of this project is to evaluate and compare the robustness of feature extraction algorithms on a large vocabulary task. The target application is cellular telephony. These evaluations are being conducted under the auspices of the Aurora Distributed Speech Recognition working group of The European Telecommunications Standards Institute (ETSI). The Wall Street Journal database (WSJ0) is being used as the basis for experiments.

  • Bulldog Stock Exchange

    As part of a unique entrepreneurship thrust in MS State's College of Engineering, EE Senior Design teams form companies. These companies are publicly traded on the Bulldog Stock Exchange. This simulation teaches our students about the intimate relationships between technology and business.

  • In-Vehicle Dialog Systems

    A voice interface is a superb tool for in-vehicle information access when your hands and eyes are busy. In this project, we are developing a dialog system that provides information about the university and its surrounding area. For example, a user can ask "Where is the nearest restaurant to my hotel?" or "How do I get from the airport to my hotel?".

  • A Japanese Command and Control Word Database

    The Japan Electronic Industry Development Association's Common Speech Data (JCSD) Corpus is an isolated phrase corpus consisting of 150 speakers (75 males/75 females) and almost 200,000 utterances. It represents an important milestone in Japanese speech recognition technology development. The JCSD Corpus was originally collected in 1986 in Japan in a nationwide project managed by Professor Shuichi Itahashi in coordination with the Japan Electronic Industry Association (JEIDA). Its importance to Japanese speech recognition technology development is, to some extent, comparable to Texas Instruments' famous 46-word speaker-dependent corpus. The JCSD Corpus was one of the first industry-standard and freely available corpora for the study of Japanese language speech recognition. Most of the competitive Japanese language speech recognition systems developed in Japan have been benchmarked on various subsets of this corpus. Hence, it is one of the most important standards of comparisons that exist for Japanese language systems.

  • Automatic Pronunciation Generation

    Correct recognition of proper nouns is critical to problems in speech understanding and applications involving voice interfaces. The recognition system requires accurate pronunciation networks for correct recognition of such words. This is a challenging problem because a large number of proper nouns have multiple valid pronunciations that do not follow typical letter-to-sound conversion rules. Generating such pronunciation dictionaries by hand is highly impractical; and classical rule-based text-to-speech systems are unsuitable for this task as they inherently generate only a single pronunciation. ISIP has developed a suite of algorithms involving stochastic neural networks, decision trees and other statistical techniques that are capable of automatically generating multiple pronunciations for proper nouns based on only the text-based spelling of the name.

  • Spoken Language Information Retrieval

    Our goal is to better understand how integration of prosodic information, speech recognition and parsing can impact the problem of information extraction from spoken documents. This research will provide initial steps towards information extraction from telephone messages, conversations, or university lectures, or from any text (such as encyclopedias), and can serve as the basis for a sorely needed sophisticated web browser technology and data mining applications.

  • Powertrain Design and Optimization

    State of the art design tools in automotive engineering still lack the power, sophistication, and automation of design tools for the electronics industry. It is our goal to fundamentally advance automotive design engineering by introducing optimization and physics-based design principles into standard industry design tools. This will allow designers to globally optimize design criteria such as size, efficiency, cost, weight, volume, and achieve unprecedented reductions in design turnaround time.

  • Robust Acoustic Modeling

    Field deployment of speech recognition technology results in a number of interesting problems, such as microphone saturation, which severely limit the performance of speech recognition engines. In this project, we study the effects of microphone saturation and develop algorithms to improve robustness to saturation, clipping, and other forms of signal degradation.

  • Robust Low Perplexity Voice Interfaces

    Robust speech recognition technology for speech recorded and transmitted over narrowband channels requires advances in several components of a speech recognition system: signal processing techniques that produce invariant feature sets; acoustic modeling and training that produce channel-independent acoustic models; noise cancellation techniques that mitigate the effects of impulsive and application-dependent transient noise. This project is a one-year collaboration with the MITRE Corporation that will result in a prototype of a near real-time system that provides a robust and flexible command and control voice interface in realistic tactical noisy environments.

  • Southern-Accented Speech

    Southern accents are underrepresented in most pubicly available databases. This had led to speculation that performance for such speakers is worse than other better-represented dialects. To test this hypothesis, a small data collection effort was recently conducted that targeted Southern-accented speakers. Data was collected from February 21 to February 25, 2000. The data collected consisted of a total of 23 speakers (13 males and 10 females) ranging in age from 18 to 56.

  • Switchboard Resegmentation

    The SWITCHBOARD Corpus (SWB) has become critical to the success of state-of-the-art LVCSR systems. Using this data, however, has not been without its share of drawbacks. Word-level transcription of SWB is difficult, and conventions associated with such transcriptions are highly controversial and often application dependent. By 1998, the quality of the SWB transcriptions for LVCSR was recognized to be less than ideal, and many years of small projects attempting to correct the transcriptions had taken their toll. In February of 1998 ISIP began a project to do a final cleanup of the SWB Corpus, and to organize and integrate all existing resources related to the data into this final release.

  • A Digital Telephone Interface For Sun Workstations

    Using the Linkon system, a speech data collection board, we have developed a fully-expandable, robust system for platform-independent collection of telephone speech data. Our object-oriented software libraries and intuitive GUI provide powerful tools with which even a novice user can efficiently prototype complex applications. Using the system one can generate programs which range from simple single-user prompt/record demonstrations to robust SWITCHBOARD-type multi-user applications.

  • Scenic Beauty Estimation of Forestry Images

    The United States Department of Agriculture and Forest Services require the automatic determination of the scenic beauty of a given forest scene. Their requirement is a consequence of rising public concern to preserve forest beauty. To achieve this, we have developed an extensive database that can support our algorithm development. The database consists of 637 unique images, each image having various subjective ratings for their scenic beauty content. The database extensively samples several dimensions of the problem including year, season, time of day, angle and treatment. In order to automatically relate the beauty of an image to the subjective beauty ratings, we have developed algorithms to extract features from the image that determine its scenic beauty. The features extracted are compared to model files using standard pattern matching paradigm. The other goal of this project is to recognize the various constituents of a forest scene. To achieve this, we will use a variety of techniques that are currently being used for speech recognition purposes. Currently we have produced algorithms that can classify images into high, medium or low scenic beauty with an accuracy of 62.3%.

  • Cognitive Assessment Using Voice Analysis

    The goal of this project is to design an effective fatigue monitoring and assessment system by characterizing changes in a human voice as a speaker becomes fatigued or stressed. A remote, near-real-time assessment system to monitor the fatigue levels of military personnel will be developed during the course of this project.

Footer
ISIP

Home | Projects | Publications | What's New | Contact | About Us | Search | Up

Please direct questions or comments to Isip_help@ece.msstate.edu

Mississippi State University
Footer