Home About IUP Magazines Journals Books Archives
     
A Guided Tour | Recommend | Links | Subscriber Services | Feedback | Subscribe Online
 
The IUP Journal of Telecommunications
AM-FM Features and Their Application to Noise Robust Speech Recognition: A Review
:
:
:
:
:
:
:
:
:
 
 
 
 
 
 

The extraction and selection of the best parametric representation of acoustic signals is an important task in designing any speech recognition system. A wide range of possibilities exists for parametrically representing the speech signal for the speech recognition task such as Linear Prediction Coding (LPC), Mel Frequency Cepstrum Coefficients (MFCCs) and others. MFCCs are, currently, the most popular choice for any speech recognition system, though one of the shortcomings of MFCCs is that the signal is assumed to be stationary within the given time frame and is therefore unable to analyze non-stationary signal. To overcome this problem several researchers used different types of modulation/demodulation (AM-FM) techniques for extracting features from speech signal. In this paper, several techniques using the AM and FM model for a broadband signal such as speech and their use in feature extraction in speech recognition are outlined. Also, the use of Amplitude Modulation (AM), Frequency Modulation (FM), and modulation with Teager Energy Cepstral Coefficients (TECC) is studied.

 
 

Speech is one of the natural forms of communication. Despite many years of research, noise robustness in speech recognition remains a difficult problem. In most of the speech systems, a signal is assumed to be stationary under the analysis window. Therefore, the fine structure of speech is partially hidden by the analysis and cannot be exploited. By using Amplitude Modulation (AM) and/or Frequency Modulation (FM) the fine structure of speech can be extracted. This paper gives an overview of the different approaches used by the researchers to investigate the AM and FM in speech signals and testing their relative contributions to speech recognition.

Zhu and Alwan (2000) considered AM for speech recognition. They used the harmonic demodulation methods in computing Mel Frequency Cepstrum Coefficients (MFCCs). Different from the previous studies, Potamianos and Maragos (1999) proposed an AM-FM model to represent the speech signal. Using the AM-FM model Dimitriadiset al. (2005 and 2007), proposed robust AM-FM features for speech recognition. It has been found in the above-proposed methods that the AM and FM features characterize the very fine structure of speech and they improve the noise robust speech recognition efficiency.

 
 

Telecommunications Journal, Mel Frequency Cepstral Coefficients, MFCCs, Amplitude Modulation AM, Frequency Modulation FM, Teager Energy Cepstral Coefficients, TECC, Fast Fourier Transform, FFT, Discrete Fourier Transform, DFT, Energy Separation Algorithm, ESA, Frequency Modulation Percentage, FMP.