Home About IUP Magazines Journals Books Archives
     
A Guided Tour | Recommend | Links | Subscriber Services | Feedback | Subscribe Online
 
The IUP Journal of Science & Technology
Automatic Classification and Indexing of Audio Broadcast Data
:
:
:
:
:
:
:
:
:
 
 
 
 
 
 

Audio classification has been a focus area in the research of audio processing and pattern recognition. Automatic audio classification is very useful to audio indexing, content-based audio retrieval and online audio distribution, but the extraction of the most common and salient themes from unstructured raw audio data is a major challenge. The paper presents effective algorithms to automatically classify audio clips into one of the six classes: music, news, sports, advertisement, cartoon and movie. For these categories, a number of acoustic features that include linear predictive coefficients (LPC), linear predictive cepstral coefficients (LPCC) and Mel frequency cepstral coefficients (MFCC) are extracted to characterize the audio content. The auto associative neural network model (AANN) is used to capture the distribution of the acoustic feature vectors. The AANN model captures the distribution of the acoustic features of a class, and the back propagation learning algorithm is used to adjust the weights of the network to minimize the mean square error for each feature vector. This work also proposes an efficient audio indexing system which indexes movie clips using K-means clustering algorithm. Experimental results indicate that the proposed algorithms can produce satisfactory results.

 
 

Popular digital audio applications like audio CDs, MP3 audio players, radio broadcasts, TV or video DVDs, video games, digital cameras with soundtrack, digital camcorders, telephones, telephone answering machines and telephone enquiries using speech or word recognition have become indispensable in our everyday lives. Audio which includes voice, music, and various kinds of environmental sounds is an important type of media, and also a significant part of video. Compared to the research done on content-based image and video database management, very little work has been done on the audio part of the multimedia stream. However, since there are more and more digital audio databases in place these days, people begin to realize the importance of effective management for audio databases relying on audio content and audio classification, and segmentation can provide powerful tools for content management. If an audio clip can be classified automatically, it can be stored in an organized database which can dramatically improve the management of audio.

Content-based classification and retrieval of audio sound is essentially a pattern recognition problem in which there are two basic issues: feature selection and classification based on the selected features. In the first step, an audio sound is reduced to a small set of parameters using various feature extraction techniques. Linear predictive coefficients (LPC), linear predictive cepstral coefficients (LPCC), and Mel frequency cepstral coefficients (MFCC) refer to the features extracted from the audio data. In the second step, classification or categorization of algorithms ranging from simple Euclidean distance methods to sophisticated statistical techniques are carried out over these coefficients. The efficacy of an audio classification or categorization depends on the ability to capture proper audio features and accurately classify each feature set corresponding to its own class.

 
 

Science and Technology Journal, Audio Broadcast Data, Audio Processing, Digital Audio Applications, MP3 Audio Players, Digital Camcorders, Artificial Neural Network (ANN, Gaussian Mixtures Models (Gmms, Mel Frequency Cepstral Coefficients (MFCC, Movie Clip Indexing, Clustering Algorithm.