Computer Sciences Journal | Speaker Recognition Using Gaussian Mixture Model

The IUP Journal of Computer Sciences :

Speaker Recognition Using Gaussian Mixture Model

Article Details

Pub. Date	:	Apr, 2014
Product Name	:	The IUP Journal of Computer Sciences
Product Type	:	Article
Product Code	:	IJCS11404
Author Name	:	Satyendra Nath Mandal, Abhranil Chatterjee and Debayan Das
Availability	:	YES
Subject/Domain	:	Management
Download Format	:	PDF Format
No. of Pages	:	18

Price

For delivery in electronic format: Rs. 50; For delivery through courier (within India): Rs. 50 + Rs. 25 for Shipping & Handling Charges

Download

To download this Article click on the button below:

Abstract

Speaker Recognition is the computing task of recognizing a speaker using some speaker- dependent characteristic of his/her voice, known as feature. The recognition process consists of three main stages: feature extraction, speaker modeling and speaker pruning or decision making. In this paper, numerical features of each speaker have been reduced using k-means clustering. The result of k-means clustering is improved by applying Gaussian Mixture Model to reduce the time complexity. The feature of the unknown speaker is compared with that of the speakers stored in the codebook. The speaker is identified from the codebook based on the maximum probability using likelihood function. It is observed that the accuracy of speaker recognition can be improved using this approach.

Description

Speaker Recognition is the computing task of recognizing a speaker using the speaker-dependent part of his voice known as feature. It can be of two types, text-dependent and text-independent. In text-dependent speaker recognition system, the speakers utter a predefined phrase known to the system, whereas text-independent speaker recognition system can process any phrase that makes it robust.

However, nowadays more and more attention is being paid to speaker recognition field. Speaker recognition involves two applications: speaker identification and speaker verification. Speaker identification is the process of automatically recognizing the person speaking on the basis of individual information included in the speech waves. This technique makes it possible to use the speaker’s voice to verify his/her identity and control access to services such as voice dialing, banking by telephone, telephone shopping, database access services, information services, voice mail, security control for confidential information areas, and remote access to computers.

Keywords

Computer Sciences Journal, Mel Frequency Cepstral Coefficient (MFCC), Gaussian Mixture Model (GMM), Expectation-Maximization (EM) algorithm, k-means clustering, Speaker pruning and Codebook