Information Technology Journal | Particle Swarm Optimization for Feature Selection: A Study on Microarray Data Classification

The IUP Journal of Information Technology

Particle Swarm Optimization for Feature Selection: A Study on Microarray Data Classification

Article Details

Pub. Date	:	Mar, 2016
Product Name	:	The IUP Journal of Information Technology
Product Type	:	Article
Product Code	:	IJIT21603
Author Name	:	Ajay Kumar Mishra, Subhendu Kumar Pani and Bikram Kesari Ratha
Availability	:	YES
Subject/Domain	:	Science and Technology
Download Format	:	PDF Format
No. of Pages	:	10

Price

For delivery in electronic format: Rs. 50;
For delivery through courier (within India): Rs. 50 + Rs. 25 for Shipping & Handling Charges

Download

To download this Article click on the button below:

Abstract

DNA microarray technology allows simultaneous monitoring and measuring of thousands of gene expression activation levels in a single experiment. Data mining techniques such as classification is widely used on microarray data for medical diagnosis and gene analysis. However, high dimensionality of the data affects the performance of classification and prediction. Consequently, a key issue in microarray data is feature selection and dimensionality reduction in order to achieve better classification and predictive accuracy. There are several machine learning approaches available for feature selection. In this study, particle swarm optimization technique was used for feature selection, and the classification performance of several popular classifiers was analyzed on a set of microarray datasets. The results conclude that particle swarm optimization technique provides better results compared to genetic algorithm.

Description

Microarray technology has attracted research attention in recent years. It is a promising tool to simultaneously monitor and measure the expression levels of thousands of genes of an organism in a single experiment. Basically, a microarray is a glass slide that contains thousands of spots. Each spot may contain a few million copies of identical DNA molecules that uniquely correspond to a gene. Microarray technology is commonly used in medical diagnosis and genetic analysis. For example, genome-wide expression data from cancerous tissues helps in cancer diagnosis and classification (Guyon et al., 2002; and Wahde and Szallasi, 2006). Machine learning techniques have been successfully applied on microarray data in the said diagnosis that involves classification and clustering. A significant number of new discoveries have been made from the microarray data analysis.

However, it remains a great challenge to the researchers as the nature of microarray data is inherently noisy and high dimensional. Due to biological fluctuations which are natural, variations in measurements are reflected in microarray data, resulting in implications for the analysis. Further, microarray experiment involves complex scientific procedures, materials and instruments. It is also possible that errors may commonly be introduced due to imperfection and limitation of instruments, impurity in materials and human negligence. It is likely that microarray data which contains thousands of genes may include many irrelevant and redundant features. Thus, microarray data usually suffers from the curse of dimensionality and poses hurdle for machine learning algorithms.

Keywords

Information Technology Journal, PSO, Microarray data classification, Feature selection, Microarray data analysis, Evolutional computation.