BACK TO CONTENTS   |    PDF   |    PREVIOUS   |    NEXT

Title

A novel feature extraction approach for microarray data based on multi-algorithm fusion

 

Authors

Zhu Jiang1,2* & Rong Xu1

 

Affiliation

1State Key Laboratory of Astronautic Dynamics, Xi’an Satellite Control Center, Xi’an, China; 2Key Laboratory of Fluid and Power Machinery, Ministry of Education, Xihua University, Chengdu, China

 

Email

hill5525@163.com; *Corresponding author

 

Article Type

Hypothesis

 

Date

Received December 09, 2014; Revised December 31, 2014; Accepted January 08, 2015; Published January 30, 2015

 

Abstract

Feature extraction is one of the most important and effective method to reduce dimension in data mining, with emerging of high dimensional data such as microarray gene expression data. Feature extraction for gene selection, mainly serves two purposes. One is to identify certain disease-related genes. The other is to find a compact set of discriminative genes to build a pattern classifier with reduced complexity and improved generalization capabilities. Depending on the purpose of gene selection, two types of feature extraction algorithms including ranking-based feature extraction and set-based feature extraction are employed in microarray gene expression data analysis. In ranking-based feature extraction, features are evaluated on an individual basis, without considering inter-relationship between features in general, while set-based feature extraction evaluates features based on their role in a feature set by taking into account dependency between features. Just as learning methods, feature extraction has a problem in its generalization ability, which is robustness. However, the issue of robustness is often overlooked in feature extraction. In order to improve the accuracy and robustness of feature extraction for microarray data, a novel approach based on multi-algorithm fusion is proposed. By fusing different types of feature extraction algorithms to select the feature from the samples set, the proposed approach is able to improve feature extraction performance. The new approach is tested against gene expression dataset including Colon cancer data, CNS data, DLBCL data, and Leukemia data. The testing results show that the performance of this algorithm is better than existing solutions.

 

Keywords

feature extraction; robustness; microarray data; multi-algorithm fusion

 

Citation

Jiang & Xu, Bioinformation 11(1): 027-033 (2015)
 

Edited by

P Kangueane

 

ISSN

0973-2063

 

Publisher

Biomedical Informatics

 

License

This is an Open Access article which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. This is distributed under the terms of the Creative Commons Attribution License.