BACK TO CONTENTS   |    PDF   |    PREVIOUS   |    NEXT

Title

Random forest for gene selection and microarray data classification

 

Authors

Kohbalan Moorthy & Mohd Saberi Mohamad*

 

Affiliation

Artificial Intelligence & Bioinformatics Research Group, Faculty of Computer Science and Information Systems, Universiti Teknologi Malaysia, 81310 Skudai, Johor, Malaysia

 

Email

saberi@utm.my; *Corresponding author

 

Article Type

Hypothesis

 

Date

Received August 30, 2011; Accepted September 21, 2011; Published September 28, 2011

 

Abstract

A random forest method has been selected to perform both gene selection and classification of the microarray data. In this embedded method, the selection of smallest possible sets of genes with lowest error rates is the key factor in achieving highest classification accuracy. Hence, improved gene selection method using random forest has been proposed to obtain the smallest subset of genes as well as biggest subset of genes prior to classification. The option for biggest subset selection is done to assist researchers who intend to use the informative genes for further research. Enhanced random forest gene selection has performed better in terms of selecting the smallest subset as well as biggest subset of informative genes with lowest out of bag error rates through gene selection. Furthermore, the classification performed on the selected subset of genes using random forest has lead to lower prediction error rates compared to existing method and other similar available methods.

 

Keywords

Random forest, gene selection, classification, microarray data, cancer classification, gene expression data

 

Citation

Moorthy & Mohamad. Bioinformation 7(3): 142-146 (2011)
 

Edited by

P Kangueane

 

ISSN

0973-2063

 

Publisher

Biomedical Informatics

 

License

This is an Open Access article which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. This is distributed under the terms of the Creative Commons Attribution License.