BACK TO CONTENTS   |    PDF   |    PREVIOUS   |    NEXT

Title

Comparison of methods for identifying differentially expressed genes across multiple conditions from microarray data

 

Authors

Yuande Tan1 & Yin Liu2*

 

Affiliation

1School of Public Health, University of Texas Health Science Center at Houston, Houston, Texas, United States of America; 2Department of Neurobiology and Anatomy, University of Texas Medical School at Houston, Houston, Texas, United States of America

 

Email

yin.liu@uth.tmc.edu; *Corresponding author

 

Article Type

Hypothesis

 

Date

Selected publications from Asia Pacific Bioinformatics Network (APBioNet) 10th International Conference on Bioinformatics (InCoB 2011), Malaysia, November 30 to December 02, 2011

 

Abstract

Identification of genes differentially expressed across multiple conditions has become an important statistical problem in analyzing large-scale microarray data. Many statistical methods have been developed to address the challenging problem. Therefore, an extensive comparison among these statistical methods is extremely important for experimental scientists to choose a valid method for their data analysis. In this study, we conducted simulation studies to compare six statistical methods: the Bonferroni (B-) procedure, the Benjamini and Hochberg (BH-) procedure, the Local false discovery rate (Localfdr) method, the Optimal Discovery Procedure (ODP), the Ranking Analysis of F-statistics (RAF), and the Significant Analysis of Microarray data (SAM) in identifying differentially expressed genes. We demonstrated that the strength of treatment effect, the sample size, proportion of differentially expressed genes and variance of gene expression will significantly affect the performance of different methods. The simulated results show that ODP exhibits an extremely high power in indentifying differentially expressed genes, but significantly underestimates the False Discovery Rate (FDR) in all different data scenarios. The SAM has poor performance when the sample size is small, but is among the best-performing methods when the sample size is large. The B-procedure is stringent and thus has a low power in all data scenarios. Localfdr and RAF show comparable statistical behaviors with the BH-procedure with favorable power and conservativeness of FDR estimation. RAF performs the best when proportion of differentially expressed genes is small and treatment effect is weak, but Localfdr is better than RAF when proportion of differentially expressed genes is large.

 

Citation

Tan & Liu . Bioinformation 7(8): 400-404 (2011)
 

Edited by

TW Tan

 

ISSN

0973-2063

 

Publisher

Biomedical Informatics

 

License

This is an Open Access article which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. This is distributed under the terms of the Creative Commons Attribution License.