An in silico analytical study of lung cancer and smokers datasets from gene expression omnibus (GEO) for prediction of differentially expressed genes



Atif Noorul Hasan1, 2, Mohammad Wakil Ahmad3, Inamul Hasan Madar4, B Leena Grace5 & Tarique Noorul Hasan2, 6 *




1Dept. of Bioinformatics, Jamia Millia Islamia, New Delhi, India; 2Division of Bioinformatics, Noor-Amna Foundation for Research and Education, Bettiah, Bihar, India; 3Dept. of Software Engg, College of Computer Science, King Saud University, Riyadh, Saudi Arabia; 4Dept. of Biotechnology and Bioinformatics, Bishop Heber College, Tiruchirappalli, TN, India; 5Dept of Biotechnology, Vinayaka Missions University, Salem, TN, India; 6R & D Center, Bharathiar University, Coimbatore-641046, TN, India



Email; *Corresponding author


Article Type




Received February 26, 2015; Revised March 31, 2015; Accepted April 15, 2015; Published May 28, 2015



Smoking is the leading cause of lung cancer development and several genes have been identified as potential biomarker for lungs cancer. Contributing to the present scientific knowledge of biomarkers for lung cancer two different data sets, i.e. GDS3257 and GDS3054 were downloaded from NCBIís GEO database and normalized by RMA and GRMA packages (Bioconductor). Diffrentially expressed genes were extracted by using and were R (3.1.2); DAVID online tool was used for gene annotation and GENE MANIA tool was used for construction of gene regulatory network. Nine smoking independent gene were found whereas average expressions of those genes were almost similar in both the datasets. Five genes among them were found to be associated with cancer subtypes. Thirty smoking specific genes were identified; among those genes eight were associated with cancer sub types. GPR110, IL1RN and HSP90AA1 were found directly associated with lung cancer. SEMA6A differentially expresses in only non-smoking lung cancer samples. FLG is differentially expressed smoking specific gene and is related to onset of various cancer subtypes. Functional annotation and network analysis revealed that FLG participates in various epidermal tissue developmental processes and is co-expressed with other genes. Lung tissues are epidermal tissues and thus it suggests that alteration in FLG may cause lung cancer. We conclude that smoking alters expression of several genes and associated biological pathways during development of lung cancers. 



Hasan et al.   Bioinformation 11(5): 229-235 (2015)

Edited by

P Kangueane






Biomedical Informatics



This is an Open Access article which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. This is distributed under the terms of the Creative Commons Attribution License.