BACK TO CONTENTS   |    PDF   |    PREVIOUS   |    NEXT

Title

An improved hypergeometric probability method for identification of functionally linked proteins using phylogenetic profiles

 

Authors

Appala Raju Kotaru1, Khader Shameer2, Pandurangan Sundaramurthy3, 4 & Ramesh Chandra Joshi1*

 

Affiliation

1Department of Electronics and Computer Engineering, Indian Institute of Technology Roorkee, 247667, Roorkee, India; 2Division of Biomedical Statistics and Informatics & Division of Cardiovascular Diseases, Mayo Clinic, Rochester 55905, USA; 3Department of Mathematics, Indian Institute of Technology Roorkee, 247667, Roorkee, India; 4School of Advanced Sciences, VIT University, Vellore - 632014, Tamil Nadu, India

 

Email

rcjosfec@gmail.com; sundaramurthy.p@vit.ac.in; *Corresponding authors

 

Article Type

Hypothesis

 

Date

Received January 16, 2013; Accepted March 06, 2013; Published April 13, 2013

 

Abstract

Predicting functions of proteins and alternatively spliced isoforms encoded in a genome is one of the important applications of bioinformatics in the post-genome era. Due to the practical limitation of experimental characterization of all proteins encoded in a genome using biochemical studies, bioinformatics methods provide powerful tools for function annotation and prediction. These methods also help minimize the growing sequence-to-function gap. Phylogenetic profiling is a bioinformatics approach to identify the influence of a trait across species and can be employed to infer the evolutionary history of proteins encoded in genomes. Here we propose an improved phylogenetic profile-based method which considers the co-evolution of the reference genome to derive the basic similarity measure, the background phylogeny of target genomes for profile generation and assigning weights to target genomes. The ordering of genomes and the runs of consecutive matches between the proteins were used to define phylogenetic relationships in the approach. We used Escherichia coli K12 genome as the reference genome and its 4195 proteins were used in the current analysis. We compared our approach with two existing methods and our initial results show that the predictions have outperformed two of the existing approaches. In addition, we have validated our method using a targeted protein-protein interaction network derived from protein-protein interaction database STRING. Our preliminary results indicates that improvement in function prediction can be attained by using coevolution-based similarity measures and the runs on to the same scale instead of computing them in different scales. Our method can be applied at the whole-genome level for annotating hypothetical proteins from prokaryotic genomes.

 

Keywords

Protein function prediction, phylogenetic profiles, functional annotation, functional similarity.

 

Citation

Kotaru et al. Bioinformation 9(7): 368-374 (2013)

 

Edited by

P Kangueane

 

ISSN

0973-2063

 

Publisher

Biomedical Informatics

 

License

This is an Open Access article which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. This is distributed under the terms of the Creative Commons Attribution License.