BACK TO CONTENTS   |    PDF   |    PREVIOUS   |    NEXT

Title

 

 

 

 

 

Predicting highly-connected hubs in protein interaction networks by QSAR and biological data descriptors

 

Authors

 

Michael Hsing1, Kendall Byler 2, Artem Cherkasov2*

Affiliation

 

1Bioinformatics Graduate Program, Faculty of Graduate Studies, University of British Columbia. 100-570 West 7th Avenue. Vancouver, BC, Canada. V5T 4S6; 2Division of Infectious Diseases, Department of Medicine, Faculty of Medicine, University of British Columbia. D 452 HP, VGH. 2733 Heather Street. Vancouver, BC, Canada. V5Z 3J5

 

Email

 

artc@interchange.ubc.ca

Article Type

 

Hypothesis

Date

 

Received September 30, 2009; Accepted October 13, 2009; Published October 15, 2009

Abstract

Hub proteins (those engaged in most physical interactions in a protein interaction network (PIN) have recently gained much research interest due to their essential role in mediating cellular processes and their potential therapeutic value. It is straightforward to identify hubs if the underlying PIN is experimentally determined; however, theoretical hub prediction remains a very challenging task, as physicochemical properties that differentiate hubs from less connected proteins remain mostly uncharacterized. To adequately distinguish hubs from non-hub proteins we have utilized over 1300 protein descriptors, some of which represent QSAR (quantitative structure-activity relationship) parameters, and some reflect sequence-derived characteristics of proteins including domain composition and functional annotations. Those protein descriptors, together with available protein interaction data have been processed by a machine learning method (boosting trees) and resulted in the development of hub classifiers that are capable of predicting highly interacting proteins for four model organisms: Escherichia coli, Saccharomyces cerevisiae, Drosophila melanogaster and Homo sapiens. More importantly, through the analyses of the most relevant protein descriptors, we are able to demonstrate that hub proteins not only share certain common physicochemical and structural characteristics that make them different from non-hub counterparts, but they also exhibit species-specific characteristics that should be taken into account when analyzing different PINs. The developed prediction models can be used for determining highly interacting proteins in the four studied species to assist future proteomics experiments and PIN analyses.

 

Keywords

 

QSAR; biological data; descriptors; protein interactions; network

Citation

 

Hsing et al., Bioinformation 4(4): 164-168 (2009)

Edited by

 

P. Kangueane

 

ISSN

 

0973-2063

 

Publisher

 

Biomedical Informatics

License

 

 

This is an Open Access article which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. This is distributed under the terms of the Creative Commons Attribution License.