Title
|
|
Predicting highly-connected hubs in protein interaction networks by QSAR and biological data descriptors
|
Authors
|
Michael Hsing1, Kendall Byler 2, Artem Cherkasov2* | |
Affiliation
|
1Bioinformatics Graduate Program, Faculty of Graduate Studies, University of British Columbia. 100-570 West 7th Avenue. Vancouver, BC, Canada. V5T 4S6; 2Division of Infectious Diseases, Department of Medicine, Faculty of Medicine, University of British Columbia. D 452 HP, VGH. 2733 Heather Street. Vancouver, BC, Canada. V5Z 3J5
| |
|
||
Article Type
|
Hypothesis | |
Date
|
Received September 30, 2009; Accepted October 13, 2009; Published October 15, 2009 | |
Abstract |
Hub proteins (those engaged in most physical interactions in a protein interaction network (PIN) have recently gained much research interest due to their essential role in mediating cellular processes and their potential therapeutic value. It is straightforward to identify hubs if the underlying PIN is experimentally determined; however, theoretical hub prediction remains a very challenging task, as physicochemical properties that differentiate hubs from less connected proteins remain mostly uncharacterized. To adequately distinguish hubs from non-hub proteins we have utilized over 1300 protein descriptors, some of which represent QSAR (quantitative structure-activity relationship) parameters, and some reflect sequence-derived characteristics of proteins including domain composition and functional annotations. Those protein descriptors, together with available protein interaction data have been processed by a machine learning method (boosting trees) and resulted in the development of hub classifiers that are capable of predicting highly interacting proteins for four model organisms: Escherichia coli, Saccharomyces cerevisiae, Drosophila melanogaster and Homo sapiens. More importantly, through the analyses of the most relevant protein descriptors, we are able to demonstrate that hub proteins not only share certain common physicochemical and structural characteristics that make them different from non-hub counterparts, but they also exhibit species-specific characteristics that should be taken into account when analyzing different PINs. The developed prediction models can be used for determining highly interacting proteins in the four studied species to assist future proteomics experiments and PIN analyses.
| |
Keywords
|
QSAR; biological data; descriptors; protein interactions; network | |
Citation
|
Hsing et al., Bioinformation 4(4): 164-168 (2009) | |
Edited by
|
P. Kangueane
| |
ISSN
|
0973-2063
| |
Publisher
|
||
License
|
This is an Open Access article which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. This is distributed under the terms of the Creative Commons Attribution License. |