BACK TO CONTENTS   |    PDF   |    PREVIOUS   |    NEXT

Title

 

 

 

 

Prediction of enzymes and non-enzymes from protein sequences based on sequence derived features and PSSM matrix using artificial neural network

 

Authors

Pradeep Kumar Naik1,*, Viplav Shankar Mishra1, Mukul Gupta1, Kunal Jaiswal1

Affiliation

1Department of Bioinformatics and Biotechnology, Jaypee University of Information Technology, Waknaghat, Distt.-Solan, 173 215, Himachal Pradesh, India

 

Phone

91 1792 239227

Email

pknaik73@rediffmail.com; * Corresponding author

Article Type

Prediction Model

Date

received September 08, 2007; revised November 06, 2007; accepted November 09, 2007; published online December 05, 2007

Abstract

The problem of predicting the enzymes and non-enzymes from the protein sequence information is still an open problem in bioinformatics. It is further becoming more important as the number of sequenced information grows exponentially over time. We describe a novel approach for predicting the enzymes and non-enzymes from its amino-acid sequence using artificial neural network (ANN). Using 61 sequence derived features alone we have been able to achieve 79 percent correct prediction of enzymes/non-enzymes (in the set of 660 proteins). For the complete set of 61 parameters using 5-fold cross-validated classification, ANN model reveal a superior model (accuracy = 78.79 plus or minus 6.86 percent, Q(pred) = 74.734 plus or minus 17.08 percent, sensitivity = 84.48 plus or minus 6.73 percent, specificity = 77.13 plus or minus 13.39 percent). The second module of ANN is based on PSSM matrix. Using the same 5-fold cross-validation set, this ANN model predicts enzymes/non-enzymes with more accuracy (accuracy = 80.37 plus or minus 6.59 percent, Q(pred) = 67.466 plus or minus 12.41 percent, sensitivity = 0.9070 plus or minus 3.37 percent, specificity = 74.66 plus or minus 7.17 percent).

 

Keywords

 

enzymes; non enzymes; neural network; sequence derived features; PSSM

Citation

Naik et al., Bioinformation 2(3): 107-112 (2007)

Edited by

P. Kangueane

ISSN

0973-2063

 

Publisher

Biomedical Informatics

 

License

This is an Open Access article which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. This is distributed under the terms of the Creative Commons Attribution License.