BACK TO CONTENTS   |    PDF   |    PREVIOUS   |    NEXT

Title

Cluster analysis identifies aminoacid compositional features that indicate Toxoplasma gondii adhesin proteins

 

Authors

Ailan F Arenas1*, Gladys E Salcedo2, Diego M Moncada1, Diego A Erazo1, Juan F Osorio1 & Jorge E Gomez-Marin1

 

Affiliation

1Grupo de Parasitología Molecular (GEPAMOL), Centro de Investigaciones Biomédicas, Universidad del Quindío, Armenia, Colombia; 2Grupo de Investigación y Asesoría en Estadística, Universidad del Quindío, Armenia, Colombia.

 

Email

aylanfarid@yahoo.com; *Corresponding author

 

Article Type

Hypothesis

 

Date

Received August 30, 2012; Accepted September 03, 2012; Published October 01, 2012

 

Abstract

Toxoplasma gondii invade host cells using a multi-step process that depends on the regulated secretion of adhesions. To identify key primary sequence features of adhesins in this parasite, we analyze the relative frequency of individual amino acids, their dipeptide frequencies, and the polarity, polarizability and Van der Waals volume of the individual amino acids by using cluster analysis. This method identified cysteine as a key amino acidin the Toxoplasma adhesin group. The best vector algorithm of non-concatenated features was for 2 attributes:the single amino acid relative frequency and the dipeptide frequency. Polarity, polarizability and Van der Waals volume were not good classificatory attributes. Single amino acid attributes clustered unambiguously 67 apicomplexan hypothetical adhesins. This algorithm was also useful for clustering hypothetical Toxoplasma target host receptors. All of the cluster performances had over 70% sensitivity and 80% specificity. Compositional aminoacid datacan be useful forimprovingmachine learning-based prediction software when homology and structural data are not sufficient.

 

Keywords

Cluster analysis, adhesin, Toxoplasma

 

Citation

Arenas et al. Bioinformation 8(19): 916-923 (2012)
 

Edited by

P Kangueane

 

ISSN

0973-2063

 

Publisher

Biomedical Informatics

 

License

This is an Open Access article which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. This is distributed under the terms of the Creative Commons Attribution License.