BACK TO CONTENTS   |    PDF   |    PREVIOUS   |    NEXT

Title

An alphabetic code based atomic level molecular similarity search in databases

 

Authors

Saranya Nallusamy & Samuel Selvaraj*

 

Affiliation

Department of Bioinformatics, School of Life Sciences, Bharathidasan University, Tiruchirapalli – 620024, Tamilnadu, India

 

Email

selvarajsamuel@gmail.com; *Corresponding author

 

Article Type

Hypothesis

 

Date

Received May 14, 2012; Accepted May 27, 2012; Published June 16, 2012

 

Abstract

Atomic level molecular similarity and diversity studies have gained considerable importance through their wide application in Bioinformatics and Chemo-informatics for drug design. The availability of large volumes of data on chemical compounds requires new methodologies for efficient and effective searching of its archives in less time with optimal computational power. We describe an alphabetic algorithm for similarity searching based on atom-atom bonding preference for ligands. We represented 170 cyclin-dependent kinase 2 inhibitors using strings of pre-defined alphabets for searching using known protein sequence alignment tools. Thus, a common pattern was extracted using this set of compounds for database searching to retrieve similar active compounds. Area under the receiver operating characteristic (ROC) curve was used for the discrimination of similar and dissimilar compounds in the databases. An average retrieval rate of about 60% is obtained in cross-validation using the home-grown dataset and the directory of useful decoys (DUD, formally known as the ZINC database) data. This will help in the effective retrieval of similar compounds using database search.

 

Keywords

Atom pair, CDK-2, similarity searching, molecular similarity

 

Citation

Nallusamy & Selvaraj, Bioinformation 8(11): 498-503 (2012)
 

Edited by

P Kangueane

 

ISSN

0973-2063

 

Publisher

Biomedical Informatics

 

License

This is an Open Access article which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. This is distributed under the terms of the Creative Commons Attribution License.