Mining for single nucleotide polymorphisms and insertions / deletions in expressed sequence tag libraries of oil palm



Aykkal Riju 1, 4, Arumugam Chandrasekar 2, 4 and Vadivel Arunachalam 2, 3*



1Aikkal, Kanul P.O, Kannur, Kerala - 670564, India;  2Central Plantation Crops Research Institute, Indian Council of Agricultural Research, Kudlu P.O, Kasaragod 671124, Kerala, India; 3Genetic Tranformation Laboratory Biotechnology Theme, International Crops Research Institute for Semi-Arid Tropics, Patancheru, Hyderabad - 502324 Andhra Pradesh India; 4Bioinformatics Center Indian Institute of Spices Research, Calicut, Kerala, India


Email; * Corresponding author


Article Type




received August 24, 2007; revised November 06, 2007; accepted November 14, 2007; published online December 11, 2007



The oil palm is a tropical oil bearing tree. Recently EST-derived SNPs and SSRs are a free by-product of the currently expanding EST (Expressed Sequence Tag) data bases.  The development of high-throughput methods for the detection of SNPs (Single Nucleotide Polymorphism) and small indels (insertion / deletion) has led to a revolution in their use as molecular markers. Available (5452) Oil palm EST sequences were mined from dbEST of NCBI. CAP3 program was used to assemble EST sequences into contigs. Candidate SNPs and Indel polymorphisms were detected using the perl script auto_snip version 1.0 which has used 576 ESTs for detecting SNPs and Indel sites. We found 1180 SNP sites and 137 indel polymorphisms with frequency 1.36 SNPs / 100 bp. Among the six tissues from which the EST libraries had been generated, mesocarp had high frequency of 2.91 SNPs and indels per 100 bp whereas the zygotic embryos had lowest frequency of 0.15 per 100 bp. We also used the Shannon index to analyze the proportion of ten possible types of SNP/indels. ESTs from tissues of normal apex showed highest values of Shannon index (0.60) whereas abnormal apex had least value (0.02). The present report deals the use of Shannon index for comparing SNP/ indel frequencies mined from ESTlibraries and also confirm that the frequency of SNP occurrence in oil palm to use them as markers for genetic studies.



Elaeis guineensis; in silico; molecular markers; Shannon index


Riju et al., Bioinformation 2(4): 128-131 (2007)


Edited by

G. Yadil






Biomedical Informatics



This is an Open Access article which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. This is distributed under the terms of the Creative Commons Attribution License.