BACK TO CONTENTS   |    PDF   |    PREVIOUS   |    NEXT

Title

Simplifier: a web tool to eliminate redundant NGS contigs

 

Authors

Rommel Thiago Jucá Ramos1, Adriana Ribeiro Carneiro1, Vasco Azevedo2, Maria Paula Schneider1, Debmalya Barh3*& Artur Silva1

 

Affiliation

1Instituto de Ciências Biológicas, Universidade Federal do Pará, Belém, PA, Brazil; 2Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil; 3Centre for Genomics and Applied Gene Technology, Institute of Integrative Omics and Applied Biotechnology (IIOAB), Nonakuri, Purba Medinipur, WB-721172, India.

 

Email

dr.barh@gmail.com; *Corresponding author

 

Article Type

Software

Date

Received August 22, 2012; Accepted August 28, 2012; Published October 13, 2012

 

Abstract

Modern genomic sequencing technologies produce a large amount of data with reduced cost per base; however, this data consists of short reads. This reduction in the size of the reads, compared to those obtained with previous methodologies, presents new challenges, including a need for efficient algorithms for the assembly of genomes from short reads and for resolving repetitions. Additionally after abinitio assembly, curation of the hundreds or thousands of contigs generated by assemblers demands considerable time and computational resources. We developed Simplifier, a stand-alone software that selectively eliminates redundant sequences from the collection of contigs generated by ab initio assembly of genomes. Application of Simplifier to data generated by assembly of the genome of Corynebacterium pseudotuberculosis strain 258 reduced the number of contigs generated by ab initio methods from 8,004 to 5,272, a reduction of 34.14%; in addition, N50 increased from 1 kb to 1.5 kb. Processing the contigs of Escherichia coli DH10B with Simplifier reduced the mate-paired library 17.47% and the fragment library 23.91%. Simplifier removed redundant sequences from datasets produced by assemblers, thereby reducing the effort required for finalization of genome assembly in tests with data from Prokaryotic organisms.

 

Availability

Simplifier is available at http://www.genoma.ufpa.br/rramos/softwares/simplifier.xhtml. It requires Sun jdk 6 or higher.

 

Keywords

NGS sequencing, ab initio assembly of genomes, redundant sequences

 

Citation

Ramos et al. Bioinformation 8(20): 996-999 (2012)
 

Edited by

P Kangueane

 

ISSN

0973-2063

 

Publisher

Biomedical Informatics

 

License

This is an Open Access article which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. This is distributed under the terms of the Creative Commons Attribution License.