BACK TO CONTENTS   |    PDF   |    PREVIOUS   |   


Towards Personalized Medicine: An Improved De Novo Assembly Procedure for Early Detection of Drug Resistant HIV Minor Quasispecies in Patient Samples



Cindy Huang1,2, Vichetra Sam1,3, Sophie Du1,3, Tuan Le1, Anthony Fletcher1, William Lau1, Kathleen Meyer1,3, Esther Asaki1,3, Da Wei Huang1*, Calvin Johnson1*



1Center for Information Technology, National Institutes of Health, Bethesda, Maryland 10891;

2Thomas Wootton High School, Rockville, Maryland 20850; 3CSRA, Falls Church, VA 22042;




Article Type

Software Model



Received August 30, 2018; Revised September 17, 2018; Accepted September 17, 2018; Published September 18, 2018



The third-generation sequencing technology, PacBio, has shown an ability to sequence the HIV virus amplicons in their full length. The long read of PaBio offers a distinct advantage to comprehensively understand the virus evolution complexity at quasispecies level (i.e. maintaining linkage information of variants) comparing to the short reads from Illumina shotgun sequencing. However, due to the high noise nature of the PacBio reads, it is still a challenge to build accurate contigs at high sensitivity. Most of previously developed NGS assembly tools work with the assumption that the input reads are fairly accurate, which is largely true for the data derived from Sanger or Illumina technologies. When applying these tools on PacBio high-noise reads, they are largely driven by noise rather than true signal eventually leading to poor results in most cases. In this study, we propose the de novo assembly procedure, which comprises a positive focused strategy, and linkage-frequency noise reduction so that it is more suitable for PacBio high-noise reads. We further tested the unique de novo assembly procedure on HIV PacBio benchmark data and clinical samples, which accurately assembled dominant and minor populations of HIV quasispecies as expected. The improved de novo assembly procedure shows potential ability to promote PacBio technology in the field of HIV drug-resistance clinical detection, as well as in broad HIV phylogenetic studies.



De Novo Assembly, HIV, PacBio, quasispecies



Huang et al. Bioinformation 14(8): 449-454 (2018)


Edited by

P Kangueane






Biomedical Informatics



This is an Open Access article which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. This is distributed under the terms of the Creative Commons Attribution License.