BACK TO CONTENTS   |    PDF   |    PREVIOUS   |    NEXT

Title

 

 

 

 

UPIC: Perl scripts to determine the number of SSR markers to run

 

Authors

Renee S. Arias, Linda L. Ballard, Brian E. Scheffler

 

Affiliation

USDA/ARS Genomics and Bioinformatics Research Unit, 141 Experiment Station Rd., Stoneville, MS 38776

 

Email

 

brian.scheffler@ars.usda.gov

 

Article Type

 

Software

Date

 

received March 16, 2009; revised April 6, 2009; accepted April 11, 2009; published April 21, 2009

 

Abstract

We introduce here the concept of Unique Pattern Informative Combinations (UPIC), a decision tool for the cost-effective design of DNA fingerprinting/genotyping experiments using simple-sequence/tandem repeat (SSR/STR) markers. After the first screening of SSR-markers tested on a subset of DNA samples, the user can apply UPIC to find marker combinations that maximize the genetic information obtained by a minimum or desirable number of markers. This allows a cost-effective planning of future experiments. We have developed Perl scripts to calculate all possible subset combinations of SSR markers, and determine based on unique patterns or alleles, which combinations can discriminate among all DNA samples included in a test. This makes UPIC an essential tool for optimizing resources when working with microsatellites. An example using real data from eight markers and 12 genotypes shows that UPIC detected groups of as few as three markers sufficient to discriminate all 12-DNA samples. Should markers for future experiments be chosen based only on polymorphism-information content (PIC), the necessary number of markers for discrimination of all samples cannot be determined. We also show that choosing markers using UPIC, an informative combination of four markers can provide similar information as using a combination of six markers (23 vs. 25 patterns, respectively), granting a more efficient planning of experiments. Perl scripts with documentation are also included to calculate the percentage of heterozygous loci on the DNA samples tested and to calculate three PIC values depending on the type of fertilization and allele frequency of the organism.

 

Keywords

simple sequence repeats, software, best SSR markers, microsatellites, GeneMapper

 

Availability

http://www.ars.usda.gov/msa/jwdsrc/gbru

Citation

Arias et al. Bioinformation 3(8): 352-360 (2009)

 

Edited by

P. Kangueane

 

ISSN

0973-2063

 

Publisher

Biomedical Informatics

License

 

 

This is an Open Access article which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. This is distributed under the terms of the Creative Commons Attribution License.