SSR discovery

From Applied Bioinformatics Group
Jump to: navigation, search

Molecular genetic markers represent one of the most powerful tools for the analysis of plant genomes and the association of heritable traits with underlying genetic variation. One form of sequence based marker, Simple Sequence Repeats (SSRs), also known as microsatellites, now predominate applications in modern plant genetic analysis and are one of the most powerful genetic markers. The reducing cost of DNA sequencing and increasing availability of large sequence data sets permits the mining of this data for large numbers of SSRs. These may then be used in applications such as genetic linkage analysis and trait mapping, diversity analysis, association studies, and marker assisted selection.

SSRs are short stretches of DNA sequence occurring as tandem repeats of mono-, di-, tri-, tetra-, penta- and hexa-nucleotides. They are highly polymorphic and informative markers. The high level of polymorphism is due to mutation affecting the number of repeat units. The value of SSRs is due to their genetic co-dominance, abundance, dispersal throughout the genome, multi-allelic variation and high reproducibility. They are also widely and ubiquitously distributed throughout eukaryotic genomes. These properties provide a number of advantages over other molecular markers, namely that multiple SSR alleles may be detected at a single locus using a simple PCR based screen, very small quantities of DNA are required for screening, and analysis is amenable to automated allele detection and sizing. The hypervariability of SSRs among related organisms makes them excellent markers for a wide range of applications, including genetic mapping, molecular tagging of genes, genotype identification, analysis of genetic diversity, phenotype mapping and marker assisted selection. SSRs demonstrate a high degree of transferability between species, as PCR primers designed to an SSR within one species frequently amplify a corresponding locus in related species, enabling comparative genetic and genomic analysis. The SSRs that are transferable between species enable studies of synteny and genome rearrangement across taxa.

The rapid expansion in the availability of sequence data enables the identification of genes and markers underlying key traits for application in molecular breeding and germplasm enhancement. When SSRs are derived from Expressed Sequence Tags (ESTs), they become gene specific. These features make EST-SSRs highly valuable markers for the construction and comparison of genetic maps. The development of SSRs has traditionally been limited by the time consuming and labour intensive requirement to construct, enrich and sequence genomic libraries. However, the identification of SSRs from expressed sequences, produced during gene discovery projects, provides a rich source of valuable molecular markers. Furthermore, these sequences, and the markers developed from them, are a valuable resource for comparative genomic studies.

Previously we have applied the tool SSRPrimer for the rapid discovery of Simple Sequence Repeats (SSRs) from bulk sequence data (Robinson et al., 2004) and from all sequences in Genbank using SSRTaxonomy Tree (Jewell et al., 2006). SSRPrimer combines Sputnik, a program to identify SSRs in the sequence, with Primer3, to design PCR primers for amplifying the SSR. SSRTaxonomy Tree applies this tool on a species wide scale. While these tools enables the rapid and cost effective discovery of SSRs, laboratory assessment is still required to measure their polymorphic status and therefore their applicability to genetic studies. The in silico enrichment of discovered SSRs, for those likely to be polymorphic, would save considerable expense in laboratory assessment through reducing the number of interrogation primers designed to monomorphic SSRs. We have therefore developed a tool for the in silico identification of polymorphic SSRs from assembled redundant expressed sequences.


Tools

SSRPrimer

SSRTaxonomy Tree

SSRPoly


References:

  • Jewell E, Robinson A, Savage D, Erwin T, Love CG, Lim GAC, Li X, Batley J, Spangenberg GC and Edwards D. (2006) SSR Primer and SSR Taxonomy Tree: Biome SSR discovery. Nucleic Acids Research 34: W656–W659
  • Robinson AJ, Love CG, Batley J, Barker G and Edwards D. (2004) Simple Sequence Repeat Marker Loci Discovery using SSR Primer. Bioinformatics 20: 1475-1476
  • Hong CP, Piao ZY, Kang TW, Batley J, Yang TJ, Hur YK, Bhak J, Edwards D and Lim YP. (2007) Genomic distribution of Simple Sequence Repeats in Brassica rapa. Molecules and Cells 23 (3) 349-356
  • Batley J, Hopkins CJ, Cogan NOI, Hand M, Jewell E, Kaur J, Kaur S, Li X, Ling AE, Love C, Mountford H, Todorovic M, Vardy M, Walkiewicz M, Spangenberg GC and Edwards D. (2007) Identification and characterisation of Simple Sequence Repeat (SSR) markers from Brassica napus expressed sequences. Molecular Ecology Notes 7: 886–889.
  • Hopkins CJ, Cogan NOI, Hand M, Jewell E, Kaur J, Li X, Lim GAC, Ling AE, Love C, Mountford H, Todorovic M, Vardy M, Spangenberg GC, Edwards D and Batley J. (2007) Sixteen new simple sequence repeat markers from Brassica juncea expressed sequences and their cross-species amplification. Molecular Ecology Notes 7: 697-700
  • Ling AE, Kaur J, Burgess B, Hand M, Hopkins CJ, Li X, Love CG, Vardy M, Walkiwiecz M, Spangenberg G, Edwards D and Batley J. (2007) Characterisation of Simple Sequence Repeat markers derived in silico from Brassica rapa Bacterial Artificial Chromosome sequences and their application in Brassica napus. Molecular Ecology Notes 7: 273-277
  • Burgess B, Mountford H, Hopkins CJ, Love C, Ling AE, Spangenberg G, Edwards D and Batley J. (2006) Identification and characterisation of Simple Sequence Repeat (SSR) markers derived in silico from Brassica oleracea genome shotgun sequences. Molecular Ecology Notes 6: 1191-1194
  • Keniry A, Hopkins CJ, Jewell E, Morrison B, Spangenberg GS, Edwards D and Batley J. (2006) Identification and characterisation of Simple Sequence Repeat (SSR) markers from Fragaria x ananassa expressed sequences. Molecular Ecology Notes 6: 319-322
  • Mortimer J, Batley J, Love C, Logan E and Edwards D. (2005) Simple Sequence Repeat (SSR) and GC distribution in the Arabidopsis thaliana genome. Journal of Plant Biotechnology 7: 17-25



Back to main page