Sugarcane is an important vegetatively propagated crop which
is cultivated for its sugar-rich stalks. It contributes an estimated
75 % of the world's sucrose supply with its mature stem capable
of accumulating 12–16 % of its fresh weight and approx.
50 % of its dry weight as sucrose (Bull and Glasziou, 1963
Sugarcane originated in South-east Asia and New Guinea (Lebot, 1999
Modern cultivated sugarcane (Saccharum
spp.) is a hybrid complex
originating from crosses between S. officinarum
L. and S. spontaneum
L., and in some lineages S. sinense
Roxb., or S. barberi
(Edme et al., 2005
). Limited numbers of clones, and hence genetic
variation, of the two major progenitors have been captured by
commercial breeding programmes. Sugarcane, like sorghum, is
a relatively recently domesticated species with little of the
available genetic diversity having been incorporated or actively
analysed for introgression into domesticated varieties. Breeding
programmes in the early 1900s focused on hybridization of S.officinarum
clones but soon progressed to interspecific crosses
incorporating S. spontaneum
. This resulted in improved agronomic
traits, such as ratooning and disease resisitance, but required
a backcrossing programme to S. officinarum
, called ‘nobilization’,
to elevate the sucrose content (Roach, 1989
; Edme et al., 2005
Since then the majority of breeding programmes have focused
on intercrossing between the hybrids, though in recent decades
the larger increases in genetic gain have been made by incorporating
more diverse germplasm into the cultivated backgrounds (Edme et al., 2005
Taxonomy of sugarcane
Sugarcane belongs to the genus Saccharum, first establishedby Linnaeus in Species Plantarum in 1753 with two species: S.officinarum and S. spicalum L. The original classification ofLinnaeus' has since been revised to contain six species: S.officinarum, known as the noble cane; S. spontaneum L., S. robustumE.W. Brandes & Jeswiet ex Grassl, and S. edule Hassk., classifiedas wild species; and S. sinense Roxb. and S. barberi Jeswiet,classified as ancient hybrids (Buzacott, 1965; Daniels and Roach, 1987;D'Hont and Layssac, 1998). The genus falls in the tribe Andropogoneaein the grass family, Poaceae, that includes other tropical grassessuch as Sorghum and Zea (maize). Closely related to Saccharumare another four genera (Erianthus section Ripidum, Miscanthussection Diandra, Narenga and Sclerostachya) that purportedlyreadily interbreed, forming the ‘Saccharum complex’(Daniels and Roach, 1987). They have in common a high levelof polyploidy and aneuploidy (unbalanced number of chromosomes)that creates a challenge for both the taxonomist and molecularbiologist (Daniels and Roach, 1987; Sreenivasan et al., 1987).
The sugarcane genome
The complexity and size of the sugarcane genome is a major limitationin genetic improvement. Whilst continued selective breedingfor enhanced sucrose accumulation has been able to achieve overhalf of the yield increase in the past 50 years, it has beenreported as having reached a plateau due to limits to the genepool exploited in traditional breeding programmes (Mariotti,2002). Individual research programmes, however, have been shownto still be making significant annual genetic gains by maintaininga diverse gene pool (Edme et al., 2005). The employment of newtechnologies to assist in the association of traits with geneticmarkers and genetic maps can aid in achieving further yieldincreases in breeding programmes.
Most sugarcane cultivars contain more than 100 chromosomes whichcan be assigned to eight homology groups (Rossi et al., 2003;Aitken et al., 2005). Over the past two decades, studies utilizingvarious molecular techniques to unravel the complexity of thisimportant crop species have provided a greater understandingof its complex genetic make-up (Bonierbale et al., 1988; Wu et al., 1992;D'Hont, 1994; Sills et al., 1995; Grivet et al., 1996; Ming et al., 2001;Rossi et al., 2003). Significant achievements include milestonesthat demonstrate the use of single (markers present on one chromosomeonly) and double dose (marker present on two chromosomes) markersfor mapping and QTL analysis (Ming et al., 2001, 2002; Hoarau et al., 2002;Aitken et al., 2004), and large-scale EST sequencing projectsby SUCEST-Sugar Cane EST Genome Project (Vettore et al., 2001),SASRI-South African Sugar Research Institute (Carson and Botha, 2000),UGA-University of Georgia, USA (Ma et al., 2004), and CSIRO-Australia's Commonwealth Scientific and Industrial ResearchOrganization (Casu et al., 2004). Unfortunately, despite theseachievements, the pace of progress with sugarcane genomics haslagged behind that achieved with other agricultural crops (Ramsay et al., 2000;Delseny et al., 2001; Mullet et al., 2002).
Analysis of variation in the sugarcane genome
In 1997, an effort was made by the International Consortiumfor Sugarcane Biotechnology to develop and evaluate simple sequencerepeats (SSRs) or microsatellite sequences as a marker systemfor sugarcane. Markers were developed from an enriched microsatellitelibrary and were shown to have the capacity to distinguish betweensugarcane genotypes due to their ability to detect large numbersof alleles (Cordeiro et al., 2000). To date, this marker systemhas delivered a number of applications that have advanced bothsugarcane research and breeding. Published applications includethe mapping of alleles generated from 72 SSR primer pairs ontoa genetic map constructed on the Australian hybrid cultivar,Q165A (Aitken et al., 2005); validation of the introgressionof genes into F1 hybrids of crosses made between S. spontaneumand elite commercial clones (Pan et al., 2004); the confirmationof fertile intergeneric F1 hybrids of S. officinarum and E.arundinaceus as well as backcross (BC1) progeny from the F1to hybrid sugarcane (Cai et al., 2005); and the use of the markersto register and confirm sugarcane varieties by the United StatesDepartment of Agriculture (USDA) (Tew et al., 2003). SSR markershave also been used to draw useful information on the relationshipsbetween various members of the ‘Saccharum complex’(Cordeiro et al., 2003; Cai et al., 2005) as well as relationshipsbetween clonal cultivars of hybrid canes (Pan et al., 2003a).A fingerprint database of major Australian sugarcane cultivarshas been developed using these markers (Piperidis et al., 2001)as has molecular genotyping of elite clones produced by theUSDA (Pan et al., 2003a, b).
High-throughput SNP genotyping
High-throughput genotyping technologies based on single nucleotidepolymorphisms (SNPs) or small-scale insertion/deletions (indel)could become efficient alternative tools for traditional markersbecause of their greater abundance in the genome and ease ofmeasurement. SNPs are being identified and rapidly mapped toprovide a rich source of genetic information with the potentialfor allowing a greater insight into understanding the geneticcomplexity of many organisms. SNPs are present in high frequencyin any genome, amenable to high throughput analysis and havethe ability to reveal hidden polymorphisms where other methodsfail (Bhattramakki and Rafalski, 2001). In plants, a numberof studies have been able to link SNPs with phenotypic traitsof agronomic interest, such as the putative betaine aldehydedehydrogenase gene responsible for the fragrance trait in rice(Bradbury et al., 2005) and SNPs found in the starch synthaseIIa gene associated with starch gelatinization temperature inrice (Waters et al., 2005). These studies highlight the usefulnessof SNP markers, demonstrating both the abundance of this markertype and the potential causal association between a single nucleotidealteration and organism phenotype. A further major advantageof SNP markers is that they allow easy and unambiguous identificationof alleles or haplotypes.
Whilst numerous technical methods have been developed for theirdetection (Gut, 2001), the majority are applicable mainly todiploid genomes where a simple presence/absence of either oneor both of the alternative bases would indicate homozygosityor heterozygosity. Sugarcane, with its complex genome comprisingan estimated 8–14 copies of every chromosome (Rossi et al., 2003;Aitken et al., 2004), can have up to 14 different alleles present,with individual alleles in varying numbers. Thus, the frequencyof an SNP base at a gene locus will be determined by both thenumber of chromosomes carrying the gene, and the number of differentalleles (or haplotypes) and frequency of each allele possessingeach SNP base. Hence, any method used to detect SNPs at a particularlocus in sugarcane must be able to determine the frequency ofeach SNP base in different genotypes, rather than simply detectingthe presence and absence of SNPs. Such detection systems aregenerally more complex and expensive than simpler and more commonmethods used for detecting less complex genomes (Ross et al., 1998;Ahmadian et al., 2000; Alderborn et al., 2000; Nurmi et al., 2001;Storm et al., 2003).
Use of SNPs in sugarcane
Currently, whilst there are only a limited number of papersdescribing the use of SNPs to understand the sugarcane genome,they point to this marker system as a valuable means of mappingcandidate genes and for identifying the genetic basis of QTLsof agronomically important traits. These studies include a discussionon the ability of SNPs to: delineate a set of 64 ESTs into twogroups that are likely to represent two gene family membersof 6-phosphogluconate dehydrogenase (Grivet et al., 2001); delineationof 178 ESTs into three paralogous genes to reveal the expressionof an Adh2 and two Adh1 genes in sugarcane (Grivet et al., 2003);the development of co-dominant cleaved amplified polymorphicsequence (CAPS) markers (Quint et al., 2002); and to map severalcandidate genes and ESTs (McIntyre et al., 2005).
In sugarcane, the proportional frequencies of each SNP basewill vary depending on the number of alleles of the gene containingthe SNP locus. The ability to capture this information accuratelyacross several SNPs within a set of homo(eo)logous alleles cangive an indication of the number of allele haplotypes presentfor a gene and potentially provide the haplotype sequences.This information could have implications for sugarcane breeding.High yield potential may be due to the presence of, or differentnumber of copies of a specific allele(s) present at a gene locus,or possibly a combination of both. Knowledge of the sequenceunderlying each allele haplotype has the potential to allowallele-specific markers to be designed.
Quantitative methods to detect allele dosage in sugarcane arenow possible with such techniques as pyrophosphate sequencingusing the PyrosequencerTM platform (Cordeiro et al., 2006b)and mass-spectrometry using the SequenomTM platform (Cordeiro et al., 2006b).These methods have allowed the quantitative detection of frequenciesof consensus to alternate SNP bases at any particular SNP locus.Utilizing a group of SNP markers developed to the same EST orgene, it becomes possible to infer the likely copy number ofthe EST or gene. This information then allows for possible haplotypesof a gene present in hybrid cane to be determined through statisticalapproaches (Cordeiro et al., 2006b).
In theory, the association of SNP variations with either thepresence or absence of different phenotypes among individualsor among individuals from different populations appears straightforward.This simplistic view does not account for the majority of basepolymorphisms that do not result in any amino acid change. Determiningthe haplotypes is more important for predicting individual phenotypesthan are the underlying SNPs. Determining haplotypes also allowsthe ability to infer the evolutionary history of a DNA region(Templeton et al., 1988; Tishkoff et al., 1998). However, difficultiesare encountered in determining SNP haplotypes when inbred orhomozygous individuals are not available (Rafalski, 2002) asis usually the case with sugarcane.
The ability to determine SNP base frequencies provides the meansto determine the likely copy number of homo(eo)logous loci insugarcane. Where chromosome counts have been performed for agenotype, this information can be used to support the inferenceof the most likely copy number of homo(eo)logous loci. Knowledgeof the number of homo(eo)logous loci will assist in the deductionof the allelic composition of the locus in any particular sugarcanegenotype. The ability to determine haplotypes also opens possibilitiesin unraveling the complexities of the sugarcane genome. By defininghaplotypes in parents of crosses, it may be possible to deducetheir segregation in progeny; or to determine allele dosageand composition in any particular genotype in relation to phenotypicperformance. A further level of analysis is required to determinethe level of expression of each of the haplotypes in this complexgenome.
Genetic maps are widely used in plant breeding to identify genomicregions controlling traits of interest. Such information assistsin understanding the genetic basis of the target trait, as wellas providing DNA markers for use in marker-assisted breeding.In sugarcane, only markers that are present as a single copyin one parent and absent in the second [i.e. single-dose (SD)marker] can be incorporated into maps using populations of conventionalsize (approx. 250 progeny) (Wu et al., 1992). In these populations,SD markers segregate in a 1 : 1 ratio.
The first maps of a cultivar were initiated on the selfed progenyof SP70-1006 (D'Hont, 1994). This map was later transferredand further developed on the cultivar R570 (Grivet et al., 1996)using RFLP probes from maize and sugarcane. By 2001, the R570map, as it had become commonly known, contained some 600 RFLPmarkers derived from a number of grass (Poaceae) species (D'Hont and Glaszmann, 2001).The markers on this map distribute over 98 cosegregation groupscovering a total length of 2008 cM. A parallel mapping effortwas also carried out to place 939 single-dose AFLP markers onR570, of which 887 were distributed into 120 cosegregation orlinkage groups (Hoarau et al., 2001). A more recent map hasbeen developed on a cross between the Australian commercialvariety Q165A (2n = 115) with the S. officinarum clone IJ76-514(2n = 80) using a combination of AFLP and SSR markers. A totalof 967 single dose markers were generated from the two markersystems, and 910 were distributed across 116 linkage groupscovering a total map length of 9058·3 cM (Aitken et al., 2005).Markers on these maps have all been generated through anonymousmarker systems. However, the use of SNP markers are resultingin ESTs mapped onto the Q165A map.
EcoTILLING for mapping ESTs
Parallel to the development of quantitative SNP frequency scoringmethods has been the adaptation of the EcoTILLING method fordetecting and mapping sugarcane ESTs. TILLING utilizes the CelImismatch-cleavage enzyme on heteroduplexed DNA strands withdetection of end-labelled cleavage product (McCallum et al., 2000).A variant of this method utilizes natural populations for thediscovery of polymorphisms (SNPs, SSRs and indels), and is referredto as EcoTILLING (Comai et al., 2004). Both methods as publishedrely largely on electrophoretic gels to separate and visualizethe products. In sugarcane, this does not allow SNPs that occuron a single allele to be clearly detected. Modifying the protocoland moving the detection system to capillary electrophoresishas allowed the detection of single-dose SNPs in sugarcane tobe identified (Cordeiro et al., 2006b) and mapped (McIntyre et al., 2006).Our early experience with family members of the sucrose phosphatesynthase gene indicate straightforward detection of the presenceof 5–11 SNPs in fragment lengths of genomic DNA between300 bp and 400 bp in length. Neither prior knowledge of anySNP in the fragment nor the alignment of multiple ESTs are requiredto identify putative SNPs and their location. Whilst the methodis as yet unable to indicate the frequency at which an individualSNP base is present, it has been demonstrated that the detectedvariation in base composition segregates as expected in progenyof mapping populations. Using the SPS gene family members asan example, the mapping of the gene family members through theEcoTILLING approach supports sequence information that threeof the five gene family members may contain more than one gene,with each gene possessing from one to five alleles (McIntyre et al., 2006).This observation will in time allow further unravelling of thecomplexities of the sugarcane genome.
Sorghum genome information as a resource for sugarcane
Sorghum is the closest cultivated relative of sugarcane. Sugarcanehas a large genome that has duplicated at least twice sinceit diverged from sorghum, around 5 million years ago (Al-Janbi et al., 1997).The extensive similarity in the gene order between these twogenomes, where intercrosses are still possible (Ming et al., 1998),makes sorghum the best model crop for the Androponeae tribe(Price et al., 2005a) with the aim of understanding the extensivegene rearrangements and assisting the development of geneticmaps in sugarcane.
Sequencing of Sorghum provides another model genome within thegrasses, which particularly when utilized in conjunction withrice, will stimulate evolutionary understanding of the entirePoaceae. Sequencing will stimulate gene and allele discoveryand crop improvement in Sorghum as it did in rice. Sugarcanegenomics will be supported by the Sorghum sequence data. Thesequences of Sorghum genes and to a lesser extent the locationof genes in the genome should be useful in sugarcane.
Genetic resources for sorghum and sugarcane improvement have
been enhanced by the application of genomic tools to analysis
of wild relatives in the Sorghum
populations (including TILLING populations) of Sorghum
the options for gene discovery and genetic manipulation. Protocols
for EcoTILLING (Cordeiro et al., 2006a
) and quantitative SNP
analysis in the complex sugarcane genome should be valuable
tools for gene mapping, gene discovery and association genetics
in sugarcane. The availability of a Sorghum
will further accelerate the potential to apply these techniques
in both Sorghum
and sugarcane. Gene discovery in this germplasm
will also be supported by application of advances in expression
profiling tools as has been applied to other crop species in
the Poaceae (McIntosh et al., 2007