Dawn of the wheat genomics era

GroundCover Live and online, stay up to date with daily grains industry news online, click here to read more

It is one thing to sequence a genome, another to understand what it means. Now, new bioinformatics tools are making it possible to properly exploit the bread wheat genome sequence in breeding programs

Photo of Dr Ute Baumann and Dr Nathan Watson-Haigh
Bioinformatics group leader at the University of Adelaide Dr Ute Baumann discusses the DAWN bioinformatics tool with Dr Nathan Watson-Haigh, the lead bioinformatician who worked on DAWN’s development. PHOTO: University of Adelaide

Knowledge of the bread wheat genome has come of age after years of researchers grappling with its size and genetic complexity.

Now, the genome has been sequenced, its key genetic structures decoded, and differences between cultivars identified in ways that support accelerated rates of genetic improvement.

The milestone was made possible by two concurrent advances. The first involved the technically challenging grunt work of sequencing the entire genome in small fragments and then assembling that sequence in the correct order. The second involved analytical methods to make sense of the resulting enormous dataset.

Crucial to cracking the analysis challenge is Dr Ute Baumann, who heads the bioinformatics program at the Australian Research Council (ARC) Industrial Transformation Research Hub for Wheat in a Hot and Dry Climate, located at the Waite campus of the University of Adelaide.

GRDC investment made it possible for Dr Baumann to develop a web-accessible bioinformatics platform called Diversity Among Wheat geNomes (DAWN). This platform provides unprecedented – and once unimaginable – insights into the structure, genetic function, genealogy and biodiversity of the bread wheat genome.

Associate Professor Delphine Fleury, who heads the ARC wheat hub, says the bioinformatics breakthrough was essential for the grains industry to benefit from global efforts to fully sequence the wheat genome.

“If we can’t read the sequence information, we can’t use it,” Associate Professor Fleury says. “The genome is big – 39.5 times bigger than the rice genome and 5.6 times larger than the human genome. That poses computational challenges to handle, search and sort through that much sequence data. That’s the problem that Dr Baumann solved. She gave us access to the genome.”

The key to learning to ‘read’ the genome involved comparing the sequence of 16 different bread wheat cultivars in a process that provided the contrast to understand the functional structure of the genome, both at a macro and micro scale.

Making sense of genetic gibberish

As the University of Adelaide’s representative to the International Wheat Genome Sequencing Consortium (IWGSC), Dr Baumann was aware early on that the gargantuan size of the wheat genome would complicate its use as a research tool.

In 2017, however, she caught sight of a way to solve the analytical challenge.

The world’s biggest jigsaw puzzle

Accompanying the arrival of wheat bioinformatics is the release of the first assembled and annotated ‘reference sequence’ of the bread wheat genome by the International Wheat Genome Sequencing Consortium (IWGSC).

As the University of Adelaide’s representative to the IWGSC, Dr Ute Baumann explains that producing the reference sequence required bringing together datasets produced by different technical sequencing strategies.

“People have come to expect that genomes can be sequenced quickly and cheaply,” Dr Baumann says. “But the wheat genome involves an unprecedented level of complexity. If I were to list all the techniques and technological advances used to produce the reference sequence, it would illustrate the complexity of this project.”

While the wheat genome has just seven basic chromosome subtypes (numbered chromosome 1 through to 7), these chromosomes can be exceptionally big. For example, a single wheat chromosome can be bigger than the entire rice genome.

In another major complication, there are three versions of each chromosome subtype due to the co-existence of three progenitor grass genomes in wheat (called genome A, B, and D). Because of genetic similarities among the progenitor species, it is difficult to distinguish the A, B and D version of each chromosomal DNA (Figure 1).

One technique taken by the IWGSC was to first sort and separate the 21 different chromosome subtypes before distributing them for sequencing to different laboratories around the world. Professor Rudi Appels, currently at La Trobe University, led the Australian team that sequenced and assembled chromosome 7A while based at Murdoch University.

“In addition, the reference sequence is annotated,” Dr Baumann says. “That basically means that the genes have been identified along with their genetic structure, their likely function, in which tissue the genes are expressed, and the location of molecular markers for all the most common marker systems, such as the 90K iSelect platform. So the annotated genome is a really fantastic resource that encapsulates a lot of prior research.”

Dr Baumann explains that Bioplatforms Australia (BPA) used advanced sequencing technology to process the genomes of 16 different wheat cultivars. Importantly, the set included 11 Australian varieties, including Baxter (which expresses novel rust resistance), Drysdale (which has enhanced water use efficiency), Gladius and Yitpi.

“Comparative genomics produces an extraordinary amount of information because it can create a picture of the function associated with different areas of the genome,” Dr Baumann says.

There was one major obstacle, however. To sequence a genome, the continuous DNA strand within every chromosome must first be fragmented into tiny pieces before they can be sequenced. To achieve this, BPA used a technique called ‘shotgun sequencing’ that creates dire difficulties assembling the fragmented sequence into the right order.

To solve what amounts to the world’s greatest jigsaw puzzle, Dr Baumann made use of data produced by the IWGSC that provided clues to the right way to assemble the fragmented DNA sequence. From data that initially meant little, emerged an entirely unprecedented view of bread wheat genetics.

“For the first time, we have achieved insights about the wheat genome that don’t just describe, but explain cultivar performance,” Dr Baumann says. “We can just about see how cultivars were bred just from the structure of their genome. We can even start to predict optimal genomic configurations needed to achieve higher-performing, new varieties.”

This amounts to gateway technology to establish predictive breeding capability for the wheat industry, as has already occurred for maize hybrids in the US.

“That is where we see the future going,” Dr Baumann says. “Essential to that goal is the ability to understand the genome structure that underlies genetic diversity, including the major agronomically important traits – flowering time, height, quality and yield genes. Only then can we try to design and construct a variety based on optimal gene variants and gene combinations while also designing the best breeding strategy to achieve that ideal cultivar.”

The inclusion of Australian cultivars within DAWN further means Australia is on the frontline of exploiting this astonishing new capability, with wheat-breeding companies, such as Limagrain, already expressing interest in tapping the analytical power of Dr Baumann’s bioinformatics tools.

Brought into focus by DAWN are both large and small-scale details.

Visible on a chromosome-wide basis is the intrusion of large segments of DNA from species related to wheat (called an alien introgression), including segments that have brought in important new traits, such as the rust-resistance gene Sr36. Dr Baumann can now account for chromosomal regions that will happily exchange and recombine DNA during reproduction and those that will not, causing genes contained within the region to always be inherited together.

“We can then also zoom into specific regions where we can find sequence differences of a single letter in the genetic code,” Dr Baumann says. “This kind of sequence diversity can account for trait variation between cultivars. It also allows researchers and breeders to develop diagnostic markers for important traits.”

An example of exploiting this level of resolution is the development of markers diagnostic for amylose content in grain. Another is the identification of two genes for yield – including one that is associated with heat tolerance – both located on the Australian-sequenced chromosome 7A.

“Bioinformatics is what grants researchers access to the wheat genome and it is sharpening our understanding of how best to develop new varieties,” Associate Professor Fleury says. “We can now see those regions that pose problems for breeding and can hold up genetic gain. So knowledge of genome structure matters to the grain industry, especially for a complex genome like wheat.

Graphic of bread wheat genome
Figure 1 Besides its size, the bread wheat genome is complex due to the co-existence in wheat of the genomes from three progenitor species, called genomes A, B and D. SOURCE: International Wheat Genome Sequencing Consortium

More information

Dr Ute Baumann
ute.baumann@adelaide.edu.au

Associate Professor Delphine Fleury
delphine.fleury@adelaide.edu.au

DAWN bioinformatics are publicly accessible here.