Skip to Content

Vast DNA tree of life for plants revealed by global science team using 1.8 billion letters of genetic code

Scientists sequenced the parasitic plant Pilostyles aethiopica that lives inside of other plants and is only visible when it flowers. DNA sequencing has reclassified the group in which this plant sits.

A new paper published today (April 24) in the journal Nature by an international team of 279 scientists led by the Royal Botanic Gardens, Kew presents the most up-to-date understanding of the flowering plant tree of life.

Using 1.8 billion letters of genetic code from more than 9,500 species covering almost 8,000 known flowering plant genera (ca. 60%), this incredible achievement sheds new light on the evolutionary history of flowering plants and their rise to ecological dominance on Earth.

The study’s authors believe the data will aid future attempts to identify , refine plant classification, uncover new medicinal compounds, and conserve plants in the face of climate change and biodiversity loss.

The major milestone for plant science, led by Kew and involving 138 organizations internationally, was built on 15 times more data than any comparable studies of the flowering plant tree of life. Among the species sequenced for this study, more than 800 have never had their DNA sequenced before.

The sheer amount of data unlocked by this research, which would take a single computer 18 years to process, is a huge stride towards building a tree of life for all 330,000 known species of flowering plants—a massive undertaking by Kew’s Tree of Life Initiative.

Dr. Alexandre Zuntini, Research Fellow at RBG Kew, says, “Analyzing this unprecedented amount of data to decode the information hidden in millions of DNA sequences was a huge challenge. But it also offered the unique opportunity to reevaluate and extend our knowledge of the plant tree of life, opening a new window to explore the complexity of plant evolution.”

Vast DNA tree of life for plants revealed by global science team using 1.8 billion letters of genetic code
The Angiosperm Tree of Life was built on 15 times more data than comparable studies and was involved sequencing more than 9,500 different species of flowering plants. Credit: RBG Kew

Unlocking historic herbarium specimens for cutting-edge research

The flowering plant tree of life, much like our own family tree, enables us to understand how different species are related to each other. The tree of life is uncovered by comparing DNA sequences between different species to identify changes (mutations) that accumulate over time like a molecular fossil record.

Our understanding of the tree of life is improving rapidly in tandem with advances in DNA sequencing technology. For this study, new genomic techniques were developed to magnetically capture hundreds of genes and hundreds of thousands of letters of genetic code from every sample, orders of magnitude more than earlier methods.

A key advantage of the team’s approach is that it enables a wide diversity of plant material, old and new, to be sequenced, even when the DNA is badly damaged. The vast treasure troves of dried plant material in the world’s herbarium collections, which comprise nearly 400 million scientific specimens of plants, can now be studied genetically.

Using such specimens, the team successfully sequenced a sandwort specimen (Arenaria globiflora) collected nearly 200 years ago in Nepal and, despite the poor quality of its DNA, were able to place it in the tree of life.

The team even analyzed extinct plants, such has the Guadalupe Island olive (Hesperelaea palmeri), which has not been seen alive since 1875. In fact, 511 of the species sequenced are already at risk of extinction, according to the IUCN Red List, including three more like Hesperelaea that are already extinct.

Professor William Baker, Senior Research Leader–Tree of Life, says, “In many ways this novel approach has allowed us to collaborate with the botanists of the past by tapping into the wealth of data locked up in historic herbarium specimens, some of which were collected as far back as the early 19th century.

“Our illustrious predecessors such as Charles Darwin or Joseph Hooker could not have anticipated how important these specimens would be in genomic research today. DNA was not even discovered in their lifetimes!

“Our work shows just how important these incredible botanical museums are to ground-breaking studies of life on Earth. Who knows what other undiscovered science opportunities lie within them?”

Across all 9,506 species sequenced, more than 3,400 came from material sourced from 163 herbaria in 48 countries. Additional material from plant collections around the world (e.g., DNA banks, seeds, living collections) have been vital for filling key knowledge gaps to shed new light on the history of flowering plant evolution. The team also benefited from publicly available data for more than 1,900 species, highlighting value of the open science approach to future genomic research.

Illuminating Darwin’s abominable mystery

Flowering plants alone account for about 90% of all known plant life on land and are found virtually everywhere on the planet—from the steamiest tropics to the rocky outcrops of the Antarctic Peninsula. And yet, our understanding of how these plants came to dominate the scene soon after their origin has baffled scientists for generations, including Charles Darwin.

Flowering plants originated more than 140 million years ago after which they rapidly overtook other vascular plants including their closest living relatives—the gymnosperms (non-flowering plants that have naked seeds, such as cycads, conifers, and ginkgo).

Darwin was mystified by the seemingly sudden appearance of such diversity in the fossil record. In an 1879 letter to Joseph Dalton Hooker, his close confidant and Director of RBG Kew, he wrote, “The rapid development as far as we can judge of all the higher plants within recent geological times is an abominable mystery.”

Utilizing 200 fossils, the authors scaled their tree of life to time, revealing how flowering plants evolved across geological time. They found that early flowering plants did indeed explode in diversity, giving rise to more than 80% of the major lineages that exist today shortly after their origin.

However, this trend then declined to a steadier rate for the next 100 million years until another surge in diversification about 40 million years ago, coinciding with a global decline in temperatures. These new insights would have fascinated Darwin and will surely help today’s scientists grappling with the challenges of understanding how and why species diversify.

Vast DNA tree of life for plants revealed by global science team using 1.8 billion letters of genetic code
The oldest plant sequenced for the study was a dried herbarium specimen of Arenaria globiflora collected in 1829 by Nathaniel Wallich. Credit: RBG Kew

A truly global collaboration

Assembling a tree of life this extensive would have been impossible without Kew’s scientists collaborating with many partners across the globe. In total, 279 authors were involved in the research, representing many different nationalities from 138 organizations in 27 countries. They include the Genomics for Australian Plants (GAP) consortium who were early adopters of the team’s techniques and who worked in close collaboration with Kew to maximize the number of Australian plant species in the tree.

International collaborators also shared their unique botanical expertise, as well as many precious plant samples from around the world that could not be obtained without their help. The comprehensive nature of the tree is in no small part a result of this wonderful partnership.

Dr. Mabel Lum, Program Manager at Bioplatforms Australia and from the GAP consortium, says, “We are proud to be a major partner and collaborator in RBG Kew’s effort to build global research infrastructure to advance our understanding of flowering plant tree of life. This fruitful collaboration underscored our commitment to fostering innovation and collaboration in scientific research, providing a springboard for future discoveries that will help shape our understanding of the natural world for generations to come.”

Vast DNA tree of life for plants revealed by global science team using 1.8 billion letters of genetic code
Alstonia spectabilis is a species of medicinal importance to the indigenous Tetun people and has been sequenced for the very first time. Credit: RBG Kew

Putting the plant tree of life to good use

The flowering plant tree of life has enormous potential in biodiversity research. This is because, just as one can predict the properties of an element based on its position in the periodic table, the location of a species in the tree of life allows us to predict its properties. The new data will thus be invaluable for enhancing many areas of science and beyond.

To enable this, the tree and all of the data that underpin it have been made openly and freely accessible to both the public and scientific community, including through the Kew Tree of Life Explorer. The study’s authors believe such open access is key to democratizing access to scientific data across the globe.

Open access will also help scientists to make the best use of the data, such as combining it with artificial intelligence to predict which plant species may include molecules with medicinal potential. Similarly, the tree of life can be used to better understand and predict how pests and diseases are going to affect the plants of the U.K. in the future. Ultimately, the authors note, the applications of this data will be driven by the ingenuity of the scientists accessing it.

Dr. Melanie-Jayne Howes, Senior Research Leader at RBG Kew who was not an author on the study but will make use of the data in her research, says, “Plant chemicals have inspired many pharmaceutical drugs, but still have great untapped potential to aid future drug discovery. The challenge is knowing which to investigate scientifically in the search for new medicines out of the ca. 330,000 flowering plant species.

“At Kew we are applying AI to predict which plant species contain chemicals with pharmaceutical potential for malaria. The availability of this vast new dataset offers exciting opportunities to enhance these predictions and hence accelerate drug discovery from plants for malaria and other diseases too.”

Vast DNA tree of life for plants revealed by global science team using 1.8 billion letters of genetic code
The new tree of life has reclassified the family and genus of Medusanthera laxiflora, a small tropical tree with bizarre fruit. Credit: Danilo Tandang

Remarkable species in the flowering plant tree of life

  • Extinct due to feral goats: Hesperelaea palmeri, also known as Guadalupe Island olive (olivo de la Isla de Guadalupe). Sequenced from an herbarium specimen at Kew collected on Guadalupe Island, off Baja California, Mexico in 1875 by medical doctor Edward Palmer. A tree belonging to the olive family (Oleaceae), it is now extinct because of overgrazing by non-native goats.
  • Oldest specimen sequenced: Arenaria globiflora, also known as Nepalese sandwort. Sequenced from an herbarium specimen at Kew collected in 1829 by Nathaniel Wallich. This remarkable specimen comes from a Himalayan mountain plant that grows at over 3,600m.
  • Parasitic plant family mystery solved: Pilostyles aethiopica, member of the stemsucker family (Apodanthaceae). Sequenced from plant tissue collected in Zimbabwe in 2012 by Kew’s Sidonie Bellot. This weird parasite lives inside the branches of other plants and is only visible when it erupts into flower. Previously thought to be closely related to pumpkins and begonias (Cucurbitales), study found it sits in the group Malpighiales.
  • Bizarre tropical tree reclassified: Medusanthera laxiflora, member of the buff-beech family (Stemonuraceae). Sequenced from an herbarium specimen at Kew collected in Indonesian New Guinea in 1993. This small tropical tree with bizarre pin fruits was previously classified alongside the holly family. New has reclassified its genus and family into a whole new order.
  • Bamboo from Hooker’s 1850s Himalayan expedition: Cephalostachyum capitatum, member of the grass family (Poaceae). Sequenced from an herbarium specimen collected in India in 1850 by Joseph Hooker, RBG Kew’s second director, and his friend Thomas Thomson.
  • Medicinal plant sequenced for the very first time: Alstonia spectabilis, also known as Kroti metan by Tetun people. Sequenced from an herbarium specimen at Kew collected in Papua New Guinea in 1954. This massive, 20m tall tree is found in the rainforests of SE Asia and Australia. Despite being medicinally important to the Tetun people of West Timor to treat malaria, as well as being a valuable source of timber, its DNA has never been sequenced before.