‘Tree of Life’ reconstructs evolutionary history

Jeanne Chapin

A ‘Tree of Life,’ filled with all known species on Earth, is beginning to grow at Iowa State.

Two ISU computer science professors are developing software and mathematical methods to categorize living species on earth. The species will be categorized in a database to see how different organisms are related, a process that could lead to a greater understanding of evolution.

“[The project] is about reconstructing the evolutionary history of species that are currently alive — the species that we have evidence for,” said Oliver Eulenstein, assistant professor of computer science.

The project of classifying species has roots in Charles Darwin’s Tree of Life concept, a genealogical timeline of all existing species, he said.

This project will be far larger than regular species classification projects, said David Fernandez-Baca, professor of computer science. While typical classification trees categorize 60 or 70 species, the Tree of Life will categorize roughly 2 million.

However, this is still a fraction of all the species on earth. Scientists have estimated the total number of existing species to be around 100 million. If the actual number is anywhere near the estimate, more than 90 percent of earth’s species are unknown, meaning a complete tree may never be built, Eulenstein said.

“What we’re trying to do is make it feasible to build really large trees based on the efforts of small groups. People have already developed small trees, but building bigger trees is more of a challenge,” Fernandez-Baca said.

The level of interest in the Tree of Life and the amount of available resources will ultimately have a large impact on how much of the project can be completed, said Raul Piaggio, graduate student in computer science. The large number of species that have yet to be discovered are also a problem in completing the Tree, he said.

The research will lead to a better understanding of evolution, Fernandez-Baca said.

“A lot of research relies on knowledge about evolution; for instance, finding out what makes certain species develop the way they do,” he said. “That’s why evolutionary trees come in handy, because they tell you what is closely related and what is not.”

The Tree of Life could trace evolutionary processes across the species by comparison and help predict organism function through similarities, Fernandez-Baca said.

“Determining function is just important, period — how your metabolism works, how diseases are transmitted, how viruses evolve; knowing their relationship can actually help you develop treatments for certain diseases,” he said.

The Tree of Life is built on computers by comparing amino acid sequences of globin, the proteins that transport oxygen to the blood. Globin is used to classify organisms because its amino acid sequence is different in each species.

“Efforts to build evolutionary, or phylogenetic, trees have existed for a long time, using techniques based on the observation of morphological traits in organisms,” Piaggio said. “But with recent advances in molecular biology, DNA sequences are now used instead.”

The National Science Foundation funded the project with a five-year, $975,000 grant, to support graduate students involved in the project, attract outside researchers and allow researchers to attend meetings and hold workshops, Fernandez-Baca said.

“I’m not sure we will get to [finish the Tree] by the end of the five years,” Fernandez-Baca said. “We’re hoping to do a large subset of the plants by the end.”

Fernandez-Baca and Eulenstein are collaborating with two biologists from the University of California-Davis and the University of Pennsylvania on the project, as well as three ISU graduate students.

“We have a very strong team,” Fernandez-Baca said. “We do what we’re good at, which is computer science, and they do what they’re good at, which is biology.”