Proteins have been quietly taking control of our lives since the start of the COVID-19 pandemic. We live by the virus’s so-called “spike” protein, which has mutated dozens of times to create increasingly deadly variants. But the truth is, we’ve always been ruled by protein. At the cellular level, they are responsible for almost everything.
Proteins are so fundamental that DNA – the genetic material that makes each of us unique – is essentially just one long sequence of blueprints of proteins. This is true for animals, plants, fungi, bacteria, archaea and even viruses. And just as these groups of organisms evolve and change over time, so do proteins and their components.
A new study by researchers at the University of Illinois, published in Scientific reports, maps the evolutionary history and interrelationships of protein domains, the subunits of protein molecules, over 3.8 billion years.
“Knowing how and why domains combine in proteins during evolution could help scientists understand and design protein activity for applications in medicine and bioengineering. For example, this information could guide disease management, such as making better vaccines from the peak protein of COVID-19 viruses, ”explains Gustavo Caetano-Anolles, professor in the Department of Crop Sciences, affiliated with Carl R Woese Institute for Genomic. Biology at Illinois, and lead author of the article.
Caetano-Anollés has studied the evolution of COVID mutations since the early stages of the pandemic, but this timeline represents a tiny fraction of what he and doctoral student Fayez Aziz undertook in their current study.
Researchers have compiled the sequences and structures of millions of protein sequences encoded in hundreds of genomes in all taxonomic groups, including higher organisms and microbes. They focused not on whole proteins, but rather on structural domains.
“Most proteins are made up of several domains. They are compact structural units, or modules, which house specialized functions, ”explains Caetano-Anollés. “More importantly, these are the units of evolution.”
After sorting proteins into domains to build evolutionary trees, they set to work to create a network to understand how domains developed and were shared among proteins over billions of years of evolution.
“We have built a time series of networks that describe how domains accumulated and how proteins rearranged their domains during evolution. This is the first time that such a network of “domain organization” has been studied as an evolutionary timeline, ”says Fayez Aziz. “Our investigation revealed that there is a large evolving network describing how domains combine with each other in proteins.”
Each link in the network represents a time when a particular domain was recruited into a protein, usually to perform a new function.
“This fact alone strongly suggests that domain recruiting is a powerful force by nature,” says Fayez Aziz. The timeline also revealed which areas contribute important protein functions. For example, researchers have been able to trace the origin of the areas responsible for environmental detection as well as secondary metabolites, or toxins used in bacterial and plant defenses.
Analysis showed that the domains began to combine early in protein evolution, but there were also periods of explosive network growth. For example, researchers describe a “big bang” of domain combinations 1.5 billion years ago, coinciding with the rise of multicellular organisms and eukaryotes, organisms with membrane-bound nuclei that include humans.
The existence of biological big bangs is not new. The Caetano-Anollés team previously reported the massive and early origin of metabolism, and they recently tracked it down when tracking the history of metabolic networks.
The historic recording of a big bang describing the evolutionary patchwork of proteins provides new tools for understanding protein composition.
“This could help identify, for example, why structural variations and genomic recombinations often occur in SARS-CoV-2,” explains Caetano-Anollés.
He adds that this new way of understanding proteins could help prevent pandemics by dissecting the origin of viral diseases. It could also help alleviate the disease by improving vaccine design in the event of an outbreak.
Reference: “Evolution of network of protein domain organization” by M. Fayez Aziz and Gustavo Caetano-Anollés, June 8, 2021, Scientific Reports.
DOI: 10.1038 / s41598-021-90498-8
The work was supported by the National Science Foundation and the US Department of Agriculture.
The Department of Crop Sciences is part of the College of Agricultural, Consumer, and Environmental Sciences at the University of Illinois.