Study sheds light on the origin of the genetic code

Despite awe-inspiring diversity, nearly every lifeform — from bacteria to blue whales — shares the same genetic code. How and when this code came about has been the subject of much scientific controversy.

Taking a fresh approach at an old problem, Sawsan Wehbi, a doctoral student in the Genetics Graduate Interdisciplinary Program at the University of Arizona, discovered strong evidence that the textbook version of how the universal genetic code evolved needs revision. Wehbi is the first author of a study published in the journal PNAS suggesting the order with which amino acids — the code’s building blocks –

were recruited is at odds with what is widely considered the “consensus” of genetic code evolution.

“The genetic code is this amazing thing in which a string of DNA or RNA containing sequences of four nucleotides is translated into protein sequences using 20 different amino acids,” said Joanna Masel, the paper’s senior author and aprofessor of ecology and evolutionary biology at the U of A. “It’s a mind-bogglingly complicated process, and our code is surprisingly good. It’s nearly optimal for a whole bunch of things, and it must have evolved in stages.”

The study revealed that early life preferred smaller amino acid molecules over larger and more complex ones, which were added later, while amino acids that bind to metals joined in much earlier than previously thought. Finally, the team discovered that today’s genetic code likely came after other codes that have since gone extinct.

The authors argue that the current understanding of how the code evolved is flawed because it relies on misleading laboratory experiments rather than evolutionary evidence. For example, one of the cornerstones of conventional views of genetic code evolution rests on the famous Urey-Miller experiment of 1952, which attempted to simulate the conditions on early Earth that likely witnessed the origin of life.

While valuable in demonstrating that nonliving matter could give rise to life’s building blocks, including amino acids, through simple chemical reactions, the experiment’s implications have been called into question. For example, it did not yield any amino acids containing sulfur, despite the element being abundant on early Earth. As a result, sulfuric amino acids are believed to have joined the code much later. However, the result is hardly surprising, considering that sulfur was omitted from the experiment’s ingredients.

According to co-author Dante Lauretta, Regents Professor of Planetary Science and Cosmochemistry at the U of A Lunar and Planetary Laboratory, early life’s sulfur-rich nature offers insights for astrobiology, particularly in understanding the potential habitability and biosignatures of extraterrestrial environments.

“On worlds like Mars, Enceladus and Europa, where sulfur compounds are prevalent, this could inform our search for life by highlighting analogous biogeochemical cycles or microbial metabolisms,” he said. “Such insights might refine what we look for in biosignatures, aiding the detection of lifeforms that thrive in sulfur-rich or analogous chemistries beyond Earth.”

The team used a new method to analyze sequences of amino across the tree of life, all the way back to the last universal common ancestor, or LUCA, a hypothesized population of organisms that lived around 4 billion years ago and represents the shared ancestor of all life on Earth today. Unlike previous studies, which used full-length protein sequences, Wehbi and her group focused on protein domains, shorter stretches of amino acids.

“If you think about the protein being a car, a domain is like a wheel,” Wehbi said. “It’s a part that can be used in many different cars, and wheels have been around much longer than cars.”

To get a handle on when a specific amino acid likely was recruited into the genetic code, the researchers used statistical data analysis tools to compare the enrichment of each individual amino acid in protein sequences dating back to LUCA, and even farther back in time. An amino acid that shows up preferentially in ancient sequences was likely incorporated early on. Conversely, LUCA’s sequences are depleted for amino acids that were recruited later but became available by the time less ancient protein sequences emerged.

The team identified more than 400 families of sequences dating back to LUCA. More than 100 of them originated even earlier and had already diversified prior to LUCA. These turned out to contain more amino acids with aromatic ring structures, like tryptophan and tyrosine, despite these amino acids being late additions to our code.

“This gives hints about other genetic codes that came before ours, and which have since disappeared in the abyss of geologic time,” Masel said. “Early life seems to have liked rings.”

Comments (0)
Add Comment