When the first coronavirus cases in Chicago appeared in January, they bore the same genetic signatures as a germ that emerged in China weeks before.
But as Egon Ozer, an infectious-disease specialist at the Northwestern University Feinberg School of Medicine, examined the genetic structure of virus samples from local patients, he noticed something different.
A change in the virus was appearing again and again. This mutation, associated with outbreaks in Europe and New York, eventually took over the city. By May, it was found in 95% of all the genomes Ozer sequenced.
At a glance, the mutation seemed trivial. About 1,300 amino acids serve as building blocks for a protein on the surface of the virus. In the mutant virus, the genetic instructions for just one of those amino acids – number 614 – switched in the new variant from a “D” (shorthand for aspartic acid) to a “G” (short for glycine).
But the location was significant, because the switch occurred in the part of the genome that codes for the all-important “spike protein” – the protruding structure that gives the coronavirus its crownlike profile and allows it to enter human cells the way a burglar picks a lock.
And its ubiquity is undeniable. Of the approximately 50,000 genomes of the new virus that researchers worldwide have uploaded to a shared database, about 70% carry the mutation, officially designated D614G but known more familiarly to scientists as “G.”
“G” hasn’t just dominated the outbreak in Chicago – it has taken over the world. Now scientists are racing to figure out what it means.
At least four laboratory experiments suggest that the mutation makes the virus more infectious, although none of that work has been peer-reviewed. Another unpublished study led by scientists at Los Alamos National Laboratory asserts that patients with the G variant actually have more virus in their bodies, making them more likely to spread it to others.
The mutation doesn’t appear to make people sicker, but a growing number of scientists worry that it has made the virus more contagious.
“The epidemiological study and our data together really explain why the [G variant’s] spread in Europe and the U.S. was really fast,” said Hyeryun Choe, a virologist at Scripps Research and a lead author of an unpublished study on the G variant’s enhanced infectiousness in laboratory cell cultures. “This is not just accidental.”
But there may be other explanations for the G variant’s dominance: biases in where genetic data are being collected, quirks of timing that gave the mutated virus an early foothold in susceptible populations.
“The bottom line is, we haven’t seen anything definitive yet,” said Jeremy Luban, a virologist at the University of Massachusetts Medical School.
The scramble to unravel this mutation mystery embodies the challenges of science during the coronavirus pandemic. With millions of people infected and thousands dying every day around the world, researchers must strike a high-stakes balance between getting information out quickly and making sure that it’s right.
SARS-CoV-2, the novel coronavirus that causes the disease covid-19, can be thought of as an extremely destructive burglar. Unable to live or reproduce on its own, it breaks into human cells and co-opts their biological machinery to make thousands of copies of itself. That leaves a trail of damaged tissue and triggers an immune system response that for some people can be disastrous.
This replication process is messy. Even though it has a “proofreading” mechanism for copying its genome, the coronavirus frequently makes mistakes, or mutations. The vast majority of mutations have no effect on the behavior of the virus.
But since the virus’s genome was first sequenced in January, scientists have been on the lookout for changes that are meaningful. And few genetic mutations could be more significant than ones that affect the spike protein – the virus’s most powerful tool.
This protein attaches to a receptor on respiratory cells called ACE2, which opens the cell and lets the virus slip inside. The more effective the spike protein, the more easily the virus can break into the bodies of its hosts. Even when the original variant of the virus emerged in Wuhan, China, it was obvious that the spike protein on SARS-CoV-2 was already quite effective.
But it could have been even better, said Choe, who has studied spike proteins and the way they bind to the ACE2 receptor since the severe acute respiratory syndrome outbreak in 2003.
The spike protein for SARS-CoV-2 has two parts that don’t always hold together well. In the version of the virus that arose in China, Choe said, the outer part – which the virus needs to attach to a human receptor – frequently broke off. Equipped with this faulty lock pick, the virus had a harder time invading host cells.
“I think this mutation happened to compensate,” Choe said.
Studying both versions of the gene using a proxy virus in a petri dish of human cells, Choe and her colleagues found that viruses with the G variant had more spike proteins, and the outer parts of those proteins were less likely to break off. This made the virus approximately 10 times more infectious in the lab experiment.
The mutation does not seem to lead to worse outcomes in patients. Nor did it alter the virus’s response to antibodies from patients who had the D variant, Choe said, suggesting that vaccines being developed based on the original version of the virus will be effective against the new strain.
Choe has uploaded a manuscript describing this study to the website BioRxiv, where scientists can post “preprint” research that has not yet been peer reviewed. She has also submitted the paper to an academic journal, which has not yet published it.
The distinctive infectiousness of the G strain is so strong that scientists have been drawn to the mutation even when they weren’t looking for it.
Neville Sanjana, a geneticist at the New York Genome Center and New York University, was trying to figure out which genes enable SARS-CoV-2 to infiltrate human cells. But in experiments based on a gene sequence taken from an early case of the virus in Wuhan, he struggled to get that form of the virus to infect cells. Then the team switched to a model virus based on the G variant.
“We were shocked,” Sanjana said. “Voilà! It was just this huge increase in viral transduction.” They repeated the experiment in many types of cells, and every time the variant was many times more infectious.
Their findings, published as a preprint on BioRxiv, generally matched what Choe and other laboratory scientists were seeing.
But the New York team offers a different explanation as to why the variant is so infectious. Whereas Choe’s study proposes that the mutation made the spike protein more stable, Sanjana said experiments in the past two weeks, not yet made public, suggest that the improvement is actually in the infection process. He hypothesized that the G variant is more efficient at beginning the process of invading the human cell and taking over its reproductive machinery.
Luban, who has also been experimenting with the D614G mutation, has been drawn to a third possibility: His experiments suggest that the mutation allows the spike protein to change shape as it attaches to the ACE2 receptor, improving its ability to fuse to the host cell.
Different approaches to making their model virus might explain these discrepancies, Luban said. “But it’s quite clear that something is going on.”
Although these experiments are compelling, they’re not conclusive, said Kristian Andersen, a Scripps virologist not involved in any of the studies. The scientists need to figure out why they’ve identified different mechanisms for the same effect. All the studies still have to pass peer review, and they have to be reproduced using the real version of the virus.
Even then, Andersen said, it will be too soon to say that the G variant transmits faster among people.
Cell culture experiments have been wrong before, noted Anderson Brito, a computational biologist at Yale University. Early experiments with hydroxychloroquine, a malaria drug, hinted that it was effective at fighting the coronavirus in a petri dish. The drug was touted by President Trump, and the Food and Drug Administration authorized it for emergency use in hospitalized covid-19 patients. But that authorization was withdrawn this month after evidence showed that the drug was “unlikely to be effective” against the virus and posed potential safety risks.
So far, the biggest study of transmission has come from Bette Korber, a computational biologist at Los Alamos National Laboratory who helped build one of the world’s biggest viral genome databases for tracking HIV. In late April, she and colleagues at Duke University and the University of Sheffield in Britain released a draft of their work arguing that the mutation boosts transmission of the virus.
Analyzing sequences from more than two dozen regions across the world, they found that most places where the original virus was dominant before March were eventually taken over by the mutated version. This switch was especially apparent in the United States: Ninety-six percent of early sequences here belonged to the D variant, but by the end of March, almost 70% of sequences carried the G amino acid instead.
The British researchers also found evidence that people with the G variant had more viral particles in their bodies. Although this higher viral load didn’t seem to make people sicker, it might explain the G variant’s rapid spread, the scientists wrote. People with more virus to shed are more likely to infect others.
The Los Alamos draft drew intense scrutiny when it was released in the spring, and many researchers remain skeptical of its conclusions.
“There are so many biases in the data set here that you can’t control for and you might not know exist,” Andersen said. In a time when as many as 90% percent of U.S. infections are still undetected and countries with limited public health infrastructure are struggling to keep up with surging cases, a shortage of data means “we can’t answer all the questions we want to answer.”
Pardis Sabeti, a computational biologist at Harvard University and the Broad Institute, noted that the vast majority of sequenced genomes come from Europe, where the G variant first emerged, and the United States, where infections thought to have been introduced by travelers from Europe spread undetected for weeks before the country shut down. This could at least partly explain why it appears so dominant.
The mutation’s success might also be a “founder effect,” she said. Arriving in a place like Northern Italy – where the vast majority of sequenced infections are caused by the G variant – it found easy purchase in an older and largely unprepared population, which then unwittingly spread it far and wide.
Scientists may be able to rule out these alternative explanations with more rigorous statistical analyses or a controlled experiment in an animal population. And as studies on the D614G mutation accumulate, researchers are starting to be convinced of its significance.
“I think that slowly we’re beginning to come to a consensus,” said Judd Hultquist, a virologist at Northwestern University.
Solving the mystery of the D614G mutation won’t make much of a difference in the short term, Andersen said. “We were unable to deal with D,” he said. “If G transmits even better, we’re going to be unable to deal with that one.”
But it’s still essential to understand how the genome influences the behavior of the virus, scientists say. Identifying emerging mutations allows researchers to track their spread. Knowing what genes affect how the virus transmits enables public health officials to tailor their efforts to contain it. Once therapeutics and vaccines are distributed on a large scale, having a baseline understanding of the genome will help pinpoint when drug resistance starts to evolve.
“Understanding how transmissions are happening won’t be a magic bullet, but it will help us respond better,” Sabeti said. “This is a race against time.”