How long have coronaviruses been around? Hundreds of years? Thousands? Millions? How far back in time do we have to send Bruce Willis to stop all possible coronavirus pandemics? Even though viruses don’t leave fossils, evolutionary biologists have developed some sophisticated and creative strategies to answer these questions. Does it matter? Does the answer tell us anything about the pandemic we are in?
Before SARS, MERS and COVID, we knew that there are 4 common coronaviruses that circulate among people. Each typically causes little more than cold-like symptoms. A common belief is that this kind of low pathogenicity is from a long evolutionary association. Conversely, deadly infections are perceived to be the hallmark newly acquired afflictions. Like a lot of things that make intuitive sense, neither is necessarily true.
Confirmation bias could easily have us believing that all new viral encounters are going to result in outbreaks like HIV, SARS, MERS, COVID-19 and Ebola. More likely we are regularly being assaulted with a wide range of new but entirely non-virulent viruses that just do not get noticed. These rarely get counted. If we only notice the killers, we’d be forgiven for thinking every new virus is a killer, and that every killer is new.
Would anyone have noticed if, instead of Spanish Flu, there was a COVID-19 pandemic in 1918? The average life expectancy was only 50 years back then. COVID-19 may not have registered so much as a blip against the backdrop of aggregate deaths from tuberculosis, diptheria, typhoid, and seasonal flu.
Nor is it true that a pathogen will always evolve to be less deadly, just to avoid burning down its own house. If transmission is favored by making people bleed out of their eye sockets and fingernails (e.g., Ebola, Marburg, Lassa) we should not expect to see strains that don’t do that. Similarly, if a virus mostly kills people already past the age of reproduction (over 50) and is otherwise well-tolerated, there is not much of a selective pressure to favor less pathogenic strains of SARS CoV 2.
Though often conflated, resistance and immunity are not the same thing. While a recovered person might have immunity against being reinfected by a virus, that immunity does not get passed-on. The Conquistadors were able to sit back and watch Aztecs get wiped out by smallpox not because their ancestors passed on an acquired immunity. Rather, somewhere back in time enough Europeans who were very susceptible to death from smallpox already died of it. They did not survive to reproduce. That left a population not so much immune from infection as they were just not as likely to die when infected.
Natural selection is a weeding process. Selective pressures of the natural world do not magically generate mutations to meet those challenges. It is not so much survival of the fittest as it is death of the inadequate. But those deaths make way for expansion of the descendants of the individuals who don’t die before reproducing.
“By the toll of a billion deaths man has bought his birthright of the earth,
H. G. Wells – The War of the Worlds
and it is his against all comers; it would still be his were the Martians ten times
as mighty as they are. For neither do men live nor die in vain.”
Mutations themselves occur randomly over time through DNA copying errors. Anything that occurs randomly over time can be used as a measure of time. If someone receives an average of 4 snail-mail letters a day, the amount of mail piling up on their doorstep can be a reliable indicator of how long they’ve been on vacation. Emile Zuckerlandl and Linus Pauling realized that same principle that allows carbon-14 dating (from random radioactive decay) might provide us with a “molecular clock” (from random mutations).
If you can measure the number of differences in a strand of DNA across primates, you might be able to make a reasonable estimate how long ago their common ancestor was alive. All that’s missing is a way to calibrate the clock. That’s where fossils come in handy. Even without fossils, we can correlate against continental drift and tectonic events.
Fossil evidence suggests that humans and chimps went their separate ways about 6 million years ago. If you compare cytochrome oxidase B genes (a.k.a. cytB) of humans and chimps, you will see that they have “drifted apart” by about 120 differences (“substitutions”) out of 1140 letters (“sites”) in the DNA code for that gene. Put half of that on the branch going to humans and the other half on the branch going to chimps, and you arrive at 60 mutations in the 6 million years since the common ancestor. Using that rate of change, the 168 differences between humans and orangutans could imply that their most recent common ancestor lived about 8.5 million years ago.
However, the most recent common ancestor of human and orangutans was actually closer to 18 million, not 8 million years ago. Our estimate was wrong for a lot of reasons (not the least of which is the simplistic model I used). Molecular clocks need more than a single fossil calibration point and they work better the more fossils we have. These studies also assume that the differences we see are just random like two life jackets in a calm ocean randomly drifting away from each other. We could be wrong because our ability to see the outcome in DNA changes today was not random in the past.
Certain mutations in the gene (cytB) cause a muscular disease that makes it hard for humans to run away from lions. We should not expect to see those kinds of changes in living humans as often as we see other kinds of changes that have no effect. Changes that tend to be “purified” out due to things like lion attack, are under what we call purifying selection pressures. For this sort of work, evolutionary biologists try to avoid genes that are under any kind of selection (disfavored/negative/purifying or favored/positive/Darwinian) and stick to those that just drift.
We can we use this approach to figure out the origin of things like viruses too. However, while the human genome has about 30,000 genes, a lot of which are drifting under neutral selection, choices are limited in coronaviruses. Viruses have very compact genomes. In SARS CoV 2 there are fewer than 20 of these “open reading frames”.
The first estimates for the origin of coronaviruses concluded that they originated about 10,000 years ago. That’s very recent. It also implies a wildly high speciation rate and an amazing ability to spread around the planet. It’s also probably very wrong because the calibrations were very limited (there are no fossils) and because of the problem of purifying selection.
Not only are there strong selective pressures to keep a viral genome compact (for rapid replication and encapsulation), there is intense selective pressure on viruses from the immune systems of their hosts. It is doubtful that any of the viral genome isn’t influenced by some sort of evolutionary selection. When correcting for purifying selection, more recent work suggested that coronaviruses have been around for at least 55 million years, and probably much longer.
Obviously then, their mutation rate is much lower than was first estimated when concluding such a recent origin of the entire family of viruses. This is so because coronaviruses proofread when copying their own genetic code. They check for errors and fix them on the fly. That strongly limits the mutation rate. The rate of mutational change in coronaviruses is a lot lower than original estimates. In fact among RNA viruses, coronaviruses have some of the the lowest rates of mutation.
This is good news!
With a low mutation rate and a proof-reading virus, a vaccine developed against SARS CoV 2 likely will work on a global scale. It should also work for a long enough period to beat back the pandemic without re-engineering it every year (unlike flu vaccines). This low mutation rate also suggests that we don’t have to worry too much about (evoutionarily) new coronaviruses popping up all of the time. We can manage our risk against the ones that already exist in the wild, simply by better managing and minimizing our interaction with wildlife.