Fields marked with "*" are required to fulfill.
SARS-CoV-2 coronavirus origins alternative theories – do they hold up against science? Part 1

SARS-CoV-2 coronavirus origins alternative theories – do they hold up against science? Part 1

Posted by:


Controversies around the SARS-CoV-2 virus continue

Earlier this month a deeply controversial and outright shocking report came out purporting to provide smoking gun evidence that SARS-CoV-2 coronavirus was laboratory engineered. Apparently Zenodo is gaining recognition as a "source" for those in the "no".

We have previously covered the topic of SARS-CoV-2 origins (we highly recommend reading it to gain a better grasp of the information presented in this post) and we dismissed the notion of a covert genome manipulation due to a lack of any evidence to support such claims. With a title like "Unusual Features of the SARS-CoV-2 Genome Suggesting Sophisticated Laboratory Modification Rather Than Natural Evolution and Delineation of Its Probable Synthetic Route" this report promised to have some incriminating statements so obviously we were going to investigate it, and see how the published science stacked against their claims.This turned out to be such an ardent endeavour that the analysis will cover two separate posts.

Furthermore, we will contrast their conspiracy theory of a genetic manipulation origin of SARS-CoV-2 with another obscure but quite fascinating theory of how the virus might have naturally evolved and then escaped to produce the current COVID-19 pandemic. Along the way, we are certain you are going to bump into information you never heard before, and this will give you new and very surprising insights! Get ready for lots of scientific content weaved into this bizarre story.

Let's dive in into this latest controversy.

Authors of the report (we will refer to them as such for the remainder of the post) commence their odyssey of accusations by bringing attention to two bat coronaviruses that exhibit high similarity to SARS-CoV-2 but which are rarely discussed in science publications. These are the highly similar and closely related ZC45 and ZXC21 bat coronaviruses. The authors propose that the extremely high sequence similarity between these two viruses and SARS-CoV-2 would indicate that it is one of these viruses (or one very similar to it we do not know about) that is the true ancestor of SARS-CoV-2. Below is the breakdown of how closely related these viruses are when comparing the amino acid sequence of different virus proteins.

Protein   ZC45/ZXC21 and SARS-CoV-2 identity  
Overall 89%
Nucleocapsid 94%
Orf8 94.2%
Spike S2 domain 95%
Membrane 98.6%
E 100%

Such a highly observed conservation is extremely unusual, the authors claim, especially for the Orf8 protein which is otherwise poorly conserved in coronaviruses. All other coronaviruses share no more than a 58% identity with SARS-CoV-2 Orf8 protein. The authors also neatly displayed a figure comparing coronaviruses’ genome sequence identity except they omitted the most “related” currently known virus to SARS-CoV-2. Keep reading to find out why.

Merogenomics Blog Figure showing coronaviruses identity

Adapted from Yan L-M et al. 2020. DOI: 10.5281/zenodo.4028830

At the top of the figure is the layout of the SARS-CoV-2 genome showing where the different genes coding for the different viral proteins reside. Underneath that are the identity sequences between different related coronaviruses and SARS-CoV-2 with the x-axis denoting the SARS-CoV-2 genome sequence position (also corresponding to that layout on top) and the y-axis showing what the identity value is along the genome (if you are wondering how they do this, they compare 1000 nucleotides of the genome at a time every 100 nucleotides apart. Clever, eh?). The ZC45 bat virus is compared, as is one of the strains of SARS-CoV (of the SARS outbreak that took place in Asia around 2003), as well as a couple more distantly related bat coronaviruses. Where you see peaks, that is a high sequence identity. Where you see valleys, that is a lower sequence identity between SARS-CoV-2 and other viruses.

Authors then immediately jump to the conclusion that "Such evidence, when considered together, is consistent with a hypothesis that the SARS-CoV-2 genome has an origin based on the use of ZC45/ZXC21 as a backbone and/or template for genetic gain-of-function modifications."

We do not agree. Not even close. Why should this be thought of as evidence of gain-of-function (when a genome is manipulated on purpose to provide the virus with new capabilities not present before)? Just because of these sequence similarities? Besides the already mentioned Orf8 protein, the authors especially focused on the fact that the E protein is 100% conserved which should also be highly unlikely because mutations should occur spontaneously to create differences between all these different viruses.

But the E protein argument is weak, and the author's own evidence shows it. The overall E protein sequence they showed for different viruses to demonstrate how easily they should mutate is actually highly conserved. In fact, apart from these individual strains that exhibit one or two mutations, the overall E protein is nearly identical in all corona viruses they chose to present. On top of that, while they trash 100% identity with bat coronavirus ZC45/ZXC21 (their suspect of what was used as a backbone of SARS-CoV-2 development) they even mention that such 100% conservation of E protein has been seen between SARS-CoV virus and another bat virus . Thus this was perhaps a bit of a premature conclusion. But let's move on.

Fake virus, fake news?

Also interesting is that the ZC45 and ZXC21 bat coronaviruses were discovered and characterized by Chinese military research laboratories and published in 2018. The authors claim that the bat RaTG13 coronavirus, currently believed to be the closest relative of SARS-CoV-2 with 96% sequence identity, was published as a fake virus to divert attention away from the connection between the SARS-CoV-2 and the ZC45 and ZXC21 bat coronaviruses. If that was the intention, it worked! We published about the race for first SARS-CoV-2 genome sequence, and the weirdness of the fact that the Wuhan Institute of Virology when presenting SARS-CoV-2 sequence, also included all of a sudden out of nowhere, the super highly similar Bat-CoV RaTG13 sequence in same publication. That was immediately seen as highly suspicious by conspiracy theorists world wide.

Image of Merogenomics article quote on coronavirus conspiracy theories

But here is where the conspiracy deepens - the lab that actually won the race to first publish the SARS-CoV-2 genome was the Shanghai Public Health Clinical Centre, and they seemingly did not know about RaTG13, so they compared it to the other closest known relative coronaviruses - the ZC45 and ZXC21 bat viruses. What is also unnerving is that this Shanghai laboratory was shut down soon after by the Chinese authorities and the authors propose that this was to punish the lab for actually inadvertently exposing the link between SARS-CoV-2 and ZC45/ZXC21 viruses. So in the end, no one ever reports about these two viruses. If indeed the RaTG13 virus is fake then this tactic worked as that is the only relative that is now usually mentioned in the literature.

Now let's get to the SARS-CoV-2 spike protein which is responsible for interacting with receptors on human cells and thus mediating the invasion of human cells. The part of spike protein that interacts with the human cell receptors is called receptor-binding motif (RBM) and we discussed it in detail in our previous posts at the onset of the pandemic.

Image of Merogenomics article quote on coronavirus spike protein

The RBM of SARS-CoV-2 differs significantly from those of ZC45 and ZXC21. This is not just some amino acid changes. The RBMs of ZC45/ZXC21 viruses have three sections where the number of amino acids present in SARS-CoV-2 as well as original SARS-CoV virus are totally absent in ZC45/ZXC21. Instead, the RBM of SARS-CoV-2 closely resembles the RBM of SARS-CoV spike protein. This of course makes sense considering that both of these viruses have an affinity for the ACE2 receptors on human cells, about which we also wrote extensively. The big question is - how did this affinity get there?


Excuse me, just how many amino acids can we mutate before a day’s end?

Here is where the authors propose that basically the SARS-CoV RBM was used to generate RBM of SARS-CoV-2. But it seems like a stretch. Just look how the authors explained it: "Although this is not an exact 'copy and paste', careful examination of the Spike-hACE2 ["h" here stands for human origin] structures reveals that all residues essential for either hACE2 binding or protein folding [...] are 'kept'." Later they add "At the same time, majority of the amino acid residues that are non-essential have 'mutated'." The implication is that SARS-CoV-2 RBM had to be manipulated and then the extra non-essential mutations were thrown in to disguise the synthetic nature of the design to make it look natural.

We do not buy it.

First of all, if the virus had affinity for human receptors, then of course the amino acids essential for that interaction will be "kept". After all, that is why the virus binds to these human receptors. Second of all, that is just too many changes between the two spike proteins. If you ever worked in a lab that does mutagenesis studies like that, every single change has to be characterised for impact. SARS-CoV-2 spike protein RBM has so many differences from that of SARS-CoV, this would be months and months of trial and error work to determine which changes are acceptable, especially since some of those changes are within amino acids that are essential in hACE2 interaction as well (the authors try to dismiss that by not focusing on this much). The biggest issue we find with this idea is what we already wrote about in our original post dedicated to the topic of SARS-CoV-2 origins. It is the fact that spike protein could have been mutated into an even better interaction with hACE2 because this is already well understood. Why go through all the effort of building SARS-CoV-2 virus? To either be a weapon, or a tool for developing future vaccines, or simply better understanding of potential future emerging threats. But if it were to be used as a weapon, it could have been engineered to be even more deadly.

This point is completely moot if the virus escaped from a lab by accident. Ie. there is no telling if SARS-CoV-2 is the only virus or maybe an intermediate virus in the gain of function (hypothetical) experiment to made a weapon. So SARS-CoV-2 got out but SARS-CoV-10 is still safely(?) in the lab!

Image of Merogenomics article quote on bioweapons

But by the way, the lead author did propose that the virus was designed to be a weapon in an interview with Tucker Carlson of Fox News. You can see the brief interview (and very sparse media attention to this report, which is quite surprising considering the nature of its claims) below:

Another scenario we thought of would actually be to do a "copy and paste" and then culture the virus in specially engineered animals that have human receptors for the virus, and allow mutagenesis to take place spontaneously that enhances the infectiousness of the virus. Actually, the authors claim that this would not be likely when they examined how the virus had to come about in nature. This topic is quite fascinating, and this is where the report finally gets really interesting, so let's dive into it.


How could SARS-CoV-2 have come about in nature?

We are going quote extensively from the report. The authors propose, "if SARS-CoV-2 does indeed come from natural evolution, its RBM could have only been acquired in one of the two possible routes:

  1. an ancient recombination event followed by convergent evolution or
  2. a natural recombination event that occurred fairly recently."

First, let's define some terms.

A “recombination event” is when two genomes of independent organisms swap some genetic material with each other. This is a part of normal evolution. Genetic material is truly elastic to modification, and so this event happens to genetic material all the time, even in our own cells. Hence different cells in your body will accumulate all sorts of remarkable mutations throughout your lifespan, some of which can eventually lead to a disease, and this is also suspected to be a participating process in aging.

Image of Merogenomics article quote on genetic recombination definition

“Convergent evolution” is when two independent evolutionary events result in developing traits of similar function in unrelated species. Meaning the trait is not passed on from one species to the other even though we might first suspect that due to the similarity of observed traits. They evolved it independent of one another. One famous example of convergent evolution is recurrent development of the ability to fly. Another one is the fact that humans and pigs can eat absolutely anything. Ok, don't take that last one seriously!

Image of Merogenomics article quote on convergent evolution definition

Here is how the authors argued against the two points:

"In the first scenario, the ancestor of SARS-CoV-2, a ZC45/ZXC21-like bat coronavirus would have recombined and 'swapped' its RBM with a coronavirus carrying a relatively 'complete' RBM (in reference to SARS). [...] Subsequently, the virus would have to adapt extensively in its new host, where the ACE2 protein is highly homologous [similar] to hACE2. Random mutations across the genome would have to have occurred to eventually shape the RBM to its current form – resembling SARS-CoV RBM in a highly intelligent manner. However, this convergent evolution process would also result in the accumulation of a large amount of mutations in other parts of the genome, rendering the overall sequence identity relatively low. The high sequence identity between SARS-CoV-2 and ZC45/ZXC21 on various proteins (94-100% identity) do not support this scenario and, therefore, clearly indicates that SARS-CoV-2 carrying such an RBM cannot come from a ZC45/ZXC21-like bat coronavirus through this convergent evolutionary route."

This would also be the argument against making the virus more efficient using animal models with human ACE2 receptors, because other parts of the genome would be altered significantly. Although one possible scenario is that the ZC45/ZXC21 viruses and SARS-CoV-2 originate from another more ancestral virus. The few percent difference in the coronaviruses sequence already apparently indicate many years of evolution taking place as we mentioned in our previous post on virus origins. Thus we are not sure what the expected sequence difference would be in authors' minds. The spike protein of the virus should mutate faster than other parts of the genome because it is under greater evolutionary pressure to adapt to new host receptor binding.

Off to the second option.

"In the second scenario, the ZC45/ZXC21-like coronavirus would have to have recently recombined and swapped its RBM with another coronavirus that had successfully adapted to bind an animal ACE2 highly homologous to hACE2. The likelihood of such an event depends, in part, on the general requirements of natural recombination:

  1. that the two different viruses share significant sequence similarity;
  2. that they must co-infect and be present in the same cell of the same animal;
  3. that the recombinant virus would not be cleared by the host or make the host extinct;
  4. that the recombinant virus eventually would have to become stable and transmissible within the host species."

The authors continued with their recommendation: "In regard to this recent recombination scenario, the animal reservoir could not be bats because theACE2 proteins in bats are not homologous enough to hACE2 and therefore the adaption would not be able to yield an RBM sequence as seen in SARS-CoV-2. This animal reservoir also could not be humans as the ZC45/ZXC21-like coronavirus would not be able to infect humans. In addition, there has been no evidence of any SARS-CoV-2 or SARS-CoV-2-like virus circulating in the human population prior to late 2019. Intriguingly, [...] SARS-CoV-2 was well-adapted for humans since the start of the outbreak."

Image of Merogenomics article quote on coronavirus adapting

This is indeed true and still the unsolved mysteries as to what that intermediate host species allowed the evolution of the virus to become infectious to humans, plus how did SARS-CoV-2 instantly show such high adaptation for infection in humans without ever being observed before in a more intermediate form? Thus far these are still gaps in the current theory of the natural evolution of the virus that await solved solution. Although in the next post we will look into another incredible theory that could answer this. In our second post, we will also point out that these authors are probably wrong about neither bats nor humans being the reservoir for recent recombination scenario. There is some funky science in store for you!


Fake it till you make it... again! Virus style

A weird kink in the SARS-CoV-2 story was the recent report finding coronaviruses in pangolins which had a nearly identical spike protein receptor binding domain to the one reported in the SARS-CoV-2 virus. Perhaps the ancestral bat and pangolin virus could have recombined. How do the authors deal with this anomaly?

They propose that the pangolin virus information is also fake! They list reports of a subsequent analysis of why these initial findings (also from China) were problematic, and the fact that coronavirus has never been seen in the last decade of testing pangolins. The most condemning evidence put forth is that the SARS-CoV-2 spike protein that was also seen in the coronaviruses in pangolins binds human ACE2 receptors 10X more strongly than those of pangolins!

But not so fast. Indeed this all strongly suggests that pangolins might not be the intermediate species, which has yet to be proven to be found, but could this not be an example of such a recombinant event taking place in pangolins where one type of coronavirus was able to recombine with an ancestor’s virus of SARS-CoV-2? Another hole in the author's carefully constructed plot is that pangolins have been reported to be hosts to SARS-CoV, which we wrote about previously too. So it is not exactly “never” that coronaviruses are observed in pangolins.

Coming back to the suggestion that the reported science is faked to obfuscate the true origin of SARS-CoV-2 is a very troubling notion. Troubling because faked science information can indeed be published and it sometimes takes many years to prove that such information does not fit the overall scientific consensus. People are people, and data can be manipulated, cherry picked or outright falsified in order to publish a desired narrative. The pressure on scientists to publish is immense and like any field of human endeavor, there are always people willing to transgress boundaries of ethics for their own gains or to push a desired agenda. Science is definitely not immune to falsehoods. There are plenty of examples throughout history and while retractions do occur - usually due to accidental flaws in experimental design - there is a component of fraud taking place, and the inability of reproducing experiments is well known. Luckily, this does not appear too common, but basically anything that is published in science should not be taken as accurate until additional supporting information builds up. And that takes time.

Image of Merogenomics article quote on science accuracy

The point is that if one wanted to drive a deep wedge in scientific consensus, fake information can set back science for many years. So the authors’ claims that this is what is purposefully taking place to hide the truth related to SARS-CoV-2 would indeed be very disturbing . And we hope not real.

We reported previously on these pangolin coronavirus findings. And indeed it was weird that suddenly and out of nowhere multiple papers published it all at the same time (the same day in fact!) But we never assumed that any of that information was false. Such claims of falsehood require very strong supporting evidence. The authors provide references, but do not supply any condemning evidence, instead leaving it to the reader to investigate further why the existence of either the RaTG13 bat coronavirus or the pangolin coronaviruses might be fake. (Although apparently such a report is in the making by the authors, so we shall wait and see and then offer our judgement.)


Tricky question: who's up for hosting a virus?

The final statement against the natural origin of the SARS-CoV-2 virus was that prior modeling showed the SARS-CoV-2 spike protein had a higher affinity for human ACE2 receptor than any other known animal ACE2 proteins. The authors thus conveniently concluded that "This last study virtually exempted all animals from their suspected roles as an intermediate host." So it has to be man-made! That is a sweeping statement. First of all, if you recount the original dire projections of how deadly the COVID-19 pandemic was expected to be based on model projections in comparison to the ensuing reality, then that is a perfect example of how complicated modeling can be (although still definitely a useful and important tool). Second of all, we can hardly dismiss the entire animal kingdom just because the few known animal ACE2 proteins might not have shown the same affinity in computer models (14 to be exact). Have you heard of civets before? They are believed to have been the intermediate host for the evolution of the original SARS-CoV. Heck, probably many people had never heard of pangolins either. Who knows what animals could still be out there that could have been the intermediate host. We see this more as fitting their information towards a desired narrative and thus the language used by the authors should have been more speculative rather than so authoritative in its conclusions.

Image of Merogenomics article quote on pandemic modeling

By the way, guess what was the second largest affinity for SARS-CoV-2 spike protein to ACE2 receptor binding after humans in that study? Pangolins! Authors even commented that the similarity between the pangolin and human ACE2 receptors might have resulted in convergent evolution resulting in such similar receptor binding domains! This is not based on some fake virus but rather on the pangolin ACE2 receptor architecture! Why was this information ignored by the authors?

Then, just to demonstrate such an impartial stance, the authors claim that if we "assume that such a host does exist" it would be highly unlikely to take place because:

  • Spike protein RBM would have to be recombined from one virus to another which is a rare form of recombination (no explanation of why, we presume because of the high sequence diversity in this region)
  • Only one type SARS-CoV has been observed in human history (until now) so to produce another that resembles the SARS-CoV RBM would be rare (we do not follow this logic. Clearly coronaviruses exist with a variety of different spike proteins with varying degrees of similarities. Potentially there are lots of options to mutate into a form that would target humans. SARS-CoV had to come about somehow too)
  • These two ancestral viruses swapping the RBM would need to reside in the same cell

As to that last point, it brings us to another very unusual and completely overlooked report regarding the history of SARS-CoV-2 which proposed how this very event might have taken place: hosting two viruses at the same time to produce a new one. And not inside some intermediate host, but in humans! It is a story that is bound to take you by surprise.

For that, stay tuned!


This article has been produced by Merogenomics Inc. and edited by Jason Chouinard, B.Sc. Reproduction and reuse of any portion of this content requires Merogenomics Inc. permission and source acknowledgment. It is your responsibility to obtain additional permissions from the third party owners that might be cited by Merogenomics Inc. Merogenomics Inc. disclaims any responsibility for any use you make of content owned by third parties without their permission.


Products and Services Promoted by Merogenomics Inc.


Select target group for DNA testing

Healthy icon Undiagnosed Diseases icon Cancer icon Prenatal icon

Healthy screening

Undiagnosed diseases




Or select popular DNA test

Pharmacogenetics icon NIPT icon Cancer icon Genome icon

Pharmaco-genetic gene panel

Non-invasive prenatal screening

Cancer predisposition gene panel

Full genome