Wuhan corona virus uniqueness – what does science say?
Dr.M.Raszek
First the terminology. You might have come across the term COVID-19. Most people assume it refers to the virus but it actually refers to the disease that the novel coronavirus causes. It is a short hand instead of having to say “2019 novel coronavirus disease”.
The name for the virus is severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). To add to the confusion, you may also have seen 2019 novel coronavirus, or 2019-nCoV for short (also nCOV-2019) which was a tentatively proposed name until the virus was better understood. All of these terms appeared to be used interchangeably, but if you were going to interchange any names, then it should be SARS-CoV-2 and 2019-nCoV. Officially SARS-CoV-2.
Second, if you need to read for yourself the best sanctioned, official education resources on this virus, you will find them linked at the bottom of this article. There was tons of science to go through for the making of this post, and that lead to many useful links.
Coronaviruses are a large family of viruses that consists of enveloped single-stranded RNA genomes. That includes the virus that caused severe acute respiratory syndrome (SARS). Along with the Wuhan SARS-CoV-2, they belong to a specific subgroup of coronaviruses called beta-genus. There are four of these groups: a couple of them include coronaviruses that target mammals, and a couple groups are viruses that use birds for hosts.
The origin of the SARS-CoV-2 virus has not been conclusively demonstrated. Meaning we do not know how it started to infect humans. There has been a tremendous amount of confusion and rumours with regards to whether this virus was human engineered rather than originating from nature. One of the reasons for this is because the epicentre of the first outbreak occurred in Wuhan city in China, which is also home to China’s highest security biosafety level laboratory, the Wuhan Institute of Virology. These types of institutes study the nastiest and most dangerous viruses imaginable. To fuel the suspicion, the Wuhan laboratory was also involved in studying bat viruses. So you can imagine how inflammatory this mix of facts could become.
So let’s pose the nasty question.
Could the virus be synthesized by humans?
In other words, not of natural origin. Or is it natural? To answer this we will dive deep into current scientific understanding of the architecture of this virus.
From a scientific point of view, there is no published evidence at all that the SARS-CoV-2 virus was engineered. Well, almost none. We will get to that in a second. But the scientific support for natural origins of the virus are so substantial, that a number of scientists publicly denounced the “synthetic origins” of coronavirus theory. These concerned scientists tossed 10 publication examples where the genomes of SARS-CoV-2 virus were analyzed without any suggestion that something is unusually out of place. That was a lot of publication on the genome of this virus with no hint of anything unusual. Since then, also nothing.
It is on these detailed investigations that conclusions have been made this is a virus of natural origins.
Let us explore.
Who published the first genome sequences of the SARS-CoV-2 virus? The weird part was that at first it looked liked it was in fact the Wuhan Institute of Virology. Kind of makes sense, after all, the outbreak was happening around them. They indeed were the first one to publish an online article about it as a preprint.
Before we go on, let’s mention science preprints. They will be a common theme in our story. Preprints are a science research publications that have not been peer reviewed yet at the time of their online publication, meaning they have not been officially published by any specific journal, and therefore might not be too credible as a source of information. It is research presented online before any official analysis. Preprints are normal these days, and it allows scientists to basically have the entire global community peer-review their work, with a goal to help spot any issues prior to an actual review by a panel of acclaimed scientists (almost always anonymous, not sure why) before publishing in a scientific journal.
So going back to SARS-CoV-2 genome, it was actually the Shanghai Public Health Clinical Center that was the first one to publish the virus genome sequence in the global depository of sequenced organisms, the GeneBank. The race between the two Chinese institutes was very close, and both peer-reviewed scientific journal publications from each institute occurred at the same time in the same journal (considered to be the most prestigious journal in the world we might add).
Here is the timeline of events.
Institute | GeneBank publishing date | BioRxiv publishing date | Journal publishing date | ||||
Shanghai Public Health Clinical Center | Jan 12 | Jan 25 | Feb 3 | ||||
Wuhan Institute of Virology | Feb 11 | Jan 23 | Feb 3 |
Each of these analyzed viruses were isolated from different individuals.
By the way, you can see how the virus infection is tracked around the world using DNA sequencing data and how the virus is continuously evolving. We will come back to that point soon.
When bats are the closest relatives we are left in the dark
However, it was the Wuhan Institute of Virology scientists who discovered that the closest relative to SARS-CoV-2 is a bat virus termed Bat-CoV RaTG13. They are 96.2% identical, which helped to confirm that SARS-CoV-2 has its origins in bats (and later confirmed by additional publications). Bats are a natural reservoir hosts of SARS-related coronaviruses, meaning these viruses infect the bats and stick around in that species. Some viruses can jump across species no problem, some do not. Sometimes those that do not, can mutate themselves to be eventually able to infect another species. That is how we sometimes pick up new viruses that are foreign to us, because they were otherwise residing in another species, and so we ignored them. In theory we could get infected like that from any species, but bats are especially bad because they seem not to be affected by some of these nasty viruses that can be really dangerous to us. That’s where the SARS coronavirus came from that started a scary outbreak nearly two decades ago (believed to have gone from bats to another species to humans). Incidentally, here is how we got the name of the virus that caused SARS disease: SARS-CoV. So you probably can figure out by now that if 2019-nCoV is also called SARS-CoV-2, SARS-CoV and SARS-CoV-2 are going to be related. And they are: 79.5% identical.
Importantly, they also discovered that SARS-CoV-2 enters our cells the same way that SARS-CoV does, by attaching to cell receptors encoded by ACE2 gene. Since then this has been confirmed by scores of scientific research - this will be important in a moment.
In other words, we have another SARS like outbreak, but this time a whole lot worse in terms of infectivity. Don’t ask why the disease is called COVID-19 and not SARS-2 - perhaps because symptomatically they present differently.
We will also add that coronaviruses genomes are made of RNA and not DNA (different viruses can also have DNA genomes) and you can see simple video on how viruses infect and use our cells below.
But, despite the seemingly close relation between the SARS-CoV-2 and Bat-CoV RaTG13, they were different enough already to suggest that SARS-CoV-2 evolved from Bat-CoV RaTG13 decades ago. You can track virus evolutionary time based on the estimated rate of mutation. In the same way, you can track the history of a virus in humans, by observing how different viruses isolated from humans are from one another. The SARS-CoV-2 studied thus far show very little differences between them (although they constantly mutate), indicating that SARS-CoV-2 has infected humans very recently. This means we have a missing link between the Bat-CoV RaTG13 and SARS-CoV-2, which is why you might have heard a theory of another animal other than bat being potentially the source of the virus that started the outbreak.
Another weird part to the story: the Bat-CoV RaTG13 virus that is the super close relative to SARS-CoV-2, well, that was reported for the first time by the Wuhan Institute of Virology scientists - and at the same time they published the SARS-CoV-2 sequence for a first time. This added to the conspiracy theories - well, isn’t that convenient that they had a bat virus so similar -but without it, SARS-CoV-2 would have been appearing quite different from any other close relatives. To some that looked super suspicious and you can see why controversies can sometime just create themselves.
Here is why. One of the papers that did analyze the SARS-CoV-2 virus genome concluded their publication as follows “the new coronavirus provides a new lineage for almost half of its genome, with no close genetic relationships to other viruses within the subgenus of sarbecovirus [sublineage of the beta-coronaviruses group to which SARS-CoV-2 belongs]. This genomic part comprises half of the spike region encoding a multifunctional protein responsible also for virus entry into host cells”. In other words, here we have a brand new mysterious virus that appeared to have a totally new genome around the region that is responsible for infecting our cells (and perhaps explaining the SARS-CoV-2 higher ability to infect), and the only other naturally derived genome that shows any similarity to this region is the Bat-CoV RaTG13 virus that was also reported by the Wuhan Institute of Virology but only once the outbreak occurred. We can’t blame people’s sceptical nature for taking over and becoming suspicious. Too many bizarre coincidences stacked together for some folks not to start wondering. This story is about to get even stranger. But coincidence is not a proof of synthetic creation, and we will get to that.
It’s all in the receptor binding intimacy
Now let’s get back to ACE2. SARC-CoV or SARS-CoV-2 genomes encode a spike protein (also referred to as S protein) that is responsible for the coronavirus entry into our cells. These spike proteins are anchored in the viral envelope, and they are the ones that give these viruses the characteristic crown-like shape from which theses viruses derive their “coronavirus” name. Since they are on the surface of the viral envelope, this allows them to come in contact with proteins found on the surface of the human cells. One such protein is called angiotensin-converting enzyme 2 (ACE2) and is a receptor for the viral spike protein. We are talking the key-lock analogy here.
The viral spike protein has a specific region that is involved in that key-lock interaction with ACE2, and we call it receptor-binding motif. This receptor-binding motif can be quite mutated, differ dramatically between different coronaviruses, and can determine the degree to which coronaviruses infect different types of species. We know that specific amino acids, the building blocks of proteins, when altered, can enhance viral binding to human ACE2. They have been meticulously mapped out in SARS-CoV, and the significant locations are 442, 472, 479, 480 and 487 along the chain of spike protein amino acids (although the chain of amino acids adopts a complex three dimensional structure, so any amino acids that are close to one another on a chain, eventually might not find themselves that close in three-dimensional space). Viruses have been engineered that put all of these mutations together, leading to creation of super infectious coronaviruses.
So let us look at the ACE2 of SARS-CoV-2. Get ready for this - it’s an intense load of information.
SARS-CoV-2 spike protein amino acid position | Corresponding SARS-CoV amino acid position | SARS-CoV-2 spike protein amino acid identity | SARS-CoV-2 binding to ACE2 expected outcome | ||||
455 | 442 | Leucine | Favourable | ||||
486 | 472 | Phenylalanine | Optimal | ||||
493 | 479 | Glutamine | Optimal | ||||
494 | 480 | Serine | Favourable | ||||
501 | 487 | Asparagine | Favourable |
The three dimensional structure of the SARS-CoV spike protein interacting with the ACE2 receptor has been mapped out to exquisite detail. We are talking at the atomic level resolution where we understand where each atom of all the amino acids are in relation to one another in the complex interaction between these two proteins! It is like saying a drone in the sky can see what you have stuck between your teeth. By now the map of the SARS-CoV-2 spike protein structure has also been elucidated. From that, we can compare to SARS-CoV-2 and infer how good of a binding partner it is for ACE2. Such analyses concluded that SARS-CoV-2 is more efficient than SARS-CoV at recognizing ACE2. Another set of authors who also analyzed the above three dimensional molecular interaction, even went as far to say that “This is strong evidence that SARS-CoV-2 is not the product of genetic engineering” because in theory you could get an even better, more infectious virus. Those three amino acids in the spike protein of SARS-CoV-2 that are favourable for ACE2 binding, could be mutated to have amino acids that are even better for such interaction.
On top of that, apparently there is also a form of the virus in nature that has the same amino acids in these key positions as SARS-CoV-2. This in fact was the first paper we saw that argued against “laboratory manipulation” and called it “improbable”. Instead, these authors argued that the virus went through two cycles of natural selection before becoming our new pain in the lungs: first in a host animal before jumping to human (zoonotic transfer), and the subsequently in the human population.
One more item, the amino acid position 487 in SARS-CoV in that table above, it was known to be important for human-to-human transmission during the SARS outbreak. The corresponding position in SARS-CoV-2 is still favourable, and this was the very first hint that SARS-CoV-2 could be transmitted between humans, which indeed was later confirmed as the outbreak escalated.
Another controversial twist to the story: cleavage exposed!
But the story does get even more interesting. That same paper proclaiming that “genomic evidence does not support the idea that SARS-CoV-2 is a laboratory construct”, well, they also reported another highly unusual feature of this virus. In the genome part of the virus that is the code for that spike protein that the virus needs for binding to ACE2 receptors, there is an insertion of a sequence. It leads to a very specific modification of the spike protein. It introduced a cleavage site within the amino acid sequence of the spike protein. This means that the spike protein can be cut in half by specialized scissor proteins (donated from the infected host, meaning, ourselves). The reason why this is significant is because this has not been observed before in this lineage of coronaviruses, and other types of viruses that have such a feature, it is known to enhance viral fusion with host cell membranes. Or in other words, increase the infectivity of the virus. Although whether that is indeed an acquired benefit of SARS-CoV-2, still needs to be determined. Another publication that studied this unique feature of SARS-CoV-2 in detail suggested that this is a result of “convergent evolution”, or where similar features independently arise in different organisms.
Conspiracy theorists would go “yeah right!”. Except that this is how you know that many of those people peddling different conspiracy theories around this virus do not know/understand the science behind the virus because none of them have seemed to really pick up on this part yet.
Although wouldn’t you know, it’s not like somebody didn’t try engineering SARS-CoV to have this feature to see if it would become more infectious. Before you ask, no it was not Wuhan Institute of Virology.
By the way, if you heard the rumors that SARS-CoV-2 could be treated with anti-HIV drugs, it is because HIV virus employs a similar cleavage mechanism to enhance its infectivity success (it is also found in Ebola and measles). Some anti-HIV drugs are used to inhibit our host proteins that do this cutting. One such molecular set of scissors of ours is furin, which is widely expressed in lung cells. One Chinese preprint even compiled a list of all the possible anti-furin drugs available as potential consideration for the 2019 outbreak coronavirus treatment. One might ring a bell with you, and it was a surprise: folic acid.
And this was all the non-controversial stuff thus far. Let’s get into the controversial bit, or the one and only scientific proclamation thus far pointing to evidence that SARS-CoV-2 was engineered, or as authors phrased it “of unconventional evolution”. But since its online premiere (again, as a preprint), this paper has faced so much criticism that it has been retracted by its authors. What did they claim?
These scientists from India did a comparison of the spike proteins sequences of different coronaviruses and found that SARS-CoV-2 spike protein is endowed with four unique inserts not found in others. Yes, this sounds familiar and the unique insert we were just discussing above is one of these four. Furthermore, what these authors claimed is that looking at the sequence of these four inserts, they all happened to have similarity to what is found in HIV virus, and to coincidentally have four such features resembling a completely different specific virus was “unlikely to be fortuitous in nature” as they put it.
They got destroyed for their claims! There was an outcry from online commentators that these inserts are so tiny in nature that you cannot make the claims that they all happen to be from HIV because these tiny stretches of sequence appear in so many viruses anyway. On top of that, the authors completely missed the boat on the fact that one of the inserts is the cleavage site we just discussed (which as we mentioned indeed is present in HIV virus).
From a statistical point of view, sure, this might all be true that this was premature to call that these inserts were all related to HIV. But what is interesting is that we never saw the presence of these additional short three inserts addressed by anyone else either. Why are they there and what was their origin? As it happens, all of these three inserts (we are ignoring the cleavage site) congregate at a location of the spike protein that is actually used for binding to ACE2. So they also might have an important influence on the viruses’ infectiousness rate or how it is transmitted. You saw how dramatic an influence a change of a single amino acid can have on the virus’ ability to interact with a receptor. Should this not have been studied in detail? Once again, this did not help stem the flow of suspicion over the origins of this virus.
But the story about this virus is constantly evolving (excuse the pun).
Enter the pangolins! Wait, what are pangolins?
It turns out that a related coronavirus has been found that has identical spike protein receptor-binding-domain as SARS-CoV-2! There go your suspicions of the virus spike protein being engineered, straight out the conspiratorial window.
Which animal was the carrier of this new virus you may ask? Pangolins! We don’t blame you if you don’t know what that is, so here is a cute picture (seriously, would you eat that?).
Suddenly a flurry of preprints on this discovery came all on the same day (all from China)! Four of them! What are the odds they would all come out at the same time? Pretty amazing.
Overall, the Pangolin-CoV (or pangolin-CoV-2020) virus had similarity of 85.5-92.4% to SARS-CoV-2.
But more importantly, the Pangolin-CoV receptor-binding-domain of the spike protein were identical to that of SARS-CoV-2 (with only one amino acid difference), compared to 89.2% amino acid similarity in that same area between SARS-CoV-2 and the Bat-CoV RaTG13 mentioned earlier. Good thing those scientists from India retracted their claims about HIV virus comparison, because now their story would look pretty goofy.
However, this does not mean that it was pangolins that infected us with this virus. In fact, overall, the spike protein of Bat-CoV RaTG13 is more similar to that of SARS-CoV-2 than the pangolin-CoV is. Pangolin-CoV spike protein also does not have the splice site like the SARS-CoV-2. Rather, they might be a common ancestor to SARS-CoV-2 and the Bat-CoV RaTG13. But this clearly indicates that pangolins might be host to viruses that could be transmitted to us and hence could be dangerous to eat. Hint hint. Rather than hunting them for food or their scales, they should be studied to track the evolution of their own coronaviruses.
One of the preprints concluded that “2019-nCoV might have originated from the recombination of a Pangolin-CoV-like virus with a Bat-CoV-RaTG13-like virus”. So we are no closer yet to solving the mystery as to where the virus came from, but it is unlikely from pangolins as the sequence difference is too much. The missing link for now continues to be missing. Pangolin-CoV likely has donated the spike protein component found in the SARS-CoV-2, and SARS-CoV has previously been found in pangolins, thus pangolins very likely could be the intermediate host that facilitated the process of evolution towards SARS-CoV-2 development. For now, bats and pangolins remain the only mammals we know of that are infected by this lineage of viruses. You might say they are nature’s own laboratory to synthesize new viruses.
Epilogue to SARS-CoV-2: global response
Could the SARS-CoV-2 virus have been human engineered though? Of course it could have, and no one would have been the wiser, as how would we know if any of these unique changes we observe in the virus were natural or not? Technically we can make viruses in whatever fashion we want, piecing them together like pieces of lego, and for good measure, a synthetic version of 2019 coronavirus was recently demonstrated, created only in a span of days. This technology has been developed so that we can learn about viruses and how to protect ourselves from them when a dangerous form like SARS-CoV-2 appears. In theory you could build anything you want.
Whether the virus was an outcome of “convergent evolution” or “unconventional evolution”, the point is that these changes we observe in the virus have made it more infectious and more deadly. And the sooner this pandemic is over, the sooner we can breathe a collective sigh of relief. This is the time where the world has to be smart together. It has to have dawned by now on everyone that this virus is taking over the world. How far it will go, we just do not know.
Judging from the type of punishment that China was willing to inflict on itself in order to stop this virus from being severely damaging to its own people and the rest of the world, it is a pretty good hint that this virus is serious. By all accounts, China has taken a tremendous beating. Its economy is shut down. We are lucky so few have died, and we all have to be smart together to minimize the impact of such a smartly designed biological threat as this SARS-CoV-2 virus.
Our global response has been smart so far not to underestimate this. People on the street are also voting with their actions and choosing to be prepared just in case too.
This is the time when sworn enemies in this world extend a helping hand to one another, and during all of this fear of the unknown and what this virus is already doing across the world, it is so heart warming to see exactly that, the whole world coming together. Even sworn enemies are extending help to one another. Countries are coming together to deal with this, both in terms of stopping the outbreak and sharing resources.
Right now we need the world to be unified because the pain from this virus is global. And the world has responded.
Important SARS-CoV-2 virus information resources.
Source | Link | ||
Alberta AHS Situation Summary | https://albertahealthservices.ca/topics/Page16944.aspx | ||
US CDC Situation Summary | https://www.cdc.gov/coronavirus/2019-nCoV/summary.html | ||
Clinicians' Biosecurity News | http://www.centerforhealthsecurity.org/cbn/2020/cbnreport-02272020.html | ||
Global Cases Tracker | https://www.gisaid.org/epiflu-applications/global-cases-covid-19/ | ||
China CDC Epidemiological Characteristics | http://weekly.chinacdc.cn/en/article/id/e53946e2-c6c4-41e9-9a9b-fea8db1a8f51 | ||
Therapeutic and Triage Strategies | https://www.thelancet.com/journals/lanres/article/PIIS2213-2600(20)30071-0/fulltext | ||
Information for Clinicians | https://jamanetwork.com/journals/jama/fullarticle/2760782 |
This article has been produced by Merogenomics Inc. and edited by Jason Chouinard, B.Sc. Reproduction and reuse of any portion of this content requires Merogenomics Inc. permission and source acknowledgment. It is your responsibility to obtain additional permissions from the third party owners that might be cited by Merogenomics Inc. Merogenomics Inc. disclaims any responsibility for any use you make of content owned by third parties without their permission.
Products and Services Promoted by Merogenomics Inc.
Select target group for DNA testing
Healthy screening |
Undiagnosed diseases |
Cancer |
Prenatal |
Or select popular DNA test
Pharmaco-genetic gene panel |
Non-invasive prenatal screening |
Cancer predisposition gene panel |
Full genome |