
Pesky common diseases have pesky genetic roots
Dr.M.Raszek
Complex traits, complex origins
I am obviously an advocate of sequencing one’s own genome to glean great information of value for one's benefit, and today we will focus on one particular aspect of a genome that is very likely to provide future benefits as opposed to currently known benefits. This is the information about all of the complex traits you inherit that exist due to a confluence of many different genes and other areas of your genome, as opposed to one single gene. Such complex traits exist due to dozens, or even hundreds, of different tiny or big variations dispersed throughout people's genomes. To give you some familiar examples, coronary heart disease, type 2 diabetes, hypertension, and obesity fall into this category. Another famous one is diabetes. As well as many cancers.
The underlying genomic architecture behind these problems is so convoluted, that how these genome variations come together to produce the final outcome is just a mystery. So they have not entered the stage of clinical action, but as time goes by, it is expected that one day they will, and I'll show you some evidence to support these points. Which is exciting because a sequenced genome is a genome that keeps on giving!
Sequencing own genome benefits
Your typical current benefit of getting your genome sequenced is to learn about your predisposition to diseases by discovering what pathogenic mutations are lurking in your genome. These can be both treatable and untreatable conditions. It is up to you to decide if you are willing and capable of handling the news about potential untreatable conditions that may or may not arrive in the future (because with the complexity of genomics, almost nothing can be guaranteed). Your other big benefit is obtaining pharmacogenomic information, a fancy way of saying knowledge about how you might process and react to medications. The final entry in the big trifecta is the carrier status of what mutations you might carry that are okay in you, but if your partner has same mutations, then you two run the risk of passing a disease to your children. Since that can now be circumvented with reproductive planning, you don't have to ditch each other just because of bad genes!
Single gene disorders
But the predisposition to diseases can be defined in a couple of ways. Conditions that are monogenic means that they are produced by something going wrong with a single gene. These are the bread and butter of clinical genomics because they are more easily understood, and more easily predictable in terms of potential outcomes, so it becomes more likely that a diagnosis could be obtained. Not that it is a trivial process, and in a true clinical diagnostic setting, it will take a team of overtly smart people who stare at computers far too long to produce that medical report for your doctor. And there are thousands of diseases to contend with! By the latest count we are looking at just over 5000 disorders with over 3500 contributing genes!
How many genes can be used clinically towards diagnosis? Current available test analyze well over a thousand where enough data has been accumulated to be informative. Out of all these, now we are looking at a few hundred genes that are actionable, meaning if you or your child were to find that you had one such mutated gene, doctors could spring into some form of action for you. Out of those few hundred, 59 genes are verified by enough evidence in terms of their contribution to disease, enough observation of such mutations in people with outcomes, enough knowledge of the expectation of disease manifestation, and proven track record of successful intervention, that they are sanctioned as the minimum gold standard of information to be delivered to a person who is sequencing her or his genome. And that list will only grow bigger, as it already has in select institutes that collect own data (as I previously described in a post on a BabySeq project and a ClinGen project).
Common complex diseases: definitely common, definitely complex
The other kind are the aforementioned multi genic conditions, referred to as common complex diseases. Common because they are ridiculously prevalent in our population (think of the cardiovascular problems), and complex because of the multitude of genomic locations somehow contributing to their outcome.
Multitude of genomic locations. Somehow contributing to the outcome. Sounds obscure. And it is. This is why they are not ready for a prime time stage. But... they are sure investigated! And I will tell you about it!
But how are these genomic locations even known about if their contribution is so uncertain? The approach has a seemingly self-explanatory name of genome wide association studies (right?), with a memorable acronym of GWAS! Well, in an oversimplified explanation, you take two groups of people, those that are considered healthy or "normal" (red flag right there), as your control group and those that scored for a particular trait of your choice you want to study. And this can range from those serious health conditions, to even wildly obscure behavioural traits, or some outstanding abilities (as long as there is appropriate control group to compare to). Scientists will find no limit in what could and should be studied, so these can be outrageously surprising sometimes. Like your likelihood of producing ear wax or something like it.
But the point is that you start with what you think are two distinctly different groups. So you can probably guess that the more accurately you place these people in such groups, the more accurate the data will be. And now you can also imagine that there certainly will be some overlap (where a person appearing healthy might not be, which is more common than you would think, or a person appearing sick might have similar symptoms but not actually be sick), and why this data might not always produce accurate results. But you still try your best, and you compare their genomes and look for differences. And voila, you find that people with heart attacks have certain mutations (or variants as a newly adopted term), that are not seen in healthy people, or that people with certain sport abilities have such and such variants, or people with certain metabolic patterns have such and such variants, or those who have hairy backs have such and such variants, and so on. In such a way, you can find dozens and dozens of variants that can be measured for their given impact to a given trait based on how uniquely tied they are to that trait.
And that was oversimplified version! I don't think I could handle the complex version!
And by the way, many of the direct-to-consumer genetic tests you can frivolously buy for yourself without ever requiring a doctor, are based on such GWAS studies. So to some degree that can be a problem. Because the degree of supporting evidence to confirm that some of these variants are indeed correct can range from strong to very weak (even outright wrong). In addition, often these studies are done on Caucasians and therefore the data might not be applicable to other ethnicities, and a bias can be easily introduced based on the group selection. And even how the impact of all these variants should be measured together is not known! There are multiple prevailing suggested methods, but frankly, no one yet knows. And who knows what "proprietary" methods are used by these consumer companies to calculate those final risk scores presented to the public. Good luck with the client knowing that!
Not that I discourage people from taking such tests. Quite the opposite. One way in which we can enhance our understanding is by producing and analyzing more data. What I am saying though, is that perhaps don't take the results of those tests as absolute certainty and go chop your body parts off! Instead, consider it as a voluntary financial and genetic contribution on your part towards one day someone deciphering that data. As I said, the prime time has not arrived yet for analysis based on such studies. The understanding of the contribution of these variants is just lacking. But it is certainly expected that one day it will, so thank you.
But those complex diseases are here, and they are not going away. And what is fascinating is that unlike the monogenic diseases mentioned above, the majority of these variants contributing to these complex traits reside outside of the genes! They are in the no man's land, or in the regulatory regions, dedicated to controlling the function of these genes. That is one of the values of having the entire genome sequence, and why I say go for full genome instead, because all of that information is captured, and even if not fully understood now, it might be in the very near future.
This understanding can be achieved in a couple of ways: by finding more and more of these variants that collectively contribute to these complex traits, enhancing our overall heritable contribution towards these traits; and by studying the impact of these variants over the long-term. How often do these variants which are claimed to contribute to the complex trait, actually result in such a trait if found in a person currently not displaying that trait? So if I have the mutation but I don't show the problem, will the problem show up?
And that is just the heritable component. To make matters more complex, the majority of these complex traits are also influenced by the environment we live in, and not just genetics. For these complex traits, what lurks in your genome is only part of the story. What you do with your life is another, and can very much influence the severity or mildness of the final outcome. Think of your propensity to obesity for example. You might have high-risk genes, or low-risk genes, but despite whatever Mother Nature has genetically bestowed on you, your lifestyle choices, such as your exercise regimen, your diet, your smoking habits, medication, propensity to stress, and a gazillion other factors big or small, will contribute to the final weight count.
Clinical utility fraught with many problems
So with that being said, enter a scientific study! A rare study that actually analyzed the multi genic conditions, some of which included those mentioned above, and looked to see how good this can get in clinical use. Even after carefully selecting which variants to include in the analysis, some of the traits still had massive amounts of variants involved in their contribution: 105 variants for multiple sclerosis, 95 for coronary heart disease, and 77 for diabetes. You see what I mean? 91% of all the variants studied were outside of the genes!
And since we mentioned that no one really knows how best to assess the cumulative impact of so many variants contributing to a complex trait, the authors compared three prevalent methods. 379 individuals of Caucasian descent were assessed, and divided into 10 subgroups of total risk. While all three methods correlated closely for all of the individuals, at best you'd get a 50% agreement between any two methods to get the same individuals into the subgroup with the riskiest variants, and even less for all three methods to agree. Problem number one.
The vast majority of the analyzed variants showed a significant difference in impact between ethnic groups, once again showing that when genetically assessing individuals against a control group, ethnicity is important. Problem number two.
The authors settled for one key method for the rest of the study, and looked at its capability by assessing a massive data set of nearly 17 000 individuals that had previously been looked at for the presence of such variants in their genomes. And here comes the big outcome of the study! Once again, placing the individuals into 10 groups of lowest to highest risk, for those who ended up in the highest risk, all of that multi genic risk amounted to very weak positive predictive value, or in other words, weak prediction of the disease materializing. Positive predictive value takes disease prevalence in the general population into consideration (for a background on this see previous post on NIPT predictive value). And the more rare a disease is, the worse predictive value can be expected. But even with what are considered common diseases (for example, 1% of the population with rheumatoid arthritis, 2% with bipolar disease, 6% with coronary heart disease and 8% with type 2 diabetes), the predictive values for these same diseases using multi genic variants were still only 2%, 6%, 10% and 12%! Yeah, pretty dismal and nowhere near good enough for clinical use. And for Crohn's disease, which has a handful of cases for every 100 000 people, it was, get ready, only 0.04% predictive value! Problem number three.
And that was for what was considered to be the group of people with the accumulated highest genetic risk. Such classification was also highly sensitive to the number of variants chosen for the analysis. When the authors started playing with the data, knocking away some of the variants from the analysis, they showed that removing half of the variants resulted in the loss of nearly half of the people in the riskiest group. And for traits that only had a small number of variants associated with a particular condition, the removal of even just one variant from the data could have a big impact, removing such an individual from that riskiest group classification to any other of the 9 risk groups (remember, people were classified into ten subgroups of increasing risk).
That is important, because this is how your risk scores are often determined in those online genetic tests, and you can really see that the accuracy of such testing is hugely dependant on many factors. So how valuable will that data actually be you wonder? Obviously we still have many new variants to discover that contribute to these complex traits! So problem number four.
Nevertheless, these brave authors forged ahead, and used such an approach to provide common complex disease risks to cardiologists for a number of their patients, and this will be part of a longitudinal study that will assess these people over a long period of time. So Father Time can contribute to the overall accuracy.
But even despite this discouraging information, these guys think that such an approach will possibly win the day in the future, boldly concluding that "risk predictions for common diseases attributable to common genetic variants, may be informative for clinicians and patients to promote specific health behaviors" even if "currently not useful in medical practice."
So there is a glimmer of hope.
Looking back in time to measure the future
In fact, the same group has published another study prior to the one above pointing to just that! Enter scientific study number two! Here the authors did something tricky: they took the available GWAS catalog of information, and they compared the risk prediction of common diseases based on past data at different years (like pretending to go back in time when less information was available). And this time they focused on breast cancer, prostate cancer, type 2 diabetes, and coronary heart disease, so quite a collection of complex diseases.
The data was looked at in 2007, 2009, 2011 and 2013. Not surprisingly, the largest amount of data accumulation was observed for the last period, showing the rapid rise of data acquisition through these technologies, and as more variants were discovered associated with these diseases, more individuals were being classified into the high-risk group (this time the authors just used three classifications: low, average and high-risk groups, where high-risk was defined as two times higher than the average population risk). The high-risk group grew from less than 3% in 2007 to 5% in 2011 to 11% in 2013. That makes sense, as the more variants associated with a disease you have, the more likely you are to find people with some sort of combination of these variants that are indicative of a higher risk of disease development.
But more importantly, between 16%-24% of individuals had to be reclassified for their risk (depending on which disease you looked at), between 2011 and 2013 alone, and the majority of that included shifts from high-risk to lower risk groups. Between 2007 and 2013, it was even more dramatic, ranging between 18%-50% of people reclassified for their risk. That is because when fewer variants were known, more weight was given to their importance, which has now been diluted by the discovery of new contributing factors. So now that each variant is worth less in its power to be associated with a disease, the risk estimation changes from past to present.
This is important because it clearly shows you that whatever information you are going to get now for these complex traits based on our current knowledge, it is very likely going to change in time, so it's not as substantial as it may seem. Not at least until we know all of those genomic contributing factors!
So going back again to those commercial DNA tests, the longer ago you did them the less likely they would have been in providing you accurate information for any of those complex diseases. Sorry to break it to you, but at the same time, it is good to see a positive trend for the better, right? All that information is only suggestive at best.
Or is it?
While the predictive power might not be here yet, and you run the risk of having your classification altered in the future, it might still be of value for you to know that if you are at a higher genetic risk for common diseases, you might as well do your best to alter your lifestyle to more healthy ways. Your doctor might not be able to do much with such data clinically, at best being vigilant for future changes indicative of such disease development, and using appropriate screening approaches. But your doctor will never chastise you for improving your lifestyle to be healthier! In fact, it is the best gift you can provide to your doctor, because often they have a hard time persuading their patients to adopt healthier lifestyles. We are stubborn beasts I guess.
I'll give you one amazing example.
Better lifestyle, better outcomes, no matter the genes
Not that long ago, a study came out where such multi genic risk scores were applied for coronary artery disease (using 50 variants in total), in multiple very large study groups, together totalling more than 55 000 people. They were followed for around 20 years and scored for any negative coronary events, including death if caused by such an event. So this is the type of longitudinal study we were talking about earlier. These authors divided the people into five risk groups.
And what did they find? Those in highest risk group had a 91% higher risk of experiencing a coronary event than the people in the lowest risk group. So this is clearly pointing to the fact that these common diseases variants can really have an impact. But the real gem of that study was the fact that these participants were also scored for a favourable versus unfavourable lifestyle (and you can probably guess that a favourable lifestyle consisted of no smoking, a healthy diet, regular exercise, and no present obesity, and all this was clearly measured).
No matter which genetic risk group you look at, a favourable lifestyle definitively decreased your chance of a negative health outcome throughout your life by up to 46%, including if you happen to be burdened with those bad genetic factors (highest risk group). This means that for any given period of 10 years, even if you are in the highest multi genic risk group, your incidence of negative coronary events drops from 11% to 5% if you lead a healthy lifestyle (example from one of the studied population). For the same population, even if you are in the lowest risk group, an unfavourable lifestyle increased your incidence of negative outcomes to 6% from the 3% seen in those with a healthy lifestyle. This is a huge discovery considering that coronary artery diseases are a big contributing factor to deaths worldwide.
So your genetics for complex traits are only just one component, and environmental factors, including your lifestyle choices, can have an additional influence in the final outcome of these complex traits. Genetic factors don't determine everything, and your lifestyle is going to play a big role no matter what, so there is no excuse to try the best you can, especially if your genetics are not in your favour!
And why you should always listen to your doctor when she or he tells you to lead a healthy lifestyle! Now you know why. It can seriously counteract problems that will not be visible to you until one day you need to be hospitalized. So best try your best. No smoking, healthy eating, exercise.
As mentioned, I am a proponent of whole genome sequencing so that the entirety of information can be captured, whether monogenic disease predisposition, or that of complex multi genic diseases, and everything else. And at Merogenomics, we can assist you in safely accessing a quality genome sequencing service. You can take immediate advantage of the current interpretation, which might already benefit your medical care, and have the advantage of reanalyzing the sequence that is in your hands at any future timing of your choosing. You can observe over time how the information regarding the multi genic information is affected as the depth of understanding accumulates. And if you get a hint of being in the higher risk group, either now or in the future, your doctor might consider appropriate screening methods, or at least you have healthy lifestyle options at your disposal.
Live well, stay healthy.
This article has been produced by Merogenomics Inc. and edited by Kerri Bryant. Reproduction and reuse of any portion of this content requires Merogenomics Inc. permission and source acknowledgment. It is your responsibility to obtain additional permissions from the third party owners that might be cited by Merogenomics Inc. Merogenomics Inc. disclaims any responsibility for any use you make of content owned by third parties without their permission.
Products and Services Promoted by Merogenomics Inc.
Select target group for DNA testing
![]() |
![]() |
![]() |
![]() |
Healthy screening |
Undiagnosed diseases |
Cancer |
Prenatal |
Or select popular DNA test
![]() |
![]() |
![]() |
![]() |
Pharmaco-genetic gene panel |
Non-invasive prenatal screening |
Cancer predisposition gene panel |
Full genome |