DNA quality consequence on your DNA test results
Dr.M.Raszek
DNA test quality: knowledge is power
So you want to sequence your genome, all of your DNA, and look deep into the hidden secrets of your biological code? Then you sure will want to get quality information! It is easy to get excited about the results, but the majority of people who purchase any type of commercial DNA sequencing test, and even many of those selling it, actually have a poor understanding of the complexity of the process and the meaning of the results. With the speed of new DNA sequencing tests coming onto the market (at least 10 medical DNA tests are released per day, and who knows how many non-medical tests), many of them, if not the majority of the available tests on the market, will be providing DNA results that do not have any scientific validation, and hence no actual utility apart from having a bit of fun. However, while you are having some fun, you have to remember that you are disclosing access to your most private and precious biological information, your DNA. Instead, DNA information should be closely guarded by families, and retained for serious medical needs.
A personal genome sequence contains extremely powerful information. After all, DNA is a program code of life, of each unique individual. Every individual who has offspring passes on approximately 50% of that information to their child. Therefore, a DNA sequence can be informative towards the health management of other related family members along with the person who is actually tested. This type of benefit can actually continue for multiple generations, and therefore well beyond someone’s lifespan. Someone’s genome sequence can confer medical benefits to grandchildren and beyond.
Thus, you better select the best quality test to capture your data.
Measuring DNA is a measure of success
Whereas there may be a myriad of pitfalls where quality can be compromised in such a complex process as deciphering a human genome, perhaps measuring its amount is not something you would first think of in this day and age. And yet, after years of perfecting genome sequencing, and thousands of people having been assessed, there is still no gold standard of how best to most accurately measure DNA concentration prior to being sequenced. Say what?! And as it turns out, it can have profound effect. One publication demonstrated that the most commonly used DNA quantification method, might actually not be the best, and demonstrated better avenues. It is this new method that was the captivating part of the paper, so let’s break it down here.
So why is high accuracy of DNA quantification so vital? In most cases of sequencing, it is performed using the technology of the Illumina company, which dominates the market share. When genome is sequenced using the Illumina platform, it is sequenced as series of short segments of DNA that the human genome is fragmented into. These short fragments (termed “library”), are conjugated with short sequences called adapters, and then attached to what is called a flow cell, where each fragment is then multiplied many times over (up to about 1000 copies), to produce a cluster of same sequence DNA (suitably named “clonal clusters”). The formation of these clusters increases the signal intensity that is measured when the DNA is sequenced base by base, and therefore enhances the accuracy of DNA being decoded. An old but great explanation of the DNA sequencing process from Illumina itself is described in this link.
Naturally, there is a physical space limit on the flow cell, so that when fragments of DNA are attached to form clusters, it is desirable for the clusters to be spaced apart so as not to interfere in each other’s signal. To achieve that, there is an optimal concentration of DNA recommended to be applied to a flow cell to strike the right balance between the use of available space and obtaining the greatest output possible. Underclustering, or when not enough flowcell space is used, does not affect the sequencing quality, but is a waste of valuable space, and therefore an unnecessary cost.
However, overclustering, when too many clusters are formed resulting in too close of a proximity for an accurate reading, can negatively impact sequence signal reading ability and therefore greatly reduce the ability to obtain quality data. And again, it is an unnecessary cost loss. The most common cause of overclustering is inaccurate DNA quantification.
How to measure your DNA
Consequently, you would think that we would have this golden quantification standard figured out by now. Now bear with me for a moment. The most commonly used methods for quantifying DNA are UV absorption, intercalating dyes, quantitative polymerase chain reaction (qPCR, which can be coupled with the use of signal emitting probes to increase specificity), or droplet digital emulsion PCRs. Unless you are involved with these processes, they may sound like a litany of cosmic weapons, so let’s briefly touch upon them, especially their strengths and weaknesses.
UV light, old, tested and true, is absorbed by DNA, so the more DNA, the more UV light will be absorbed, and you can easily measure that. The problem here is that it does not differentiate between double stranded DNA, which is what is desired to be measured (as it is your wholesome intact DNA), and single stranded DNA or even individual single nucleotides, the building blocks of the DNA, if they are floating about. So that is out, and not even considered by the industry or in the above-mentioned study.
Dyes are small chemicals that intercalate (wiggle their way in), between bases (nucleotides), of double-stranded DNA so they are very accurate. But they also measure information that is not of interest, such as primers, small specially synthesized fragments of DNA that are used to amplify the DNA that is of interest to us, that we want to learn about. To amplify, or basically increase the amount of DNA, is easy. You just need the right proteins that will do the work of duplicating the DNA, the necessary building blocks of the DNA (the nucleotides), and a short fragment of DNA with a signal encoded in it that will tell these proteins, hey, this is where you are going to start duplicating the DNA, and thank you very much! These short fragments of DNA that initiate this chain reaction of duplicating the DNA of interest are aptly nicknamed “primers”.
Then you have qPCR, which is an amplification process itself, where with use of primers, the same DNA molecules can be remade many times over. Currently qPCR is the accepted standard because it will only measure the double stranded DNA of interest, and requires small amounts of DNA to capture the measurement because during process the sample is re-cloned over and over (or amplified). The DNA is measured either by use of the already mentioned fluorescent dyes, or by use of probes specific to the DNA sequence fragment being measured. Such probes are labelled with fluorescent marker which can be measured once the probe is dislocated from the DNA in the process of copying it. These fluorescent probes emit a specific color of light if you stimulate them with another wavelength of light. So everyone is happy, the fluorescent probe gets to shine brightly, and we get to measure it.
Droplet digital PCR (ddPCR) is still fairly new technology, where the DNA is not measured in one single volume (that comprises all of the different DNA molecules we want to amplify and measure), but is dispersed molecule by molecule into thousands (and with some technologies even millions), of individual droplets before being amplified by the PCR process. This allows for much smaller sample reaction sizes, and allows quantification of DNA molecules directly without the need of any standards to compare it to (as is the case with other methods).
In the paper analyzing the different methods, the authors looked into direct dye fluorescence (with a system called QuBit), qPCR amplification also measured with dye, ddPCR, and their new method, where they incorporated use of probes in ddPCR. These were special probes that were attached to primers used to amplify the DNA so they could be used and measured on any random sequence DNA molecules. The primers used to amplify the DNA in turn target the attached adapters (remember, these are small fragments of DNA we attach to our DNA of interest and are used to attach to DNA to flowcell), so in one go the same primers can be used to amplify many different genome fragments. The authors called this method ddPCR-Tail. The probes themselves are also an ingenious design as they use chemically modified DNA bases that can still participate in regular DNA interactions, but with increased affinity, making them more specific. In nature they probably would be almost too specific, but for the scientific application, they are ingenious because it allows the probes to be very short in length (only few bases long).
Nothing is trivial about DNA sequencing
The results of the study might be somewhat surprising: while all quantification methods successfully estimated DNA concentration in a similar range, all measurements were still significantly different between each method. One measurement was totally off (ddPCR had some issues in one instance). So this was discouraging because it showcases what variability could be introduced just by the choice of DNA quantification method, which could impact how well a human genome could be analyzed.
To investigate further, the genomes were sequenced and the quality of results was assessed. It was measured in terms of quality score of properly identifying the DNA code, as well as how well mixed different libraries were measured individually. It turned out that QuBit and ddPCR-Tail were the best methods for quantifying DNA. It was somewhat surprising that qPCR was outranked by QuBit, and it is a sobering food for thought considering that qPCR is the current standard used. It was also somewhat surprising that ddPCR did not show much of an improvement over qPCR. Both of these methods showed overclustering, resulting in the reduced quality of data, as well as non-uniform reading of different samples that were being sequenced at the same time. Such developments could have a negative impact on how effectively the sequence could be determined.
The winner appeared to be the ddPCR-Tail method because, apart from being comparable to QuBit, it required far less sample for quantification purposes. This means such a method could also be used in difficult to prepare samples where the amount of starting DNA material is sparse. QuBit, on the other hand, is dependent on much higher quantity. The authors also added that the reason why this method performed so well was because of the high quality of their own starting material, which is unlikely to be the case in most cases.
What are you to do about this if you plan to sequence your genome or take any other DNA test? Are you supposed to check with the company how they measure their DNA? Well, probably not. As mentioned, most likely it will be the standard method mentioned above. But if you were to inquire for anything, you should be at least be inquiring if the laboratory that will be handling your DNA is certified for such tests. Having regulatory certification means the laboratory has to comply with certain standards and they are not trivial. It means that the laboratory has to regularly demonstrate and log quality control. That will include the DNA quantification process besides everything else related to sequencing DNA.
You can try to help out a little by providing a blood DNA sample instead of saliva. While the quality of DNA in a quality lab will be the same from either of these samples, saliva DNA will be contaminated by bacteria, so with blood you will at least be providing purer DNA. It does not have to improve the quality results, but at least it will make it more cost efficient for the provider – no wasted space on that flow cell with bacterial junk (although one day I expect that information to be of clinical value to customer as well). But the quality of DNA from saliva in poor hands can potentially be impacting the results.
The bottom line is that the quality of someone’s whole genome data will be highly dependent on the quality of dozens and dozens steps being taken along the way. Measurement of your DNA is just one simple demonstration. There can be many pitfalls in how the sample is prepared for sequencing, including its accurate quantification prior to being sequenced on the flowcell. And there are many pitfalls of how your sequenced data is then analyzed and interpreted. Crazy enough, this area of genome sequencing data is not even regulated at the moment. Total wild west! We only know of one provider that is actually certified for the analysis of your DNA data to the same degree as a laboratory that purifies and sequences your DNA. Those who are interested in sequencing their genomes for the wealth of its information, especially medical information, do have some hazards to watch out for, and a selection of appropriate DNA test providers is paramount. The best of the best will look through every detail to ensure the highest quality data is provided to your doctor and yourself. They will frequently measure the performance standards of their equipment and their technicians. And will be on top of the best technology to provide the best outcome.
So if you plan to sequence your genome, chose carefully. And Merogenomics can help. That’s what we are here for!
This article has been produced by Merogenomics Inc. and edited by Kerri Bryant. Reproduction and reuse of any portion of this content requires Merogenomics Inc. permission and source acknowledgment. It is your responsibility to obtain additional permissions from the third party owners that might be cited by Merogenomics Inc. Merogenomics Inc. disclaims any responsibility for any use you make of content owned by third parties without their permission.
Products and Services Promoted by Merogenomics Inc.
Select target group for DNA testing
Healthy screening |
Undiagnosed diseases |
Cancer |
Prenatal |
Or select popular DNA test
Pharmaco-genetic gene panel |
Non-invasive prenatal screening |
Cancer predisposition gene panel |
Full genome |