Written by: Nermin Đuzić, M.Sc. in Genetics, Content Specialist
As we briefly mentioned in the previous article, we will soon rely on polygenic risk score assessment to improve our results and provide more comprehensive genetic reports with recommended actions through the practitioners’ portal. Hence, this article aims to explain polygenic risk scoring methodology.
Before we dig deeper into the technical and statistical terms, let’s start with reminding ourselves of some basic genetics principles and extending them a bit more.
Everything starts with genes!
Our article about genetic variations explained that there are conditions controlled by a single gene mutation, and they are called monogenic. For example, a single mutation in the beta-globin chain of hemoglobin causes sickle cell anemia, and a single mutation in the CTFR gene causes cystic fibrosis.
However, have you ever asked yourself how we become more or less prone to conditions like diabetes, heart disease, or cancer? Does our genetics also determine whether we will develop some of these or not?
The answer is yes! These conditions also have a genetic background. However, the whole thing is much more complex due to the interaction of hundreds and thousands of genes included in the genetic architecture of these conditions, therefore called polygenic.
The majority of genetically determined conditions have a complex or polygenic nature compared to the monogenic ones. Since complex traits and conditions involve hundreds and thousands of genes that control their onset, we need complex statistical machinery to estimate the risk they confer - polygenic risk scoring.
Polygenic risk scores, also known as genetic risk scores or polygenic scores, represent a single number estimate that provides a relative genetic predisposition for a given trait or condition, known to be associated with many genetic variants.
In other words, we have first to look at the individual risks at several locations a person has for certain traits or conditions and then aggregate them into a single number called a polygenic risk score. These scores usually include hundreds to thousands of SNPs which are obtained from large genome-wide association studies (GWAS). Results from these GWAS studies are used to estimate and validate allele effect weights for all SNPs associated with the given trait based on which polygenic risk scores are estimated .
We look at the reference database
PRS model considers variants found in your DNA and their impact on the likelihood of developing a complex trait or a condition. To calculate a PR score for a particular condition or disease, we must sum up all individual risks on SNPs across your genome. But wait, how do we know what variants we should consider for any condition?
The answer lies in Genome-Wide Association Studies (GWAS) that serve as a reference. GWAS studies include millions of people from all over the world belonging to a specific ethnicity or population whose variants are tested for association with a given trait .
In simple terms, GWAS studies report and distinguish people with genetic variants that are associated with the onset of a certain condition compared to those individuals who lack them. In other words, these studies identify and report on how alleles for genetic variants that are associated with a given disorder discriminate between affected and not affected individuals.
In this way, GWAS studies provide useful information on which genetic variants (SNPs) we should look for to assess genetic risk for a condition of interest. This clarifies how their variants differ and are more common in people with particular traits and disorders. In that way, we hint at which variants we should look for in DNA samples submitted by our users.
Each variant contributes to the overall risk of developing a certain condition. This contribution is presented through the allele effect sizes of associated genetic variants. PRS model analyzes the effect size of each allele of the corresponding genetic variant and gives it a corresponding weight number that corresponds to individual risk. Greater the effect size greater the weight given to the variant. The final polygenic risk score is the sum of all individual risks carried by each variant.
Since most GWAS studies have been conducted on the European population, polygenic risk scores are mainly available for people of European ancestry. This affects the efficacy of PRS methodology if applied to other ethnic groups of non-European descent since there could be differences in their genetic variants. However, the good news is that large-scale scientific efforts are already underway to address this.
Let’s summarize everything!
In summary, PRS is calculated following several steps:
- We analyze millions of SNPs (genetic variants) across your genome.
- Each variant has its effect size corresponding to its impact on developing a trait or condition. Considering its effect size from GWAS studies, we generate effect size weights for each variant.
- We use appropriate genetic models to add up all effect size weights and get a final polygenic risk score for a given trait or condition.
We can compare your polygenic risk score to the average risk score in the population based on the predefined list of SNPs. These results represent a lifetime risk. However, they may differ according to your age group and gender.
Although the results provided by the PRS model represent a lifetime risk of developing a particular condition, you always have to be aware of the fact that our health and well-being are influenced by our genetics and environment, where the environment includes all non-genetic factors like our habits and lifestyle.
Certain traits are more genetically determined than others, which is explained by the term heritability, the proportion of variability within a population that can be attributed to inherited genetic factors instead of environmental ones.
Genetic and environmental factors like diet and lifestyle influence complex conditions and traits. Therefore, you should put these under control and look at your PRS as a part of the holistic picture.
Also, you should be aware that these results may warn you that you have increased or decreased the risk of developing a condition based on your genetics. However, it does not mean that you will develop it during your lifetime. PRS fits as a small part in the grand scheme of the 4P medicine (predictive, preventive, personalized, and participatory) approach.
- Choi, S. W., Mak, T. S. H., & O’Reilly, P. F. (2020). Tutorial: a guide to performing polygenic risk score analyses. Nature Protocols, 15(9), 2759-2772.
- Tam, V., Patel, N., Turcotte, M., Bossé, Y., Paré, G., & Meyre, D. (2019). Benefits and limitations of genome-wide association studies. Nature Reviews Genetics, 20(8), 467-484.