Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2011 Feb 15.
Published in final edited form as: Nature. 2010 Aug 5;466(7307):707–713. doi: 10.1038/nature09270

Biological, Clinical, and Population Relevance of 95 Loci for Blood Lipids

Tanya M Teslovich 1,118, Kiran Musunuru 2,3,4,5,6,118, Albert V Smith 7,8, Andrew C Edmondson 9,10, Ioannis M Stylianou 10, Masahiro Koseki 11, James P Pirruccello 2,5,6, Samuli Ripatti 12,13, Daniel I Chasman 4,14, Cristen J Willer 1, Christopher T Johansen 15, Sigrid W Fouchier 16, Aaron Isaacs 17, Gina M Peloso 18,19, Maja Barbalic 20, Sally L Ricketts 21, Joshua C Bis 22, Yurii S Aulchenko 17, Gudmar Thorleifsson 23, Mary F Feitosa 24, John Chambers 25, Marju Orho-Melander 26, Olle Melander 26, Toby Johnson 27, Xiaohui Li 28, Xiuqing Guo 28, Mingyao Li 9,10, Yoon Shin Cho 29, Min Jin Go 29, Young Jin Kim 29, Jong-Young Lee 29, Taesung Park 30,31, Kyunga Kim 32, Xueling Sim 33, Rick Twee-Hee Ong 34, Damien C Croteau-Chonka 35, Leslie A Lange 35, Joshua D Smith 36, Kijoung Song 37, Jing Hua Zhao 38, Xin Yuan 37, Jian'an Luan 38, Claudia Lamina 39; CARDIoGRAM Consortium, ENGAGE Consortium, Candidate Gene Association Resource (CARe) Lipids Working Group, Andreas Ziegler 40, Weihua Zhang 25, Robert YL Zee 4,14, Alan F Wright 41, Jacqueline CM Witteman 17,42, James F Wilson 43, Gonneke Willemsen 44, H-Erich Wichmann 45, John B Whitfield 46, Dawn M Waterworth 37, Nicholas J Wareham 38, Gérard Waeber 47, Peter Vollenweider 47, Benjamin F Voight 2,5, Veronique Vitart 41, Andre G Uitterlinden 17,42,48, Manuela Uda 49, Jaakko Tuomilehto 50, John R Thompson 51, Toshiko Tanaka 52,53, Ida Surakka 12,13, Heather M Stringham 1, Tim D Spector 54, Nicole Soranzo 54,55, Johannes H Smit 56, Juha Sinisalo 57, Kaisa Silander 12,13, Eric JG Sijbrands 17,48, Angelo Scuteri 58, James Scott 59, David Schlessinger 60, Serena Sanna 49, Veikko Salomaa 13, Juha Saharinen 61, Chiara Sabatti 62, Aimo Ruokonen 63, Igor Rudan 43, Lynda M Rose 14, Robert Roberts 64, Mark Rieder 36, Bruce M Psaty 65, Peter P Pramstaller 66, Irene Pichler 66, Markus Perola 12,13, Brenda WJH Penninx 56, Nancy L Pedersen 67, Cristian Pattaro 66, Alex N Parker 68, Guillaume Pare 69, Ben A Oostra 70, Christopher J O'Donnell 4,19, Markku S Nieminen 57, Deborah A Nickerson 36, Grant W Montgomery 46, Thomas Meitinger 71,72, Ruth McPherson 64, Mark I McCarthy 73,74,75, Wendy McArdle 76, David Masson 11, Nicholas G Martin 46, Fabio Marroni 77, Massimo Mangino 54, Patrik KE Magnusson 67, Gavin Lucas 78, Robert Luben 21, Ruth J F Loos 38, Maisa Lokki 38, Guillaume Lettre 79, Claudia Langenberg 38, Lenore J Launer 80, Edward G Lakatta 60, Reijo Laaksonen 81, Kirsten O Kyvik 82, Florian Kronenberg 39, Inke R König 40, Kay-Tee Khaw 21, Jaakko Kaprio 12,13,83, Lee M Kaplan 84, Åsa Johansson 85, Marjo-Riitta Jarvelin 86,87, A Cecile JW Janssens 17, Erik Ingelsson 67, Wilmar Igl 85, G Kees Hovingh 16, Jouke-Jan Hottenga 44, Albert Hofman 17,42, Andrew A Hicks 66, Christian Hengstenberg 88, Iris M Heid 45,89, Caroline Hayward 41, Aki S Havulinna 50,90, Nicholas D Hastie 41, Tamara B Harris 80, Talin Haritunians 28, Alistair S Hall 91, Ulf Gyllensten 85, Candace Guiducci 5, Leif C Groop 26,92, Elena Gonzalez 5, Christian Gieger 45, Nelson B Freimer 93, Luigi Ferrucci 94, Jeanette Erdmann 95, Paul Elliott 86,96, Kenechi G Ejebe 5, Angela Döring 45, Anna F Dominiczak 97, Serkalem Demissie 18,19, Panagiotis Deloukas 55, Eco JC de Geus 44, Ulf de Faire 98, Gabriel Crawford 5, Francis S Collins 99, Yii-der I Chen 28, Mark J Caulfield 27, Harry Campbell 43, Noel P Burtt 5, Lori L Bonnycastle 99, Dorret I Boomsma 44, S Matthijs Boekholdt 100, Richard N Bergman 101, Inês Barroso 55, Stefania Bandinelli 102, Christie M Ballantyne 103, Themistocles L Assimes 104, Thomas Quertermous 104, David Altshuler 2,4,5, Mark Seielstad 34, Tien Y Wong 105, E-Shyong Tai 106, Alan B Feranil 107, Christopher W Kuzawa 108, Linda S Adair 109, Herman A Taylor Jr 110, Ingrid B Borecki 24, Stacey B Gabriel 5, James G Wilson 110, Kari Stefansson 23, Unnur Thorsteinsdottir 23, Vilmundur Gudnason 7,111, Ronald M Krauss 112, Karen L Mohlke 35, Jose M Ordovas 113, Patricia B Munroe 114, Jaspal S Kooner 59, Alan R Tall 11, Robert A Hegele 15, John JP Kastelein 16, Eric E Schadt 115, Jerome I Rotter 28, Eric Boerwinkle 20, David P Strachan 116, Vincent Mooser 37, Hilma Holm 23, Muredach P Reilly 9,10, Nilesh J Samani 61,117, Heribert Schunkert 95, L Adrienne Cupples 18,19,119, Manjinder S Sandhu 21,38,55,119, Paul M Ridker 4,14,119, Daniel J Rader 9,10,119, Cornelia M van Duijn 17,42,119, Leena Peltonen 5,12,13,55,119, Gonçalo R Abecasis 1,119, Michael Boehnke 1,119, Sekar Kathiresan 2,3,4,5,119
PMCID: PMC3039276  NIHMSID: NIHMS213289  PMID: 20686565

Abstract

Serum concentrations of total cholesterol, low-density lipoprotein cholesterol (LDL-C), high-density lipoprotein cholesterol (HDL-C), and triglycerides (TG) are among the most important risk factors for coronary artery disease (CAD) and are targets for therapeutic intervention. We screened the genome for common variants associated with serum lipids in >100,000 individuals of European ancestry. Here we report 95 significantly associated loci (P < 5 × 10-8), with 59 showing genome-wide significant association with lipid traits for the first time. The newly reported associations include single nucleotide polymorphisms (SNPs) near known lipid regulators (e.g., CYP7A1, NPC1L1, and SCARB1) as well as in scores of loci not previously implicated in lipoprotein metabolism. The 95 loci contribute not only to normal variation in lipid traits but also to extreme lipid phenotypes and impact lipid traits in three non-European populations (East Asians, South Asians, and African Americans). Our results identify several novel loci associated with serum lipids that are also associated with CAD. Finally, we validated three of the novel genes—GALNT2, PPP1R3B, and TTC39B—with experiments in mouse models. Taken together, our findings provide the foundation to develop a broader biological understanding of lipoprotein metabolism and to identify new therapeutic opportunities for the prevention of CAD.


Serum concentrations of total cholesterol (TC), low-density lipoprotein cholesterol, high-density lipoprotein cholesterol, and triglycerides are heritable risk factors for cardiovascular disease and targets for therapeutic intervention1. Genome-wide association studies (GWASs) involving up to 20,000 individuals of European ancestry have identified >30 genetic loci contributing to inter-individual variation in serum lipid concentrations2-10. Half of these loci harboured genes previously known to influence serum lipid concentrations, establishing the technical validity of the lipid GWAS. Nevertheless, the practical value of the GWAS approach remains a subject of debate11-14.

Here we focus on three key questions motivated by recent progress in genetic mapping: (1) are loci identified in populations of European descent important in non-European groups, suggesting relevance in different global populations; (2) are these loci of clinical relevance, providing the framework to identify potential novel drug targets for the treatment of extreme lipid phenotypes and prevention of coronary artery disease (CAD); and (3) do these loci harbour genes with biological relevance, i.e., are directly involved in lipid regulation and metabolism?

We address these questions using several approaches: a genome-wide association screen for serum lipids in >100,000 individuals of European ancestry; evaluation of mapped variants in East Asians, South Asians, and African Americans; association testing in individuals with and without coronary artery disease (CAD); evaluation of genetic variants in patients with extreme serum lipid concentrations; and genetic manipulation in mouse models.

Genome-Wide Association Mapping in >100,000 Individuals

To identify additional common variants associated with serum TC, LDL-C, HDL-C, and TG concentrations, we performed meta-analysis of 46 lipid GWASs (Supplementary Tables 1-4). These studies together comprise >100,000 individuals of European descent (maximum sample size 100,184 for TC; 95,454 for LDL-C; 99,900 for HDL-C; and 96,598 for TG), ascertained in the United States, Europe, or Australia. In each study, we used genotyped single nucleotide polymorphisms (SNPs) and phased chromosomes from the HapMap CEU (Utah residents with ancestry from northern and western Europe) sample to impute autosomal SNPs catalogued in the HapMap; SNPs with minor allele frequency (MAF) >1% and good imputation quality (see Methods) were analysed. A total of ∼ 2.6 million directly genotyped or imputed SNPs were tested for association with each of the four lipid traits in each study. For each SNP, evidence of association was combined across studies using a fixed-effects meta-analysis.

We identified 95 loci that showed genome-wide significant association (P < 5 × 10-8) with at least one of the four traits tested (Fig. 1; Supplementary Fig. 1; Supplementary Table 2). These include all of the 36 loci previously reported by GWAS at genome-wide significance2-10 and 59 loci reported here in a GWAS for the first time. Among these 59 novel loci, 39 demonstrated genome-wide significant association with TC, 22 with LDL-C, 31 with HDL-C, and 16 with TG. Among the 36 known loci, 21 demonstrated genome-wide significant association with another lipid phenotype in addition to that previously described. To rule out spurious associations arising as a result of imputation artifact, at nearly all loci we were able to identify proxy SNPs that had been directly genotyped on Illumina and/or Affymetrix arrays and confirm each of the associations (Supplementary Table 5). The full association results for each of the four traits are available at http://www.broadinstitute.org/mpg/pubs/lipids2010/ or http://www.sph.umich.edu/csg/abecasis/public/lipids2010.

Figure 1. Meta-analysis of plasma lipid concentrations in >100,000 individuals of European descent.

Figure 1

The gene name listed in “Locus” column is either a plausible biological candidate gene in the locus or the nearest annotated gene to the lead SNP. Listed in “Lead Trait” column is the lipid trait with best P-value among all four traits. Listed in “Other Traits” are additional lipid traits with P < 5 × 10-8. Listed in “Alleles/MAF” column are: major allele, minor allele, and minor allele frequency (MAF) within the combined cohorts included in this meta-analysis (alleles designated with respect to the “+” strand; Supplementary Table 2). Numbers in “Effect Size” column are in mg/dL for the lead trait, modeled as an additive effect of the minor allele. P-values are listed for the lead traits. In the “eQTL” column, “Y” indicates that lead SNP has an eQTL with at least one gene within 500 kb with P < 5 × 10-8 in at least one the three tissues tested (liver, omental fat, subcutaneous fat). In the “CAD” column, “Y” indicates that the lead SNP meets the pre-specified statistical significance threshold of P < 0.001 for association with CAD and being concordant between the direction of lipid effect and the change in CAD risk. In the “Ethnic” column, “+” indicates concordant effect on lead trait of the variant between the primary meta-analysis cohort and the European or non-European group, “−” indicates discordant effect on lead trait, and “?” indicates data not available for the group; in order, the ethnic groups are European, East Asian, South Asian, and African American (Supplementary Table 11).

To evaluate whether additional independent association signals existed at each locus, we performed conditional association analyses for each of the four lipid traits including genotypes at the lead SNPs for each of the 95 loci as covariates in the association analyses (see Methods). These analyses identified secondary signals in 26 loci (Supplementary Table 6); when these additional SNPs are combined with the lead SNPs, the total set of mapped variants explains 12.4% (TC), 12.2% (LDL-C), 12.1% (HDL-C), and 9.6% (TG) of the total variance in each lipid trait in the Framingham Heart Study, corresponding to ∼25-30% of the genetic variance for each trait.

Previous studies have suggested sex-specific heritability of lipid traits15. A key challenge in addressing this issue is evaluating enough men and women to achieve adequate statistical power for each sex. We re-analysed the GWAS for the four lipid traits separately in women (n = 63,274) and in men (n = 38,514). Four of the 95 loci identified in the primary analysis showed significant heterogeneity of effect size (P < 0.0005) between men and women (Supplementary Table 7). Moreover, an additional five loci had significant association in only one sex and not in the sex-combined analysis. Two loci associated with HDL-C in the sex-combined analysis (KLF14 and ABCA8) showed female-specific association with TG and LDL-C, respectively. The KLF14 locus is a striking example, with rs1562398 significantly associated with TG in women (effect size = −0.046 for the C allele, P = 2 × 10-12), but not in men (effect size = −0.012, P = 0.05) (Supplementary Fig. 2; Supplementary Table 7).

To gain insight into how DNA variants in associated loci might influence serum lipid concentrations, we tested whether the mapped DNA sequence variants regulate the expression levels of nearby genes (expression quantitative trait loci, or eQTLs) in human tissues relevant to lipoprotein metabolism (liver and fat)16. We carried out genotyping and RNA expression profiling of >39,000 transcripts in three types of human tissue samples from: liver (960 samples), omental fat (741 samples), and subcutaneous fat (609 samples). We examined the correlations between each of the lead SNPs at the 95 loci and the expression levels of transcripts located within 500 kb of the SNP. We pre-specified a conservative threshold of statistical significance at P < 5 × 10-8. At this threshold, we identified 38 SNP-to-gene eQTLs in liver, 28 in omental fat, and 19 in subcutaneous fat (Fig. 1 ; Supplementary Tables 8-10). Some lead SNPs are quite remote from the associated gene transcripts. For example, rs9987289 (associated with both LDL-C and HDL-C) correlates with a two-fold change in liver expression of PPP1R3B, yet is 174 kb away from the gene, which as demonstrated below is likely to be a causal gene. Similarly, rs2972146 (associated with both HDL-C and TG in this study, as well as with insulin resistance and type 2 diabetes mellitus in a prior study17) correlates with IRS1 expression in omental fat, despite being located 495 kb away from the gene.

Relevance of GWAS Loci in Non-Europeans

As all of the individuals studied in our primary GWAS were of European ancestry, it remained unclear if the loci we identified in Europeans are relevant in non-European individuals. To address this question, we performed additional analyses in cohorts comprising >15,000 East Asians (Chinese, Koreans, and Filipinos), >9,000 South Asians, and >8,000 African Americans (Fig. 1; Supplementary Table 11). As a similarly sized control, we also performed genotyping in a cohort of 7,000 additional Europeans.

In the European group, we found that 35 of 36 lead SNPs tested against LDL-C had the same direction of association as seen in the primary (>100,000 person) analysis (see Supplementary Table 12 for explanation); 44 of 47 SNPs for HDL-C; and 29 of 32 SNPs for TG. Such directional consistency for the three traits is unlikely to be due to chance (P = 5 × 10-10 for LDL-C; P = 1 × 10-10 for HDL-C; and P = 1 × 10-6 for TG). For further replication evidence, we performed direct genotyping of a subset of the lead SNPs in two European cohorts together totalling 12,000 individuals and found that 24 of 26 tested SNPs had the same direction of association (Supplementary Table 13).

We observed similar proportions in South Asians, with 29 of 32 lead SNPs tested against LDL-C having the same direction of association as in the primary analysis (P = 1 × 10-6); 35 of 39 SNPs for HDL-C (P = 2 × 10-7); and 24 of 27 SNPs for TG (P = 3 × 10-5). We also had consistent results with East Asians [LDL-C: 29 of 36, P = 2 × 10-4; HDL-C: 38 of 44, P = 5 × 10-7; TG: 26 of 28, P = 2 × 10-6], with more modest evidence for replication in African Americans [LDL-C: 33 of 36, P = 1 × 10-7; HDL-C: 37 of 44, P = 3 × 10-6; TG: 24 of 30, P = 7 × 10-4]. Furthermore, we found that the proportions of SNPs that had the same direction of association and P < 0.05 were similar in the European, South Asian, and East Asian replication groups, with smaller proportions in African Americans (Supplementary Table 12). Of note, for a majority of the loci, there was no evidence of heterogeneity of effects between the primary European groups and each of the non-European groups (Supplementary Table 11).

These observations suggest that most (but likely not all) of the 95 lipid loci identified in this study contribute to the genetic architecture of lipid traits widely across global populations. They also suggest future studies to localize causal DNA variants by leveraging differences in linkage disequilibrium (LD) patterns among populations. We evaluated the potential for fine mapping by comparing the number of SNPs in LD with lead SNPs in three HapMap populations (Supplementary Table 14). At many loci, only a subset of SNPs in high LD (r2 ≥ 0.8) with the lead SNP in HapMap CEU are also in high LD with the lead SNP in HapMap YRI (Yoruba in Ibadan, Nigeria) individuals or in the joint JPT+CHB (Japanese in Tokyo, Japan and Han Chinese in Beijing, China) cohort. Such differential LD patterns can prove useful to refine association boundaries and prioritize SNPs for functional evaluation, as demonstrated for the LDL-C-associated locus on chromosome 1p13 (reported in a separate study in this issue of Nature18).

Clinical Relevance of GWAS Loci

To assess whether the GWAS approach yields clinical insights of potential therapeutic relevance, we sought to determine which of the lipid-associated lead SNPs are also associated with CAD in a manner consistent with established epidemiological relationships (i.e., SNP alleles which increase TC, LDL-C, or TG or that decrease HDL-C should be associated with increased risk of CAD). Whereas LDL-C is an accepted causal risk factor for CAD, it is unclear whether HDL-C and/or TG are also causal risk factors. This uncertainty was reinforced by the failure of a drug that raised HDL-C via CETP inhibition to reduce the risk of cardiovascular disease19.

Whether other drugs that specifically raise HDL-C or lower TG can reduce CAD risk remains an open question. In contrast, the most widely marketed drugs for lowering of LDL-C, statins, have been demonstrated in numerous clinical trials to reduce risk of CAD. Statins inhibit hydroxy-3-methylglutaryl coenzyme A reductase (the protein product of HMGCR) and thereby reduce LDL-C and TC levels. We observed that the variant of our lead SNP in the HMGCR locus that is associated with lower LDL-C levels is also associated with lower CAD risk (P = 0.004), consistent with the clinical effects of statins. Analogously, common variants in other lipid-associated loci that are also associated with CAD may implicate genes at these loci as possible therapeutic targets.

We performed association testing for each of the lead SNPs from this study in 24,607 individuals of European descent with CAD and 66,197 without CAD, with a pre-specified one-sided significance threshold of P < 0.001 requiring directionality consistent with the relevant lipid-CAD epidemiological relationship. A limited number of loci met this criterion (Fig. 1; Supplementary Table 15), with most of them being associated with LDL-C—consistent with LDL-C being a causal risk factor for CAD.

Four novel CAD-associated loci related specifically to HDL-C or TG but not LDL-C: IRS1 (HDL-C, TG), C6orf106 (HDL-C), KLF14 (HDL-C), and NAT2 (TG). That these loci were associated with CAD suggests that there may be selective mechanisms by which HDL-C or TG can be altered in ways that also modulate CAD risk. However, it is also possible that causal genes in these loci may have pleiotropic effects on non-lipid parameters that are causal for CAD risk reduction. For example, the major allele of the lead SNP in the IRS1 locus is associated with increased risk of type 2 diabetes mellitus, insulin resistance, and hyperinsulinemia17, along with decreased HDL-C, increased TG, and increased risk of CAD; it remains unclear which of the metabolic risk factors are responsible for the increased CAD risk.

Besides CAD, a second clinically relevant phenotype is hyperlipidemia. We asked whether the common variants in the 95 associated loci, each with individually small effects on serum lipids, combine to contribute to extreme lipid phenotypes. We genotyped individuals identified in three independent studies as having high LDL-C (n = 532, mean 219 mg/dL), high HDL-C (n = 652, mean 90 mg/dL), or high TG (n = 344, mean 1,079 mg/dL). For each extreme case group, individuals with low serum LDL-C (n = 532, mean 110 mg/dL), HDL-C (n = 784, mean 36.2 mg/dL), or TG (n = 144, mean 106 mg/dL) served as control groups. In each case-control sample set, we calculated risk scores summarizing the number of LDL-C-, HDL-C-, or TG-raising alleles weighted by effect size.

For LDL-C, we found that individuals with LDL-C allelic dosage score in the top quartile were 13 times as likely to have high LDL-C than individuals in the bottom quartile (P = 1 × 10-14) (Supplementary Fig. 3; Supplementary Tables 16, 17). For HDL-C, individuals in the top quartile of HDL-C risk score were four times as likely to have high HDL-C than individuals in the bottom quartile (P = 2 × 10-16). For TG, individuals in the top quartile of TG risk score were 44 times as likely to be hypertriglyceridemic than individuals in the bottom quartile (P = 4 × 10-28). These results suggest that the additive effects of multiple common variants contribute to determining membership in the extremes of a quantitative trait distribution.

Biological Relevance of GWAS Loci

Whether the GWAS approach can yield biological insights that improve our understanding of the mechanisms underlying phenotypes such as serum lipid concentrations remains an open question. Loci identified through GWAS may explain a very small proportion of the variance in a phenotype through naturally occurring common variants in humans, but they may have a greater impact through rare variants or when targeted by pharmacological or genetic intervention.

We surveyed our 95 GWAS loci and asked whether any nearby genes are linked to known Mendelian lipid disorders. There is remarkable overlap between the loci identified here and 18 genes previously implicated in Mendelian lipid disorders (Supplementary Table 18). Fifteen of the genes underlying these Mendelian disorders lie within 100 kb of one of our lead SNPs, including 8 that lie within 10 kb of the nearest lead SNP. In 1,000,000 simulations of 95 randomly drawn SNPs, selected to match our lead SNPs with respect to MAF and the number of nearby genes, the average simulation showed no overlapping loci and none showed >8 overlapping loci.

An additional two loci represent well-established drug targets for the treatment of hyperlipidemia: HMGCR (statins) and NPC1L1 (ezetimibe). Several other loci harbor genes already appreciated to influence lipid metabolism prior to this study: LPA, which encodes lipoprotein(a); PLTP, which encodes phospholipid transfer protein; ANGPTL3 and ANGPTL4, lipoprotein lipase inhibitors; SCARB1, a HDL receptor which mediates selective uptake of cholesteryl ester; CYP7A1, which encodes cholesterol 7-alpha-hydroxylase; STARD3, a cholesterol transport gene; and LRP1 and LRP4, members of the LDL receptor-related protein family. Notably, the protein product of one of the genes implicated by our study—MYLIP—is a ubiquitin ligase that had no recognized role in lipid metabolism prior to our study's inception, but has since been independently demonstrated to be a regulator of cellular LDL receptor levels and is now termed Idol (inducible degrader of the LDL receptor)20.

GALNT2 (encoding UDP-N-acetyl-alpha-D-galactosamine:polypeptide N-acetylgalactosaminyltransferase 2) is a member of a family of GalNAc-transferases, which transfer an N-acetyl galactosamine to the hydroxyl group of a serine/threonine residue in the first step of O-linked oligosaccharide biosynthesis. It is the only gene in the mapped locus on chromosome 1q42 within 150 kb of the lead SNP (rs4846914), which is located in an intron of the gene. We therefore reasoned that GALNT2 would be an ideal candidate for functional validation in a mouse model. We introduced the mouse orthologue Galnt2 into mouse liver via a viral vector. Liver-specific overexpression of Galnt2 resulted in significantly lower plasma HDL-C (24% compared to control mice) by 4 weeks (Fig. 2a). We also performed knockdown of endogenous liver Galnt2 through delivery of an shRNA via a viral vector. Reduction of the transcript level (∼95% knockdown as determined by qRT-PCR) resulted in higher HDL-C levels by 4 weeks (71% compared to control mice) (Fig. 2b). These observations validate GALNT2 as a biological mediator of HDL-C levels.

Figure 2. Effects of altered Galnt2, Ppp1r3b, or Ttc39b expression in mouse liver on plasma lipid levels.

Figure 2

a, b, Overexpression and knockdown of Galnt2. Shown are plasma HDL-C levels at baseline, 2 weeks, or 4 weeks after injection of viral vectors. n = 6 mice per group. c, Overexpression of Ppp1r3b. Shown are plasma HDL-C levels at baseline, 2 weeks, or 4 weeks after injection of viral vectors. n = 7 mice per group. d, Knockdown of Ttc39b. Shown are plasma HDL-C levels at baseline, 4 days, or 7 days after injection of viral vectors. n = 6 mice per group. Error bars show standard deviations. Because independent experiments were performed at different times and/or sites, there is variability in baseline HDL-C levels.

We further asked whether eQTL studies could facilitate the identification of causal genes in loci with multiple genes. Out of several genes surrounding a locus on chromosome 8p23 found to be associated with HDL-C, LDL-C, and TC (Fig. 1), only PPP1R3B [encoding protein phosphatase 1, regulatory (inhibitor) subunit 3B] was found to have an eQTL in liver (Supplementary Table 7). The allele associated with increased expression correlated with lower levels of each of the lipid traits. This eQTL relationship suggests that higher expression of PPP1R3B will lower plasma lipids. Consistent with this prediction, overexpression of the mouse orthologue Ppp1r3b in mouse liver via a viral vector resulted in significantly lower plasma HDL-C levels at two weeks (25%) and four weeks (18%) (Fig. 2c), as well as lower TC levels at two weeks (21%) and four weeks (14%) (data not shown).

Similarly, on a locus on chromosome 9p22 found to be associated with HDL-C, TTC39B (encoding tetratricopeptide repeat domain 39B) was the only one of several genes in the locus to exhibit an eQTL in liver (Supplementary Table 7), with the allele associated with decreased expression correlating with increased HDL-C. Consistent with this eQTL, knockdown of the mouse orthologue Ttc39b via a viral vector, with 50% knockdown of transcript as determined by qRT-PCR, resulted in significantly higher plasma HDL-C levels at four days (19%) and seven days (14%) (Fig. 2d). These data suggest PPP1R3B and TTC39B as causal genes for lipid regulation. These findings, combined with the demonstration that SORT1 is a causal gene for LDL-C and is regulated in its expression by a GWAS SNP (reported in a separate study in this issue of Nature18), support the use of eQTL studies to prioritize functional validation of GWAS-nominated genes.

Together, these observations establish that some of the identified 95 loci harbour novel bona fide lipid regulatory genes and suggest that with additional functional studies many, if not all, of the loci will yield insights into the biological underpinnings of lipid metabolism.

New Biological, Clinical, and Genetic Insights

Through a series of studies, we demonstrate that (1) at least 95 loci across the human genome harbour common variants associated with serum lipid traits in Europeans; (2) the loci contribute to lipid traits in multiple non-European populations; (3) some of these loci are associated not only with lipids but also with risk for CAD; (4) common variants in the loci combine to contribute to extreme lipid phenotypes; and (5) many of the identified loci harbour genes that contribute to lipid metabolism, including the novel lipid genes GALNT2, PPP1R3B, and TTC39B that we validated in mouse models.

It has recently been suggested that conducting genetic studies with increasingly larger cohorts will be relatively uninformative for the biology of complex human disease, particularly if initial studies have failed to explain a sizable fraction of the heritability of the disease in question.11 As the reasoning goes, analysis of a few thousand individuals will uncover the common variants with the strongest effect on phenotype. Larger studies will suffer from a plateau phenomenon in which either no additional common variants will be found or any common variants that are identified will have too small an effect to be of biological interest.

Our study provides strong empirical evidence against this assertion. We extended a GWAS for serum lipids from ∼20,000 to ∼100,000 individuals and identified 95 loci (of which 59 are novel) that, in aggregate, explain 10%-12% of the total variance (representing ∼25-30% of the genetic variance). Even though the lipid-associated SNPs we identified have relatively small effect sizes, some of the 59 new loci contain genes of clear biological and clinical importance—among them LDLRAP1 (responsible for autosomal recessive hypercholesterolemia), SCARB1 (receptor for selective uptake of HDL-C), NPC1L1 (established drug target), MYLIP (recently characterized regulator of LDL-C), and PPP1R3B (newly characterized regulator of HDL-C). We expect that future investigations of the new loci (e.g., resequencing efforts to identify low-frequency and rare variants, or functional experiments in cells and animal models, as demonstrated for SORT1 in a separate study reported in this issue of Nature18) will uncover additional important new genes. Thus, the data presented in this study provide a foundation from which to develop a broader biological understanding of lipoprotein metabolism and to identify potential new therapeutic opportunities.

Methods Summary

The full Methods are in Supplementary Information and provide information about: (1) study samples and phenotypes; (2) genotyping and imputation; (3) genome-wide association analyses; (4) meta-analyses of directly typed and imputed SNPs; (5) estimation of effect sizes; (6) conditional analyses of top signals; (7) sex-specific analyses; (8) cis-expression quantitative trait locus analyses; (9) analyses of lipid-associated SNPs in European and non-European samples; (10) analyses of lipid-associated SNPs in individuals with and without CAD; (11) analyses of associated SNPs in patients with extreme LDL-C, HDL-C, or TG levels; (12) simulation studies to assess overlap between GWAS signals and Mendelian disease loci; and (13) details of mouse studies.

Supplementary Material

1

Acknowledgments

We wish to dedicate this paper to the memory of Dr Leena Peltonen, who passed away on 11 March 2010. A full listing of acknowledgements is provided in Supplementary Information.

Footnotes

Author Contributions: T.M.T., K.M., A.V.S., A.C.E., I.M.S., M.K., and J.P.P. carried out the primary data analyses and/or experimental work. All other authors contributed to additional analyses. L.A.C., M.S.S., P.M.R., D.J.R., C.M.v.D., L.P., G.R.A., M.B., and S.K. conceived, designed, and supervised the study. K.M. wrote the manuscript.

Author Information: Reprints and permissions information is available at www.nature.com/reprints.

D.M.W., K.S., and V.M. are full-time employees of GlaxoSmithKline. M.S.S. has received research funding from GlaxoSmithKline. P.V. and G.W. received grant money from GlaxoSmithKline to fund the CoLaus study. G.T., K.S., U.T., and H.H. are full-time employees of deCODE genetics. The other authors declare no competing financial interests.

References

  • 1.Kathiresan S, et al. A genome-wide association study for blood lipid phenotypes in the Framingham Heart Study. BMC Med Genet. 2007;8(1):S17. doi: 10.1186/1471-2350-8-S1-S17. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Diabetes Genetics Initiative of Broad Institute of Harvard and MIT et al. Genome-wide association analysis identifies loci for type 2 diabetes and triglyceride levels. Science. 2007;316:1331–1336. doi: 10.1126/science.1142358. [DOI] [PubMed] [Google Scholar]
  • 3.Willer CJ, et al. Newly identified loci that influence lipid concentrations and risk of coronary artery disease. Nat Genet. 2008;40:161–169. doi: 10.1038/ng.76. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Kathiresan S, et al. Six new loci associated with blood low-density lipoprotein cholesterol, high-density lipoprotein cholesterol or triglycerides in humans. Nat Genet. 2008;40:189–197. doi: 10.1038/ng.75. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Kooner JS, et al. Genome-wide scan identifies variation in MLXIPL associated with plasma triglycerides. Nat Genet. 2008;40:149–151. doi: 10.1038/ng.2007.61. [DOI] [PubMed] [Google Scholar]
  • 6.Wallace C, et al. Genome-wide association study identifies genes for biomarkers of cardiovascular disease: serum urate and dyslipidemia. Am J Hum Genet. 2008;82:139–149. doi: 10.1016/j.ajhg.2007.11.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Sabatti C, et al. Genome-wide association analysis of metabolic traits in a birth cohort from a founder population. Nat Genet. 2009;41:35–46. doi: 10.1038/ng.271. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Aulchenko YS, et al. Loci influencing lipid levels and coronary heart disease risk in 16 European population cohorts. Nat Genet. 2009;41:47–55. doi: 10.1038/ng.269. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Kathiresan S, et al. Common variants at 30 loci contribute to polygenic dyslipidemia. Nat Genet. 2009;41:56–65. doi: 10.1038/ng.291. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Chasman DI, et al. Forty-three loci associated with plasma lipoprotein size, concentration, and cholesterol content in genome-wide analysis. PLoS Genet. 2009;5:e1000730. doi: 10.1371/journal.pgen.1000730. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Goldstein DB. Common genetic variation and human traits. N Engl J Med. 2009;360:1696–1698. doi: 10.1056/NEJMp0806284. [DOI] [PubMed] [Google Scholar]
  • 12.Hirschhorn JN. Genomewide association studies—illuminating biologic pathways. N Engl J Med. 2009;360:1699–1701. doi: 10.1056/NEJMp0808934. [DOI] [PubMed] [Google Scholar]
  • 13.Kraft P, Hunter DJ. Genetic risk prediction—are we there yet? N Engl J Med. 2009;360:1701–1703. doi: 10.1056/NEJMp0810107. [DOI] [PubMed] [Google Scholar]
  • 14.Hardy J, Singleton A. Genomewide association studies and human disease. N Engl J Med. 2009;360:1759–1768. doi: 10.1056/NEJMra0808700. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Weiss LA, Pan L, Abney M, Ober C. The sex-specific genetic architecture of quantitative traits in humans. Nat Genet. 2006;38:218–222. doi: 10.1038/ng1726. [DOI] [PubMed] [Google Scholar]
  • 16.Schadt EE, et al. Mapping the genetic architecture of gene expression in human liver. PLoS Biol. 2008;6:e107. doi: 10.1371/journal.pbio.0060107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Rung J, et al. Genetic variant near IRS1 is associated with type 2 diabetes, insulin resistance and hyperinsulinemia. Nat Genet. 2009;41:1110–1115. doi: 10.1038/ng.443. [DOI] [PubMed] [Google Scholar]
  • 18.Musunuru K, et al. From noncoding variant to phenotype via SORT1 at the 1p13 cholesterol locus. Nature. 2010 doi: 10.1038/nature09266. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Barter PJ, et al. Effects of torcetrapib in patients at high risk for coronary events. N Engl J Med. 2007;357:2109–2122. doi: 10.1056/NEJMoa0706628. [DOI] [PubMed] [Google Scholar]
  • 20.Zelcer N, Hong C, Boyadjian R, Tontonoz P. LXR regulates cholesterol uptake through Idol-dependent ubiquitination of the LDL receptor. Science. 2009;325:100–104. doi: 10.1126/science.1168974. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

1

RESOURCES