Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2024 Sep 25:2024.09.24.614730.
doi: 10.1101/2024.09.24.614730.

Characterization of non-coding variants associated with transcription factor binding through ATAC-seq-defined footprint QTLs in liver

Affiliations

Characterization of non-coding variants associated with transcription factor binding through ATAC-seq-defined footprint QTLs in liver

Max F Dudek et al. bioRxiv. .

Abstract

Non-coding variants discovered by genome-wide association studies (GWAS) are enriched in regulatory elements harboring transcription factor (TF) binding motifs, strongly suggesting a connection between disease association and the disruption of cis-regulatory sequences. Occupancy of a TF inside a region of open chromatin can be detected in ATAC-seq where bound TFs block the transposase Tn5, leaving a pattern of relatively depleted Tn5 insertions known as a "footprint". Here, we sought to identify variants associated with TF-binding, or "footprint quantitative trait loci" (fpQTLs) in ATAC-seq data generated from 170 human liver samples. We used computational tools to scan the ATAC-seq reads to quantify TF binding likelihood as "footprint scores" at variants derived from whole genome sequencing generated in the same samples. We tested for association between genotype and footprint score and observed 693 fpQTLs associated with footprint-inferred TF binding (FDR < 5%). Given that Tn5 insertion sites are measured with base-pair resolution, we show that fpQTLs can aid GWAS and QTL fine-mapping by precisely pinpointing TF activity within broad trait-associated loci where the underlying causal variant is unknown. Liver fpQTLs were strongly enriched across ChIP-seq peaks, liver expression QTLs (eQTLs), and liver-related GWAS loci, and their inferred effect on TF binding was concordant with their effect on underlying sequence motifs in 80% of cases. We conclude that fpQTLs can reveal causal GWAS variants, define the role of TF binding site disruption in disease and provide functional insights into non-coding variants, ultimately informing novel treatments for common diseases.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.. ATAC-seq footprinting analysis can detect genotype-dependent binding events
(A) Calculation of FP score. TF binding is detectable in ATAC-seq experiments because bound TFs block the insertion of Tn5, leaving a site of relatively depleted cutsites within a larger ATAC-seq peak, known as a footprint. The PRINT software calculates the footprint (FP) score of a local insertion pattern using a supervised regression model trained on the insertion patterns of known binding sites. The resulting FP score can be interpreted as the relative likelihood of a binding event, which can depend on the genotype of a local SNP. (B) fpQTL discovery. Liver samples were taken from 170 donors, and analyzed by ATAC-seq and whole-genome sequencing (WGS). PRINT was used to calculate a footprint score at every SNP location in every sample, and for every SNP an FP score was regressed onto SNP genotype across samples to calculate a P-value for the strength of association.
Figure 2.
Figure 2.. fpQTLs are enriched near transcription start sites (TSSs)
(A) Manhattan plot. For each SNP, the FP score was regressed onto genotype to calculate a coefficient of association (β1). P-values were calculated by testing the null hypothesis that β1=0. The vertical line represents an FDR-adjusted P-value of 0.05. (B) fpQTL proportions based on TSS-proximity. SNPs were binned based on distance to the nearest TSS, and the proportion of SNPs within the bin labeled as fpQTLs was calculated. 95% binomial confidence intervals are shown. (C) Promoter fpQTLs have higher effect sizes (|β1|) than TSS-distal fpQTLs. fpQTLs were considered within a promoter if the distance to the nearest TSS was < 1 kb. (D) For all fpQTLs, the regression β1 (x-axis) is plotted against ΔFP score = alt allelic FP score – ref allelic FP score (y-axis), where the allelic FP scores were calculated by considering insertions in heterozygous samples separately based on their allele. Purple fpQTLs are concordant between their across sample and within-sample effect. The number of fpQTLs is labeled in each quadrant (Fisher’s exact test OR = 2.2, P-value = 8.1×10−7).
Figure 3.
Figure 3.. fpQTLs are enriched in ChIP peaks and concordant with underlying sequence motifs
(A) The expected and observed number of fpQTLs within ChIP peaks for every TF with ChIP data. Liver-related TFs are labeled in red (see Supplementary Table 3). Expected number of fpQTLs was calculated as [#SNPs in ChIP peaks × proportion of SNPs that are fpQTLs]. (B) Number of concordant and discordant fpQTLs which overlap given motifs, grouped by TF. Three redundant CTCF motifs were excluded. Motifs from JASPAR, matched with P=5×10−4. (C) Comparison of fpQTL effect size with the change in motif score, for all fpQTL-motif overlaps. The y-axis represents the regression beta, with positive values indicating an increase in binding for the allele with the stronger motif. Spearman coefficient and P-value shown.
Figure 4.
Figure 4.. fpQTLs are enriched for lipid-associated SNPs
(A) Enrichment of fpQTLs in GWAS/QTL SNPs for different traits, using odds ratios (OR). GWAS SNPs investigated were defined as all SNPs which are either (1) a lead SNP reported in literature, or (2) a proxy of a lead SNP with r2 > 0.8. The top three traits have no such GWAS SNPs as an fpQTL (OR = 0). P-values come from Fisher’s exact test, 95% confidence intervals are shown. Traits which are nominally significant (P < 0.05) are annotated with ✱. (B) Enrichment of GWAS heritability in fpQTLs for several traits, calculated by stratified LD score regression. P-values are calculated by ldsc using permutations. Error bars show ± standard error of enrichment. ldsc can sometimes return negative enrichment values, which are indicated for T2D and TG.
Figure 5.
Figure 5.. fpQTLs can fine-map GWAS loci
Significance plots show P-values for fpQTLs (top), LDL GWAS (middle), and eQTLs (bottom). (A) SORT1 locus significance plot. (B) FP score at rs12740374 (at SORT1 locus) across samples based on genotype. (C) Bias-corrected Tn5 insertions around rs12740374 (marked with x) based on genotype, aggregated across samples. (D) ZFPM1 locus significance plot, with the effect of rs55823018 on the RXRA binding motif shown below. (E) SLC12A8 locus significance plot, with the effect of rs11710930 on the HNF4A binding motif shown below

Similar articles

References

    1. Maurano M.T., Humbert R., Rynes E., Thurman R.E., Haugen E., Wang H., Reynolds A.P., Sandstrom R., Qu H., Brody J., et al. (2012). Systematic localization of common disease-associated variation in regulatory DNA. Science 337, 1190–1195. 10.1126/science.1222794. - DOI - PMC - PubMed
    1. Hormozdiari F., van de Bunt M., Segrè A.V., Li X., Joo J.W.J., Bilow M., Sul J.H., Sankararaman S., Pasaniuc B., and Eskin E. (2016). Colocalization of GWAS and eQTL Signals Detects Target Genes. The American Journal of Human Genetics 99, 1245–1260. 10.1016/j.ajhg.2016.10.003. - DOI - PMC - PubMed
    1. Edwards S.L., Beesley J., French J.D., and Dunning A.M. (2013). Beyond GWASs: Illuminating the Dark Road from Association to Function. Am J Hum Genet 93, 779–797. 10.1016/j.ajhg.2013.10.012. - DOI - PMC - PubMed
    1. Trynka G., Sandor C., Han B., Xu H., Stranger B.E., Liu X.S., and Raychaudhuri S. (2013). Chromatin marks identify critical cell types for fine mapping complex trait variants. Nat Genet 45, 124–130. 10.1038/ng.2504. - DOI - PMC - PubMed
    1. Visscher P.M., Wray N.R., Zhang Q., Sklar P., McCarthy M.I., Brown M.A., and Yang J. (2017). 10 Years of GWAS Discovery: Biology, Function, and Translation. Am J Hum Genet 101, 5–22. 10.1016/j.ajhg.2017.06.005. - DOI - PMC - PubMed

Publication types

LinkOut - more resources