Abstract
Recent efforts to sequence the genomes of thousands of matched normal-tumor samples have led to the identification of millions of somatic mutations, the majority of which are non-coding. Most of these mutations are believed to be passengers, but a small number of non-coding mutations could contribute to tumor initiation or progression, e.g. by leading to dysregulation of gene expression. Efforts to identify putative regulatory drivers rely primarily on information about the recurrence of mutations across tumor samples. However, in regulatory regions of the genome, individual mutations are rarely seen in more than one donor. Instead of using recurrence information, here we present a method to prioritize putative regulatory driver mutations based on the magnitude of their effects on transcription factor-DNA binding. For each gene, we integrate the effects of mutations across all its regulatory regions, and we ask whether these effects are larger than expected by chance, given the mutation spectra observed in regulatory DNA in the cohort of interest. We applied our approach to analyze mutations in a liver cancer data set with ample somatic mutation and gene expression data available. By combining the effects of mutations across all regulatory regions of each gene, we identified dozens of genes whose regulation in tumor cells is likely to be significantly perturbed by non-coding mutations. Overall, our results show that focusing on the functional effects of non-coding mutations, rather than their recurrence, has the potential to prioritize putative regulatory drivers and the genes they dysregulate in tumor cells.
J. Zhao and V. Martin—The authors contributed equally to this work.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
ICGC/TCGA Pan-Cancer Analysis of Whole Genomes Consortium: Pan-cancer analysis of whole genomes. Nature 578(7793), 82–93 (2020)
Khurana, E., Fu, Y., Chakravarty, D., Demichelis, F., Rubin, M., Gerstein, M.: Role of non-coding sequence variants in cancer. Nat. Rev. Genet. 17(2), 93–108 (2016)
Elliott, K., Larsson, E.: Non-coding driver mutations in human cancer. Nat. Rev. Cancer 21(8), 500–509 (2021)
Lochovsky, L., Zhang, J., Fu, Y., Khurana, E., Gerstein, M.: LARVA: an integrative framework for large-scale analysis of recurrent variants in noncoding annotations. Nucleic Acids Res. 43(17), 8123–8134 (2015)
Lochovsky, L., Zhang, J., Gerstein, M.: MOAT: efficient detection of highly mutated regions with the mutations overburdening annotations tool. Bioinformatics 34(6), 1031–1033 (2018)
Rheinbay, E., et al.: Recurrent and functional regulatory mutations in breast cancer. Nature 547(7661), 55–60 (2017)
Weinhold, N., Jacobsen, A., Schultz, N., Sander, C., Lee, W.: Genome-wide analysis of noncoding regulatory mutations in cancer. Nat. Genet. 46(11), 1160–1165 (2014)
Lawrence, M.S., et al.: Mutational heterogeneity in cancer and the search for new cancer-associated genes. Nature 499(7457), 214–218 (2013)
Rheinbay, E., et al.: Analyses of non-coding somatic drivers in 2,658 cancer whole genomes. Nature 578(7793), 102–111 (2020)
Heinz, S., et al.: Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol. Cell 38(4), 576–589 (2010)
Link, V.M., Romanoski, C.E., Metzler, D., Glass, C.K.: MMARGE: motif mutation analysis for regulatory genomic elements. Nucleic Acids Res. 46(14), 7006–7021 (2018)
Shen, Z., Hoeksema, M.A., Ouyang, Z., Benner, C., Glass, C.K.: MAGGIE: leveraging genetic variation to identify DNA sequence motifs mediating transcription factor binding and function. Bioinformatics 36(Suppl_1), i84–i92 (2020)
Horn, S., et al.: TERT promoter mutations in familial and sporadic melanoma. Science 339(6122), 959–961 (2013)
Huang, F.W., Hodis, E., Xu, M.J., Kryukov, G.V., Chin, L., Garraway, L.A.: Highly recurrent TERT promoter mutations in human melanoma. Science 339(6122), 957–959 (2013)
Buisson, R., et al.: Passenger hotspot mutations in cancer driven by APOBEC3A and mesoscale genomic features. Science 364(6447), 06 (2019)
Mas-Ponte, D., Supek, F.: DNA mismatch repair promotes APOBEC3-mediated diffuse hypermutation in human cancers. Nat. Genet. 52(9), 958–968 (2020)
Perera, D., Poulos, R.C., Shah, A., Beck, D., Pimanda, J.E., Wong, J.W.: Differential DNA repair underlies mutation hotspots at active promoters in cancer genomes. Nature 532(7598), 259–263 (2016)
Kim, E., et al.: Systematic functional interrogation of rare cancer variants identifies oncogenic alleles. Cancer Discov. 6(7), 714–726 (2016)
Martin, V., Zhao, J., Afek, A., Mielko, Z., Gordân, R.: QBiC-Pred: quantitative predictions of transcription factor binding changes due to sequence variants. Nucleic Acids Res. 47(W1), W127–W135 (2019)
Zhao, J., Li, D., Seo, J., Allen, A.S., Gordân, R.: Quantifying the impact of non-coding variants on transcription factor-DNA binding. Res. Comput. Mol. Biol. 10229, 336–352 (2017)
O’Leary, N.A., et al.: Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation. Nucleic Acids Res. 44(D1), D733-745 (2016)
Tweedie, S., et al.: Genenames.org: the HGNC and VGNC resources in 2021. Nucleic Acids Res. 49(D1), D939–D946 (2021)
Andersson, R., et al.: An atlas of active enhancers across human cell types and tissues. Nature 507(7493), 455–461 (2014)
Lizio, M., et al.: Gateways to the FANTOM5 promoter level mammalian expression atlas. Genome Biol. 16, 22 (2015)
Alexandrov, L.B., et al.: The repertoire of mutational signatures in human cancer. Nature 578(7793), 94–101 (2020)
Jusakul, A., et al.: Whole-genome and epigenomic landscapes of etiologically distinct subtypes of cholangiocarcinoma. Cancer Discov. 7(10), 1116–1135 (2017)
Fisher, R.A.: Statistical Methods for Research Workers, 4th edn. Oliver & Boyd, Edinburgh (1934)
Lawrence, M.S., et al.: Discovery and saturation analysis of cancer genes across 21 tumour types. Nature 505(7484), 495–501 (2014)
Araya, C.L., et al.: Identification of significantly mutated regions across cancer types highlights a rich landscape of functional molecular alterations. Nat. Genet. 48(2), 117–125 (2016)
Lipták, T.: On the combination of independent tests. Magyar Tud Akad Mat Kutato Int Kozl 3, 171–197 (1958)
Whitlock, M.C.: Combining probability from independent tests: the weighted Z-method is superior to Fisher’s approach. J. Evol. Biol. 18(5), 1368–1373 (2005)
Zaykin, D.V.: Optimally weighted Z-test is a powerful method for combining probabilities in meta-analysis. J. Evol. Biol. 24(8), 1836–1841 (2011)
van Zwet, W.R., Oosterhoff, J.: On the combination of independent test statistics. Ann. Math. Stat. 38(3), 659–680 (1967)
Heard, N.A., Rubin-Delanchy, P.: Choosing between methods of combining \(p\)-values. Biometrika 105(1), 239–246 (2018)
Hochberg, Y.: A sharper Bonferroni procedure for multiple tests of significance. Biometrika 75(4), 800–802 (1988)
Uhlen, M., et al.: A pathology atlas of the human cancer transcriptome. Science 357(6352), 08 (2017)
Li, Y., et al.: ShRNA-targeted centromere protein A inhibits hepatocellular carcinoma growth. PLoS ONE 6(3), e17794 (2011)
He, B., et al.: CTNNA3 is a tumor suppressor in hepatocellular carcinomas and is inhibited by miR-425. Oncotarget 7(7), 8078–8089 (2016)
Li, M., Xia, S., Shi, P.: DPM1 expression as a potential prognostic tumor marker in hepatocellular carcinoma. PeerJ 8, e10307 (2020)
Bianchi, M., et al.: Distribution of metastatic sites in renal cell carcinoma: a population-based analysis. Ann. Oncol. 23(4), 973–980 (2012)
Sacco, J.J., et al.: The deubiquitylase Ataxin-3 restricts PTEN transcription in lung cancer cells. Oncogene 33(33), 4265–4272 (2014)
Zou, H., Chen, H., Zhou, Z., Wan, Y., Liu, Z.: ATXN3 promotes breast cancer metastasis by deubiquitinating KLF4. Cancer Lett. 467, 19–28 (2019)
Otálora-Otálora, B.A., Henríquez, B., López-Kleine, L., Rojas, A.: RUNX family: oncogenes or tumor suppressors (review). Oncol. Rep. 42(1), 3–19 (2019)
Liu, E.M., Martinez-Fundichely, A., Bollapragada, R., Spiewack, M., Khurana, E.: CNCDatabase: a database of non-coding cancer drivers. NAR 49(D1), D1094–D1101 (2021)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Zhao, J., Martin, V., Gordân, R. (2022). Transcription Factor-Centric Approach to Identify Non-recurring Putative Regulatory Drivers in Cancer. In: Pe'er, I. (eds) Research in Computational Molecular Biology. RECOMB 2022. Lecture Notes in Computer Science(), vol 13278. Springer, Cham. https://doi.org/10.1007/978-3-031-04749-7_3
Download citation
DOI: https://doi.org/10.1007/978-3-031-04749-7_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-04748-0
Online ISBN: 978-3-031-04749-7
eBook Packages: Computer ScienceComputer Science (R0)