×

Null hypothesis significance testing interpreted and calibrated by estimating probabilities of sign errors: a Bayes-frequentist continuum. (English) Zbl 07632826


MSC:

62-XX Statistics

References:

[1] Bayarri, M.; Benjamin, D. J.; Berger, J. O.; Sellke, T. M., “Rejection Odds and Rejection Ratios: A Proposal for Statistical Practice in Testing Hypotheses, Journal of Mathematical Psychology, 72, 90-103 (2016) · Zbl 1357.62018 · doi:10.1016/j.jmp.2015.12.007
[2] Begley, C. G.; Ioannidis, J. P., “Reproducibility in Science, Circulation Research, 116, 116-126 (2015) · doi:10.1161/CIRCRESAHA.114.303819
[3] Benjamin, D. J.; Berger, J. O., “Three Recommendations for Improving the Use of p-Values, The American Statistician, 73, 186-191 (2019) · Zbl 07588201
[4] Benjamin, D. J.; Berger, J. O.; Johannesson, M.; Nosek, B. A.; Wagenmakers, E. J.; Berk, R.; Bollen, K. A.; Brembs, B.; Brown, L.; Camerer, C.; Cesarini, D.; Chambers, C. D.; Clyde, M.; Cook, T. D.; De Boeck, P.; Dienes, Z.; Dreber, A.; Easwaran, K.; Efferson, C.; Fehr, E.; Fidler, F.; Field, A. P.; Forster, M.; George, E. I.; Gonzalez, R.; Goodman, S.; Green, E.; Green, D. P.; Greenwald, A. G.; Hadfield, J. D.; Hedges, L. V.; Held, L.; Hua Ho, T.; Hoijtink, H.; Hruschka, D. J.; Imai, K.; Imbens, G.; Ioannidis, J. P. A.; Jeon, M.; Jones, J. H.; Kirchler, M.; Laibson, D.; List, J.; Little, R.; Lupia, A.; Machery, E.; Maxwell, S. E.; McCarthy, M.; Moore, D. A.; Morgan, S. L.; Munafó, M.; Nakagawa, S.; Nyhan, B.; Parker, T. H.; Pericchi, L.; Perugini, M.; Rouder, J.; Rousseau, J.; Savalei, V.; Schönbrodt, F. D.; Sellke, T.; Sinclair, B.; Tingley, D.; Van Zandt, T.; Vazire, S.; Watts, D. J.; Winship, C.; Wolpert, R. L.; Xie, Y.; Young, C.; Zinman, J.; Johnson, V. E., “Redefine Statistical Significance, Nature Human Behaviour, 2, 6-10 (2018) · doi:10.1038/s41562-017-0189-z
[5] Bernardo, J. M., “Integrated Objective Bayesian Estimation and Hypothesis Testing, Bayesian Statistics, 9, 1-68 (2011)
[6] Bickel, D. R., “Estimating the Null Distribution to Adjust Observed Confidence Levels for Genome-Scale Screening, Biometrics, 67, 363-370 (2011) · Zbl 1219.62164 · doi:10.1111/j.1541-0420.2010.01491.x
[7] Bickel, D. R., “Coherent Frequentism: A Decision Theory Based on Confidence Sets, Communications in Statistics—Theory and Methods, 41, 1478-1496 (2012) · Zbl 1319.62007
[8] Bickel, D. R., “Empirical Bayes Interval Estimates That Are Conditionally Equal to Unadjusted Confidence Intervals or to Default Prior Credibility Intervals, Statistical Applications in Genetics and Molecular Biology, 11, 7 (2012) · Zbl 1296.92018
[9] Bickel, D. R., “Simple Estimators of False Discovery Rates Given as Few as One or Two p-Values Without Strong Parametric Assumptions, Statistical Applications in Genetics and Molecular Biology, 12, 529-543 (2013) · Zbl 1311.62109
[10] Bickel, D. R., Genomics Data Analysis: False Discovery Rates and Empirical Bayes Methods (2019), New York: Chapman and Hall/CRC, New York
[11] Bickel, D. R., “Maximum Entropy Derived and Generalized Under Idempotent Probability to Address Bayes-Frequentist Uncertainty and Model Revision Uncertainty, Working Paper (2019) · doi:10.5281/zenodo.2645555
[12] Bickel, D. R., “Null Hypothesis Significance Testing Defended and Calibrated by Bayesian Model Checking, The American Statistician (2019) · Zbl 07632862 · doi:10.1080/00031305.2019.1699443
[13] Bickel, D. R., “Sharpen Statistical Significance: Evidence Thresholds and Bayes Factors Sharpened into Occam’s Razor, Stat, 8, e215 (2019) · Zbl 07851102
[14] Bickel, D. R., “Confidence Distributions and Empirical Bayes Posterior Distributions Unified as Distributions of Evidential Support, Communications in Statistics—Theory and Methods (2020) · Zbl 07535583 · doi:10.1080/03610926.2020.1790004
[15] Bickel, D. R., “Interval Estimation, Point Estimation, and Null Hypothesis Significance Testing Calibrated by an Estimated Posterior Probability of the Null Hypothesis, Working Paper (2020) · Zbl 07649640 · doi:10.5281/zenodo.3694136
[16] Bickel, D. R.; Rahal, A., “Correcting False Discovery Rates for Their Bias Toward False Positives, Communications in Statistics—Simulation and Computation (2019) · Zbl 1497.62203 · doi:10.1080/03610918.2019.1630432
[17] Butler, J. S.; Jones, P., “Theoretical and Empirical Distributions of the p Value, METRON, 76, 1-30 (2018) · Zbl 1416.62040
[18] Button, K. S.; Ioannidis, J. P.; Mokrysz, C.; Nosek, B. A.; Flint, J.; Robinson, E. S.; Munafò, M. R., “Power Failure: Why Small Sample Size Undermines the Reliability of Neuroscience, Nature Reviews Neuroscience, 14, 365 (2013) · doi:10.1038/nrn3475
[19] Carlin, B. P.; Louis, T. A., Bayesian Methods for Data Analysis (2009), New York: Chapman & Hall/CRC, New York
[20] Casella, G.; Berger, R. L., “Reconciling Bayesian and Frequentist Evidence in the One-Sided Testing Problem, Journal of the American Statistical Association, 82, 106-111 (1987) · Zbl 0612.62021
[21] Colquhoun, D., “The Reproducibility of Research and the Misinterpretation of p-Values, Royal Society Open Science, 4, 171085 (2017)
[22] Colquhoun, D., “The False Positive Risk: A Proposal Concerning What to Do About p-Values, The American Statistician, 73, 192-201 (2019) · Zbl 07588202
[23] Cox, D. R., “The Role of Significance Tests, Scandinavian Journal of Statistics, 4, 49-70 (1977) · Zbl 0358.62006
[24] de Ruiter, J., “Redefine or Justify? Comments on the Alpha Debate, Psychonomic Bulletin & Review, 26, 430-433 (2019)
[25] Dreber, A.; Pfeiffer, T.; Almenberg, J.; Isaksson, S.; Wilson, B.; Chen, Y.; Nosek, B. A.; Johannesson, M., “Using Prediction Markets to Estimate the Reproducibility of Scientific Research, Proceedings of the National Academy of Sciences of the United States of America, 112, 15343-15347 (2015) · doi:10.1073/pnas.1516179112
[26] Efron, B., Large-Scale Inference: Empirical Bayes Methods for Estimation, Testing, and Prediction (2010), Cambridge: Cambridge University Press, Cambridge · Zbl 1277.62016
[27] Efron, B.; Tibshirani, R., “Empirical Bayes Methods and False Discovery Rates for Microarrays, Genetic Epidemiology, 23, 70-86 (2002) · doi:10.1002/gepi.1124
[28] Efron, B.; Tibshirani, R.; Storey, J. D.; Tusher, V., “Empirical Bayes Analysis of a Microarray Experiment, Journal of the American Statistical Association, 96, 1151-1160 (2001) · Zbl 1073.62511
[29] Evans, M., Chapman & Hall/CRC Monographs on Statistics & Applied Probability, Measuring Statistical Evidence Using Relative Belief (2015), New York: CRC Press, New York · Zbl 1358.62004
[30] Goodman, S. N., “Toward Evidence-Based Medical Statistics. 2: The Bayes Factor, Annals of Internal Medicine, 130, 1005-1013 (1999) · doi:10.7326/0003-4819-130-12-199906150-00019
[31] Grandhi, A.; Guo, W.; Romano, J., “Control of Directional Errors in Fixed Sequence Multiple Testing, Statistica Sinica, 29, 1047-1064 (2019) · Zbl 1426.62217
[32] Greenland, S.; Poole, C., “Living With p Values: Resurrecting a Bayesian Perspective on Frequentisi Statistics, Epidemiology, 24, 62-68 (2013) · doi:10.1097/EDE.0b013e3182785741
[33] Grundy, P. M., “Fiducial Distributions and Prior Distributions: An Example in Which the Former Cannot Be Associated With the Latter, Journal of the Royal Statistical Society, Series B, 18, 217-221 (1956) · Zbl 0073.14902
[34] Hannig, J.; Iyer, H.; Lai, R. C.; Lee, T. C., “Generalized Fiducial Inference: A Review and New Results, Journal of the American Statistical Association, 111, 1346-1361 (2016)
[35] Held, L.; Ott, M., “How the Maximal Evidence of p-Values Against Point Null Hypotheses Depends on Sample Size, American Statistician, 70, 335-341 (2016) · Zbl 07665893
[36] Held, L.; Ott, M., “On p-Values and Bayes Factors, Annual Review of Statistics and Its Application, 5, 393-419 (2018)
[37] Huang, D. W.; Sherman, B. T.; Lempicki, R. A., “Bioinformatics Enrichment Tools: Paths Toward the Comprehensive Functional Analysis of Large Gene Lists, Nucleic Acids Research, 37, 1-13 (2009) · doi:10.1093/nar/gkn923
[38] Hughes, B., Psychology in Crisis (2018), London: Palgrave, London
[39] Hurlbert, S.; Lombardi, C., “Final Collapse of the Neyman-Pearson Decision Theoretic Framework and Rise of the neoFisherian, Annales Zoologici Fennici, 46, 311-349 (2009)
[40] Ioannidis, J. P., “Why Most Published Research Findings Are False, PLoS Medicine, 2, e124 (2005) · doi:10.1371/journal.pmed.0020124
[41] Johnson, V.; Payne, R.; Wang, T.; Asher, A.; Mandal, S., “On the Reproducibility of Psychological Science, Journal of the American Statistical Association, 112, 1-10 (2017) · doi:10.1080/01621459.2016.1240079
[42] Lakens, D.; Adolfi, F. G.; Albers, C. J.; Anvari, F.; Apps, M. A.; Argamon, S. E.; Baguley, T.; Becker, R. B.; Benning, S. D.; Bradford, D. E.; Buchanan, E. M., “Justify Your Alpha, Nature Human Behaviour, 2, 168 (2018)
[43] Lindley, D. V., “Fiducial Distributions and Bayes’ Theorem, Journal of the Royal Statistical Society, Series B, 20, 102-107 (1958) · Zbl 0085.35503
[44] Marsman, M.; Wagenmakers, E. J., “Three Insights From a Bayesian Interpretation of the One-Sided p Value, Educational and Psychological Measurement, 77, 529-539 (2017) · doi:10.1177/0013164416669201
[45] Martin, R.; Liu, C., “A Note on p-Values Interpreted as Plausibilities, Statistica Sinica, 24, 1703-1716 (2014) · Zbl 1480.62010
[46] Mayo, D. G. (2019)
[47] McShane, B. B.; Gal, D.; Gelman, A.; Robert, C.; Tackett, J. L., “Abandon Statistical Significance, The American Statistician, 73, 235-245 (2019) · Zbl 07588206
[48] Montazeri, Z.; Yanofsky, C. M.; Bickel, D. R., “Shrinkage Estimation of Effect Sizes as an Alternative to Hypothesis Testing Followed by Estimation in High-Dimensional Biology: Applications to Differential Gene Expression, Statistical Applications in Genetics and Molecular Biology, 9, 23 (2010) · Zbl 1304.92046
[49] Nadarajah, S.; Bityukov, S.; Krasnikov, N., “Confidence Distributions: A Review, Statistical Methodology, 22, 23-46 (2015) · Zbl 1486.62071
[50] Nieuwenhuis, S.; Forstmann, B. U.; Wagenmakers, E. J., “Erroneous Analyses of Interactions in Neuroscience: A Problem of Significance, Nature Neuroscience, 14, 1105-1107 (2011)
[51] Open Science Collaboration (2015), “Estimating the Reproducibility of Psychological Science,” Science, 349, aac4716.
[52] Pace, L.; Salvan, A., Principles of Statistical Inference: From a Neo-Fisherian Perspective, Advanced Series on Statistical Science & Applied Probability (1997), Singapore: World Scientific, Singapore · Zbl 0911.62003
[53] Polansky, A. M., Observed Confidence Levels: Theory and Application (2007), New York: Chapman and Hall, New York
[54] Pratt, J. W., “Bayesian Interpretation of Standard Inference Statements, Journal of the Royal Statistical Society, Series B, 27, 169-203 (1965) · Zbl 0142.15203
[55] Schachtman, N. A., “Palavering About p-Values,” (2019)
[56] Sellke, T.; Bayarri, M. J.; Berger, J. O., “Calibration of p Values for Testing Precise Null Hypotheses, American Statistician, 55, 62-71 (2001) · Zbl 1182.62053
[57] Shen, J.; Liu, R. Y.; Xie, M. G., “Prediction With Confidence—A General Framework for Predictive Inference, Journal of Statistical Planning and Inference, 195, 126-140 (2018) · Zbl 1383.62076
[58] Shi, H.; Yin, G., “Reconnecting p-Value and Posterior Probability Under One- and Two-Sided Tests, The American Statistician (2020) · Zbl 07632864 · doi:10.1080/00031305.2020.1717621.
[59] Singh, K.; Xie, M.; Strawderman, W. E., “Confidence Distribution (CD)—Distribution Estimator of a Parameter, IMS Lecture Notes Monograph Series 2007, 54, 132-150 (2007)
[60] Stephens, M., “False Discovery Rates: A New Deal, Biostatistics, 18, 275-294 (2016)
[61] van den Bergh, D.; Haaf, J. M.; Ly, A.; Rouder, J. N.; Wagenmakers, E. J., “A Cautionary Note on Estimating Effect Size,”, PsyArXiv (2019) · doi:10.31234/osf.io/h6pr8
[62] Vovk, V. G., “A Logic of Probability, With Application to the Foundations of Statistics, Journal of the Royal Statistical Society, Series B, 55, 317-341 (1993) · Zbl 0806.62004
[63] Wacholder, S.; Chanock, S.; Garcia-Closas, M.; Ghormli, L. E.; Rothman, N., “Assessing the Probability That a Positive Report Is False: An Approach for Molecular Epidemiology Studies, Journal of the National Cancer Institute, 96, 434-442 (2004) · doi:10.1093/jnci/djh075
[64] Wasserstein, R. L.; Lazar, N. A., “The ASA’s Statement on p-Values: Context, Process, and Purpose, The American Statistician, 70, 129-133 (2016) · Zbl 07665862
[65] Wasserstein, R. L.; Schirm, A. L.; Lazar, N. A., “Moving to a World Beyond ‘p < 0.05, The American Statistician, 73, 1-19 (2019) · Zbl 07588180
[66] Wilkinson, G. N., “On Resolving the Controversy in Statistical Inference” (with discussion), Journal of the Royal Statistical Society, Series B, 39, 119-171 (1977) · Zbl 0373.62002
[67] Wilson, B. M.; Wixted, J. T., “The Prior Odds of Testing a True Effect in Cognitive and Social Psychology, Advances in Methods and Practices in Psychological Science, 1, 186-197 (2018)
[68] Xie, M. G.; Singh, K., “Confidence Distribution, the Frequentist Distribution Estimator of a Parameter: A Review, International Statistical Review, 81, 3-39 (2013) · Zbl 1416.62170
[69] Yang, Z.; Li, Z.; Bickel, D. R., “Empirical Bayes Estimation of Posterior Probabilities of Enrichment: A Comparative Study of Five Estimators of the Local False Discovery Rate, BMC Bioinformatics, 14, 87 (2013) · doi:10.1186/1471-2105-14-87
[70] Yanofsky, C. M.; Bickel, D. R., “Validation of Differential Gene Expression Algorithms: Application Comparing Fold-Change Estimation to Hypothesis Testing, BMC Bioinformatics, 11, 63 (2010) · doi:10.1186/1471-2105-11-63
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.