×

Dual-semiparametric regression using weighted Dirichlet process mixture. (English) Zbl 1469.62148

Summary: An efficient and flexible Bayesian approach is proposed for a dual-semiparametric regression model that models mean function semiparametrically and estimates the distribution of the error term nonparametrically. Using a weighted Dirichlet process mixture (WDPM), a Bayesian approach has been developed on the assumption that the distributions of the response variables are unknown. The WDPM approach is especially useful for real applications that have heterogeneous error distributions or come from a mixture of distributions. In the mean function, the unknown functions are estimated using natural cubic smoothing splines. For the error terms, several different WDPMs are proposed using different weights that depend on the distances between the covariates. Their marginal likelihoods are derived, and the computation of marginal likelihood for WDPM is provided. Efficient Markov chain Monte Carlo (MCMC) algorithms are also provided. The Bayesian approaches based on different WDPMs are compared with the parametric error model and the Dirichlet process mixture (DPM) error model in terms of the Bayes factor using a simulation study, suggesting better performance of the Bayesian approach based on WDPM. The advantage of the proposed Bayesian approach is also demonstrated using the credit rating data.

MSC:

62-08 Computational methods for problems pertaining to statistics
62G08 Nonparametric regression and quantile regression
62F15 Bayesian inference
62P05 Applications of statistics to actuarial sciences and financial mathematics

Software:

SemiPar; gamair
Full Text: DOI

References:

[1] Basu, S.; Chib, S., Marginal likelihood and Bayes factors for Dirichlet process mixture models, J. Amer. Statist. Assoc., 98, 224-235, (2003) · Zbl 1047.62023
[2] Blackwell, D.; MacQueen, J. B., Ferguson distributions via Pólya urn schemes, Ann. Statist., 1, 353-355, (1973) · Zbl 0276.62010
[3] Carvalho, C.; Polson, N. G., The horseshoe estimator for sparse signals, Biometrika, 97, 465-480, (2010) · Zbl 1406.62021
[4] Chae, M., Lin, L., Dunson, D.B., 2016. Bayesian sparse linear Regression with Unknown Symmetric Error, arXiv:1608.02143.
[5] Chib, S., Marginal likelihood from the Gibbs output, J. Amer. Statist. Assoc., 90, 1313-1321, (1995) · Zbl 0868.62027
[6] Chib, S.; Greenberg, E., Additive cubic spline regression with Dirichlet process mixture errors, J. Econometrics, 156, 322-336, (2010) · Zbl 1431.62088
[7] Chu, W.; Ghahramani, Z., Gaussian processes for ordinal regression, J. Mach. Learn. Res., 6, 1019-1041, (2005) · Zbl 1222.68170
[8] Dunson, D. B.; Pillai, N.; Park, J., Bayesian density regression, J. R. Stat. Soc. Ser. B, 69, 163-183, (2007) · Zbl 1120.62025
[9] Dunson, D. B.; Stanford, J. B., Bayesian inferences on predictors of conception probabilities, Biometrics, 61, 126-133, (2005) · Zbl 1077.62106
[10] Durrleman, S.; Simon, R., Flexible regression models with cubic splines, Stat. Med., 8, 551-561, (1989)
[11] Escobar, M. D.; West, M., Bayesian density estimation and inference using mixtures, J. Amer. Statist. Assoc., 90, 577-588, (1995) · Zbl 0826.62021
[12] Ferguson, T. S., A Bayesian analysis of some nonparametric problems, Ann. Statist., 1, 209-230, (1973) · Zbl 0255.62037
[13] Ghosal, S., (Hjort, N. L.; Holmes, C.; Muller, P.; Walker, S. G., The Dirichlet Process, Related Priors, and Posterior Asymptotic, Bayesian Nonparametrics, (2009), Cambridge University Press New York), (Chapter 2)
[14] Green, P. J.; Silverman, B. W., (Nonparametric Regression and Generalized Linear Models: a Roughness Penalty Approach, Chapman & Hall/CRC Monographs on Statistics & Applied Probability, vol. 58, (1994)) · Zbl 0832.62032
[15] Hannah, L. A.; Blei, D. M.; Powell, W. B., Dirichlet process mixtures of generalized linear models, J. Mach. Learn. Res., 12, 1923-1953, (2011) · Zbl 1280.62031
[16] Holmes, C. C.; Held, L., Bayesian auxiliary variable models for binary and multinomial regression, Bayesian Anal., 1, 145-168, (2006) · Zbl 1331.62142
[17] Irwin, M.; Cox, N.; Kong, A., Sequential imputation for multilocus linkage analysis, Proc. Natl. Acad. Sci. USA, 91, 11684-11688, (1994)
[18] Jensen, M. J.; Maheu, J. M., Bayesian semiparametric multivariate GARCH modeling, J. Econometrics, 176, 3-17, (2013) · Zbl 1284.62559
[19] Kim, I.; Cheong, H.; Kim, H., Semiparametric regression models for detecting effect modification in matched case-control studies, Stat. Med., 30, 1837-1851, (2011)
[20] Kim, I.; Cohen, N., Semiparametric and nonparametric modelling for effect modification in matched case-control studies, Comput. Statist. Data Anal., 46, 631-643, (2004) · Zbl 1429.62039
[21] Kim, I.; Cohen, N.; Carroll, R., Semiparametric regression splines in matched case-control studies, Biometrics, 59, 1158-1169, (2003) · Zbl 1274.62804
[22] Lancaster, P.; Šalkauskas, K., Curve and Surface Fitting: An Introduction, (1986), Academic Press San Diego · Zbl 0649.65012
[23] Li, Q.; Racine, J. S., Nonparametric Econometrics: Theory and Practice, (2006), Princeton University Press Princeton
[24] MacEachern, S., Dependent Dirichlet Processes, (2000), Department of Statistics, The Ohio State University, (unpublished manuscript)
[25] MacEachern, S. N.; Möller, P., Estimating mixtures of Dirichlet process models, J. Comput. Graph. Stat., 7, 223-238, (1998)
[26] Mahmoud, H. F.F.; Kim, I.; Kim, H., Semiparametric single index multi change points model with an application of environmental health study on mortality and temperature, Environmetrics, 27, 494-506, (2016) · Zbl 1525.62172
[27] Marsh, L. C.; Cormier, D. R., (Spline Regression Models, Series: Quantitative Applications in the Social Sciences, vol. 137, (2002))
[28] Pagan, A.; Ullah, A., Nonparametric Econometrics: Themes in Modern Econometrics, (1999), Cambridge University Press Cambridge
[29] Pang, H. Kim. I.; Zhao, H., Bayesian semiparametric regression models for evaluating pathway effects on continuouse and binary clinical outcomes, Stat. Med., 31, 1633-1651, (2012)
[30] Pati, D.; Dunson, D. B., Bayesian nonparametric regression with varying residual density, Ann. Inst. Statist. Math., 66, 1-31, (2014) · Zbl 1281.62102
[31] Ruppert, D., Selecting the number of knots for penalized splines, J. Comput. Graph. Stat., 11, 735-757, (2002)
[32] Ruppert, D.; Wand, M. P.; Carroll, J., Semiparametric Regression, (2003), Cambridge University Press New York, NY, USA · Zbl 1038.62042
[33] Sethuraman, J., A constructive definition of Dirichlet priors, Statist. Sinica, 4, 639-650, (1994) · Zbl 0823.62007
[34] Verbeek, M., A Guide To Modern Econometrics, (2008), John Wiley & Sons, Ltd Chichester · Zbl 1013.62109
[35] Ortega Villa, A.; I, Kim.; Kim, H., Semiparametric time varying coefficient model for matched case-crossover studies, Stat. Med., 36, 998-1013, (2017)
[36] Wasserman, L., (All of Nonparametric Statistics, Springer Texts in Statistics, (2006), Springer New York) · Zbl 1099.62029
[37] Wood, S. N., (Generalized Additive Models: An Introduction with R, Texts in Statistical Science, (2006), Chapman & Hall/CRC Boca Raton, FL) · Zbl 1087.62082
[38] Zellner, A., (On Assessing Prior Distributions and Bayesian Regression Analysis with G-Prior Distributions, Bayesian Inference and Decision Techniques: Essays in Honor of Bruno de Finetti, (1986), North-Holland Amsterdam), 233-243 · Zbl 0655.62071
[39] Zhang, H.; Kim, I.; Park, I., Semiparametric Bayesian hierarchical models for heterogeneous population in nonlinear mixed effect model: application to gastric emptying studies, J. Appl. Stat., 41, 2743-2760, (2014) · Zbl 1514.62969
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.