×

A new algorithm to estimate monotone nonparametric link functions and a comparison with parametric approach. (English) Zbl 1406.62082

Summary: The generalized linear model (GLM) is a class of regression models where the means of the response variables and the linear predictors are joined through a link function. Standard GLM assumes the link function is fixed, and one can form more flexible GLM by either estimating the flexible link function from a parametric family of link functions or estimating it nonparametically. In this paper, we propose a new algorithm that uses P-spline for nonparametrically estimating the link function which is guaranteed to be monotone. It is equivalent to fit the generalized single index model with monotonicity constraint. We also conduct extensive simulation studies to compare our nonparametric approach for estimating link function with various parametric approaches, including traditional logit, probit and robit link functions, and two recently developed link functions, the generalized extreme value link and the symmetric power logit link. The simulation study shows that the link function estimated nonparametrically by our proposed algorithm performs well under a wide range of different true link functions and outperforms parametric approaches when they are misspecified. A real data example is used to illustrate the results.

MSC:

62J12 Generalized linear models (logistic models)
65D07 Numerical computation using splines
62G08 Nonparametric regression and quantile regression

Software:

gamair; nloptr
Full Text: DOI

References:

[1] Aranda-Ordaz, FJ, On two families of transformations to additivity for binary response data, Biometrika, 68, 357-363, (1981) · Zbl 0466.62098 · doi:10.1093/biomet/68.2.357
[2] Bollaerts, K; Eilers, PH; Aerts, M, Quantile regression with monotonicity restrictions using P-splines and the L1-norm, Stat. Model., 6, 189-207, (2006) · Zbl 07257134 · doi:10.1191/1471082X06st118oa
[3] Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press, Cambridge (2004) · Zbl 1058.90049 · doi:10.1017/CBO9780511804441
[4] Czado, C; Santner, TJ, The effect of link misspecification on binary regression inference, J. Stat. Plan. Inference, 33, 213-231, (1992) · Zbl 0781.62037 · doi:10.1016/0378-3758(92)90069-5
[5] De Boor, C.: A Practical Guide to Splines. Springer, New York (2001) · Zbl 0987.65015
[6] Eilers, PH; Marx, BD, Flexible smoothing with B-splines and penalties, Stat. Sci., 11, 89-121, (1996) · Zbl 0955.62562 · doi:10.1214/ss/1038425655
[7] Eilers, PH; Li, B; Marx, BD, Multivariate calibration with single-index signal regression, Chemometr. Intell. Lab. Syst., 96, 196-202, (2009) · doi:10.1016/j.chemolab.2009.02.001
[8] Eilers, PH; Marx, BD; Durbán, M, Twenty years of P-splines, SORT Stat. Oper. Res. Trans., 39, 149-186, (2015) · Zbl 1339.41010
[9] Härdle, W.K., Müller, M., Sperlich, S., Werwatz, A.: Nonparametric and Semiparametric Models. Springer, Berlin (2012) · Zbl 1059.62032
[10] Hastie, T., Tibshirani, R.: Generalized Additive Models. CRC Press, Boca Raton (1990) · Zbl 0747.62061
[11] He, X; Shi, P, Monotone B-spline smoothing, J. Am. Stat. Assoc., 93, 643-650, (1998) · Zbl 1127.62322
[12] Ichimura, H, Semiparametric least squares (SLS) and weighted SLS estimation of single-index models, J. Econom., 58, 71-120, (1993) · Zbl 0816.62079 · doi:10.1016/0304-4076(93)90114-K
[13] Jiang, X; Dey, DK; Prunier, R; Wilson, AM; Holsinger, KE, A new class of flexible link functions with application to species co-occurrence in cape floristic region, Ann. Appl. Stat., 7, 2180-2204, (2013) · Zbl 1283.62228 · doi:10.1214/13-AOAS663
[14] Kim, S; Chen, MH; Dey, DK, Flexible generalized t-link models for binary response data, Biometrika, 95, 93-106, (2008) · Zbl 1437.62513 · doi:10.1093/biomet/asm079
[15] Klein, RW; Spady, RH, An efficient semiparametric estimator for binary response models, Econometrica, 61, 387-421, (1993) · Zbl 0783.62100 · doi:10.2307/2951556
[16] Leitenstorfer, F; Tutz, G, Generalized monotonic regression based on B-splines with an application to air pollution data, Biostatistics, 8, 654-673, (2007) · Zbl 1118.62125 · doi:10.1093/biostatistics/kxl036
[17] Liu, C.: Robit regression: a simple robust alternative to logistic and probit regression. In: Applied Bayesian Modeling and Causal Inference from Incomplete-data Perspectives, pp. 227-238. Wiley, London (2004) · Zbl 05274820
[18] Mallick, BK; Gelfand, AE, Generalized linear models with unknown link functions, Biometrika, 81, 237-245, (1994) · Zbl 0825.62609 · doi:10.1093/biomet/81.2.237
[19] Marx, BD; Eilers, PH; Li, B, Multidimensional single-index signal regression, Chemometr. Intell. Lab. Syst., 109, 120-130, (2011) · doi:10.1016/j.chemolab.2011.08.006
[20] McCullagh, P., Nelder, J.A.: Generalized Linear Models. CRC Press, Boca Raton (1989) · Zbl 0744.62098 · doi:10.1007/978-1-4899-3242-6
[21] Muggeo, VM; Ferrara, G, Fitting generalized linear models with unspecified link function: a P-spline approach, Comput. Stat. Data Anal., 52, 2529-2537, (2008) · Zbl 1452.62541 · doi:10.1016/j.csda.2007.08.011
[22] Pregibon, D, Goodness of link tests for generalized linear models, J. R. Stat. Soc. Ser. C (Appl. Stat.), 29, 15-23, (1980) · Zbl 0434.62048
[23] Ramsay, J, Estimating smooth monotone functions, J. R. Stat. Soc. Ser. B Stat. Methodol., 60, 365-375, (1998) · Zbl 0909.62041 · doi:10.1111/1467-9868.00130
[24] Ramsay, JO, Monotone regression splines in action, Stat. Sci., 3, 425-441, (1988) · doi:10.1214/ss/1177012761
[25] Roy, V, Efficient estimation of the link function parameter in a robust Bayesian binary regression model, Comput. Stat. Data Anal., 73, 87-102, (2014) · Zbl 1506.62158 · doi:10.1016/j.csda.2013.11.013
[26] Wang, L; Yang, L, Spline estimation of single-index models, Stat. Sin., 19, 765-783, (2009) · Zbl 1166.62023
[27] Wang, W; Small, DS, Monotone B-spline smoothing for a generalized linear model response, Am. Stat., 69, 28-33, (2015) · Zbl 07671703 · doi:10.1080/00031305.2014.969445
[28] Wang, X; Dey, DK, Generalized extreme value regression for binary response data: an application to B2B electronic payments system adoption, Ann. Appl. Stat., 4, 2000-2023, (2010) · Zbl 1220.62165 · doi:10.1214/10-AOAS354
[29] Wang, Z, An algorithm for generalized monotonic smoothing, J. Appl. Stat., 27, 495-507, (2000) · Zbl 1014.65008 · doi:10.1080/02664760050003678
[30] Weisberg, S; Welsh, A, Adapting for the missing link, Ann. Stat., 22, 1674-1700, (1994) · Zbl 0828.62059 · doi:10.1214/aos/1176325749
[31] Wood, S.: Generalized Additive Models: An Introduction with R. CRC Press, Boca Raton (2006) · Zbl 1087.62082
[32] Wood, SN, Fast stable direct Fitting and smoothness selection for generalized additive models, J. R. Stat. Soc. Ser. B Stat. Methodol., 70, 495-518, (2008) · Zbl 05563356 · doi:10.1111/j.1467-9868.2007.00646.x
[33] Ypma, J.: Introduction to nloptr: an R interface to NLopt. Tech. rep. (2014)
[34] Yuan, Y.: Prediction Performance of Survival Models. University of Waterloo, Waterloo (2008)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.