Abstract
Consider a general regression model, where the response Y depends on discrete predictors X only through the index \( \boldsymbol{\beta }^{T}\mathbf{X} \). It is well-known that the ordinary least squares (OLS) estimator can recover the underlying direction \( \boldsymbol{\beta } \) exactly if the link function between Y and X is linear. Li and Duan (Ann Stat 17:1009–1052, 1989) showed that the OLS estimator can recover \( \boldsymbol{\beta } \) proportionally if the predictors satisfy the linear conditional mean (LCM) condition. For discrete predictors, we demonstrate that the LCM condition generally does not hold. To improve the OLS estimator in the presence of discrete predictors, we model the conditional mean \( \mathrm{E}(\mathbf{X}\mid \boldsymbol{\beta }^{T}\mathbf{X}) \) as a polynomial function of \( \boldsymbol{\beta }^{T}\mathbf{X} \) and use the central solution space (CSS) estimator. The superior performances of the CSS estimators are confirmed through numerical studies.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Carroll, R. J., Fan, J., Gijbels, I., & Wand, M. P. (1997). Generalized partially linear single-index models. Journal of the American Statistical Association, 92, 477–489.
Cook, R. D. (1998). Regression graphics. New York: Wiley.
Cook, R. D., & Li, L. (2009). Dimension reduction in regressions with exponential family predictors. Journal of Computational and Graphical Statistics, 18, 774–791.
Cook, R. D., & Nachtsheim, C. (1994). Reweighting to achieve elliptically contoured covariates in regression. Journal of the American Statistical Association, 89, 592–599.
Cui, X., Härdle, W., & Zhu, L. X. (2011). The EFM approach for single-index models. The Annals of Statistics, 39, 1658–1688.
Dong, Y., & Li, B. (2010). Dimension reduction for non-elliptically distributed predictors: Second order methods. Biometrika, 97, 279–294.
Dong, Y., & Yu, Z. (2012). Dimension reduction for the conditional kth moment via central solution space. Journal of Multivariate Analysis, 112, 207–218.
Härdle, W., Hall, P., & Ichimura, H. (1993). Optimal smoothing in single-index models. The Annals of Statistics, 21, 157–178.
Horowitz, J. L., & Härdle, W. (1996). Direct semiparametric estimation of single-index models with discrete covariates. Journal of the American Statistical Association, 91, 1632–1640.
Ichimura, H. (1993). Semiparametric least square (SLS) and weighted SLS estimation of single-index models. Journal of Econometrics, 58, 71–120.
Li, B., & Dong, Y. (2009). Dimension reduction for non-elliptically distributed predictors. The Annals of Statistics, 37, 1272–1298.
Li, K. C., & Duan, N. (1989). Regression analysis under link violation. The Annals of Statistics, 17, 1009–1052.
Powell, J. L., Stock, J. M., & Stoker, T. M. (1989). Semiparametric estimation of index coefficients. Econometrica, 57, 1403–1430.
Sheng, W., & Yin, X. (2013). Direction estimation in single-index models via distance covariance. Journal of Multivariate Analysis, 122, 148–161.
Zhang, N., & Yin, X. (2015). Direction estimation in single-index regressions via Hilbert-Schmidt independence criterion. Statistica Sinica, 25, 743–758.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix
Appendix
Proof of \( \boldsymbol{\beta }_{\text{OLS}} \propto \boldsymbol{\beta } \) in Example 3, case iii.
For (X 1, X 2, W)T ∼ Multinomial (n, (p 1, p 2, p 3)), it is well-known that X i ∼ Binomial(n, p i ) for i = 1, 2, and cov(X 1, X 2) = −np 1 p 2. It follows that E(X 1 X 2) = n(n − 1)p 1 p 2. It can be shown that
Next we denote k 3 = n − k 1 − k 2 and calculate E(X 1 2 X 2) as follows
For Y = (X 1 + X 2)2, we thus have
Similarly, E{(X 2 − E(X 2))Y } is equal to
Recall that X = (X 1, X 2)T and \( \mathrm{var}(\mathbf{X}) = \boldsymbol{\varSigma } \). Let \( \vert \boldsymbol{\varSigma }\vert \) be the determinant of \( \boldsymbol{\varSigma } \). Since the first row of \( \boldsymbol{\varSigma }^{-1} \) is \( \vert \boldsymbol{\varSigma }\vert ^{-1}\{np_{2}(1 - p_{2}),np_{1}p_{2}\} \), the first component of \( \boldsymbol{\beta }_{\text{OLS}} \) becomes
Due to the symmetry between p 1 and p 2 in the expression above, the second component of \( \boldsymbol{\beta }_{ \text{OLS}} \) is exactly the same as the first component. Thus we have \( \boldsymbol{\beta }_{ \text{OLS}} \propto (1,1)^{T} = \boldsymbol{\beta } \). □
Proof of Proposition 1.
Assume \( \mathrm{E}\{\mathrm{E}(\mathbf{X}\mid \boldsymbol{\eta }^{T}\mathbf{X})Y \} =\mathrm{ E}\{\mathrm{E}(\mathbf{X}\mid \boldsymbol{\beta }^{T}\mathbf{X})Y \} \) with probability 1 for some \( \boldsymbol{\eta } \), such that \( \boldsymbol{\eta } \) is not proportional to \( \boldsymbol{\beta } \). Then both \( \boldsymbol{\eta } \) and \( \boldsymbol{\beta } \) will satisfy Eq. (6), which means the solution of (6) is not unique up to a scalar multiplication. Thus under the assumption that \( \mathrm{Pr}(\mathrm{E}\{\mathrm{E}(\mathbf{X}\mid \boldsymbol{\eta }^{T}\mathbf{X})Y \}\neq \mathrm{E}\{\mathrm{E}(\mathbf{X}\mid \boldsymbol{\beta }^{T}\mathbf{X})Y \}) > 0 \) whenever \( \boldsymbol{\eta } \) is not proportional to \( \boldsymbol{\beta } \), the solution of (6) is unique. Because \( \boldsymbol{\beta } \) satisfies (6) and the solution of (6) is unique up to a scalar multiplication, we have \( \boldsymbol{\beta }_{\text{CSS}} \propto \boldsymbol{\beta } \). Under the additional LCM condition (2), \( \boldsymbol{\beta }_{\text{OLS}} \) is also proportional to \( \boldsymbol{\beta } \). Consequently we have \( \boldsymbol{\beta }_{\text{CSS}} \propto \boldsymbol{\beta }_{\text{OLS}} \). □
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Dong, Y., Yu, Z. (2016). Direction Estimation in a General Regression Model with Discrete Predictors. In: Jin, Z., Liu, M., Luo, X. (eds) New Developments in Statistical Modeling, Inference and Application. ICSA Book Series in Statistics. Springer, Cham. https://doi.org/10.1007/978-3-319-42571-9_4
Download citation
DOI: https://doi.org/10.1007/978-3-319-42571-9_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-42570-2
Online ISBN: 978-3-319-42571-9
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)