×

An Anscombe type robust regression statistic. (English) Zbl 0900.62166

Summary: A new robust regression estimator is proposed. Its use involves sampling of elemental set in a schema very similar to Rousseeuw’s least median of squares. Since the construction of such a statistics is done on the basis of residuals from regression, the problem reduces to parameter estimation in a one-dimensional sample, in the face of outliers. Our proposal relies on the work done by Anscombe; it uses the ideas of insurance premium and protection applied to outlier identification, with the addition of Rosner’s backwards elimination. This new estimator may represent a modest improvement over methods like the LMS, in that it appears to be able to solve marginal cases resistant so far, requiring the extraction of fewer elemental sets in order to reach a reasonable likelihood of success. The proposed method, however, is not uniformly preferable to the LMS and it should complement the latter rather than replace it.

MSC:

62F35 Robustness and adaptive procedures (parametric inference)
62J99 Linear inference, regression
Full Text: DOI

References:

[1] Anscombe, F.J.: Rejection of outliers. Technometrics 2, 123-147 (1960) · Zbl 0091.14806
[2] Bradu, D.; Hawkins, D.M.: Location of outliers in two-way tables using tetrads. Technometrics 24, 103-108 (1982)
[3] Bradu, D.; Hawkins, D.M.: Sample size requirements for multiple outlier location techniques based on elemental sets. Comput. statist. Data anal. 16, 257-270 (1993) · Zbl 0937.62536
[4] Davies, L.; Gather, U.: The identification of multiple outliers. J. amer. Statist. assoc. 88, No. 423, 782-801 (1993) · Zbl 0797.62025
[5] Donoho, D.L.; Huber, P.J.: The notion of breakdown point. A festschrift for erich lehmann (1983) · Zbl 0523.62032
[6] Draper, N.R.; Smith, H.: Applied regression analysis. (1966) · Zbl 0158.17101
[7] Hawkins, D.M.: The accuracy of elemental sets approximations for regression. J. amer. Statist. assoc. 88, 580-589 (1993)
[8] Hawkins, D.M.: The feasible set algorithm for least median of squares regression. Comput. statist. Data anal. 16, No. 1, 81-101 (1993) · Zbl 0875.62305
[9] Hawkins, D.M.; Bradu, D.; Kass, G.V.: Location of several outliers in multiple regression data using elemental sets. Technometrics 26, 197-208 (1984)
[10] Pen\tilde{}a, D.; Yohay, V.J.: The detection of influential subsets in linear regression using an influence matrix. (1991)
[11] Portnoy, S.: Using regression fractiles to identify outliers. Statistical data analysis based on the L1-norm and related methods (1987)
[12] Rosner, B.: Percentage points for a generalized ESD many outlier procedure. Technometrics 25, No. 2, 165-172 (1983) · Zbl 0536.62030
[13] Rousseeuw, P.J.: Least median of squares regression. J. amer. Statist. assoc. 79, 871-880 (1984) · Zbl 0547.62046
[14] Rousseeuw, P.J.; Basset, G.J.: Robustness of the p-subset algorithm for regression with high breakdown point. Directions in robust statistics and diagnostics, part II, the IMA volumes in mathematics and its applications, 185-194 (1991) · Zbl 0738.62071
[15] Rousseeuw, P.J.; Leroy, A.M.: Robust regression and outlier detection. (1987) · Zbl 0711.62030
[16] Rousseeuw, P.J.; Van Zomeren, B.C.: Unmasking multivariate outliers and leverage points. J. amer. Statist. assoc. 85, 633-651 (1990)
[17] Simonoff, J.S.: General approaches to stepwise identification of unusual values in data analysis. Directions in robust statistics and diagnostics, part II, the IMA volumes in mathematics and its applications, 233-242 (1991)
[18] Stromberg, A.J.: Computing the exact least median of squares estimate and stability diagnostics in multiple linear regression. SIAM J. Sci. comput. 14 (1993) · Zbl 0788.65144
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.