×

An easy way to increase the finite-sample efficiency of the resampled minimum volume ellipsoid estimator. (English) Zbl 0900.62278

Summary: In a robust analysis, the minimum volume ellipsoid (MVE) estimator is very often used to estimate both multivariate location and scatter. The MVE estimator for the scatter matrix is defined as the smallest ellipsoid covering half of the observations, while the MVE location estimator is the midpoint of that ellipsoid. The MVE estimators can be computed by minimizing a certain criterion over a high-dimensional space. In practice, one mostly uses algorithms based on minimization of the objective function over a sequence of trial estimates. One of these estimators uses a resampling scheme, and yields the (p+1)-subset estimator. In this note, we show how this estimator can easily be adapted, yielding a considerable increase of statistical efficiency at finite samples. This gain in precision is also observed when sampling from contaminated distributions, and it becomes larger when the dimension increases. Therefore, we do not need more computation time nor do we lose robustness properties. Moreover, only a few lines have to be added to existing computer programs. The key idea is to average over several trials close to the optimum, instead of just picking out the trial with the lowest value for the objective function. The resulting estimator keeps the equivariance and robustness properties of the original MVE estimator. This idea can also be applied to several other robust estimators, including least-trimmed-squares regression.

MSC:

62H12 Estimation in multivariate analysis
62F35 Robustness and adaptive procedures (parametric inference)
65C99 Probabilistic methods, stochastic differential equations

Software:

AS 282
Full Text: DOI

References:

[1] Beirlant, J.; Mason, D. M.; Vynckier, C.: Goodness-of-fit tests for multivariate normality based on generalised quantiles. Internal report (1996) · Zbl 1061.62532
[2] Cook, R. D.; Hawkins, D. M.; Weisberg, S.: Exact iterative computation of the robust multivariate minimum volume ellipsoid estimator. Statist. and probab. Lett. 16, 213-218 (1993)
[3] Croux, C.; Rousseeuw, P. J.: A class of high-breakdown scale estimators based on subranges. Comm. statist. Theory methods 21, 1935-1951 (1992) · Zbl 0774.62035
[4] Croux, C.; Rousseeuw, P. J.; Van Bael, A.: Robust regression by minimizing nested scale estimators. J. statist. Plann. inference 53, 197-235 (1996) · Zbl 0854.62027
[5] Davies, P. L.: Asymptotic behavior of S-estimates of multivariate location parameters and dispersion matrices. Ann. statist. 15, 1269-1292 (1987) · Zbl 0645.62057
[6] Davies, P. L.: The asymptotics of rousseeuw’s minimum volume ellipsoid estimator. Ann. statist. 20, 1828-1843 (1992) · Zbl 0764.62046
[7] Donoho, D. L.; Huber, P. J.: The notion of breakdown point. A festschrift for erich L. Lehmann, 157-184 (1983)
[8] Einmahl, J. H. J.; Mason, D. M.: Generalized quantile processes. Ann. statist. 20, 1062-1078 (1992) · Zbl 0757.60012
[9] Hampel, F. R.; Ronchetti, E. M.; Rousseeuw, P. J.; Stahel, W. A.: Robust statistics: the approach based on influence functions. (1986) · Zbl 0593.62027
[10] Hawkins, D. M.: A feasible solution algorithm for the minimum volume ellipsoid estimator in multivariate data. Comput. statist. 8, 95-107 (1993)
[11] Hawkins, D. M.: A feasible solution algorithm for the minimum covariance determinant estimator. Comput. statist. Data anal. 17, 197-210 (1994) · Zbl 0937.62595
[12] Hawkins, D. M.; Simonoff, J. S.: AS 282: high breakdown regression and multivariate estimation. Appl. statist. 42, 423-432 (1993)
[13] Hössjer, O.: Exact computation of the least-trimmed-squares estimate in simple linear regression. Comput. statist. Data anal. 19, 265-268 (1995)
[14] Lopuhaä, H. P.: Multivariate \({\tau}\)-estimators for location and scatter. Canad. J. Statist. 19, 307-321 (1991) · Zbl 0746.62034
[15] Lopuhaä, H. P.; Rousseeuw, P. J.: Breakdown points of affine equivariant estimators of multivariate location and covariance matrices. Ann. statist. 19, 229-248 (1991) · Zbl 0733.62058
[16] Maronna, R. A.; Yohai, V. J.: The behavior of the stahel-donoho robust multivariate estimator. J. amer. Statist. assoc. 90, 330-341 (1995) · Zbl 0820.62050
[17] Maronna, R. A.; Stahel, W. A.; Yohai, V. J.: Bias-robust estimators of multivariate scatter based on projections. J. multivariate anal. 42, 141-161 (1992) · Zbl 0777.62057
[18] Rousseeuw, P. J.: Multivariate estimation with high breakdown point. Mathematical statistics and applications, 283-297 (1985) · Zbl 0609.62054
[19] Rousseeuw, P. J.; Bassett, G. W.: Robustness of the p-subset algorithm for regression with high breakdown point. Directions in robust statistics and diagnostics, part II, 185-194 (1991) · Zbl 0738.62071
[20] Rousseeuw, P. J.; Leroy, A. M.: Robust regression and outlier detection. (1987) · Zbl 0711.62030
[21] Rousseeuw, P. J.; Van Zomeren, B. C.: Unmasking multivariate outliers and leverage points. J. amer. Statist. assoc. 85, 633-639 (1990)
[22] Ruppert, D.: Computing S-estimators for regression and multivariate location/dispersion. J. comput. Graph. statist. 1, 253-270 (1992)
[23] Tyler, D. E.: A distribution-free M-estimator of multivariate scatter. Ann. statist. 15, 234-251 (1987) · Zbl 0628.62053
[24] Woodruff, D. L.; Rocke, D. M.: Heuristic search algorithms for the minimum volume ellipsoid. J. comput. Graph. statist. 2, 69-95 (1993)
[25] Woodruff, D. L.; Rocke, D. M.: Computable robust estimation of multivariate location and shape in high dimension using compound estimators. J. amer. Statist. assoc. 89, 888-896 (1994) · Zbl 0825.62485
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.