370
Combining the EID and FSA for Computing the Minimum Volume Ellipsoid
Steven Grambow and Arnold Stromberg
Abstract:
The minimum volume ellipsoid (MVE) is an important robust estimator of location and covariance structure in multivariate data analysis. This estimator is defined to be the ellipsoid of minimum volume covering at least h points of the data set (Rousseeuw and Leroy, 1987). Here h is taken to be [(n+p+1) /2] where [.] is the greatest integer function in order for the MVE to have the maximum breakdown point. This paper addresses the problem of subset selection for the MVE. There are a variety of algorithms currently available due to Rousseeuw and Leroy (1987), Woodruff and Rocke (1993), Hawkins (1993), and most recently Poston, et al. (1997). Discussion will focus on the feasible solution algorithm (FSA) by Hawkins and the effective independence distribution (EID) algorithm by Poston, et al. Although the FSA is computationally expensive and is not deterministic, it often finds the exact MVE. The EID is fast and deterministic but rarely finds the exact MVE. By combining features of the FSA and EID we propose an algorithm which is reasonably fast, deterministic and has a good chance of finding the exact MVE.
For the salinity data (Rousseeuw and Leroy, 1987), Hawkins points out that the FSA approach would need approximately 5000 random starts to locate the MVE reliably, requiring approximately 7092 seconds of computation time on an HP C- 110 workstation. Poston, et al. note that the EID method finds a solution in approximately 0.02 seconds with only 8.4\% error in the actual volume of the MVE. The authors' combined algorithm is deterministic and locates the exact MVE in approximately 5.9 seconds.
Key Words: Minimum covariance determinant, Robust estimators, Subset selection.
Here is the full postscript text for this
technical report. It is 139310 bytes long.