Abstract
Several approaches can be made to the choice of bandwidth in the kernel smoothing of distribution functions. Recent proposals by Sarda (1993) and by Altman & Leger (1995) are analogues of the 'leave-one-out' and 'plug-in' methods which have been widely used in density estimation. In contrast, a method of crossvalidation appropriate to the smoothing of distribution functions is proposed. Selection of the bandwidth parameter is based on unbiased estimation of a mean integrated squared error curve whose minimising value defines an optimal smoothing parameter. This procedure is shown to lead to asymptotically optimal bandwidth choice, not just in the usual first-order sense but also in the second-order sense in which kernel methods improve on the standard empirical distribution function. Some general theory on the performance of optimal, data-based methods of bandwidth choice is also provided, leading to results which do not have analogues in the context of density estimation. The numerical performances of all the methods discussed in the paper are compared. A bandwidth based on a simple reference distribution is also included. Simulations suggest that the crossvalidatory proposal works well, although the simple reference bandwidth is also quite effective. :.
Original language | English |
---|---|
Pages (from-to) | 799-808 |
Number of pages | 10 |
Journal | Biometrika |
Volume | 85 |
Issue number | 4 |
Publication status | Published - 1998 |
Externally published | Yes |
Keywords
- Crossvalidation
- Empirical distribution function
- Kernel
- Smoothing