Abstract
This paper considers nonparametric estimation of age- and time-specific trends in disease incidence using serial prevalence data collected from multiple cross-sectional samples of a population over time. The methodology accounts for differential selection of diseased and undiseased individuals resulting, for example, from differences in mortality. It is shown that when a log-linear incidence odds model is adopted, an EM algorithm provides a convenient method for carrying out maximum likelihood estimation, primarily using existing generalized linear models software. The procedure is quite general, allowing a range of age-time incidence models to be fitted under the same framework. Furthermore, by making use of existing software for fitting generalized additive models, the procedure can be generalized with virtually no extra complexity to allow maximization of a penalized likelihood for smooth nonparametric estimation. Automatic choice of smoothing level for the penalized likelihood estimates is discussed, using generalized cross- validation. The method is applied to a data set on serial toxoplasmosis prevalence, which has previously been analyzed under the assumption of nondifferential selection. A variety of age-time incidence models are fitted, and the sensitivity to plausible differential selection patterns is considered. It is found that nonmultiplicative models are unnecessary and that qualitative incidence trends are fairly robust to differential selection.
Original language | English |
---|---|
Pages (from-to) | 1384-1398 |
Number of pages | 15 |
Journal | Biometrics |
Volume | 53 |
Issue number | 4 |
DOIs | |
Publication status | Published - Dec 1997 |
Externally published | Yes |
Keywords
- Cross-sectional data
- Differential selection
- EM algorithm
- Generalized additive models
- Incidence and prevalence
- Penalized likelihood