Statistical modeling of Southern Ocean marine diatom proxy and winter sea ice data: model comparison and developments

Alexander J. Ferry*, Tania Prvan, Brian Jersky, Xavier Crosta, Leanne K. Armand

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

10 Citations (Scopus)


We compare the performance of the modern analog technique (MAT), the Imbrie and Kipp transfer function (IKTF), the generalized additive model (GAM) and weighted averaging partial least squares (WA PLS) on a southern hemisphere diatom relative abundance and winter sea ice concentration training data set. All relevant model assumptions are tested with a random 10-fold cross-validation, whilst a hold out cross-validation tested the explanatory power of each model on spatially independent validation data. We used auto correlograms on model residuals, variance partitioning, and principal coordinates analysis of neighbor matrices (PCNM) to investigate the importance of the spatial structure of our training database. A set of hierarchical logistic regression models (or Huisman-Olff-Fresco models) are used to infer the response of each diatom species along the winter sea ice gradient. Our analyses suggest that IKTF is an inappropriate sea ice estimation approach as its underlying statistical assumptions do not hold and the fit of IKTF to our data under cross-validation was poor. We conclude that MAT may be biased by spatial autocorrelation, and together with IKTF fails to provide unbiased estimates of winter sea ice. We find GAM and WA PLS are more appropriate than IKTF and MAT for the estimation of paleo winter sea ice cover throughout the Southern Ocean. However, as WA PLS is based on a unimodal species response, which is rarely exhibited by diatoms along the winter sea ice gradient, we ultimately advocate the application of GAM. GAM only uses diatoms with a statistically significant association, and ecologically based link, with sea ice. GAM outperformed all other models under cross-validation in terms of performance statistics, the fit of GAM to the training dataset and diagnostic tests for model assumptions. We also demonstrate that GAM provides a more detailed and potentially more accurate (based on a comparison with New Zealand and southeast Australian paleo climatic records) paleo winter sea ice record for the southwestern Pacific Ocean in comparison with IKTF, MAT and WA PLS.

Original languageEnglish
Pages (from-to)100-112
Number of pages13
JournalProgress in Oceanography
Publication statusPublished - 1 Feb 2015


Dive into the research topics of 'Statistical modeling of Southern Ocean marine diatom proxy and winter sea ice data: model comparison and developments'. Together they form a unique fingerprint.

Cite this