TY - JOUR
T1 - Spatially-explicit modelling with support of hyperspectral data can improve prediction of plant traits
AU - Rocha, Alby D.
AU - Groen, Thomas A.
AU - Skidmore, Andrew K.
PY - 2019/9/15
Y1 - 2019/9/15
N2 - Data from remote sensing with finer spectral and spatial resolution are increasingly available. While this allows more accurate prediction of plant traits at different spatial scales, it raises concerns about a lack of independence between observations. Hyperspectral wavelengths are serially correlated provoking multicollinearity among the predictors. As collection of ground reference points for validation remains time-consuming and difficult in many environments, empirical models are trained with a limited number of observations compared to the number of wavelengths. Moreover, any set of observations collected from a continuous surface is also likely to be spatially autocorrelated. Machine learning regression facilitates the task of selecting the most informative wavelengths, and then transforming them into latent variables to avoid the problem of multicollinearity. However, these regression methods do not solve the problem of spatial autocorrelation in the model residuals. In this study we show that, when significant spatial autocorrelation is observed, models that explicitly deal with spatial information and use a spectral index as a covariate exhibit a higher prediction accuracy than machine learning regressions do. However, for these models to work, the number of (hyperspectral) bands included in the models has to be drastically reduced and the model can not be directly extrapolated to a new (unobserved) location in another area. We conclude that quantifying spatial autocorrelation a-priori in the data can help in deciding whether the spatial and the spectral dimensions should be modelled together or not.
AB - Data from remote sensing with finer spectral and spatial resolution are increasingly available. While this allows more accurate prediction of plant traits at different spatial scales, it raises concerns about a lack of independence between observations. Hyperspectral wavelengths are serially correlated provoking multicollinearity among the predictors. As collection of ground reference points for validation remains time-consuming and difficult in many environments, empirical models are trained with a limited number of observations compared to the number of wavelengths. Moreover, any set of observations collected from a continuous surface is also likely to be spatially autocorrelated. Machine learning regression facilitates the task of selecting the most informative wavelengths, and then transforming them into latent variables to avoid the problem of multicollinearity. However, these regression methods do not solve the problem of spatial autocorrelation in the model residuals. In this study we show that, when significant spatial autocorrelation is observed, models that explicitly deal with spatial information and use a spectral index as a covariate exhibit a higher prediction accuracy than machine learning regressions do. However, for these models to work, the number of (hyperspectral) bands included in the models has to be drastically reduced and the model can not be directly extrapolated to a new (unobserved) location in another area. We conclude that quantifying spatial autocorrelation a-priori in the data can help in deciding whether the spatial and the spectral dimensions should be modelled together or not.
KW - INLA
KW - Machine learning
KW - Spatially explicit models
KW - Data simulation
KW - Radiative transfer models
KW - Spatial autocorrelation
KW - Plant traits
UR - http://www.scopus.com/inward/record.url?scp=85066246279&partnerID=8YFLogxK
U2 - 10.1016/j.rse.2019.05.019
DO - 10.1016/j.rse.2019.05.019
M3 - Article
AN - SCOPUS:85066246279
SN - 0034-4257
VL - 231
SP - 1
EP - 13
JO - Remote Sensing of Environment
JF - Remote Sensing of Environment
M1 - 111200
ER -