Application of feature selection methods and machine learning algorithms for saltmarsh biomass estimation using Worldview-2 imagery

Sikdar M. M. Rasel, Hsing Chung Chang, Timothy J. Ralph, Neil Saintilan, Israt Jahan Diti

Research output: Contribution to journalArticleResearchpeer-review

Abstract

Assessing large scale plant productivity of coastal marshes is essential to understand the resilience of these systems to climate change. Two machine learning approaches, random forest (RF) and support vector machine (SVM) regression were tested to estimate biomass of a common saltmarshes species, salt couch grass (Sporobolus virginicus). Reflectance and vegetation indices derived from 8 bands of Worldview-2 multispectral data were used for four experiments to develop the biomass model. These four experiments were, Experiment-1: 8 bands of Worldview-2 image, Experiment-2: Possible combination of all bands of Worldview-2 for Normalized Difference Vegetation Index (NDVI) type vegetation indices, Experiment-3: Combination of bands and vegetation indices, Experiment-4: Selected variables derived from experiment-3 using variable selection methods. The main objectives of this study are (i) to recommend an affordable low cost data source to predict biomass of a common saltmarshes species, (ii) to suggest a variable selection method suitable for multispectral data, (iii) to assess the performance of RF and SVM for the biomass prediction model. Cross-validation of parameter optimizations for SVM showed that optimized parameter of ɛ-SVR failed to provide a reliable prediction. Hence, ν-SVR was used for the SVM model. Among the different variable selection methods, recursive feature elimination (RFE) selected a minimum number of variables (only 4) with an RMSE of 0.211 (kg/m2). Experiment-4 (only selected bands) provided the best results for both of the machine learning regression methods, RF (R2= 0.72, RMSE= 0.166 kg/m2) and SVR (R2= 0.66, RMSE = 0.200 kg/m2) to predict biomass. When a 10-fold cross validation of the RF model was compared with a 10-fold cross validation of SVR, a significant difference (p = <0.0001) was observed for RMSE. One to one comparisons of actual to predicted biomass showed that RF underestimates the high biomass values, whereas SVR overestimates the values; this suggests a need for further investigation and refinement.

LanguageEnglish
Number of pages25
JournalGeocarto International
DOIs
Publication statusE-pub ahead of print - 11 Jun 2019

Fingerprint

worldview
saltmarsh
imagery
experiment
biomass
learning
vegetation index
fold
regression
machine learning
WorldView
method
prediction
NDVI
resilience
marsh
Values
reflectance
climate change
productivity

Keywords

  • salt couch
  • spectral band
  • variable selection
  • vegetation indices
  • Worldview-2

Cite this

@article{fd339e687d0a43d490ba964269fa69e1,
title = "Application of feature selection methods and machine learning algorithms for saltmarsh biomass estimation using Worldview-2 imagery",
abstract = "Assessing large scale plant productivity of coastal marshes is essential to understand the resilience of these systems to climate change. Two machine learning approaches, random forest (RF) and support vector machine (SVM) regression were tested to estimate biomass of a common saltmarshes species, salt couch grass (Sporobolus virginicus). Reflectance and vegetation indices derived from 8 bands of Worldview-2 multispectral data were used for four experiments to develop the biomass model. These four experiments were, Experiment-1: 8 bands of Worldview-2 image, Experiment-2: Possible combination of all bands of Worldview-2 for Normalized Difference Vegetation Index (NDVI) type vegetation indices, Experiment-3: Combination of bands and vegetation indices, Experiment-4: Selected variables derived from experiment-3 using variable selection methods. The main objectives of this study are (i) to recommend an affordable low cost data source to predict biomass of a common saltmarshes species, (ii) to suggest a variable selection method suitable for multispectral data, (iii) to assess the performance of RF and SVM for the biomass prediction model. Cross-validation of parameter optimizations for SVM showed that optimized parameter of ɛ-SVR failed to provide a reliable prediction. Hence, ν-SVR was used for the SVM model. Among the different variable selection methods, recursive feature elimination (RFE) selected a minimum number of variables (only 4) with an RMSE of 0.211 (kg/m2). Experiment-4 (only selected bands) provided the best results for both of the machine learning regression methods, RF (R2= 0.72, RMSE= 0.166 kg/m2) and SVR (R2= 0.66, RMSE = 0.200 kg/m2) to predict biomass. When a 10-fold cross validation of the RF model was compared with a 10-fold cross validation of SVR, a significant difference (p = <0.0001) was observed for RMSE. One to one comparisons of actual to predicted biomass showed that RF underestimates the high biomass values, whereas SVR overestimates the values; this suggests a need for further investigation and refinement.",
keywords = "salt couch, spectral band, variable selection, vegetation indices, Worldview-2",
author = "Rasel, {Sikdar M. M.} and Chang, {Hsing Chung} and Ralph, {Timothy J.} and Neil Saintilan and Diti, {Israt Jahan}",
year = "2019",
month = "6",
day = "11",
doi = "10.1080/10106049.2019.1624988",
language = "English",
journal = "Geocarto International",
issn = "1010-6049",
publisher = "Taylor & Francis",

}

Application of feature selection methods and machine learning algorithms for saltmarsh biomass estimation using Worldview-2 imagery. / Rasel, Sikdar M. M.; Chang, Hsing Chung; Ralph, Timothy J.; Saintilan, Neil; Diti, Israt Jahan.

In: Geocarto International, 11.06.2019.

Research output: Contribution to journalArticleResearchpeer-review

TY - JOUR

T1 - Application of feature selection methods and machine learning algorithms for saltmarsh biomass estimation using Worldview-2 imagery

AU - Rasel, Sikdar M. M.

AU - Chang, Hsing Chung

AU - Ralph, Timothy J.

AU - Saintilan, Neil

AU - Diti, Israt Jahan

PY - 2019/6/11

Y1 - 2019/6/11

N2 - Assessing large scale plant productivity of coastal marshes is essential to understand the resilience of these systems to climate change. Two machine learning approaches, random forest (RF) and support vector machine (SVM) regression were tested to estimate biomass of a common saltmarshes species, salt couch grass (Sporobolus virginicus). Reflectance and vegetation indices derived from 8 bands of Worldview-2 multispectral data were used for four experiments to develop the biomass model. These four experiments were, Experiment-1: 8 bands of Worldview-2 image, Experiment-2: Possible combination of all bands of Worldview-2 for Normalized Difference Vegetation Index (NDVI) type vegetation indices, Experiment-3: Combination of bands and vegetation indices, Experiment-4: Selected variables derived from experiment-3 using variable selection methods. The main objectives of this study are (i) to recommend an affordable low cost data source to predict biomass of a common saltmarshes species, (ii) to suggest a variable selection method suitable for multispectral data, (iii) to assess the performance of RF and SVM for the biomass prediction model. Cross-validation of parameter optimizations for SVM showed that optimized parameter of ɛ-SVR failed to provide a reliable prediction. Hence, ν-SVR was used for the SVM model. Among the different variable selection methods, recursive feature elimination (RFE) selected a minimum number of variables (only 4) with an RMSE of 0.211 (kg/m2). Experiment-4 (only selected bands) provided the best results for both of the machine learning regression methods, RF (R2= 0.72, RMSE= 0.166 kg/m2) and SVR (R2= 0.66, RMSE = 0.200 kg/m2) to predict biomass. When a 10-fold cross validation of the RF model was compared with a 10-fold cross validation of SVR, a significant difference (p = <0.0001) was observed for RMSE. One to one comparisons of actual to predicted biomass showed that RF underestimates the high biomass values, whereas SVR overestimates the values; this suggests a need for further investigation and refinement.

AB - Assessing large scale plant productivity of coastal marshes is essential to understand the resilience of these systems to climate change. Two machine learning approaches, random forest (RF) and support vector machine (SVM) regression were tested to estimate biomass of a common saltmarshes species, salt couch grass (Sporobolus virginicus). Reflectance and vegetation indices derived from 8 bands of Worldview-2 multispectral data were used for four experiments to develop the biomass model. These four experiments were, Experiment-1: 8 bands of Worldview-2 image, Experiment-2: Possible combination of all bands of Worldview-2 for Normalized Difference Vegetation Index (NDVI) type vegetation indices, Experiment-3: Combination of bands and vegetation indices, Experiment-4: Selected variables derived from experiment-3 using variable selection methods. The main objectives of this study are (i) to recommend an affordable low cost data source to predict biomass of a common saltmarshes species, (ii) to suggest a variable selection method suitable for multispectral data, (iii) to assess the performance of RF and SVM for the biomass prediction model. Cross-validation of parameter optimizations for SVM showed that optimized parameter of ɛ-SVR failed to provide a reliable prediction. Hence, ν-SVR was used for the SVM model. Among the different variable selection methods, recursive feature elimination (RFE) selected a minimum number of variables (only 4) with an RMSE of 0.211 (kg/m2). Experiment-4 (only selected bands) provided the best results for both of the machine learning regression methods, RF (R2= 0.72, RMSE= 0.166 kg/m2) and SVR (R2= 0.66, RMSE = 0.200 kg/m2) to predict biomass. When a 10-fold cross validation of the RF model was compared with a 10-fold cross validation of SVR, a significant difference (p = <0.0001) was observed for RMSE. One to one comparisons of actual to predicted biomass showed that RF underestimates the high biomass values, whereas SVR overestimates the values; this suggests a need for further investigation and refinement.

KW - salt couch

KW - spectral band

KW - variable selection

KW - vegetation indices

KW - Worldview-2

UR - http://www.scopus.com/inward/record.url?scp=85067594457&partnerID=8YFLogxK

U2 - 10.1080/10106049.2019.1624988

DO - 10.1080/10106049.2019.1624988

M3 - Article

JO - Geocarto International

T2 - Geocarto International

JF - Geocarto International

SN - 1010-6049

ER -