Background: Semi-automating the analyses of accelerometry data makes it possible to synthesize large data sets. However, when constructing activity budgets from accelerometry data, there are many methods to extract, analyse and report data and results. For instance, machine learning is a robust approach to classifying data. We used a new method, super learning, that combines base learners (different machine learning methods) in an optimal manner to achieve overall improved accuracy. Other facets of super learning include the number of behavioural categories to predict, the number of epochs (sample window size) used to split data for training and testing and the parameters on which to train the models.
Results: The super learner accurately classified behaviour categories with higher accuracy and lower variance than comparative models. For all models tested, using four behaviours, in comparison with six, achieved higher rates of accuracy. The number of epochs chosen also affected the accuracy with smaller epochs (7 and 13) performing better than longer epochs (25 and 75).
Conclusions: Correct model selection, training and testing are imperative to creating reliable and valid classification models. To do so means model fitting must use a wide array of selection criteria. We evaluated a number of these including model, number of behaviours to classify and epoch length and then used a parameter grid search to implement the models. We found that all criteria tested contributed to the models' overall accuracies. Fewer behaviour categories and shorter epoch length improved the performance of all models tested. The super learner classified behaviours with higher accuracy and lower variance than other models tested. However, when using this model, users need to consider the additional human and computational time required for implementation. Machine learning is a powerful method for classifying the behaviour of animals from accelerometers. Care and consideration of the modelling parameters evaluated in this study are essential when using this type of statistical analysis.
Bibliographical noteCopyright the Author(s) 2017. Version archived for private and non-commercial use with the permission of the author/s and according to publisher conditions. For further rights please contact the publisher.
- behavioural classification
- machine learning
- marine mammal
- super learner