Leveraging machine learning approaches to decode hive sounds for stress prediction

Saba Mustafa, Mahsa Mohaghegh*, Iman Ardekani, Abdolhossein Sarrafzadeh

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

78 Downloads (Pure)

Abstract

Beekeeping plays a vital role in preserving ecosystems through pollination and increasing biodiversity. Effective monitoring of honeybee health and hive conditions is essential to balance bee populations and their environment. This study addresses the challenges of data scarcity and generalization in beehive health monitoring by introducing a semi-supervised learning model that employs a Transformer-based encoder-classifier for acoustic analysis of hive sounds. This research demonstrates the application of a Transformer-based architecture specifically tailored for bee bioacoustics and stress detection, integrating advanced feature extraction and fine-tuning techniques for this application. The main objective is to identify stress-related indicators from audio data collected via smart beehives. The proposed method utilizes a dataset of 5,336 labelled audio clips from diverse sources, including the NU-hive project and YouTube audio, to aid the learning process and enhance the classification accuracy for both labeled and unlabeled data. The audio features used in the analysis include Mel-frequency cepstral coefficients (MFCCs) and their delta and delta-delta variants, root mean square (RMS) energy, spectral centroid, and dominant frequency from Short-Time Fourier Transform (STFT). The Transformer-based encoder-classifier is implemented to classify bee behaviour within the hive as Normal, NoQueen, or Swarm, and to distinguish stressed from not stressed states. Evaluations indicate that the semi-supervised Transformer encoder-classifier achieves 99% accuracy on labeled data, with precision and recall values of 0.99 or higher for the Normal and NoQueen classes, and 0.96 for the Swarm class. Cluster validation produced a silhouette score of 0.47 and a Davies-Bouldin index of 0.57, indicating moderate cluster separability and compactness. The modelwas able to pseudo-label 94.7% of unlabeled data, validated against the nearest labelled neighbours. These results show the effectiveness of AI-driven beehive monitoring in supporting sustainable beekeeping practices and ecosystem conservation efforts.

Original languageEnglish
Pages (from-to)147953-147971
Number of pages19
JournalIEEE Access
Volume13
Early online date15 Aug 2025
DOIs
Publication statusPublished - 2025
Externally publishedYes

Bibliographical note

Copyright the Author(s) 2025. Version archived for private and non-commercial use with the permission of the author/s and according to publisher conditions. For further rights please contact the publisher.

Fingerprint

Dive into the research topics of 'Leveraging machine learning approaches to decode hive sounds for stress prediction'. Together they form a unique fingerprint.

Cite this