Abstract
Digitalisation of the built environment provides multiple benefits such as operational and energy productivity improvements and supports the participation of buildings in the management of electricity networks. Automated methods to infer contextual information from building management systems and Internet of Things sensor metadata plays a significant role in this process. In this paper, we have studied the problem of transfer learning using text metadata to automatically tag building sensors with semantic tags. We demonstrate that state-of-the-art pre-trained neural language models are a promising approach which to the best of our knowledge have not been studied due to the lack of pre-processors to tokenise the texts. We develop a tokeniser based on the unigram language model capable of tokenising the idiosyncratic text found in building sensor metadata and use it to train from scratch a transformer based language model using sensor metadata from 152 buildings. The weights are then used to train a tagset classifier using transfer learning, and tested on 30 buildings. Metrics such as precision, recall and the Jaccard similarity coefficient have been used to evaluate the suitability of our results for various buildings. The proposed method can predict building tagsets with over 70% accuracy against a real world noisy dataset.
| Original language | English |
|---|---|
| Article number | 100041 |
| Pages (from-to) | 1-10 |
| Number of pages | 10 |
| Journal | Advances in Applied Energy |
| Volume | 3 |
| Early online date | 25 May 2021 |
| DOIs | |
| Publication status | Published - 25 Aug 2021 |
Bibliographical note
Copyright the Author(s) 2021. Version archived for private and non-commercial use with the permission of the author/s and according to publisher conditions. For further rights please contact the publisher.Keywords
- Smart buildings
- Metadata inference
- Tagging
- Machine learning
- Transfer learning
- Transformers
- Tokenizer
- BACnet
- RoBERTa
- Unigram
- Internet of Things
Fingerprint
Dive into the research topics of 'Advancing smart building readiness: automated metadata extraction using neural language processing methods'. Together they form a unique fingerprint.Projects
- 1 Finished
-
A Large-Scale Distributed Experimental Facility for the Internet of Things
Sheng, M. (Primary Chief Investigator), Bouguettaya, A. (Chief Investigator), Loke, S. (Chief Investigator), Li, X. (Chief Investigator), Liang, W. (Chief Investigator), Benattalah, B. (Chief Investigator), Ali Babar, M. (Primary Chief Investigator), Yang, J. (Chief Investigator), Zomaya, A. Y. (Chief Investigator), Wang, Y. (Chief Investigator), Zhou, W. (Chief Investigator), Yao, L. (Chief Investigator), Taylor, K. (Chief Investigator) & Bergmann, N. (Chief Investigator)
1/01/18 → 31/12/20
Project: Research
Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver