Advancing smart building readiness: automated metadata extraction using neural language processing methods

David Waterworth*, Subbu Sethuvenkatraman, Quan Z. Sheng

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

95 Downloads (Pure)

Abstract

Digitalisation of the built environment provides multiple benefits such as operational and energy productivity improvements and supports the participation of buildings in the management of electricity networks. Automated methods to infer contextual information from building management systems and Internet of Things sensor metadata plays a significant role in this process. In this paper, we have studied the problem of transfer learning using text metadata to automatically tag building sensors with semantic tags. We demonstrate that state-of-the-art pre-trained neural language models are a promising approach which to the best of our knowledge have not been studied due to the lack of pre-processors to tokenise the texts. We develop a tokeniser based on the unigram language model capable of tokenising the idiosyncratic text found in building sensor metadata and use it to train from scratch a transformer based language model using sensor metadata from 152 buildings. The weights are then used to train a tagset classifier using transfer learning, and tested on 30 buildings. Metrics such as precision, recall and the Jaccard similarity coefficient have been used to evaluate the suitability of our results for various buildings. The proposed method can predict building tagsets with over 70% accuracy against a real world noisy dataset.

Original languageEnglish
Article number100041
Pages (from-to)1-10
Number of pages10
JournalAdvances in Applied Energy
Volume3
Early online date25 May 2021
DOIs
Publication statusPublished - 25 Aug 2021

Bibliographical note

Copyright the Author(s) 2021. Version archived for private and non-commercial use with the permission of the author/s and according to publisher conditions. For further rights please contact the publisher.

Keywords

  • Smart buildings
  • Metadata inference
  • Tagging
  • Machine learning
  • Transfer learning
  • Transformers
  • Tokenizer
  • BACnet
  • RoBERTa
  • Unigram
  • Internet of Things

Fingerprint

Dive into the research topics of 'Advancing smart building readiness: automated metadata extraction using neural language processing methods'. Together they form a unique fingerprint.
  • A Large-Scale Distributed Experimental Facility for the Internet of Things

    Sheng, M. (Primary Chief Investigator), Bouguettaya, A. (Chief Investigator), Loke, S. (Chief Investigator), Li, X. (Chief Investigator), Liang, W. (Chief Investigator), Benattalah, B. (Chief Investigator), Ali Babar, M. (Primary Chief Investigator), Yang, J. (Chief Investigator), Zomaya, A. Y. (Chief Investigator), Wang, Y. (Chief Investigator), Zhou, W. (Chief Investigator), Yao, L. (Chief Investigator), Taylor, K. (Chief Investigator) & Bergmann, N. (Chief Investigator)

    1/01/1831/12/20

    Project: Research

Cite this