Fine-grained recognition of manipulation activities on objects via multi-modal sensing

Xiulong Liu, Bojun Zhang, Lizhang Wang, Sheng Chen, Xin Xie*, Xinyu Tong*, Tao Gu, Keqiu Li

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

2 Citations (Scopus)

Abstract

Fine-grained recognition of human manipulation activities on objects is crucial in the era of human-computer-object integration. However, there is a lack of solutions for simultaneous recognition of human identity, manipulation activities (including drawing and rotation), and manipulated objects. Therefore, we propose an RF-Camera system that combines RFID and computer vision techniques to address this challenge in multi-person and multi-object scenarios. In RF-Camera, we employ a skeleton-assisted method to extract facial images of target individuals, enabling precise recognition of their identities. To identify manipulation activities, we analyze the 3D hand trajectory and fingertip vector angle, differentiating drawing and rotation manipulation activities. Additionally, we model target person's hand movements to predict phase data of the target tag, enabling the determination of person-object relationships. Implementing RF-Camera using COTS RFID and Kinect devices involves overcoming challenges such as extracting effective data from noisy streams, predicting virtual phase data considering hand-tag offset, and ensuring high tag reading rates in tag-dense scenarios. We conducted experiments involving six participants performing object manipulation activities, including drawing letters/symbols and rotating movements. Extensive experimental results show that RF-Camera achieves over 90% accuracy in recognizing person identity, manipulation activities, and person-object matching in most conditions.

Original languageEnglish
Pages (from-to)9614-9628
Number of pages15
JournalIEEE Transactions on Mobile Computing
Volume23
Issue number10
DOIs
Publication statusPublished - Oct 2024

Keywords

  • computer vision
  • human sensing
  • multi-modal fusion
  • object manipulation activities
  • RFID

Cite this