TY - JOUR
T1 - SimFuPulse
T2 - a self-similarity supervised model for remote photoplethysmography extraction from facial videos
AU - Xiao, Hanguang
AU - Li, Zhipeng
AU - Xia, Ziyi
AU - Liu, Tianqi
AU - Zhou, Feizhong
AU - Avolio, Alberto
PY - 2024/12
Y1 - 2024/12
N2 - Remote Photoplethysmography (rPPG) is a non-contact technique for extracting physiological signals using facial videos, exhibiting broad application prospects in fields such as anti-spoofing face recognition, healthcare, and affective computing. However, extracting rPPG signals from facial video sequences encounters challenges due to subtle color variations and noise interference. Additionally, the presence of phase offset between ground truth and facial videos further complicates this endeavor. To address the issues of weak signals, strong noise, and phase offset, we propose a self-similarity supervised learning approach, named SimFuPulse, to mitigate noise and enhance rPPG representation by fusing original and differential video frames. By employing a 3D convolutional network (ResPhys) with an encoder–decoder architecture, enhanced spatiotemporal features are modeled to extract reliable rPPG signals. Moreover, a self-similarity mechanism is devised to mitigate the impact of phase offset on model training. The proposed method demonstrates superior accuracy over current state-of-the-art approaches across three publicly available datasets.
AB - Remote Photoplethysmography (rPPG) is a non-contact technique for extracting physiological signals using facial videos, exhibiting broad application prospects in fields such as anti-spoofing face recognition, healthcare, and affective computing. However, extracting rPPG signals from facial video sequences encounters challenges due to subtle color variations and noise interference. Additionally, the presence of phase offset between ground truth and facial videos further complicates this endeavor. To address the issues of weak signals, strong noise, and phase offset, we propose a self-similarity supervised learning approach, named SimFuPulse, to mitigate noise and enhance rPPG representation by fusing original and differential video frames. By employing a 3D convolutional network (ResPhys) with an encoder–decoder architecture, enhanced spatiotemporal features are modeled to extract reliable rPPG signals. Moreover, a self-similarity mechanism is devised to mitigate the impact of phase offset on model training. The proposed method demonstrates superior accuracy over current state-of-the-art approaches across three publicly available datasets.
KW - Face video
KW - Heart rate
KW - Remote photoplethysmography
KW - Supervised learning
UR - http://www.scopus.com/inward/record.url?scp=85201315712&partnerID=8YFLogxK
U2 - 10.1016/j.bspc.2024.106736
DO - 10.1016/j.bspc.2024.106736
M3 - Article
AN - SCOPUS:85201315712
SN - 1746-8094
VL - 98
SP - 1
EP - 13
JO - Biomedical Signal Processing and Control
JF - Biomedical Signal Processing and Control
M1 - 106736
ER -