Remote photoplethysmography (rPPG) is a non- contact technique for estimating heart rate (HR) by analyzing subtle, blood-volume-induced color changes in skin captured by standard video cameras. While promising for applications in telemedicine and continuous health monitoring, practical rPPG implementation is severely hampered by susceptibility to motion artifacts, illumination variations, and skin-tone differences. In this study, we develop a comprehensive rPPG pipeline using the UBFC-rPPG dataset, employing standard face detection and sig- nal extraction algorithms (Green, CHROM, POS, LGI) to retrieve baseline pulse waveforms. To address the critical challenge of signal noise, we implement and rigorously compare two distinct performance optimization strategies: (1) a composite Normalized Signal Quality Index (NSQI)-based filtering framework designed to autonomously identify and discard corrupted signal segments, and (2) a supervised Machine Learning (ML) regression approach utilizing a 1D Convolutional Neural Network (1D-CNN) to map physiological signals directly to HR estimates. Our experimental results demonstrate that the NSQI filtering approach effectively reduces Mean Squared Error (MSE) and Mean Absolute Error (MAE) by selectively removing low-quality data. Conversely, the 1D-CNN approach exhibited significant generalization issues, failing to capture morphological signal features and instead acting as a mean regressor due to inherent dataset bias and normalization challenges.
Introduction
The text discusses a method for estimating heart rate (HR) using remote photoplethysmography (rPPG), a contactless technique that extracts pulse signals from facial video by analyzing subtle changes in skin color caused by blood flow. Unlike traditional contact-based methods such as ECG and pulse oximetry, rPPG is more convenient but highly sensitive to noise from motion, lighting changes, and camera instability, which makes accurate heart rate estimation challenging.
The study reviews two main approaches to improving rPPG performance: advanced signal processing methods and deep learning techniques. Traditional methods such as Green channel analysis, CHROM, POS, and LGI attempt to isolate pulse signals using color transformations and mathematical modeling. In contrast, deep learning models like CNNs aim to directly learn heart rate patterns from raw or processed video signals but require large datasets and strong generalization capabilities.
The proposed system builds a complete rPPG pipeline using the UBFC-rPPG dataset, starting with face detection and region-of-interest extraction, followed by spatial averaging and bandpass filtering to isolate physiological signals. Four baseline extraction methods (Green, CHROM, POS, LGI) are evaluated for signal quality.
To improve reliability, the study introduces two optimization strategies:
NSQI (Normalized Signal Quality Index) filtering, which evaluates signal quality using multiple statistical and spectral features (such as SNR, entropy, skewness, kurtosis, zero-crossing rate, and phase-space behavior). Low-quality signal segments are removed to improve accuracy.
A 1D-CNN regression model, which directly predicts heart rate from time-domain signals using convolutional layers and supervised learning.
Results show that advanced methods like POS and CHROM perform better than basic approaches, and NSQI filtering improves accuracy by removing noisy signal segments. However, NSQI only filters bad data rather than correcting it. The 1D-CNN model showed a major limitation: it failed to learn meaningful pulse features and instead predicted nearly constant heart rates, highlighting challenges in training deep models on rPPG data without strong data quality and diversity.
Conclusion
This paper presented a comparative study of signal quality filtering versus machine learning regression for improving rPPG accuracy. The findings demonstrate that NSQI-based filtering is a viable, interpretable strategy for enhancing reliability by discarding noisy data, though its ultimate performance is capped by the quality of the underlying signal extraction algorithms. Conversely, the 1D-CNN approach, while theoretically powerful, failed to generalize due to severe dataset bias and normalization challenges, highlighting the difficulty of training regression models on limited physiological data.
Future work will focus on addressing the identified limitations:
1) Dynamic ROI Tracking: Replacing the fixed heuristic ROI with facial landmark tracking to maintain skin coverage during motion.
2) ML Data Augmentation: Implementing temporal re- sampling techniques to artificially balance the dataset across a wider range of heart rates (e.g., 40-140 BPM) to prevent mean-regression overfitting.
3) Instance Normalization: Applying Z-score normalization on a per-window basis to help the CNN learn features independent of signal amplitude.
References
[1] S. Nakamura et al., “A Review of Photoplethysmography for Remote Physiological Monitoring,” IEEE Trans. Biomed. Eng., vol. 68, no. 9, pp. 2896-2908, 2020.
[2] C.-H. Cheng, K.-L. Wong, et al., “Deep Learning Methods for Remote Heart Rate Measurement: A Review,” Sensors, vol. 21, no. 18, 6296, 2021.
[3] G. de Haan and V. Jeanne, “Robust Pulse Rate from Chrominance-based rPPG,” IEEE Trans. Biomed. Eng., vol. 60, no. 10, pp. 2878-2886, 2013.
[4] J. Wang et al., “Exploiting Spatial Redundancy of Facial Skin for Remote Photoplethysmography,” IEEE Trans. Biomed. Eng., vol. 67, no. 5, pp. 1735-1744, 2020.
[5] W. Chen and D. McDuff, “DeepPhys: Video-Based Physiological Mea- surement Using Convolutional Attention Networks,” in ECCV, 2018.
[6] Z. Yu, X. Li, and G. Zhao, “Remote Photoplethysmograph Signal Measurement from Facial Videos Using Spatio-Temporal Networks (PhysNet),” in BMVC, 2019.
[7] M.Z. Poh, N.C. Swenson, and R.W. Picard, “A Wearable Sensor for Unobtrusive, Long-term Assessment of Electrodermal Activity,” IEEE Trans. Biomed. Eng., vol. 57, no. 5, pp. 1249-1258, 2010.
[8] W. Verkruysse, L.O. Svaasand, and J.S. Nelson, “Remote plethysmo- graphic imaging using ambient light,” Opt. Express, vol. 16, no. 26, pp. 21434-21445, 2008.
[9] S. Bobbia et al., “Unsupervised Skin Tissue Segmentation for Remote Photoplethysmography,” Pattern Recognit. Lett., vol. 95, pp. 71-81, 2017.
[10] M. Elgendi, I. Martinelli, and C. Menon, “Optimal signal quality index for remote photoplethysmogram sensing,” npj Biosensing, 2024.