Low-quality measurements were excluded from both the PPG and monitor data. For the PPG wristband vitals, a low quality index can originate from motion artefacts or a low signal-to-noise ratio. For HR and RR, detection of arrhythmia using an arrhythmia detection algorithm would also lead to a low quality score [21 (link)]. For the reference monitor, the logged ECG and capnography signals were visually inspected to identify low-quality measurements, based on assessment of the temporal sequence.
Baseline characteristics are expressed as mean (SD) or, in case of nonnormally distributed values, as median (IQR) values. Agreement between the PPG wristband and reference monitor measurements on a second-to-second basis was visualized using Bland-Altman plots [22 (link)]. As multiple observations from the same patients were analyzed, the bias and limits of agreement were calculated using the method for repeated measures of Zou et al [23 (link)]. Additionally, the 95% CIs around the limits of agreement were assessed using MOVER [23 (link)].
According to the American National Standards Institute consensus standard, the error for HR measurements should be ≤10% or ≤5 bpm. In this analysis, an error of ≤5 bpm for HR and ≤3 rpm for RR was considered clinically acceptable. Additionally, Clarke error grid analysis was performed to quantify the implications of the difference between the vitals measured by the reference monitor and the PPG wristband. Clarke error grid analysis was originally developed for blood glucose measurements, and the boundaries of the different zones were adapted on the basis of the Modified Early Warning Score protocol used in our hospital [8 (link),17 (link),24 (link),25 (link)].