One-Minute Finger Pulsation Measurement for Diabetes Rapid Screening with 1.3% to 13% False-Negative Prediction Rate

: Previous non-invasive Diabetes Mellitus (DM) prediction methods for rapid screening suffered from the trade-off between speed and accuracy. The accurate results of questionnaires rely on long and detailed questions thus sacrifice speed, meanwhile, photoplethysmography (PPG) offers convenient and fast testing but lacking accuracy. In this work, we developed a 5-grade model to accurately screen out non-DM subjects (low prediction grades) via one-minute PPG measurement. This efficient and effective rapid screening will practically reduce the loading for further invasive verification on the remaining DM-grade subjects. A total of 2538 subjects are recruited (DM: 1310, non-DM: 1228) with two 1-minute PPG samples taken from each subject. The model includes 8 features: 3 autonomic- and 3 vascular-related PPG features, heart rate, and waist circumference. All 8 features monotonically alter with increased DM prediction grade. The model provides users 5 DM risk grades. While defined grade 1 and grade 2 as non-DM grades, the prediction result shows a low false-negative rate of 13%. If only considering grade 1 as non-DM, the false-negative rate will be significantly reduced to 1.3%. Thus subjects predicted as grades 1 and 2 are substantially away from DM. The remaining subjects with higher DM risk grades such as grades 3, 4, and 5 (or unlikely grade 2) are recommended to take clinical-standard invasive DM test for corresponding therapeutic treatment. A table for assessing the risk index for each feature is also compiled. We have experimentally demonstrated a 1-minute pulsation measurement with PPG-based device (SpO 2 oximeter, smartphone, or wearable device) can be an efficient/effective DM rapid screening technique to filter out non-DM subjects. The resulted high-risk feature indexes also pose as warning signs of the degradation of either autonomic or vascular functions for personal healthcare management. The fast and convenient execution and useful results suggest that our approach is very simple and informative for quick DM risk assessment.


Introduction
Diabetes mellitus (DM) population is growing rapidly and often accompanied with cardiovascular diseases (CVD), which increases the mortality rate [1]. To detect or monitor disease progression on the rapidly increasing DM population, an efficient DM detection method with some information on disease progression is in need. An accurate diagnosis of diabetes remains expensive and inconvenient due to the time-consuming and invasive processes, such as traditional oral glucose tolerance test [2], HbA1C test [3], and C-peptide test [4]. On the other hand, current non-invasive DM classification methods can be generally divided into questionnaire-based and signal-based binary classification models. Although they avoid uncomfortable or painful invasive methods, there are still various issues that prevent them from widely used in clinical settings.
Attempts to predict DM from questionnaire's information such as body shape, lifestyle, and family history have been studied by using machine learning and data mining techniques in the last decades. Logistic regression, decision tree, and SVM methods are applied for undiagnosed diabetes, pre-diabetes, and diagnosed diabetes from large scale survey in United States [5][6][7]. Among the body shape information, waist circumference is the indispensable index for DM evaluation. Carey et al. [8] found that the relative risk for developing diabetes for people with waist circumference of 92cm could be 5.1 times greater than those with waist circumference of 67cm. Though with prominent performance, the questionnaire heavily depends on the detailedness of the questions and sometimes self-unaware information may be the key to figure out the subject's current physiological status precisely. Among signal-based prediction methods, photoplethysmography (PPG), commonly used for heart rate and saturation of peripheral oxygen (SpO 2 ), is popular on wearable device applications and has been studied for DM classification [9]. DM can be caused by factors such as obesity, unhealthy diet, and genetics and resulted in the degradation of one's autonomic and vascular functions. Features extracted from PPG signals can be generally divided into two categories: heart rate variability (HRV) and morphological profile. HRV can be correlated to autonomic neuropathy [10], whereas morphological profile can be correlated to vascular function [11] in DM progression.
HRV, originally derived from ECG RR intervals for evaluating the autonomic nervous system [12], is believed to be significantly decreased in DM patients. To be more specific, HRV has been viewed as an easy and reliable way for cardiovascular and autonomic neuropathy assessment [10]. Since HRV is calculated from heart rate in a time span, being popularly used in wearable products, PPG has been studied to replace ECG as a convenient surrogate with some restrictions [13,14]. The lower sharpness of PPG peak shapes making it more difficult to define accurately and can cause some deviation in calculation. Another concern is the recording time span. Previous studies [13,14] mainly use short-term HRV features (normally from 5 minutes PPG measurement). It is too lengthy and unrealistic to ask user to remain still for 5 minutes for every measurement in real application.
The effectiveness of ultra-short-term (one-minute measurement preferred) features on diabetes prediction remains unclear. On the other hand, morphological features derived from PPG have been associated with vascular stiffness, age, and cardiovascular diseases [16,17]. Previous studies extracted tens to hundreds of features from PPG signals, and then reduced the feature size by feature selection, signal decomposition, or dimensional projection [18][19][20][21].
These works faces various different challenges, such as the complexities of machine-learning model, nonlinear transformation characteristic, and small size or monotonic samples, making the result difficult to interpret. In this study, we introduce an efficient and effective model that assesses DM risk with a 5-grade representation while using 1-minute PPG measurement to achieve usability in practice. To identify DM patients, this study integrates both body shape information (waist circumference) and PPG signal-derived features (HRV and morphology) as input. Not only the grade 1 and grade 5 subjects are corresponding to the most confident for non-DM and DM prediction, but we also expect that the grades can correlate to the degree or severity of DM. The logistic regression model is chosen for its generalizability and interpretability characteristics in order to observe a global trend of relevant factors. The quantitative risk indexes of features are also available for users to learn where their DM risk grade was from. The rationale of DM risk prediction with PPG signal and some morphological feature are illustrated in Figure 1

Material and Methods
We conducted a large-scale study on comparing DM and non-DM subjects' PPG and basic physical examination items. Total of 5076 samples of 1-minute ECG and PPG signals were recorded from 2538 subjects (DM: 1310, non-DM: 1228), while ECG signals were only used as a PPG signal reference. The study was approved by the Institutional Review Board of Academia Sinica, Taiwan (Application No: AS-IRB01-16081).

Measurement Protocol
The subjects were asked to sit on the chair in resting position at least 5 minutes while filling out a questionnaire. Personal information, including, sex, age, height, weight, waist circumference, smoking habit, family history, SpO 2 (peripheral oxygen saturation), blood pressure, blood glucose, and HbA1C was asked or measured by commercial products listed in the next section. The subjects were then asked to paste the ECG patches at lead-I angle and PPG finger clips on index fingers of both hands for consecutive two 1-minute recordings of waveform signals. The experiment setup for PPG is depicted in Figure 1 (b).

Hardware
The devices and instruments used in the experiment are as follows: POM-201 for SpO 2 , Omron HEM-7320 for blood pressure, Roche Accu-chek mobile for blood glucose, SEIMENS DCA Vantage Analyzer for HbA1C, and CardioChek PA analyzer for blood lipid. PPG and ECG are recorded with TI AFE4490 module and ADI AD8232-EVALZ, respectively.

Data Process
Filters were first applied to raw PPG signals to acquire useable AC signal. The high-frequency signals (>40Hz) were filtered out to remove noise, and low-frequency signals (<0.1Hz) were filtered out to remove DC components [16]. PPG peaks and valleys were then defined by applying peak detection to the signals. ECG signal were only used to confirm the validity of each PPG sample in this study. Valley to valley (V-V) interval was selected to represent PPG pulse intervals for its better consistency with R-R intervals. Out of the total 5076 samples from 2538 subjects, 58 samples were removed due to the mismatch numbers between V-V intervals and R-R intervals, leaving 5018 samples. Samples with measured HbA1C level over 6.5% (48 mmol/mol) or already under diabetic related treatments are labeled as DM in this study.

Arrhythmia (Irregular Rhythm) Samples Removal
After valley detection and segmentation, the area, amplitude, and interval value of each pulse are evaluated. Samples with significant irregular rhythm pulses and abnormal pulse profile are removed to not confound HRV and morphology calculation. If the width of any two continuous pulses or areas under the signal curve differs by 2 or more than 2-folds, or half or less than one-half folds, the signal is determined to be irregular. A total of 97 samples (71 for DM, 26 for non-DM) were further removed at this step from the previously remaining 5018 samples. Finally, 4921 samples were used for the following work.

HRV and Morphological Features
Although many features are listed below, a large portion was removed by feature selection process during modeling. Total of 6 of features (3 HRV and 3 morphological) were used in junction with heart rate and waist circumference to generate two sets of modeling: 1. Model without waist circumference (W/O-WC) 2. Model with waist circumference (W-WC). HRV features are correlated to one's autonomic function, and morphological features are correlated to one's vascular function.
Many HRV features in time and frequency domain were calculated from V-V intervals of PPG.
Time-domain features are derived from interval differences as follows: 1) SDNN: the standard deviation of NN (normal beat-to-beat) intervals. NN intervals are equivalent to VV interval in this study. 2) RMSSD: the root mean square of successive differences between adjacent NN intervals. 3) SDSD: the standard deviation of successive differences between adjacent NN intervals. 4) pNN50: the proportion of NN50 (number of successive NN interval difference exceeds 50ms) divided by total number of NN intervals. 5) pNN20: the proportion of NN20 (number of successive NN interval difference exceeds 20ms) divided by total number of NN intervals. Frequency domain (power spectral density) features are calculated from one-dimensional discrete Fourier Transform with the sample's pulse intervals as follows: 1) LFP: low-frequency power (variance) of frequency between 0.04-0.15 Hz. [12] 2) HFP: high-frequency power (variance) of frequency between 0.15-0.4 Hz. [12] 3) TP: total power (variance) of frequency below 0.4 Hz. 4) FP_ratio: ratio between LFP and HFP 5) nLFP: normalized low-frequency power, LFP/TP. 6) nHFP: normalized high-frequency power, HFP/TP. Pulse-wise morphological features are defined from time span of relative amplitude of AC signal as follows: 1) FW_25: full width of 25% amplitude 2) FW_50: full width of 50% amplitude 3) FW_75: full width of 75% amplitude 4) nFW_25: normalized full width of 25% amplitude by pulse width 5) nFW_50: normalized full width of 50% amplitude by pulse width 6) nFW_75: normalized full width of 75% amplitude by pulse width 7) UT: pulse width (time) from the valley to 100% amplitude 8) Total_Area: Sum of pulses area under curve in 1 minute 9) Normed_Area: Mean Area under curve for each pulse in 1 minute (total_area divided by pulse number)

Model
The logistic regression model is chosen for better explanation of the feature influence to the target (DM risk probability, P), easy implementation, and probability information. The logistic regression module from scikit-learn version 0.19.1 for python3 is used in this section. The logistic regression function is as follows: The DM risk corresponds to the conditional probability P, which describes the chance of observing Y=1 on condition that X is a particular vector x. The binary outcome Y of 1 stands for DM and 0 stands for non-DM. The vector component stands for the value of selected feature i, and βi stands for the coefficient of the i th feature. Maximum-likelihood estimation of β updates minimizes the error of model prediction with respect to the true class. The N features are selected through a backward elimination procedure from p-value of individual feature. For several iterations, a model is trained, and the feature with the highest p-value is removed to eliminate the least significant features. This process continues until the model scores stop improving. Total of 8 features remaining includes waist circumference, heart rate, Normed_Area, FW_50, Total_Area, TP, SDNN, and pNN20. After feature selection with iterations, the model predicted probability is stored and put into 5-grade representation. Grade 1 to 5 is defined corresponding to 0~20%, 20~40%, 40~60%, 60~80%, and 80~100% region of DM probability. In addition to 5-grade probabilities, the risk indexes for individual features were also derived. For each feature, mid-point values of the mean values of consecutive grades for each feature are used as boundaries to define their risk index grades.

Result and Discussion
First, overviews on W-WC and W/O-WC models are given, and then their DM probabilities are compared in 5-grade representation. Next, how the hybrid model is introduced to improve the false-negative region of the result is shown. Then, we present an overview of our input features and the distribution of different grades. Last, we explain how the model can be used.

Model Comparison
When looking at the distribution of the two models in Figure 2 (b), we can see the prediction results on the same sample from both models may vary. In comparison, using waist circumference results in a generally more desirable distribution of the prediction result (better differentiability and more separation between DM and non-DM samples). On the downside, it also brings an unwanted result of having more DM subjects in the lowest DM risk grade. To compare the differences in low DM risk grades, the confusion matrix using both grades 1 and 2 or only grade 1 as non-DM, and the rest as DM are shown in Table 1. Despite the model W-WC having a lower false-negative rate on grades 1 and 2 combined, it has significantly more DM samples classified to grade 1 when compared to model W/O-WC as shown in figure 2. The false-negative rate in grade 1 specifically had increased from 2.3% to 3%. In the clinical setting, it is more costly and problematic to give the users false-negative diagnose, which could result in delayed treatment or remain untreated completely. A false-positive result would only require the user to get a more comprehensive test to verify. A hybrid model was created by combining models with and without waist circumference. When comparing the same sample with inconsistent prediction results between with and without waist circumference model, the higher DM risk prediction grade is taken as its hybrid model prediction grade. As waist circumference is the strongest feature but could be over dominating in some cases and result in some false-negative results. Our hybrid model solved this problem from the viewpoint of reducing high false-negative rate for rapid screening application. The whole modeling process is summarized as a flowchart in Figure 2 (a). It significantly reduces the number of false-negative predictions in grade 1 down to 32 samples and grade 2 down to 290 samples. Resulting in the false-negative rate from only 1.3% in grade 1 to 13% from grades 1 and 2 combined. This minimizes the number of false-negative prediction results at the cost of some more non-DM predictions classified into higher grades. Based on the hybrid model in high DM risk grades, 85.1% (333 out of 391 samples) of the samples in grade 5 and 69.0% (1170 out of 1696 samples) of the samples in grade 4 already have diabetes. As our study uses the diabetes definition of having HbA1C above 6.5% (48 mmol/mol) for labeling DM and non-DM. Many of the samples from non-DM subjects classified into grades 4 and 5 have elevated HbA1C but were just under the DM spec (6.5% or 48 mmol/mol). Around 3 quarters of the non-DM samples in grades 4 (353 out of 526) and 5 (48 out of 58) were within the prediabetes range (HbA1C 5.7% ~6.5% or 39~46 mmol/mol) leaving very little truly non-diabetic subjects in grades 4 and 5 (183 truly non-diabetic subjects out of 2087). As many labeled non-diabetic subjects are actually in the pre-diabetes range, it may have taken parts in why the result distribution is skewed toward higher DM grades. 1.3% to 13% False-Negative Prediction Rate

Feature Analysis
The selected features from a backward elimination process for DM prediction are shown in Table 2 with the format of mean value ± standard deviation for all samples in each grade.
Heart rate and waist circumference are deemed to be the most important feature by the model W/O-WC and W-WC and have an increasing trend with higher prediction grade.
Compared with non-DM subjects, both HRV and morphological features show a decreased value in DM subjects, which supports the assumption of vascular and autonomic dysfunction. Instead of a more commonly used HRV feature pNN50 in other studies [15], pNN20 is proved to be more suitable with 1-minute PPG signal due to the significantly shorter measuring time span. Another notable eliminated feature is UT, which correlates to the systole states. The elimination of UT is attributed to the difficulty on labeling the signal peak accurately due to its low sharpness on PPG pulsation signal, which results in some inconsistency. The table also summarizes the relationship between model predicted grades and features with corresponding HbA1C values for acquiring the individual feature risk index. Table 2 also illustrates that the higher the grades of the logistic regression probability are, the higher the HbA1C values of the corresponding subjects are. This evidence suggests that the model learns physiological meaning from DM binary labels, and the predicted probability is appropriate to be treated as a quantitative measure of DM progression. Box plots for a more visualized relationship across grades are shown in Figure 3. As many lifestyles affect the DM progression, one can track his or her own status dynamically with this grade and be aware of the corresponding risk feature to improve.  The table shows the mean value ± standard deviation of individual features and Hba1C of all samples grouped by different predicted risk grades and when grouped by DM and non-DM groups. Based on this table, we can assess one's individual feature risk index with mid-point value as boundary between the mean values of adjacent grades. We can also see DM groups generally have higher HR and waist circumference, and lower HRV or morphology related features.

Interpretation of Predicted DM Risk Grades and Corresponding Recommendations
From the application point of view, there are two scenarios of using the proposed model to predict DM riskundiagnosed and diagnosed DM patients. For undiagnosed DM patients, this can be used as a rapid screening tool to identify high DM risk patients and offer warning to low-risk patients at the same time. The devices capable of recording PPG signals such as smartphones, wearable devices, and SPO 2 oximeter are commonly available. This method can be easily achieved on a massive scale with very little to none additional cost if implemented on said devices. For diagnosed DM patients, predicted results and feature indexes can help them monitor their physiology status for managing or improving their DM condition to prevent further degradation, especially on vascular and autonomic functions.
The risk grading of individual features is defined by mid-point value as the boundary between the mean values of the adjacent grade. The mean values of features with the corresponding model's risk prediction are presented in Table  2. Assessing DM status using hybrid model prediction grades in combination with the grades of individual features provides a more informative result for the users. For example, when non-DM subjects classified into the higher risk group, this may be a false-positive prediction in terms of diabetes risk; however, with the individual feature risk index, the user could assess whether or not they are suffering from other related underlying health conditions often associated with high DM risk. As for prediction results in lower DM risk grades, users can still pay extra attention to individual features for warning signs and act upon them accordingly.

Comparison of Autonomic-Related Features
All of the HRV parameters show a decreasing trend with increased DM risk grade, which is consistent with previous studies [10]. However, we found that in our 1-minute recording setting, the most sensitive parameter is pNN20, whereas and the least sensitive parameter is SDNN. The reason why pNN20 has not been highlighted as a significant factor for DM in HRV parameters in previous studies may be due to the time span of calculation. As the time span increases, it also increases the chance of recording larger interval variation changes. For the 1-minute time span, 20 ms may be the appropriate threshold to distinguish DM from non-DM.

Comparison of Vascular-Related Features
For vascular correlated features, the model selected total area, normed area (average pulse area), and the full width of half amplitude as significant features for DM classification. These features all reflect the blood perfusion that the higher values suggest better blood perfusion, which is expected to perform better on healthy subjects. There is the possibility of heart rate as the true factor that dominates these features. We believed we can view the feature relation of normed_area is to area (total_area) as the relation of stroke volume is to cardiac output, which manifests the characteristic of blood perfusion.

Comparison to Previous Studies
Previous works generally gives concise DM or non-DM answers with overall accuracy and emphasis on new methods or features [5][6][7][8][9]. In this study, we focused on achieving low false negative rate when applying in practical uses. To the extent of our knowledge, this presents the best low false-negative rate result.

Conclusion
This study demonstrates a probability-based 5-grade classifications scenario for DM risk prediction. We, especially, conclude that this technique with waist circumference and PPG signal-derived features is capable of rapidly screening out non-DM subjects. All of the features used in the model are monotonically increasing or decreasing along with risk grades. Our model having the false-negative rate forms 1.3% to 13% for only 1.3% and 11.8% of all DM samples were classified into grade 1 and grade 2, respectively. For samples classified into grade 4 and grade 5, 69.0% and 85.1% of samples have diabetes. Even though the rest of the samples were not suffering from diabetes, their high risk autonomic-or vascular-related features may arise from other health complications. Based on these prominent results, we conclude that the 5-grade DM prediction with the hybrid model is effective and efficient. With the additional feature risk indexes derived from 1-minute PPG measurement for in-depth analysis, this is also a very simple and informative methodology to predict DM risk. J. C. wrote the manuscript and research model. W. Y. wrote the manuscript draft and research model. T. H. reviewed/edited the manuscript. F. Y. organized the research and coordinated the manuscript. F. Y is the guarantor of this work and, as such, had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.