Nocturnal Cough and Snore Detection Using Smartphones in Presence of Multiple Background-Noises

Author: Sudip Vhaduri (Fordham University)

DOI: https://doi.org/10.1145/3378393.3402273

Session: 3.2. Health

Abstract: Non-speech human sounds, such as coughs and snores, and their patterns are associated with different respiratory diseases, including asthma, chronic obstructive pulmonary disease (COPD), as well as other health difficulties such as sleep disorders. Thereby, researchers and physicians have been using coughs and snores as symptoms while reporting and assessing respiratory diseases, their stages, and sleep quality. However, so far, the assessments frequently depend on different types of patient-reported surveys, which inherently suffer from various limitations, such as recall biases, human errors. Therefore, automated detection and reporting of coughs and snores can improve the disease assessment and monitoring. In this paper, we present an automated approach to detect coughs and snores from smartphone-microphones using generalized, semi-personalized and personalized modeling schemes. We analyze three separate datasets and different combinations of three types of nocturnal noises (i.e., sounds from air conditioners (AC), dog barks, and sirens) using the {\em Mel-frequency cepstral coefficient} (MFCC) features and different classification techniques. We find that a generalized model with the support vector machine (SVM) classifier can achieve an average accuracy of 0.86\pm0.140.86±0.14, F_1 F 1 score of 0.86\pm0.130.86±0.13, and area under the receiver operating characteristic curve (AUC-ROC) of 0.94\pm0.080.94±0.08. These performances can further be improved to an average accuracy of 0.96\pm0.080.96±0.08, F_1 F 1 score of 0.96\pm0.080.96±0.08, and AUC-ROC of 0.98\pm0.040.98±0.04 using the personalized random forest (RF) model. The results show the potential for smartphones to automatically report symptoms of respiratory diseases as well as sleep disorders. Furthermore, we find that our models perform consistently well while testing on separate datasets in the presence of multiple background-noises.