Publications

The machine learning-based prediction of the sound pressure level from pathological and healthy speech signal

Abstract

Vocal intensity is quantified by the sound pressure level (SPL). The SPL can be measured by either using a sound level meter or by comparing the energy of the recorded speech signal with the energy of the recorded calibration tone of a known SPL. Neither of these approaches can be used if speech is recorded in real-life conditions using a device that is not calibrated for SPL measurements. To measure the SPL from non-calibrated recordings, where speech is presented on a normalized amplitude scale, this study investigates the use of the machine learning (ML)-based estimation of the SPL. Several ML-based systems consisting of a feature extraction stage and a regression stage were built. For the former, four conventional acoustic features, two state-of-the-art pre-trained features, and their combined feature set were compared. For the latter, three regression models were compared. The systems were trained using the healthy speech of an open repository. The systems were evaluated using both pathological speech produced by patients suffering from heart failure and using speech produced by healthy controls. The results showed that the best combination of the feature and regression model provided a mean absolute error of about 2 dB in the SPL estimation task.

Metadata

publication
Journal of the Acoustical Society of America, 2025
year
2025
publication date
2025
authors
Manila Kodali, Sudarsana Kadiri, Shrikanth Narayanan, Paavo Alku
link
https://research.aalto.fi/en/publications/the-machine-learning-based-prediction-of-the-sound-pressure-level
resource_link
https://research.aalto.fi/files/173189136/Jasa_Kodali_et_al.pdf
journal
Journal of the Acoustical Society of America
publisher
Acoustical Society of America