The machine learning-based prediction of the sound pressure level from pathological and healthy speech signal

Abstract

Vocal intensity is quantified by the sound pressure level (SPL). The SPL can be measured by either using a sound level meter or by comparing the energy of the recorded speech signal with the energy of the recorded calibration tone of a known SPL. Neither of these approaches can be used if speech is recorded in real-life conditions using a device that is not calibrated for SPL measurements. To measure the SPL from non-calibrated recordings, where speech is presented on a normalized amplitude scale, this study investigates the use of the machine learning (ML)-based estimation of the SPL. Several ML-based systems consisting of a feature extraction stage and a regression stage were built. For the former, four conventional acoustic features, two state-of-the-art pre-trained features, and their combined feature set were compared. For the latter, three regression models were compared. The systems were trained using the healthy speech of an open repository. The systems were evaluated using both pathological speech produced by patients suffering from heart failure and using speech produced by healthy controls. The results showed that the best combination of the feature and regression model provided a mean absolute error of about 2 dB in the SPL estimation task.

Date: November 4, 2025
Authors: Manila Kodali, Sudarsana Kadiri, Shrikanth Narayanan, Paavo Alku
Journal: Journal of the Acoustical Society of America
Publisher: Acoustical Society of America

View Paper

Information Sciences Institute

Publications

The machine learning-based prediction of the sound pressure level from pathological and healthy speech signal

Abstract