Publications

75-Speaker Annot-16: A benchmark dataset for speech articulatory rt-MRI annotation with articulator contours and phonetic alignment

Abstract

High-quality speech articulatory databases are essential for advancing speech science and technology research. However, the lack of standardized annotations limits their full potential use and broad accessibility. In this context, we introduce 75-Speaker Annot-16, a comprehensive annotation dataset derived from the 75-Speaker vocal tract MRI database. Annot-16 provides phonetic alignments, articulator contour annotations, and handmade ground-truth articulator contours. Our annotation process integrates automated algorithms with expert verification to ensure accuracy and efficiency. To demonstrate its utility, we establish three benchmark tasks: speech phoneme recognition, articulatory contour segmentation, and articulatory phoneme recognition. Annot-16 can serve as a valuable resource for speech modeling, computer vision, and cross-modal learning, bridging engineering applications, speech science, and linguistic research. Database webpage: https://sail. usc. edu/span/75speakers annot/.

Date
February 5, 2026
Authors
Xuan Shi, Yubin Zhang, Yijing Lu, Marcus Ma, Tiantian Feng, Asterios Toutios, Haley Hsu, Louis Goldstein, Shrikanth Narayanan
Conference
Proc. Interspeech 2025
Pages
2175-2179