Publications – Information Sciences Institute

ModalityMirror: Enhancing Audio Classification in Modality Heterogeneity Federated Learning via Multimodal Distillation

Abstract

Multimodal Federated Learning frequently encounters challenges of client modality heterogeneity, leading to undesired performances for secondary modality in multimodal learning. It is particularly prevalent in audiovisual learning, with audio is often assumed to be the weaker modality in recognition tasks. To address this challenge, we introduce ModalityMirror to improve audio model performance by leveraging knowledge distillation from an audiovisual federated learning model. ModalityMirror involves two phases: a modality-wise FL stage to aggregate unimodal encoders; and a federated knowledge distillation stage on multimodality clients to train a unimodal student model. Our results demonstrate that ModalityMirror significantly improves the audio classification compared to the state-of-the-art FL methods such as Harmony, particularly in audiovisual FL facing video missing. Our approach unlocks the potential …

Metadata

publication: Proceedings of the 35th Workshop on Network and Operating System Support for …, 2025
year: 2025
publication date: 2025/3/31
authors: Tiantian Feng, Tuo Zhang, Salman Avestimehr, Shrikanth Narayanan
link: https://dl.acm.org/doi/abs/10.1145/3712678.3721885
resource_link: https://dl.acm.org/doi/pdf/10.1145/3712678.3721885
book: Proceedings of the 35th Workshop on Network and Operating System Support for Digital Audio and Video
pages: 78-83