Publications

ModalityMirror: Enhancing Audio Classification in Modality Heterogeneity Federated Learning via Multimodal Distillation

Abstract

Multimodal Federated Learning frequently encounters challenges of client modality heterogeneity, leading to undesired performances for secondary modality in multimodal learning. It is particularly prevalent in audiovisual learning, with audio is often assumed to be the weaker modality in recognition tasks. To address this challenge, we introduce ModalityMirror to improve audio model performance by leveraging knowledge distillation from an audiovisual federated learning model. ModalityMirror involves two phases: a modality-wise FL stage to aggregate unimodal encoders; and a federated knowledge distillation stage on multimodality clients to train a unimodal student model. Our results demonstrate that ModalityMirror significantly improves the audio classification compared to the state-of-the-art FL methods such as Harmony, particularly in audiovisual FL facing video missing. Our approach unlocks the potential …

Metadata

publication
Proceedings of the 35th Workshop on Network and Operating System Support for …, 2025
year
2025
publication date
2025/3/31
authors
Tiantian Feng, Tuo Zhang, Salman Avestimehr, Shrikanth Narayanan
link
https://dl.acm.org/doi/abs/10.1145/3712678.3721885
resource_link
https://dl.acm.org/doi/pdf/10.1145/3712678.3721885
book
Proceedings of the 35th Workshop on Network and Operating System Support for Digital Audio and Video
pages
78-83