Publications
ModalityMirror: Enhancing Audio Classification in Modality Heterogeneity Federated Learning via Multimodal Distillation
Abstract
Multimodal Federated Learning frequently encounters challenges of client modality heterogeneity, leading to undesired performances for secondary modality in multimodal learning. It is particularly prevalent in audiovisual learning, with audio is often assumed to be the weaker modality in recognition tasks. To address this challenge, we introduce ModalityMirror to improve audio model performance by leveraging knowledge distillation from an audiovisual federated learning model. ModalityMirror involves two phases: a modality-wise FL stage to aggregate unimodal encoders; and a federated knowledge distillation stage on multimodality clients to train a unimodal student model. Our results demonstrate that ModalityMirror significantly improves the audio classification compared to the state-of-the-art FL methods such as Harmony, particularly in audiovisual FL facing video missing. Our approach unlocks the potential …
Metadata
- publication
- Proceedings of the 35th Workshop on Network and Operating System Support for …, 2025
- year
- 2025
- publication date
- 2025/3/31
- authors
- Tiantian Feng, Tuo Zhang, Salman Avestimehr, Shrikanth Narayanan
- link
- https://dl.acm.org/doi/abs/10.1145/3712678.3721885
- resource_link
- https://dl.acm.org/doi/pdf/10.1145/3712678.3721885
- book
- Proceedings of the 35th Workshop on Network and Operating System Support for Digital Audio and Video
- pages
- 78-83