Authors: Heitor R. Guimarães, Arthur Pimentel, Anderson R. Avila, Mehdi Rezagholizadeh, Boxing Chen, Tiago H. Falk
Published on: March 13, 2024
Impact Score: 7.6
Arxiv code: Arxiv:2403.08654
Summary
- What is new: Introduces RobustDistiller, a novel knowledge distillation mechanism that reduces model size and increases robustness to noise and reverberation in speech recognition tasks.
- Why this is important: Existing self-supervised speech representation models are too large for edge applications and are not robust against noise and reverberation.
- What the research proposes: RobustDistiller uses a multi-task learning objective alongside knowledge distillation to create smaller, noise-invariant speech recognition models.
- Results: The Student model with 23M parameters achieved comparable performance to the Teacher model with 95M parameters across twelve different tasks, showing improvement in noise and reverberation robustness.
Technical Details
Technological frameworks used: nan
Models used: RobustDistiller, DPWavLM
Data used: nan
Potential Impact
Speech recognition technology providers, edge computing device manufacturers, companies relying on voice interfaces and audio analysis.
Want to implement this idea in a business?
We have generated a startup concept here: ClearSpeak.
Leave a Reply