Federating Dynamic Models using Early-Exit Architectures for Automatic Speech Recognition on Heterogeneous Clients

自动语音识别模型需要大量的语音录音进行训练。然而,收集这些数据通常很费力,并导致隐私问题。分散学习作为一种有效的去中心化技术,在保持数据在不同的客户端本地的同时,协作学习一个共享预测模型,已经得到了广泛应用。然而,客户端设备通常具有有限的计算和通信资源,导致大型模型的实际困难。此外,边缘设备的异质性使得为它们生成一个适用于所有设备的单模型是不可能的。与最近的文章不同,使用具有不同架构的多个模型,本文提出了一种使用动态架构的方法,该方法采用早期退出策略,可以根据输入和操作条件调整其处理(即遍历层)。这种解决方案属于部分训练方法,并为两种好处:在各种设备上使用单个模型;在本地训练后联邦学习模型是直接的。在公共数据集上进行的实验表明,我们提出的方法是有效的,可以与基本的去中心化学习策略相结合。

Automatic speech recognition models require large amounts of speech recordings for training. However, the collection of such data often is cumbersome and leads to privacy concerns. Federated learning has been widely used as an effective decentralized technique that collaboratively learns a shared prediction model while keeping the data local on different clients. Unfortunately, client devices often feature limited computation and communication resources leading to practical difficulties for large models. In addition, the heterogeneity that characterizes edge devices makes it sub-optimal to generate a single model that fits all of them. Differently from the recent literature, where multiple models with different architectures are used, in this work, we propose using dynamical architectures which, employing early-exit solutions, can adapt their processing (i.e. traversed layers) depending on the input and on the operation conditions. This solution falls in the realm of partial training methods and brings two benefits: a single model is used on a variety of devices; federating the models after local training is straightforward. Experiments on public datasets show that our proposed approach is effective and can be combined with basic federated learning strategies.

https://arxiv.org/abs/2405.17376

https://arxiv.org/pdf/2405.17376.pdf

发表回复

您的电子邮箱地址不会被公开。 必填项已用 * 标注