DoRA: Enhancing Parameter-Efficient Fine-Tuning with Dynamic Rank Distribution

翻译:对大规模预训练模型的微调本质上是一个资源密集型任务。虽然它能够增强模型的能力,但同时也导致了相当数量的计算成本,对下游任务的实际应用造成了挑战。现有的参数高效的微调(PEFT)方法,如低秩适应(LoRA)依赖于绕过框架,忽略权重矩阵之间差别的参数预算需求,这可能导致微调效果不理想。为了解决这个问题,我们引入了动态低秩适应(DoRA)方法。DoRA将高秩LoRA层分解为结构化的单秩组件,使得在训练过程中可以根据它们对特定任务的贡献动态地调整参数预算,这使得有限参数预算能实现最大的利用。实验结果表明,DoRA可以在与LoRA和完整模型微调的竞争性能方面实现优势,并且与相同存储参数预算相比,各种强大的基线均表现优异。我们的代码可在此处访问:https://www.example.com

Fine-tuning large-scale pre-trained models is inherently a resource-intensive task. While it can enhance the capabilities of the model, it also incurs substantial computational costs, posing challenges to the practical application of downstream tasks. Existing parameter-efficient fine-tuning (PEFT) methods such as Low-Rank Adaptation (LoRA) rely on a bypass framework that ignores the differential parameter budget requirements across weight matrices, which may lead to suboptimal fine-tuning outcomes. To address this issue, we introduce the Dynamic Low-Rank Adaptation (DoRA) method. DoRA decomposes high-rank LoRA layers into structured single-rank components, allowing for dynamic pruning of parameter budget based on their importance to specific tasks during training, which makes the most of the limited parameter budget. Experimental results demonstrate that DoRA can achieve competitive performance compared with LoRA and full model fine-tuning, and outperform various strong baselines with the same storage parameter budget. Our code is available at this https URL

https://arxiv.org/abs/2405.17357

https://arxiv.org/pdf/2405.17357.pdf

发表回复

您的电子邮箱地址不会被公开。 必填项已用 * 标注