FLM-TopK: expediting federated large language model tuning by sparsifying intervalized gradients

Wenqi Qiu, Yipeng Zhou, Jinzhi Wang, Quan Z. Sheng, Laizhong Cui*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference proceeding contributionpeer-review

Abstract

The past few years have witnessed the unprecedented capability of large language models (LLMs). To adapt LLMs with various downstream tasks, fine-tuning methods, e.g., Low-Rank Adaptation (LoRA), are proposed to efficiently tune LLMs. Meanwhile, federated LLM tuning emerges for refining LLMs with clients owning private data. In the federated tuning process, the server and clients frequently exchange fine-tune gradients via Internet, giving the rise of the communication challenge. To overcome this challenge, most existing works employ quantization methods for compressing gradients because sparsification methods like TopK incur heavy overhead for transmitting position IDs (PIDs) of sparsified gradients. In this work, to expedite federated LLM tuning with a higher compression rate, we design the Federated LLM Tuning with TopK (FLM-TopK) algorithm. Specifically, FLM-TopK intervalizes gradients before compression. Then, TopK is separately applied for gradients in each interval so that the overhead representing PIDs is constrained. To optimize our algorithm, we empirically study the distribution of gradients, which obeys the Gaussian distribution. Based on the Gaussian distribution, we establish an optimization problem to minimize the compression error by jointly optimizing the interval size and the sparsification rate per interval. We prove that the non-convex problem can be approximately solved by alternating optimization. To demonstrate the superiority of FLM-TopK, we conduct extensive experiments on nine public datasets. The results demonstrate that FLM-TopK significantly outperforms SOTA baselines, achieving 6.42%-18.87% improvement in accuracy and 17.07%-44.44% reduction in communication traffic.

Original languageEnglish
Title of host publicationIEEE INFOCOM 2025 - IEEE Conference on Computer Communications
Place of PublicationPiscataway, NJ
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
Number of pages10
ISBN (Electronic)9798331543051
ISBN (Print)9798331543068
Publication statusPublished - 2025
Event2025 IEEE Conference on Computer Communications, INFOCOM 2025 - London, United Kingdom
Duration: 19 May 202522 May 2025

Publication series

Name
ISSN (Print)0743-166X
ISSN (Electronic)2641-9874

Conference

Conference2025 IEEE Conference on Computer Communications, INFOCOM 2025
Country/TerritoryUnited Kingdom
CityLondon
Period19/05/2522/05/25

Fingerprint

Dive into the research topics of 'FLM-TopK: expediting federated large language model tuning by sparsifying intervalized gradients'. Together they form a unique fingerprint.

Cite this