Communication compression techniques in distributed deep learning: a survey

Zeqin Wang, Ming Wen, Yuedong Xu*, Yipeng Zhou, Jessie Hui Wang, Liang Zhang

*Corresponding author for this work

Research output: Contribution to journalReview articlepeer-review

21 Citations (Scopus)

Abstract

Nowadays, the training data and neural network models are getting increasingly large. The training time of deep learning will become unbearably long on a single machine. To reduce the computation and storage burdens, distributed deep learning has been put forward to collaboratively train a large neural network model with multiple computing nodes in parallel. The unbalanced development of computation and communication capabilities has led to training time being dominated by communication time, making the communication overhead a major challenge toward efficient distributed deep learning. Communication compression is an effective method to alleviate communication overhead, and it has evolved from simple random sparsification or quantization to versatile strategies or data structures. In this survey, existing communication compression techniques are reviewed and classified to provide a bird's eye view. The main properties of each class of compression methods are analyzed, and their applications or theoretical convergence are described if necessary. This survey is potentially helpful for researchers and engineers to understand the up-to-date achievements on the communication compression techniques that accelerate the training of large deep learning models.

Original languageEnglish
Article number102927
Pages (from-to)1-26
Number of pages26
JournalJournal of Systems Architecture
Volume142
DOIs
Publication statusPublished - Sept 2023

Keywords

  • Distributed deep learning
  • Communication compression
  • Sparsification
  • Quantization

Fingerprint

Dive into the research topics of 'Communication compression techniques in distributed deep learning: a survey'. Together they form a unique fingerprint.

Cite this