TY - JOUR
T1 - Glider
T2 - rethinking congestion control with deep reinforcement learning
AU - Xia, Zhenchang
AU - Wu, Libing
AU - Wang, Fei
AU - Liao, Xudong
AU - Hu, Haiyan
AU - Wu, Jia
AU - Wu, Dan
PY - 2023/1
Y1 - 2023/1
N2 - Traditional congestion control protocols may fail to achieve consistently-high performance over a wide range of networking environments as their hardwired policies are optimized over specific network conditions. In this paper, we depart from conventional wisdom and propose Glider, a new congestion control protocol that uses deep reinforcement learning to be more versatile and adaptive to dynamic environments. In particular, Glider uses a framework based on Deep Q-Network, that a sender keeps adapting its congestion control strategies by continuously interacting with the network environment. In addition, the sender constantly sends data, making it challenging to apply reinforcement learning algorithms that require step-by-step state computation to congestion control. Therefore, we design a Dynamic Bisection Division Algorithm (DBDA) to discretize the packet transmission process into steps to ensure Glider’s feasibility on congestion control. We have used an extensive array of experiments on Pantheon to show that Glider can adapt well to varying buffer sizes and is resilient to random loss. Moreover, on wide-area inter-data center links, it can achieve 6.4× and 1.4× higher throughput than TCP CUBIC and BBR, respectively, and comparable performance as other learning-based congestion control protocols in the literature.
AB - Traditional congestion control protocols may fail to achieve consistently-high performance over a wide range of networking environments as their hardwired policies are optimized over specific network conditions. In this paper, we depart from conventional wisdom and propose Glider, a new congestion control protocol that uses deep reinforcement learning to be more versatile and adaptive to dynamic environments. In particular, Glider uses a framework based on Deep Q-Network, that a sender keeps adapting its congestion control strategies by continuously interacting with the network environment. In addition, the sender constantly sends data, making it challenging to apply reinforcement learning algorithms that require step-by-step state computation to congestion control. Therefore, we design a Dynamic Bisection Division Algorithm (DBDA) to discretize the packet transmission process into steps to ensure Glider’s feasibility on congestion control. We have used an extensive array of experiments on Pantheon to show that Glider can adapt well to varying buffer sizes and is resilient to random loss. Moreover, on wide-area inter-data center links, it can achieve 6.4× and 1.4× higher throughput than TCP CUBIC and BBR, respectively, and comparable performance as other learning-based congestion control protocols in the literature.
KW - Congestion control
KW - Intelligent decision-making
KW - Reinforcement learning
KW - Deep learning
UR - http://www.scopus.com/inward/record.url?scp=85132840796&partnerID=8YFLogxK
U2 - 10.1007/s11280-022-01018-1
DO - 10.1007/s11280-022-01018-1
M3 - Article
AN - SCOPUS:85132840796
SN - 1386-145X
VL - 26
SP - 115
EP - 137
JO - World Wide Web
JF - World Wide Web
IS - 1
ER -