Skip to main navigation Skip to search Skip to main content

An empirical study on model pruning and quantization

Yuzhe Tian, Tom H. Luan, Xi Zheng*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference proceeding contributionpeer-review

Abstract

In machine learning, model compression is vital for resource-constrained Internet of Things (IoT) devices, such as unmanned aerial vehicles (UAVs) and smart phones. Currently there are some state-of-the-art (SOTA) compression methods, but little study is conducted to evaluate these techniques across different models and datasets. In this paper, we present an in-depth study on two SOTA model compression methods, pruning and quantization. We apply these methods on AlexNet, ResNet18, VGG16BN and VGG19BN, with three well known datasets, Fashion-MNIST, CIFAR-10, and UCI-HAR. Through our study, we draw the conclusion that, applying pruning and retraining could keep the performance (average less than degrade) while reducing the model size (at compression rate) on spatial domain datasets (e.g. pictures); the performance on temporal domain datasets (e.g. motion sensors data) degrades more (average about degrade); the performance of quantization is related with the pruning rate and the network architecture. We also compare different clustering methods and reveal the impact on model accuracy and quantization ratio. Finally, we provide some interesting directions for future research.

Original languageEnglish
Title of host publicationBroadband communications, networks, and systems
Subtitle of host publication13th EAI International Conference, BROADNETS 2022, Virtual Event, March 12-13, 2023 proceedings
EditorsWei Wang, Jun Wu
Place of PublicationCham
PublisherSpringer, Springer Nature
Pages111-125
Number of pages15
ISBN (Electronic)9783031404672
ISBN (Print)9783031404665
DOIs
Publication statusPublished - 2023
Event13th EAI International Conference on Broadband Communications, Networks, and Systems - Virtual, Online
Duration: 12 Mar 202313 Mar 2023

Publication series

NameLecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering
PublisherSpringer
Volume511
ISSN (Print)1867-8211
ISSN (Electronic)1867-822X

Conference

Conference13th EAI International Conference on Broadband Communications, Networks, and Systems
Abbreviated titleBROADNETS 2022
CityVirtual, Online
Period12/03/2313/03/23

Keywords

  • Model compression
  • Deep neural network
  • Edge computing

Fingerprint

Dive into the research topics of 'An empirical study on model pruning and quantization'. Together they form a unique fingerprint.

Cite this