Deduplication in SSDs

model and quantitative analysis

Jonghwa Kim*, Choonghyun Lee, Sangyup Lee, Ikjoon Son, Jongmoo Choi, Sungroh Yoon, Hu Ung Lee, Sooyong Kang, Youjip Won, Jaehyuk Cha

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference proceeding contribution

49 Citations (Scopus)

Abstract

In NAND Flash-based SSDs, deduplication can provide an effective resolution of three critical issues: cell lifetime, write performance, and garbage collection overhead. However, deduplication at SSD device level distinguishes itself from the one at enterprise storage systems in many aspects, whose success lies in proper exploitation of underlying very limited hardware resources and workload characteristics of SSDs. In this paper, we develop a novel deduplication framework elaborately tailored for SSDs. We first mathematically develop an analytical model that enables us to calculate the minimum required duplication rate in order to achieve performance gain given deduplication overhead. Then, we explore a number of design choices for implementing deduplication components by hardware or software. As a result, we propose two acceleration techniques: sampling-based filtering and recency-based fingerprint management. The former selectively applies deduplication based upon sampling and the latter effectively exploits limited controller memory while maximizing the deduplication ratio. We prototype the proposed deduplication framework in three physical hardware platforms and investigate deduplication efficiency according to various CPU capabilities and hardware/software alternatives. Experimental results have shown that we achieve the duplication rate ranging from 4% to 51%, with an average of 17%, for the nine workloads considered in this work. The response time of a write request can be improved by up to 48% with an average of 15%, while the lifespan of SSDs is expected to increase up to 4.1 times with an average of 2.4 times.

Original languageEnglish
Title of host publicationMSST 2012
Subtitle of host publicationIEEE 28th Symposium on Mass Storage Systems and Technologies 2012
Place of PublicationPiscataway, New Jersey
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
Number of pages12
ISBN (Electronic)9781467317474
ISBN (Print)9781467317450
DOIs
Publication statusPublished - 2012
Externally publishedYes
Event2012 IEEE 28th Symposium on Mass Storage Systems and Technologies, MSST 2012 - Pacific Grove, CA, United States
Duration: 16 Apr 201220 Apr 2012

Other

Other2012 IEEE 28th Symposium on Mass Storage Systems and Technologies, MSST 2012
CountryUnited States
CityPacific Grove, CA
Period16/04/1220/04/12

Fingerprint Dive into the research topics of 'Deduplication in SSDs: model and quantitative analysis'. Together they form a unique fingerprint.

Cite this