Deduplication Storage System:
How It Works and Why It Changes Storage Industry
Date and time: 11.30am - 12.30pm, Thursday 20th August, 2009
Venue: 12.13.03 (Building 12, Level 13, Room 03)
Abstract:
Deduplication has emerged as the hottest
technology in storage industry. The
latest deduplication storage system can achieve 10x
to 30x compression ratio on backup data, with an inline multi-stream deduplication throughput of 1.5Gbytes/sec (as a contrast, a
common compression tool such as gzip or winzip achieves about 2-3x compression at about 30
Mbytes/sec on a typical server). Deduplication storage systems have now become the standard
for backups and remote data replications for enterprise data centers. What is deduplication? How
does a deduplication storage system work? Can deduplication
storage system go beyond the backup use cases?
This talk answers these questions by first giving an introduction to the deduplication technology and then describing the internals of Data Domain deduplication file system. We will give an in-depth discussion on how to solve the key technical challenge to achieve a high deduplication throughput with minimal CPU amd memory resources and how the deduplication technology can impact nearline storage and primary storage systems.
About the speaker:
Dr. Kai Li is a Charles Fitzmorris
professor at the Computer Science Department of Princeton University. His
research interests include operating systems, computer architecture, storage
systems, and large-scale data analysis and visualization systems He has led several research projects at
Princeton including the Shared Virtual Memory project which studies how to
build shared memory on a cluster without physically shared memory, the Scalable
I/O project which attacks I/O bottleneck problems for supercomputers, the
Scalable High-performance Really Inexpensive MultiProcessor
(SHRIMP) project which investigates how to build high-performance servers on a
cluster, and the Scalable Display Wall project which explores how to build and
use a high-resolution, wall-size display system to visualize massive datasets.
During his sabbatical from
He joined
Seminar Organisation
Seminars are free and open to the general public. No booking is necessary. If you are interested in giving a presentation in this seminar series, or to make suggestions for speakers, please contact Xiaodong Li, the seminar co-ordinator.