Michael Stumm: Publications

Paper Details

Reference:

Siying Dong, Mark Callaghan, Leonidas Galan, Dhruba Borthakur, Tony Savor, and Michael Stumm,
"Optimizing space amplification in RocksDB",
In Proceeding Conference on Innovative Data Systems Research (CIDR'17), Santa Cruz, CA, USA, January, 2017, pp. online.

Download:

PDF

Abstract:

RocksDB is an embedded, high-performance, persistent key-value storage engine developed at Facebook. Much of our current focus in developing and configuring RocksDB is to prioritize resource efficiency over more standard performance metrics, such as response time latency and throughput, as long as they remain acceptable. In particular, we optimize space efficiency as long as read/write latencies are able to meet target service-level requirements for the intended workloads, because storage space is most often the bottleneck when using Flash SSDs under typical production workloads at Facebook. RocksDB uses Log-structured Merge Trees (LSM) to obtain significant space efficiency and better write throughput while still being good at read performance.

We describe how we apply a number of approaches to reduce storage usage in RocksDB. We discuss how we are able to trade off storage efficiency and CPU overhead, as well as read and write amplification. Based on results of experimental evaluations of MySQL with RocksDB as the embedded storage engine (using TPC-C and LinkBench benchmarks) and based on measurements taken from production databases, we show that RocksDB uses less than half the storage that InnoDB uses, yet performs well and in many cases even better than the B tree-based InnoDB storage engine. To the best of our knowledge, this is the first time an LSM-based storage engine has shown competitive performance when running OLTP workloads at large scale.

Keywords:

MySQL, Log-structured merge tree, LSM, RocksDB, space amplifaction, key-value storage, write amplification

BibTeX:

@inproceedings(Dong-CIDR-16,
    author = {Siying Dong and Mark Callaghan and Leonidas Galan and Dhruba Borthakur and Tony Savor and Michael Stumm},
    title = {Optimizing space amplification in {RocksDB}},
    booktitle = {Proceeding Conference on Innovative Data Systems Research (\textbf{CIDR'17})},
    location = {Santa Cruz, CA, USA},
    month = {January},
    year = {2017},
    pages = {online},
    keywords = {MySQL, Log-structured merge tree, LSM, RocksDB, space amplifaction, key-value storage, write amplification}
)