No abstract available.
Proceeding Downloads
TeRM: extending RDMA-attached memory with SSD
RDMA-based in-memory storage systems offer high performance but are restricted by the capacity of physical memory. In this paper, we propose TeRM to extend RDMA-attached memory with SSD. TeRM achieves fast remote access on the SSD-extended memory by ...
Combining buffered I/O and direct I/O in distributed file systems
- Yingjin Qian,
- Marc-André Vef,
- Patrick Farrell,
- Andreas Dilger,
- Xi Li,
- Shuichi Ihara,
- Yinjin Fu,
- Wei Xue,
- André Brinkmann
Direct I/O allows I/O requests to bypass the Linux page cache and was introduced over 20 years ago as an alternative to the default buffered I/O mode. However, high-performance computing (HPC) applications still mostly rely on buffered I/O, even if ...
OmniCache: collaborative caching for near-storage accelerators
We propose OmniCache, a novel caching design for near-storage accelerators that combines near-storage and host memory capabilities to accelerate I/O and data processing. First, OmniCache introduces a "near-cache" approach, maximizing data access to the ...
Symbiosis: the art of application and kernel cache cooperation
We introduce Symbiosis, a framework for key-value storage systems that dynamically configures application and kernel cache sizes to improve performance. We integrate Symbiosis into three production systems - LevelDB, WiredTiger, and RocksDB - and, ...
Optimizing file systems on heterogeneous memory by integrating DRAM cache with virtual memory management
This paper revisits the usage of DRAM cache in DRAM-PM heterogeneous memory file systems. With a comprehensive analysis of existing file systems with cache-based and DAX-based designs, we show that both suffer from suboptimal performance due to excessive ...
Kosmo: efficient online miss ratio curve generation for eviction policy evaluation
In-memory caches play an important role in reducing the load on backend storage servers for many workloads. Miss ratio curves (MRCs) are an important tool for configuring these caches with respect to cache size and eviction policy. MRCs provide insight ...
I/O Passthru: upstreaming a flexible and efficient I/O path in Linux
- Kanchan Joshi,
- Anuj Gupta,
- Javier Gonz´lez,
- Ankit Kumar,
- Krishna Kanth Reddy,
- Arun George,
- Simon Lund,
- Jens Axboe
New storage interfaces continue to emerge fast on Non-Volatile Memory Express (NVMe) storage. Fitting these innovations in the general-purpose I/O stack of operating systems has been challenging and time-consuming. The NVMe standard is no longer limited ...
Metis: file system model checking via versatile input and state exploration
We present Metis, a model-checking framework designed for versatile, thorough, yet configurable file system testing in the form of input and state exploration. It uses a nondeterministic loop and a weighting scheme to decide which system calls and their ...
RFUSE: modernizing userspace filesystem framework through scalable kernel-userspace communication
With the advancement of storage devices and the increasing scale of data, filesystem design has transformed in response to this progress. However, implementing new features within an in-kernel filesystem is a challenging task due to development ...
The design and implementation of a capacity-variant storage system
We present the design and implementation of a capacity-variant storage system (CVSS) for flash-based solid-state drives (SSDs). CVSS aims to maintain high performance throughout the lifetime of an SSD by allowing storage capacity to gracefully reduce ...
I/O in a flash: evolution of ONTAP to low-latency SSDs
- Matthew Curtis-Maury,
- Ram Kesavan,
- V R Bharadwaj,
- Nikhil Mattankot,
- Vania Fang,
- Yash Trivedi,
- Kesari Mishra,
- Qin Li
Flash-based persistent storage media are capable of sub-millisecond latency I/O. However, a storage architecture optimized for spinning drives may contain software delays that make it impractical for use with such media. The NetApp® ONTAP® storage system ...
We ain't afraid of no file fragmentation: causes and prevention of its performance impact on modern flash SSDs
A few studies reported that fragmentation still adversely affects the performance of flash solid-state disks (SSDs) particularly through request splitting. This research investigates the fragmentation-induced performance degradation across three levels: ...
In-memory key-value store live migration with NetMigrate
Distributed key-value stores today require frequent key-value shard migration between nodes to react to dynamic workload changes for load balancing, data locality, and service elasticity. In this paper, we propose NetMigrate, a live migration approach ...
IONIA: high-performance replication for modern disk-based KV stores
We introduce IONIA, a novel replication protocol tailored for modern SSD-based write-optimized key-value (WO-KV) stores. Unlike existing replication approaches, IONIA carefully exploits the unique characteristics of SSD-based WO-KV stores. First, it ...
Physical vs. logical indexing with IDEA: inverted deduplication-aware index
In the realm of information retrieval, the need to maintain reliable term-indexing has grown more acute in recent years, with vast amounts of ever-growing online data searched by a large number of search-engine users and used for data mining and natural ...
MIDAS: minimizing write amplification in log-structured systems through adaptive group number and size configuration
Log-structured systems are widely used in various applications because of its high write throughput. However, high garbage collection (GC) cost is widely regarded as the primary obstacle for its wider adoption. There have been numerous attempts to ...
What's the story in EBS glory: evolutions and lessons in building cloud block store
- Weidong Zhang,
- Erci Xu,
- Qiuping Wang,
- Xiaolu Zhang,
- Yuesheng Gu,
- Zhenwei Lu,
- Tao Ouyang,
- Guanqun Dai,
- Wenwen Peng,
- Zhe Xu,
- Shuo Zhang,
- Dong Wu,
- Yilei Peng,
- Tianyun Wang,
- Haoran Zhang,
- Jiasheng Wang,
- Wenyuan Yan,
- Yuanyuan Dong,
- Wenhui Yao,
- Zhongjie Wu,
- Lingjun Zhu,
- Chao Shi,
- Yinhu Wang,
- Rong Liu,
- Junping Wu,
- Jiaji Zhu,
- Jiesheng Wu
In this paper, we qualitatively and quantitatively discuss the design choices, production experience, and lessons in building the Elastic Block Storage (EBS) at ALIBABA CLOUD over the past decade. To cope with hardware advancement and users' demands, we ...
ELECT: enabling erasure coding tiering for LSM-tree-based storage
Given the skewed nature of practical key-value (KV) storage workloads, distributed KV stores can adopt a tiered approach to support fast data access in a hot tier and persistent storage in a cold tier. To provide data availability guarantees for the hot ...
MinFlow: high-performance and cost-efficient data passing for I/O-intensive stateful serverless analytics
Serverless computing has revolutionized application deployment, obviating traditional infrastructure management and dynamically allocating resources on demand. A significant use case is I/O-intensive applications like data analytics, which widely employ ...
COLE: a column-based learned storage for blockchain systems
Blockchain systems suffer from high storage costs as every node needs to store and maintain the entire blockchain data. After investigating Ethereum's storage, we find that the storage cost mostly comes from the index, i.e., Merkle Patricia Trie (MPT). ...
Baleen: ML admission & prefetching for flash caches
- Daniel Lin-Kit Wong,
- Hao Wu,
- Carson Molder,
- Sathya Gunasekar,
- Jimmy Lu,
- Snehal Khandkar,
- Abhinav Sharma,
- Daniel S. Berger,
- Nathan Beckmann,
- Gregory R. Ganger
Flash caches are used to reduce peak backend load for throughput-constrained data center services, reducing the total number of backend servers required. Bulk storage systems are a large-scale example, backed by high-capacity but low-throughput hard ...
Seraph: towards scalable and efficient fully-external graph computation via on-demand processing
Fully-external graph computation systems exhibit optimal scalability by computing the ever-growing, large-scale graph with constant amount of memory on a single machine. In particular, they keep the entire massive graph data in storage and iteratively ...
Index Terms
- Proceedings of the 22nd USENIX Conference on File and Storage Technologies