Loading...
Loading...
Tailslayer, a new open-source C++ library, targets RAM-read tail latency spikes—often blamed on DRAM refresh stalls—by replicating data across independent memory channels and issuing hedged reads, returning the fastest replica. The project relies on reverse-engineered/undocumented DRAM channel scrambling offsets across AMD, Intel, and AWS Graviton, and ships tooling to profile refresh timing plus a HedgedReader template for integrating triggers and result handling. Benchmarks and discussion emphasize a key trade-off: lowering long-tail latency can raise median latency and add replication overhead, prompting calls for clearer documentation and stronger evidence for performance claims.
tailslayer: Library for reducing tail latency in RAM reads
Tailslayer is a C++ library that cuts tail latency for RAM reads by replicating data across independent DRAM channels with uncorrelated refresh schedules and issuing hedged reads to return the fastest response. It leverages undocumented channel scrambling offsets observed on AMD, Intel, and AWS Graviton platforms to place replicas on channels with different refresh timings. Users include Tailslayer as hedged_reader.hpp, provide a signal function (to trigger reads) and a final work function (to handle results), and the library duplicates inserts across replicas and pins per-replica workers to cores. The repo includes examples, benchmarks, and a discovery toolkit for measuring DRAM refresh behavior. This matters for latency-sensitive systems where DRAM refresh stalls cause long-tail read delays.
Tailslayer is a newly posted open-source library (GitHub) aiming to reduce tail latency on RAM reads by changing how loads are issued and managed. Shared on Hacker News with an accompanying announcement and a demo video, the project claims to lower long-tail read latency at the cost of increased median latency. Commenters push back: one notes the README and headers omit discussing this trade-off explicitly and criticizes the project for shifting baseline latency by the same factor to reduce tails, while another disputes a presenter claim that certain CPUs (Graviton) lack performance counters. The discussion highlights practical trade-offs and the need for clearer documentation and evidence. This matters for systems, databases, and low-latency services concerned with memory-access tail events.
Tailslayer is a C++ library that cuts RAM read tail latency by replicating data across independent DRAM channels with uncorrelated refresh schedules and issuing hedged reads to return the fastest result. It leverages undocumented channel scrambling offsets on AMD, Intel, and AWS Graviton, currently supporting two-channel replication (benchmarks show N-way possible) and provides a HedgedReader template requiring a signal function (when to trigger a read) and a final work function (how to process the returned value). Each replica is pinned to a core, copies are maintained per-insert, and tooling in discovery/ profiles DRAM refresh timing and benchmarks. This matters for latency-sensitive systems where DRAM refresh stalls create rare but harmful tail latency spikes.