Loading...
Loading...
A detailed analysis explains why Linux 7.0 caused a major PostgreSQL benchmark throughput drop — roughly halving performance in the reporter's tests. The author traces the regression to kernel preemption and scheduler changes in Linux 7.0 that altered CPU scheduling and contention behavior for PostgreSQL workloads, increasing lock contention and context-switch overhead. They reproduce the issue with benchmarks, examine kernel commits and scheduler parameters, and discuss mitigation strategies su
Linux 7.0 introduced a scheduler preemption change that caused severe PostgreSQL slowdowns on high-parallelism Graviton4 systems. An AWS engineer’s patch report showed pgbench throughput on a 96-vCPU instance fell from ~98.6k tps on Linux 6.x to ~50.8k tps on Linux 7.0. Profiling with perf found ~55% of CPU time spent in PostgreSQL’s buffer locking path (s_lock / GetVictimBuffer / StrategyGetBuffer), implicating increased preemption and contention during page buffer management. The article traces how the kernel’s new scheduling/preemption behavior interacts with PostgreSQL’s memory and page handling, why the regression appears at certain page sizes and workload scales, and discusses patches and mitigations proposed on the kernel mailing list. This matters for cloud DB performance, kernel maintainers, and operators running high-concurrency workloads.
A regression introduced in Linux 7.0 caused PostgreSQL to misbehave by exposing a kernel preemption bug that breaks the database under certain workloads. The issue surfaced when systems running the new kernel with older PostgreSQL releases encountered stability and correctness problems; containerized deployments amplify the risk because kernels and userland can be upgraded independently. The postmortem traces the regression to changes in kernel preemption timing and scheduling paths that violate PostgreSQL’s assumptions, outlines how it manifests, and discusses mitigation options (kernel patch, PostgreSQL workarounds, or upgrading both layers). This matters because many production services rely on PostgreSQL and container patterns, so the regression risks data integrity and availability until kernels or database code are fixed.
Linux 7.0 removed the PREEMPT_NONE scheduler option on modern CPUs, replacing it with PREEMPT_LAZY and PREEMPT_FULL — a change that halved PostgreSQL throughput on a 96-vCPU Graviton4 test. AWS engineer Salvatore Dipietro benchmarked pgbench under heavy parallel load and found Linux 6.x achieved ~98.6k TPS versus ~50.8k TPS on Linux 7.0. Profiling showed roughly 55% of CPU time stuck in PostgreSQL’s s_lock path while servicing buffer reads, exposing contention patterns that PREEMPT_LAZY alters compared with PREEMPT_NONE. The article explains why PostgreSQL’s shared buffer and page-access semantics interact badly with the new preemption behavior, why most server workloads are unaffected, and why this regression matters for high-parallel database deployments and cloud operators.
A detailed analysis explains why Linux 7.0 caused a major PostgreSQL benchmark throughput drop — roughly halving performance in the reporter's tests. The author traces the regression to kernel preemption and scheduler changes in Linux 7.0 that altered CPU scheduling and contention behavior for PostgreSQL workloads, increasing lock contention and context-switch overhead. They reproduce the issue with benchmarks, examine kernel commits and scheduler parameters, and discuss mitigation strategies such as tuning scheduler/preemption settings, kernel configuration rollbacks, or awaiting upstream fixes. This matters because database performance regressions in a mainstream kernel can impact cloud providers, enterprise deployments, and open-source database users, forcing urgent investigations and operational workarounds.