Loading...
Loading...
SPEC released CPU2026, an updated CPU benchmark suite with 52 workloads (up from 43) and larger code footprints, aiming to modernize tests while keeping portability. TechScan examined CPU2026 on Linux with GCC 14.2.0 (-O3, native) to focus on hardware behavior; GCC 15.2.0 was avoided due to issues. SPEC CPU2026 uses an Ampere eMAG 8180 as the reference (score 1.0), which the author criticizes as unrepresentative of modern hardware and skewing perceptions of performance. Tests show current Intel
Benchmarking and microbenchmark results shape hardware design and compiler targeting; updates to SPEC and AVX-512 wins affect how architects and performance engineers evaluate Zen 5 and other CPUs. Understanding workload mix, reference choices, and real-world SIMD gains helps engineers prioritize optimizations and platform selection.
Dossier last updated: 2026-05-25 04:30:27
Daniel Lemire demonstrates an AVX-512 SIMD implementation that parses IPv6 text addresses about 12x faster than the standard inet_pton on a single Intel Xeon Gold core. Using 512-bit registers to locate colons, expand bytes, permute hex digits, and combine values with multiply-accumulate, the branch-minimized routine achieves ~71 million addresses/sec versus inet_pton's ~5.7 million in his benchmark, with far fewer instructions and higher instruction throughput. The code and benchmark details are published on Lemire's blog, showing practical speedups for high-throughput networking or logging systems where IPv6 parsing is a bottleneck. This matters for server-side networking stacks, probes, and telemetry that need extremely fast text-to-binary IP conversion.
SPEC updated its long-standing CPU benchmark suite to SPEC CPU2026, increasing workloads from 43 to 52 and enlarging individual programs to better reflect modern code. The author evaluated CPU-oriented performance using GCC 14.2.0 on Linux, focusing on hardware comparisons. SPEC CPU2026’s reference baseline uses an Ampere eMAG 8180 (score 1.0), which the author criticizes as anachronistic and too slow compared with modern desktop CPUs. Tests show Intel’s recent Lion Cove and AMD Zen 5 delivering similar integer results while Zen 5 often leads in floating-point, partly due to GCC emitting AVX-512 and wide-vector code for several workloads (e.g., 706.stockfish, 749.fotonik3d). The article highlights concerns about the reference choice and the suite’s implications for evaluating contemporary CPU designs.
SPEC released CPU2026, an updated CPU benchmark suite with 52 workloads (up from 43) and larger code footprints, aiming to modernize tests while keeping portability. TechScan examined CPU2026 on Linux with GCC 14.2.0 (-O3, native) to focus on hardware behavior; GCC 15.2.0 was avoided due to issues. SPEC CPU2026 uses an Ampere eMAG 8180 as the reference (score 1.0), which the author criticizes as unrepresentative of modern hardware and skewing perceptions of performance. Tests show current Intel and AMD desktop cores greatly outpace the eMAG, with Zen 5 often leading in floating-point workloads. Several workloads (e.g., 706.stockfish, 749.fotonik3d, 765.roms) generate AVX-512 code under GCC, indicating SIMD-heavy demands that stress modern CPU features.
Daniel Lemire demonstrates an AVX-512–based IPv6 text parser that outperforms the standard inet_pton. Using SIMD on 512-bit registers, his routine locates colons, expands bytes, permutes hex digits and uses multiply-accumulate steps to assemble bytes in a mostly branch-free pipeline. Benchmarks on an Intel Xeon Gold 6548N show the AVX-512 parser handles ~71.3 million addresses/sec (14.0 ns/addr), about 12× faster than inet_pton (5.7 Mv/s, 175.3 ns/addr), with eight times fewer instructions and higher instruction throughput. Lemire provides source code and notes the approach is applicable on recent Intel/AMD CPUs, making high-throughput IPv6 parsing practical for networking stacks and commodity servers.