Loading...
Loading...
Over four months, Redis creator Salvatore Sanfilippo and contributors used advanced AI (notably GPT-5.x/Codex) to design, implement, and optimize a new Array data type. The work iteratively evolved from sparse/dense slice ideas into a multi-level super-directory of sliced dense directories to support large sparse indices, efficient ARSET/ARSCAN/ARPOP semantics, and memory-friendly behavior. AI assisted with specification, auto-generating C code, finding inefficiencies, and optimizing integrations like TRE-based ARGREP regex search. Parallel demos of TRE Python bindings highlight TRE’s ReDoS resistance and linear scaling. The effort underscores AI’s role in accelerating complex systems programming while keeping expert oversight central.
Redis adding a native Array data type and documenting an AI-assisted development process signals shifts in developer workflows and database capabilities; tech professionals should track how AI changes design, testing, and deployment. The TRE demo highlights continuing importance of robust regex libraries against ReDoS threats when integrating native or extension code.
Dossier last updated: 2026-05-10 04:33:49
Antirez (Redis creator Salvatore Sanfilippo) recaps a four-month development of a new Redis Array data type, crediting heavy use of AI (notably GPT-5.x/Codex) for accelerating specification, coding, testing, and optimizations. He detailed design choices: sparse + dense slice directories, dynamic reshaping into a super-directory of sliced dense directories (default 4,096 elements per slice) to support efficient ARSET, ARSCAN and ARPOP operations with time proportional to stored elements rather than range span. AI-assisted auto-coding and code reviews revealed inefficiencies that were iteratively rewritten. While prototyping file-like use cases, he added ARGREP with regex support via the TRE library, which he optimized with GPT help. He concludes AI enables tackling higher-complexity system programming while remaining deeply involved.
Simon Willison demonstrated a minimal Python ctypes binding to the TRE regular expression library and showed TRE’s robustness against ReDoS (regular expression denial-of-service) attacks. Using benchmarks and malicious “evil” patterns, the demo found TRE processed massive inputs (up to 10 million characters) far faster than Python’s built-in re module and scaled linearly rather than exponentially. Willison notes TRE’s lack of backtracking as the key reason for improved performance, and references community interest (including antirez integrating TRE into Redis) as motivation to explore TRE for Python use. The write-up highlights an easy experimental binding and practical mitigation for regex-based DoS risks in Python applications.
Redis creator antirez announced the new Array data type has landed after a four-month development process driven heavily by AI-assisted design and coding. He documented the specification, iterated on C structures, sparse representations, cursor semantics and ARINSERT, then used GPT-5.x (referred to as Codex) to co-design and auto-generate implementation code. Early design choices required rework: to support large sparse indices without heavy allocations, the internal layout evolved into a multi-level "super directory" of sliced dense directories. Antirez credits AI for enabling faster exploration and deeper design changes while he continuously reviewed and refined the output. The change matters for Redis performance, memory behavior, and array semantics for large-scale indexed access.
The author describes a four-month effort to design and implement a new Array data type for Redis, using AI (GPT 5.x/Codex) extensively for specification, code generation, reviewing, and testing. The project evolved from an initial spec and simple sparse/dense slice design into a more complex super-directory of sliced dense directories to support efficient ARSET, ARSCAN and ARPOP operations without large allocations and with scans proportional to existing elements. AI helped accelerate iterative redesigns, testing, 32-bit support, and optimizing the TRE regex library for ARGREP (regex-backed search) to avoid pathological performance. The author concludes that AI scaled their capacity but system-level expertise and hands-on involvement remained essential.