Loading...
Loading...
Agentic AI is pushing both model design and infrastructure toward longer, tool-driven workflows. Xiaomi has put MiMo‑v2.5‑Pro into public beta, touting stronger long-horizon coherence, “harness-aware” structured development, and self-correction across multi-thousand-step tool chains—demonstrated by end-to-end software builds and even analog circuit optimization—without raising API pricing. In parallel, Google is reshaping its TPU strategy for the agentic era by splitting TPU v8 into TPU 8t for throughput-heavy training and TPU 8i for low-latency inference and agent loops, pairing large-scale SuperPod capacity with latency-cutting on-chip memory and networking upgrades.
Xiaomi has launched MiMo-V2.5-Pro in public beta, a major upgrade focused on agentic capabilities, long-horizon coherence, and complex software and engineering tasks. Deployed across Xiaomi’s API Platform and AI Studio, the model reportedly sustains multi-thousand-step tool workflows without price changes. Internal benchmarks highlight three flagship feats: building a complete SysY-to-RISC-V compiler in Rust (233/233 tests) over 4.3 hours and 672 tool calls; producing an 8,192-line multi-track desktop video editor across 1,868 tool calls in 11.5 hours; and designing and optimizing a FVF-LDO analog regulator in TSMC 180nm via ngspice closed-loop iteration, meeting multiple metrics. Xiaomi emphasizes the model’s “harness awareness,” structured development, and self-correcting behavior, suggesting broader implications for autonomous agent workflows and developer productivity.
Google unveiled TPU v8 as two purpose-built chips at Cloud Next ’26: the TPU 8t for large-scale training and the TPU 8i for low-latency inference and agentic workloads. The split addresses operational differences between throughput-optimized training and latency-sensitive agent loops (decomposition, dispatch, evaluation), which compound per-step latency. TPU 8t delivers massive training scale (9,600 chips per superpod, 121 exaflops, 2 PB shared memory, multi-site clusters with over a million TPUs). TPU 8i focuses on inference: 3x more on-chip SRAM, a Collectives Acceleration Engine to cut collective latency 5x, and a Boardfly high-radix network that halves communication latency; it offers large pod growth and ~80% better inference performance per dollar. This signals Google optimizing infrastructure for agentic AI at cloud scale.
Xiaomi has launched MiMo-V2.5-Pro into public beta, claiming major gains in agentic reasoning, long-horizon coherence, and complex software and engineering tasks. The model is available via Xiaomi's API Platform and AI Studio with no price change. Internal tests show V2.5-Pro completing a full SysY-to-RISC-V compiler in Rust (233/233 tests) across 672 tool calls in 4.3 hours, building a full-featured video editor (8,192 LOC over 1,868 tool calls, 11.5 hours), and iterating an analog FVF-LDO circuit to meet multiple metrics via ngspice in about an hour. Xiaomi emphasizes the model’s ability to sustain thousand-plus tool-call workflows, harness-awareness, structured development, and robust self-correction, positioning it for demanding developer and engineering automation tasks.
Google unveiled its eighth-generation TPUs—two new chips and a TPU 8t SuperPod architecture that scales to 9,600 chips with 2 PB of shared high-bandwidth memory and double the interchip bandwidth of the previous generation, delivering 121 exaFLOPS and a single large memory pool for very large models. The post frames these chips as designed for an “agentic” AI era, emphasizing Google’s vertically integrated approach across chip design, systems, and datacenter orchestration to optimize cost and performance against rivals like NVIDIA. This matters because hyperscale custom silicon plus system-level integration can be a decisive competitive advantage for training and serving next-generation large models and agentic AI workloads. Key players: Google (TPU team) and cloud/AI competitors such as NVIDIA.