Loading...
Loading...
Thinking Machines Lab unveiled 'interaction models,' neural systems trained from scratch to handle continuous, multimodal human-AI collaboration in real time. Built with a multi-stream, micro-turn architecture, these models ingest audio, video, and text simultaneously and respond with low latency, avoiding the perception freezes of turn-based systems. Researchers say this approach restores copresence, contemporality, and simultaneity—letting humans interject, show, and receive information continuously—improving productivity, developer tools, and real-time applications. The preview claims interaction models deliver both new interactive capabilities and competitive intelligence and responsiveness, marking a step toward more natural, synchronous AI assistants for practical workflows.
Interaction models aim to restore real-time, synchronous human-AI copresence, which affects how developers design collaboration tools, interfaces, and latency-sensitive applications. Tech professionals must reconsider architectures, data pipelines, and UX patterns to support continuous multimodal streams and low-latency responsiveness.
Dossier last updated: 2026-05-15 09:18:37
Former OpenAI CTO Mira Murati — now leading startup Thinking Machines Lab — previewed new “interaction models” designed to keep humans in the loop by natively understanding continuous, messy human communication via camera and microphone. Unlike current pipeline approaches that transcribe speech into text for standard LLM processing, these models interpret pauses, interruptions and tone so they can adapt in real time as users clarify or change intent. Thinking Machines, which has raised billions and already ships Tinker (an API for fine-tuning frontier models), says the tech aims to enable customizable, personalized collaboration rather than automation-driven replacement. The approach contrasts with OpenAI, Anthropic and Google’s push toward increasingly autonomous model outputs. Why it matters: interaction models could reshape human-AI workflows and product design by prioritizing collaboration and intent recognition.
Thinking Machines Lab : Thinking Machines Lab details interaction models, which can think and respond in real time, letting users and AI interact continuously for better collaboration — Today, we're announcing a research preview of interaction models: models that handle interaction natively rather than through external scaffolding.
A research preview introduces interaction models — neural models trained from scratch to natively handle continuous, multimodal human-AI collaboration in real time. Using a multi-stream, micro-turn architecture, these models ingest audio, video, and text simultaneously to respond and act with low latency, addressing limitations of turn-based frontier models that freeze perception while users speak or the model generates. The paper argues that real work benefits from humans remaining in the loop via copresence, contemporality, and simultaneity, and that interaction models overcome collaboration bottlenecks by enabling seamless, synchronous exchanges. This matters for productivity, developer tools, and real-time applications across AI, UX, and multimodal platforms.
Researchers previewed interaction models — neural models trained from scratch to natively handle continuous, multi-modal human-AI collaboration (audio, video, text) with micro-turn, multi-stream design for real-time responsiveness. They argue current turn-based, single-threaded models and interfaces push humans out of the loop and limit collaborative workflows; interaction models aim to restore copresence, contemporality, and simultaneity so people can interject, show, and receive information continuously. The research claims both qualitatively new interactive capabilities and state-of-the-art combined performance in intelligence and responsiveness, positioning these models as a step toward more natural, synchronous AI assistants for practical, real-world tasks.