Talkie Revives 1930s Data for LLM Science

Researchers have released Talkie, a 13B-parameter “vintage” language model trained exclusively on pre-1931 text, positioning it as both a 1930s-flavored chatbot and a controlled research instrument for studying how training data shapes capabilities. By avoiding modern corpus contamination, the team uses Talkie to probe temporal generalization—forecasting later events, “discovering” post-cutoff inventions, and testing in-context learning on modern Python tasks. Early results show occasional surprising successes but clear gaps versus web-trained models, underscoring the limits of historical-only data. The release also sparked hobbyist interest in local deployment, VRAM requirements, and integration with tools like Ollama.

4 news itemsHeat: 2.2

News (4)

Introducing talkie: a 13B vintage language model from 1930

Researchers Nick Levine, David Duvenaud and Alec Radford released talkie, a 13B-parameter “vintage” language model trained on 260B tokens of pre-1931 English, and an instruction-tuned chat checkpoint (talkie-1930-13b-it). Both models use Apache 2.0 licenses; the base model’s training corpus is entirely out of copyright, while the chat model’s instruction tuning relied on modern LLMs (Claude Sonnet/Opus) to generate synthetic prompts and preference signals. The project probes interesting research questions—can models trained on historical text predict future events, rediscover scientific advances, or learn to program? The team flags risks of contamination from post-1931 data and anachronistic behavior introduced by RL with AI feedback and plans to bootstrap era-appropriate judges in future work. The models and an interactive demo are publicly available.

src_blog_aisimonwillison.net3h ago

Talkie: a 13B vintage language model from 1930

Researchers released talkie, a 13-billion-parameter “vintage” language model trained solely on pre-1931 text to study how historical training data shapes LM behavior. Built and shared on GitHub/Hugging Face by Nick Levine, David Duvenaud and Alec Radford (April 2026), talkie is intended as both a conversational novelty—simulating a 1930s perspective—and an experimental tool for measuring forecasting, creativity, and generalization without modern data contamination. Early evaluations show talkie rates surprisingness of future events, struggles but can sometimes succeed at post-cutoff inventions, and performs poorly yet nontrivially on modern Python coding tasks when given in-context examples. The project highlights contamination-free benchmarks, opportunities to study scaling trends, and limits of historical-only training for modern capabilities.

27pts

Zelijekude5h ago

Talkie: a 13B vintage language model from 1930

A project called Talkie released a 13-billion-parameter language model branded as “vintage” and themed around 1930s-style language, sparking discussion on Hacker News. Commenters joked about running on antique hardware and debated resource needs (VRAM) and deployment options like Ollama; others suggested simulating old-timey mannerisms by prompting much larger models. The post highlights community interest in stylistic or era-specific LLMs, practical constraints for hobbyists (model size, memory), and interoperability with existing local-serving tools. This matters because niche, persona-driven models showcase demand for bespoke LLM behavior, influence tooling integrations, and raise questions about model efficiency and accessibility for developers and hobbyists.

24pts

NewsNow5h ago

Talkie: a 13B vintage language model from 1930

Researchers released talkie, a 13-billion-parameter “vintage” language model trained only on text published before 1931 to simulate historical conversation and probe fundamental LM behaviors. Built and demonstrated with live prompting (Claude prompting talkie), the project uses contamination-free training data to evaluate forecasting, invention discovery, and generalization: measuring how surprising post-cutoff historical events are to the model, testing whether it could independently “discover” later inventions (e.g., helicopters, Turing machines), and assessing in-context learning by having the vintage model attempt modern Python coding tasks. Results show clear performance gaps versus modern web-trained models but open a controlled path to study scaling trends, temporal generalization, and how training corpora shape capabilities. Key players include the talkie authors and platforms hosting the model (GitHub, Hugging Face).

253pts

HNjekude6h ago