Loading...
Loading...
A new CPU-only code search engine, Semble, aims to speed and shrink agent-driven code retrieval by returning concise snippets and using roughly 98% fewer tokens than a grep-and-read workflow. Built with static Model2Vec embeddings, BM25, RRF fusion and code-aware reranking, Semble indexes repos in ~250 ms and answers queries in ~1.5 ms on CPU while matching ~99% of a 137M code transformer's retrieval quality. It can run locally as an MCP server for agents like Claude Code or Codex. Separately, AWS’s Agent Toolkit shows that MCP servers and curated skills only work reliably when agents are given a short rules file directing them to prefer MCP tools and verify APIs—omitting that file causes agents to ignore available skills and fall back to training data.
Efficient, local code retrieval reduces latency, cost, and token usage for agent-driven developer workflows while preserving retrieval quality. MCP rules and tool-preference guidance are crucial for agents to reliably use local servers and curated skills instead of fallback to training data.
Dossier last updated: 2026-05-18 08:29:44
Semble is a lightweight code-search library designed for AI agents that returns precise snippets with about 98% fewer tokens than a grep-and-read workflow. It indexes a typical repo in ~250 ms and answers queries in ~1.5 ms on CPU, claiming ~200x faster indexing and ~10x faster queries than a code-specialized transformer while maintaining 99% of retrieval quality (NDCG@10 ≈ 0.854). Semble runs locally (no GPU or API keys), can operate as an MCP server compatible with Claude Code, Codex, Cursor, OpenCode and others, and supports searching local paths or git URLs via a CLI (semble search / find-related). The tool reduces token usage for agents, speeds agent access to code, and enables on-device, privacy-friendly code retrieval for developer workflows and LLM agent integrations.
MinishLab/semble: Fast and Accurate Code Search for Agents. Uses ~98% fewer tokens than grep+read
Semble is a new code-search library optimized for AI agents that claims to deliver fast, accurate, token-efficient retrieval, returning only relevant snippets and using ~98% fewer tokens than a grep-and-read workflow. It indexes repositories in ~250 ms and answers queries in ~1.5 ms on CPU, while matching 99% of retrieval quality of code-specialized transformer models (NDCG@10 ≈ 0.854) and offering ~200x faster indexing and ~10x faster queries in their benchmarks. Semble runs locally with no API keys or GPUs, can act as an MCP server for agents like Claude Code, Codex, Cursor, and OpenCode, or be invoked via a CLI/Bash integration, and supports local paths or git URLs with automatic re-indexing. This matters for agent-driven development workflows and cost/latency-sensitive code-assistance tools.
Open-sourced Semble, a CPU-only code search engine from MinishLab, claims to cut token usage by 98% versus grep+read while matching 99% of retrieval quality of a 137M-parameter code transformer. Founders Stephan and Thomas combine static Model2Vec embeddings (potion-code-16M) with BM25, RRF fusion and code-aware reranking to deliver fast, accurate results: ~250ms to index a typical repo and ~1.5ms per query on CPU in their benchmark of ~1,250 query/document pairs across 63 repos and 19 languages. Semble requires no GPUs or API keys, offers an MCP server for Claude Code and others, and provides benchmarks, installation instructions, and the model on GitHub and Hugging Face.
AWS's new Agent Toolkit (released May 6, 2026) gives coding agents access to an MCP server, sandboxed execution, and 20+ curated AWS skills, but the toolkit includes a short rules file that you must add for agents to reliably use those tools. The author set up the MCP server and skills but skipped the 17-line rules file, so the agent answered from its training data instead of loading skills or consulting documentation. The rules file instructs agents to prefer the AWS MCP for AWS interactions, check for and retrieve relevant skills before starting tasks, and verify uncertain API parameters against docs. Adding the rules file made the agent load skills and search docs first, improving accuracy and tool use.