Performance
Index once, search many times Initial indexing takes time, but subsequent searches are instant
Once you’ve got ChunkHound up and running, it’s time to dive deeper. This guide covers advanced indexing strategies, server deployment, and optimization for large codebases.
For large codebases, indexing is a separate step that provides significant benefits:
Performance
Index once, search many times Initial indexing takes time, but subsequent searches are instant
Smart Diffing
Only processes changed files Preserves embeddings for unchanged code
Fix Command
Repairs inconsistencies
chunkhound index detects and fixes database drift
Enterprise Ready
Battle-tested scaling Used on codebases with millions of LOC
$ chunkhound index /path/to/large-codebaseScanning 10,000 files...Processing 8,234 Python files, 1,766 TypeScript files...✓ 45,000 chunks indexed✓ Embeddings: 45,000 generated⏱️ Time: 7m 30s$ chunkhound index # After editing 3 filesDetecting changes...✓ 3 files modified, 9,997 files unchanged✓ 150 chunks updated✓ Embeddings: 150 generated, 44,850 reused⏱️ Time: 18 secondsWhen using MCP servers, ChunkHound automatically watches your files and updates the index as you edit. No manual commands needed:
This makes ChunkHound perfect for live memory systems - create a folder of markdown notes that stays searchable as you add and modify content.
| Use Case | Mode | Command |
|---|---|---|
| Personal development | stdio | chunkhound mcp |
| Team/production use | HTTP | chunkhound mcp --http |
Your IDE starts/stops the server automatically. The index stays in memory for instant searches. Perfect for personal development with a single IDE.
chunkhound mcp /path/to/projectYou start the server once, multiple IDEs can connect. Ideal for teams or when switching between multiple git worktrees.
chunkhound mcp /path/to/project --http --port 8000# Connect IDEs to http://localhost:8000ChunkHound is production-ready and actively tested on codebases with 200,000+ lines of code. Here’s how to deploy it effectively:
Scale & Performance
4.8M lines in 56 minutes
Proven performance on massive codebases with minimal CPU overhead
Project Diversity
Battle-tested across architectures
From GoatDB’s TypeScript monolith to Kubernetes’ multi-language ecosystem
Multi-Language Support
22+ Languages
Python, TypeScript, Go, Rust, Java, C++, and more via Tree-sitter
AI-Built Architecture
100% AI-Generated
Entire codebase written by AI agents, using cAST algorithm for intelligent code chunking
We indexed the entire Kubernetes codebase - 4.8 million lines across 23,000 files. Here’s what happened on a MacBook Pro M4:
56 minutes. That’s it.
But here’s the surprising part: ChunkHound barely broke a sweat. CPU usage stayed at 47% the entire time. Your laptop wasn’t struggling - it was waiting. Waiting for the embedding API to process each chunk and send it back.
This tells you something important about scaling ChunkHound: the bottleneck isn’t your hardware, it’s your embedding provider.
Want it faster?
The Kubernetes test proves ChunkHound can handle anything you throw at it. But most days, you won’t need to index millions of lines. Start with what you’re working on, expand as needed.