Code Research
Architectural Understanding, Not Just Search Results
Section titled “Architectural Understanding, Not Just Search Results”Ask: “How does authentication work?”
Don’t get a list of files containing “auth.” Get a comprehensive report mapping auth components, relationships, security patterns, and configuration—with file.ts:45 citations.
Code Research performs breadth-first exploration of your codebase’s semantic graph, following connections between components and synthesizing findings into structured markdown reports.
Upgrading from v3 (Code Expert Agent)
Section titled “Upgrading from v3 (Code Expert Agent)”If you previously configured the “Code Expert Agent” in .claude/agents/code-expert.md:
What Changed
Section titled “What Changed”- No longer needed - Code Research is now a built-in MCP tool, not a separate agent
- LLM configuration now required - Code Research needs LLM provider configuration for synthesis and analysis
- Same functionality - Deep architectural research works the same way, just integrated directly into ChunkHound
Migration Steps
Section titled “Migration Steps”1. Add LLM configuration to .chunkhound.json:
{ "llm": { "provider": "claude-code-cli" }}2. Remove the old agent file:
rm .claude/agents/code-expert.md3. Restart your MCP server:
# Stop current server (Ctrl+C if running in terminal)# Restart ChunkHound MCP serverchunkhound mcp4. Verify it works:
Ask your AI assistant: “Research the authentication implementation”
You should see Code Research tool invocation instead of agent delegation.
Will Both Run Simultaneously?
Section titled “Will Both Run Simultaneously?”No. The old agent file is ignored once removed. If you keep both, Claude Code may try to use the agent instead of the built-in tool. Remove the agent file to ensure clean migration.
What Breaks Without Migration?
Section titled “What Breaks Without Migration?”Code Research features won’t work without LLM configuration. However:
When to Use It
Section titled “When to Use It”- Before implementing features - Find existing patterns to reuse instead of reinventing
- During debugging - Map complete flows to find the actual failure point
- Refactoring prep - Understand all dependencies before making changes
- Code archaeology - Learn unfamiliar systems quickly
When to Use Direct Search Instead
Section titled “When to Use Direct Search Instead”Code Research is designed for architectural exploration. For simpler queries, use the base search tools directly:
- Quick symbol lookups - Use regex search to find all occurrences of a specific function or class name
- Known file/function - Use semantic search when you know roughly what you’re looking for
- Architectural questions - Use Code Research to understand how components interact and why
Via CLI:
chunkhound research "how does rate limiting work?"Via MCP (with your AI assistant):
"Research our rate limiting implementation"What You Get
Section titled “What You Get”Code Research returns a structured markdown report with architectural insights and precise file citations. Here’s what a typical report looks like:
Example: Rate Limiting Research Output
## Rate Limiting Architecture
### OverviewThe application implements token bucket rate limiting using Redis for distributed state.Rate limiting is applied at the middleware layer with per-endpoint configuration.
### Core Components
**RateLimitMiddleware** (`src/middleware/ratelimit.ts:45-120`)- Token bucket algorithm with sliding window- Redis-based distributed counters- Custom headers for limit status- Applied to 12 API endpoints
**Configuration** (`config/limits.yaml:1-30`)- Per-endpoint rate definitions- Default: 100 requests per 15-minute window- Environment-based overrides supported
### Usage Pattern
Found across these endpoints:- `POST /api/auth/login` - 5 requests/min (src/routes/auth.ts:23)- `POST /api/users/create` - 10 requests/min (src/routes/users.ts:45)- `GET /api/data/*` - 100 requests/min (src/routes/data.ts:67)
### Implementation Recommendation
Reuse existing middleware for new endpoints:
```typescriptapp.use('/api/new-endpoint', rateLimiter({ windowMs: 15 * 60 * 1000, max: 100}));```
### Key Files- `src/middleware/ratelimit.ts` - Core implementation- `src/services/redis.ts:89-145` - Redis client- `config/limits.yaml` - Configuration- `tests/middleware/ratelimit.test.ts` - Test examplesParameters:
query(required) - Your research question
The report includes:
- Architectural overview and design patterns
- Component locations with
file.ts:linecitations - Usage examples from your codebase
- Implementation recommendations
Setup & Configuration
Section titled “Setup & Configuration”Code Research requires an LLM provider for intelligent synthesis and query expansion. ChunkHound uses a dual-provider architecture:
- Utility Provider - Fast operations: query expansion, follow-up generation
- Synthesis Provider - Deep analysis: final synthesis with large context windows
Quick setup examples:
// Claude Code CLI (recommended for Claude Code users){ "llm": { "provider": "claude-code-cli" }}
// Codex CLI (recommended for Codex users){ "llm": { "provider": "codex-cli", "codex_reasoning_effort": "medium" }}
// OpenAI (for users without CLI subscriptions){ "llm": { "provider": "openai", "api_key": "sk-your-key" }}For complete setup instructions including environment variables, mixed providers, and all configuration options, see the LLM Configuration section of the Configuration guide.
How It Works
Section titled “How It Works”Code Research is a specialized sub-agent system optimized for code understanding. Unlike simple semantic search that returns matching chunks, it performs breadth-first exploration of your codebase’s semantic graph, following connections and understanding architectural relationships.
The system combines:
- Multi-hop semantic search: Starting from your query, it expands outward through semantic relationships, exploring connected components
- Hybrid semantic + symbol search: Discovers conceptually relevant code, then finds all exact symbol references for comprehensive coverage
- Intelligent synthesis: Generates structured markdown reports with architectural insights and precise
file:linecitations
Token budgets scale with repository size (30k-150k input tokens), and the system automatically allocates resources based on what it discovers.
For deep implementation details, see the Advanced: Technical Deep Dive section below or the Under the Hood documentation.
Advanced: Technical Deep Dive
Section titled “Advanced: Technical Deep Dive”Multi-Hop BFS Traversal
Section titled “Multi-Hop BFS Traversal”Starting from your query, the system expands outward through semantic relationships:
Query: "authentication error handling"
Level 0: Direct matches → auth_error_handler() → validate_credentials()
Level 1: Connected components (semantic neighbors) → error_logger() (shares error handling patterns) → token_validator() (shares auth validation logic)
Level 2: Architectural relationships → database_retry() (error logger uses it) → session_cleanup() (token validator calls it)At each level, an LLM generates context-aware follow-up questions to explore promising directions, turning semantic search into guided exploration of architectural connections.
Graph RAG Without the Graph
Section titled “Graph RAG Without the Graph”Traditional Graph RAG systems build explicit knowledge graphs—extracting entities, mining relationships, and storing them in graph databases. Code Research approximates graph-like exploration through orchestration, trading explicit relationship modeling for zero upfront cost and automatic scaling.
How Orchestration Creates a Virtual Graph
Section titled “How Orchestration Creates a Virtual Graph”ChunkHound’s base layer (cAST index + semantic/regex search) provides traditional RAG capabilities. The Code Research sub-agent orchestrates these tools to create Graph RAG behavior:
Base Layer Foundation:
- Chunks as nodes: cAST chunking preserves metadata (function names, class hierarchies, parameters, imports)
- Vector similarity as edges: Semantic search finds conceptually related chunks via HNSW index
- Symbol references as edges: Regex search finds all exact symbol occurrences
Orchestration Layer Creates the Graph:
- BFS traversal: Iteratively calls semantic search, starting from initial results and expanding through related chunks
- Query expansion: Generates multiple semantic entry points, exploring different “neighborhoods” in parallel
- Symbol extraction + regex: Pulls symbols from semantic results, triggers parallel regex to find all references
- Follow-up questions: Creates targeted queries based on discovered code, recursively exploring architectural boundaries
- Convergence detection: Monitors score degradation to prevent infinite traversal
Because cAST chunks preserve semantic boundaries, multi-hop expansion follows meaningful architectural connections rather than arbitrary text proximity. This structural awareness is why orchestration can approximate graph traversal—the base chunks already encode relationships that orchestration discovers through iterative search.
The virtual graph emerges through orchestrated tool use, not pre-computed storage:
- Initial semantic search → discovers conceptually relevant chunks
- Multi-hop expansion → follows vector similarity “edges” through BFS
- Symbol extraction → identifies key entities from high-relevance results
- Regex search → finds all references, completing the “graph” of connections
- Follow-ups → explores architectural relationships discovered in results
This approach scales efficiently to multi-million LOC repositories because there’s no explicit graph to maintain—the “graph” is the pattern of orchestrated search calls, adapted dynamically to each query’s needs.
Hybrid Semantic + Symbol Search
Section titled “Hybrid Semantic + Symbol Search”After each semantic search finds conceptually relevant chunks, the system extracts symbols (function names, class names, parameter names) and runs parallel regex searches to find every occurrence of those symbols across the codebase.
This hybrid approach combines:
- Semantic search: Discovers what’s conceptually relevant (understanding)
- Regex search: Finds all exact symbol references (precision)
The results are unified through simple deduplication by chunk ID. Semantic results retain their reranked relevance scores from the multi-hop search phase, while regex results add new chunks containing exact symbol matches that weren’t discovered semantically. This gives you comprehensive coverage: the semantic “why this matters” plus the regex “everywhere this appears.” Since regex is a local database operation, this adds zero API costs while providing more complete results.
Why This Works
Section titled “Why This Works”Traditional semantic search finds conceptually similar code but misses architectural relationships. Knowledge graphs model these relationships explicitly but require expensive upfront extraction and ongoing maintenance.
Code Research combines base search capabilities (semantic + regex) with intelligent orchestration:
- Query expansion - Multiple semantic entry points discover different code neighborhoods
- Multi-hop exploration - BFS through semantic neighborhoods following architectural connections
- Symbol extraction + regex - Comprehensive coverage beyond semantic discovery
- Follow-up generation - Context-aware questions explore architectural boundaries
- Adaptive scaling - Token budgets (30k-150k) scale with codebase size
- Map-reduce synthesis - Parallel cluster synthesis with deterministic citation remapping
The virtual graph emerges through orchestrated tool use—no upfront construction, no separate storage, no synchronization overhead. Query-adaptive orchestration scales from quick searches to deep architectural exploration automatically.
Adaptive Scaling
Section titled “Adaptive Scaling”Token budgets scale with repository size (30k-150k input tokens) and traversal depth (shallow→deep). The system automatically allocates resources based on what it’s discovering.
Intelligent Synthesis
Section titled “Intelligent Synthesis”Small result sets use single-pass synthesis (one LLM call). Large result sets trigger map-reduce synthesis (cluster chunks, synthesize clusters, combine summaries). Output is always a structured markdown report with architectural insights and file.ts:45 citations.
Quality Filtering Before Synthesis
Section titled “Quality Filtering Before Synthesis”Code research doesn’t blindly pass all collected chunks to synthesis. After BFS exploration completes, the system performs a final reranking pass against your original query to filter for quality and relevance:
- File-level reranking: All discovered files are reranked using the reranker model against your original question
- Token budget allocation: Files are prioritized by relevance score, and only the highest-scoring files fit within the synthesis token budget
- Chunk filtering: Only chunks from budgeted files make it to the final synthesis
This implements a classic precision-recall tradeoff—cast a wide net during exploration (maximize recall), then filter for quality before synthesis (maximize precision). Low-relevance findings are excluded, ensuring the LLM synthesizes only the most pertinent architectural insights.
Map-Reduce Synthesis with Clustering
Section titled “Map-Reduce Synthesis with Clustering”When semantic clustering produces multiple clusters from filtered results, the system uses two-phase HDBSCAN clustering with map-reduce synthesis to prevent context collapse:
-
Phase 1 (Natural Boundary Discovery): HDBSCAN (Hierarchical Density-Based Spatial Clustering) discovers natural semantic boundaries in the embedding space, grouping files where they are cohesively related rather than forcing arbitrary partitions. This respects the inherent structure of your codebase, identifying both semantically dense clusters and outliers that don’t fit natural groupings.
-
Phase 2 (Token-Budget Grouping): Clusters are greedily merged based on centroid distance while respecting the 30k token limit per cluster, preserving semantic coherence during merging.
Files are partitioned into token-bounded clusters, synthesized in parallel with cluster-local citations [1][2][3], then deterministically remapped to global numbers before the reduce phase combines summaries.
This avoids progressive compression loss from iterative summarization chains (summary → summary-of-summary). Each cluster synthesizes once with full context, preserving architectural details while enabling arbitrary scaling. Cluster-local citation namespaces enable maximum parallelism—no coordination needed during map phase. The reduce LLM integrates remapped summaries with explicit instructions to preserve citations (not generate new ones), ensuring every [N] traces to actual source files.
Result: 10KB repos use single-pass synthesis (k=1), 1M+ LOC repos automatically scale to map-reduce (k=5+) without context collapse or citation hallucination.