sem-mcp performance

6 MCP tools. Entity-level intelligence for agents.
Real numbers from sem's own codebase (48 Rust files).

6 MCP tools
75% fewer tokens
4.7x faster with caching
2.3x agent accuracy

Token efficiency

How many tokens does an agent need to understand EntityGraph and everything it affects?

Read all 73 source files ~32,000 tokens
32,000
sem_context (8K budget) 8,000 tokens · 121 entities packed
8,000
sem_context (4K budget) 4,000 tokens · 44 entities packed
4,000
Read just graph.rs ~3,906 tokens · 1 file, no cross-file deps
3,906
sem_context (2K budget) 1,887 tokens · target entity only
1,887
sem_context packs the target entity + all dependents + transitive signatures into a token budget. The agent gets the blast radius, not the whole repo. At 8K tokens, it fits 121 entities from across the codebase.

Impact precision

"How many things break if EntityGraph changes?"

30
grep EntityGraph
string matches (imports, comments, type annotations)
304
sem_impact
entities in the transitive dependency chain
grep results 30 matches
30
sem_impact (transitive) 304 entities
304
grep finds string matches. sem_impact walks the entity dependency graph and finds everything that transitively depends on the target. No hallucination, no false positives, no missed cross-file callers. In 56ms.

Test targeting

"Which tests should I run after changing EntityGraph?"

44
cargo test
run everything
24
sem_impact(mode="tests")
tests that actually depend on EntityGraph
Run all tests 44 tests
44
sem_impact(mode="tests") 24 tests
24
Impact analysis filtered to test entities. Agents know exactly which tests matter for their change. 45% fewer tests to run, zero guessing.

Agent accuracy

Same questions about code changes, answered by Claude Sonnet 4.5. One gets sem diff JSON, the other gets raw git diff.

sem diff
git diff
Q1: List added functions (F1) 93% vs 75%
93%
75%
Q2: Files with modified entities (F1) 100% vs 55%
100%
55%
Q3: Entity type counts (accuracy) 91% vs 13%
91%
13%
Q4: Added/modified/deleted counts (exact) 100% vs 22%
100%
22%
Overall average 96% vs 41%
+131% improvement with structured entity diffs.

See detailed findings and failure modes →

Graph caching

Agents call multiple graph tools per session. Without caching, each tool rebuilds the entity graph from scratch. With caching, the first call builds it once and every subsequent call reuses it.

495ms
5 graph calls
no caching (separate processes)
106ms
5 graph calls
in one session (memory cache)

6 tool calls: entities + diff + blame + impact + log + context

Without caching (5 rebuilds) 495ms
495ms
With caching (1 build + 4 cache hits) 106ms
106ms
4.7x faster. 389ms saved per session.
114ms cold start (no cache)
96ms SQLite warm start
21ms avg per call (cached)
Two cache layers: in-memory (keyed by file manifest hash) and SQLite at .sem/cache.db. Memory cache serves sequential calls in the same session. SQLite cache survives process restarts. If any file's mtime changes, the cache is invalidated and the graph rebuilds.

Latency (per tool)

Every tool, measured against sem's own codebase (48 Rust files). Single-process cold calls.

sem_entities 30ms
30ms
sem_diff 47ms
47ms
sem_blame 40ms
40ms
sem_log 85ms
85ms
sem_impact 120ms
120ms
sem_context 136ms
136ms

Cold per-process calls. In a live session with graph caching, graph tools (impact, context) average 21ms after the first call.