Commands

Every command supports --format json for machine-readable output.

sem diff Semantic diff of changes
-s, --staged Show staged changes only
-c, --commit <sha> Diff a specific commit
--from <ref> --to <ref> Diff a commit range
-f, --format <fmt> terminal, json, plain, or markdown
--file-exts <ext>... Filter by extension (e.g. .py .rs)
-v, --verbose Show inline before/after content
--stdin Read file changes from stdin (no git needed)
sem impact What breaks if this entity changes?
<entity> Name of the entity to analyze
--file <path> Disambiguate when multiple entities share a name
--deps Show direct dependencies only
--dependents Show direct dependents only
--tests Show affected test entities only
--json Output as JSON
--file-exts <ext>... Filter by extension
sem blame Who last modified each function/class
<file> File to blame
--json Output as JSON
sem log Evolution of an entity through git history
<entity> Name of the entity to trace
--file <path> File containing the entity (auto-detected if omitted)
--limit <n> Maximum commits to scan (default: 50)
-v, --verbose Show content diff between versions
--json Output as JSON
sem entities List entities under a file or directory path
[path] File or directory path; defaults to .
--json Output as JSON
sem context Token-budgeted context for an entity (for AI prompts)
<entity> Name of the entity
--file <path> Disambiguate when multiple entities share a name
--budget <n> Token budget (default: 8000)
--json Output as JSON
--file-exts <ext>... Filter by extension
sem setup Replace git diff with sem diff globally
Installs a git diff wrapper so every git diff runs through sem automatically
sem unsetup Restore default git diff behavior

JSON output

Pipe into your AI agent, CI pipeline, or automation.

sem diff --format json | jq
{
  "summary": {
    "fileCount": 2,
    "added": 1,
    "modified": 1,
    "deleted": 1,
    "total": 3
  },
  "changes": [
    {
      "entityId": "src/auth.ts::function::validateToken",
      "changeType": "added",
      "entityType": "function",
      "entityName": "validateToken",
      "filePath": "src/auth.ts"
    }
  ]
}

Entity matching

Three-phase algorithm. Detects additions, modifications, deletions, renames, and moves.

Phase 1: Exact ID Same entity ID in before/after? Modified or unchanged.
Phase 2: Structural hash Same AST structure, different name? Renamed or moved.
Phase 3: Fuzzy similarity >80% token overlap? Probable rename.

sem vs git

Featuregit diffsem diff
Granularity lines entities (functions, classes, properties)
Code parsing no tree-sitter (21 languages)
Config files lines key-path entities
Rename detection heuristic (file-level) 3-phase (ID + hash + fuzzy)
Machine output patch format JSON
Agent accuracy 41.5% 95.9% (benchmark)
Speed 9ms 8ms

Benchmarks

Measured with hyperfine on the sem repo. 50 runs, median reported.

Small commit (1 file) 5ms
5ms
Medium commit (5 files) 8ms
8ms
Large commit (13 files) 19ms
19ms
Range (8 commits, 30 files) 24ms
24ms

sem diff vs git diff

Same commit (5 files), same repo.

git diff (line-level only) 9ms
9ms
sem diff (entity-level) 8ms
8ms

Faster than git diff while adding semantic parsing, rename detection, and structural hashing.

Internal profiler

Built-in instrumentation via sem diff --profile. Shows where time is spent inside the binary.

Small commit (1 file, 8 entities)

git2 open repo
1.2ms
git diff + content
0.9ms
parse + match
1.8ms
format output
0.1ms
Total (wall) 4.9ms

Large commit (13 files, 65 entities)

git2 open repo
1.2ms
git diff + content
3.6ms
parse + match (parallel)
10.8ms
format output
0.2ms
Total (wall) 19.4ms

Range: 8 commits (30 files, 1383 entities)

git2 open repo
1.2ms
git diff + content
3.8ms
parse + match (parallel)
17.2ms
format output
0.4ms
Total (wall) 24.0ms

Flame graph

CPU time breakdown for sem diff --commit (large, 13 files). Hover for details.

git2 repo
git diff + content
tree-sitter parse
entity matching
format output

All supported formats

FormatExtensionsEntities
TypeScript.ts .tsx .mts .ctsfunctions, classes, interfaces, types, enums, exports
JavaScript.js .jsx .mjs .cjsfunctions, classes, variables, exports
Python.pyfunctions, classes, decorated definitions
Go.gofunctions, methods, types, vars, consts
Rust.rsfunctions, structs, enums, impls, traits, mods, consts
Java.javaclasses, methods, interfaces, enums, fields, constructors
C.c .hfunctions, structs, enums, unions, typedefs
C++.cpp .cc .hppfunctions, classes, structs, enums, namespaces, templates
C#.csclasses, methods, interfaces, enums, structs, properties
Ruby.rbmethods, classes, modules
PHP.phpfunctions, classes, methods, interfaces, traits, enums
Swift.swiftfunctions, classes, protocols, structs, enums, properties
Elixir.ex .exsmodules, functions, macros, guards, protocols
Bash.shfunctions
HCL/Terraform.hcl .tf .tfvarsblocks, attributes (qualified names)
Kotlin.kt .ktsclasses, interfaces, objects, functions, properties
Fortran.f90 .f95 .ffunctions, subroutines, modules, programs
Vue.vuetemplate/script/style blocks + inner TS/JS entities
XML.xml .plist .svg .csprojelements (nested, tag-name identity)
ERB.erbblocks, expressions, code tags
Svelte.svelte .svelte.js .svelte.tscomponent blocks, rune modules + inner JS/TS entities
JSON.jsonproperties, objects (RFC 6901 paths)
YAML.yml .yamlsections, properties (dot paths)
TOML.tomlsections, properties
CSV.csv .tsvrows (first column as ID)
Markdown.md .mdxheading-based sections