Learn

Deep dives into how weave works and why it exists.

How weave-driver differs from git's merge driver

Git merges lines. Weave merges entities. A deep dive into why line-level merging fails for multi-agent workflows, and how entity-level merging fixes it.

merge fundamentals

What is a CRDT and why does weave use one?

CRDTs let multiple agents edit shared state without coordination. How weave uses Automerge to track who's editing what, and why advisory locks beat hard locks.

crdt fundamentals

How weave-driver differs from git's merge driver

merge · fundamentals

Git's merge driver: line-level

Git's built-in merge works at the line level. When you run git merge, git compares the base version with both branches line by line. It uses a diff algorithm (Myers diff by default) that finds the longest common subsequence of lines, then identifies which lines were added, removed, or changed in each branch.

The critical concept is hunks — contiguous groups of changed lines. Git identifies hunks in each branch, then checks if any hunks from the two branches overlap or are adjacent. If they do, git declares a conflict, even if the actual changes are logically independent.

The problem: Git has zero understanding of what the lines mean. It doesn't know what a function is, what a class is, or where one logical unit ends and another begins. Two agents editing completely different functions can trigger a conflict simply because the functions happen to be near each other in the file.

weave-driver: entity-level

weave-driver replaces git's line-level comparison with entity-level comparison. Instead of "which lines changed?", weave asks "which functions/classes/properties changed?"

Here's the key difference:

  1. Parsing step: weave runs sem-core (tree-sitter based) on all three versions of the file (base, ours, theirs) to extract named entities — functions, classes, methods, interfaces, types, properties, etc.
  2. Matching step: Entities are matched across the three versions by their stable ID (type + name + parent). function::processData in base is matched to function::processData in ours and theirs.
  3. Per-entity resolution: Each entity is resolved independently. If only one branch modified processData, that version wins. If only the other branch modified validateInput, that version wins. No conflict — they're different entities.
  4. Fallback: If both branches modified the same entity differently, weave falls back to diffy::merge (3-way line merge) on just that entity's body. So you only get a conflict when two branches genuinely changed the same function in incompatible ways.

Why this matters for AI agents

This distinction becomes critical in multi-agent workflows. When you have two or more AI agents working on the same codebase:

  • Agent A is told to refactor processData() to use async/await
  • Agent B is told to add validation to validateInput()

Both agents might work in the same file. With git, even though they're editing completely different functions, the merge will likely conflict because the changed line ranges are close together. Someone (or something) has to manually resolve the "conflict" which isn't actually a conflict at all.

With weave, the merge is automatic. weave sees that Agent A changed entity function::processData and Agent B changed entity function::validateInput. Different entities, no conflict, merge cleanly.

A concrete example

Consider this TypeScript file:

src/lib.ts (base)
export function processData(input: string) {
  return input.trim();
}

export function validateInput(data: unknown) {
  return data !== null;
}

Agent A changes processData:

Agent A's version (ours)
export async function processData(input: string) {
  const cleaned = input.trim();
  const result = await transform(cleaned);
  return result;
}

export function validateInput(data: unknown) {
  return data !== null;
}

Agent B changes validateInput:

Agent B's version (theirs)
export function processData(input: string) {
  return input.trim();
}

export function validateInput(data: unknown) {
  if (data === null || data === undefined) {
    throw new Error("Input required");
  }
  return true;
}

Git's result

CONFLICT (content): Merge conflict in src/lib.ts
The expanded processData pushes lines down, making the hunk ranges overlap with the changed validateInput. Git can't tell them apart — it just sees overlapping changed lines.

Weave's result

2 entities matched, 2 modified, 0 conflicts
weave sees: function::processData changed in ours only → use ours. function::validateInput changed in theirs only → use theirs. Clean merge with both changes preserved.

The mental model

Think of it this way:

  • Git sees a file as a list of lines. Any change is a line change. Proximity = potential conflict.
  • Weave sees a file as a list of named entities. Each entity is resolved independently. Only same-entity, different-content changes conflict.

This is the same conceptual leap that git made over raw patches — operating at a higher semantic level to make merging smarter and less painful.

tl;dr — Git doesn't know what a function is. Weave does. That's the entire difference, and it eliminates the vast majority of false conflicts in multi-agent workflows.

What is a CRDT and why does weave use one?

crdt · fundamentals

The problem: shared state without a server

When multiple agents work on the same codebase, they need to coordinate. Agent A needs to know that Agent B is already editing processData() so it picks something else. But we don't want to run a central server — weave should work as a local tool, just like git.

This is the classic distributed systems problem: how do multiple writers share state without a central coordinator?

CRDT: Conflict-free Replicated Data Type

A CRDT is a data structure designed so that any two copies can always be merged without conflicts. No matter what order the operations arrive in, no matter if some updates are missed and replayed later, the result is always consistent.

The simplest example: a counter. Instead of storing "the count is 5", each agent stores "I added 3" and "I added 2". When you merge, you sum them up. It doesn't matter who went first — 3 + 2 = 2 + 3 = 5. Always.

The key insight: CRDTs are designed so the merge function is commutative (A + B = B + A), associative ((A + B) + C = A + (B + C)), and idempotent (A + A = A). This means you can never get an inconsistent state, no matter what happens.

Automerge: a CRDT for JSON documents

weave uses Automerge, which is a CRDT that behaves like a JSON document. You can have nested maps, lists, and scalar values — and any two copies can always be merged.

weave's state document tracks:

  • Entities: every code entity weave knows about — its name, type, file, who claimed it, who last modified it
  • Agents: every registered agent — its name, branch, last heartbeat, what it's working on
  • Operations: an audit log of claims, releases, and modifications

This is saved as a binary file at .weave/state.automerge in your repo root. It's not committed to git — it's local coordination state, like .git/ itself.

Why advisory locks, not hard locks?

When Agent A "claims" an entity, it's a signal, not enforcement. Agent B can still edit the claimed entity if it wants to. The merge driver will still handle it correctly.

Why not enforce it?

  1. Agents crash. If Agent A holds a hard lock and crashes, the entity is stuck. Advisory locks + heartbeat timeouts handle this gracefully — stale claims are automatically released.
  2. The merge driver is the safety net. Even if two agents edit the same entity, weave-driver resolves it at merge time. Claims are an optimization to prevent conflicts, not a requirement.
  3. Optimistic concurrency works better. In practice, most agents cooperate. Penalizing the common case (no conflict) to prevent the rare case (actual conflict) is bad ergonomics.
The philosophy: Claims are like turn signals in traffic. They communicate intent. Other drivers (agents) generally respect them. But if someone ignores a signal, the road (merge driver) still handles it — you don't crash, you just merge.

The coordination flow

Here's how it works end to end:

  1. Agent registers itself: weave_agent_register({agent_id: "claude-1", branch: "feature-auth"})
  2. Agent checks what's in a file: weave_extract_entities({file_path: "src/auth.ts"})
  3. Agent checks for existing claims: weave_who_is_editing({file_path: "src/auth.ts", entity_name: "validateToken"})
  4. If unclaimed, agent claims it: weave_claim_entity({...})
  5. Agent edits the code, periodically heartbeating
  6. Agent releases when done: weave_release_entity({...})

If an agent crashes mid-work, its heartbeat stops. After a configurable timeout, cleanup_stale_agents releases all its claims so other agents can take over.

Why not just use a database?

Three reasons:

  • No server needed. A CRDT is just a file. No PostgreSQL, no Redis, no Docker. .weave/state.automerge works on any machine, offline, instantly.
  • Mergeable by design. If two agents update the state simultaneously (race condition), the Automerge merge is always correct. A regular JSON file would corrupt.
  • Compact and fast. Automerge uses a binary format. The state file is typically a few KB. Reads and writes are sub-millisecond.
tl;dr — A CRDT gives weave a "shared whiteboard" where agents can signal what they're working on. It's just a file, needs no server, and can never get into an inconsistent state. Claims are advisory — the merge driver is the real safety net.