Skip to content

Architecture

Architecture

gitmode is an npm package (import { createHandler } from "gitmode") that runs on Cloudflare’s edge infrastructure with four components:

ComponentRole
WorkerHTTP router via createHandler() — matches git protocol and REST API routes, forwards to Durable Object
RepoStore DOPer-repo Durable Object with embedded SQLite for refs, HEAD, metadata, commit index
PackWorkerDOEdge compute pool — each worker reads data from R2 and processes locally (packfile assembly, diff, grep, worktree, tree walks)
R2Object storage for git objects bundled into ~2MB chunks, plus materialized worktree files
Zig WASMTwo modules: server (865KB, full git + libgit2) and client (83KB, SHA-1/zlib/delta only)
SSH ProxyNode.js SSH-to-HTTP proxy for git clone ssh://... support (development only)
  1. Git client sends HTTP request (e.g., git push) or SSH request via the SSH proxy
  2. Worker matches the URL pattern, extracts owner/repo, gets the RepoStore DO handle
  3. Worker forwards the request to the DO with action headers (x-action, x-repo-path)
  4. RepoStore DO handles the action:
    • For git protocol: delegates to handleInfoRefs, handleUploadPack, or handleReceivePack
    • For REST API: delegates to GitPorcelain methods via handleApiAction
  5. GitEngine reads/writes objects from R2 and refs from SQLite
  6. PackWorkerDO pool (compute-intensive operations): RepoStore fans out work to a dynamic pool of compute workers. Each worker reads data from R2 and processes locally — packfile assembly, unified diffs, regex grep, worktree materialization, or tree walks. Pool size scales with workload, capped at a configurable max (default 20). Activated when PACK_WORKER binding is configured and workload exceeds operation-specific thresholds.
  7. Zig WASM handles binary operations: SHA-1 hashing, zlib, packfile parsing, delta encoding
-- Branch and tag refs
CREATE TABLE refs (
name TEXT PRIMARY KEY, -- e.g. "heads/main", "tags/v1.0"
sha TEXT NOT NULL
);
-- HEAD pointer (symbolic ref or detached SHA)
CREATE TABLE head (
id INTEGER PRIMARY KEY CHECK (id = 1),
value TEXT NOT NULL -- e.g. "ref: refs/heads/main" or a raw SHA
);
-- Repository metadata
CREATE TABLE repo_meta (
id INTEGER PRIMARY KEY CHECK (id = 1),
owner TEXT NOT NULL,
name TEXT NOT NULL,
description TEXT DEFAULT '',
visibility TEXT DEFAULT 'public',
default_branch TEXT DEFAULT 'main',
created_at TEXT NOT NULL,
updated_at TEXT
);
-- Commit index for fast log queries
CREATE TABLE commits (
sha TEXT PRIMARY KEY,
author TEXT NOT NULL,
message TEXT NOT NULL,
timestamp INTEGER NOT NULL
);
-- Object chunk index (maps SHA → R2 chunk for batch reads)
CREATE TABLE object_chunks (
sha TEXT PRIMARY KEY,
chunk_key TEXT NOT NULL,
byte_offset INTEGER NOT NULL,
byte_length INTEGER NOT NULL
);
-- File size cache (avoids R2 reads for stats endpoint)
CREATE TABLE file_sizes (
sha TEXT PRIMARY KEY,
size INTEGER NOT NULL
);

Git objects are bundled into ~2MB chunks stored at {owner}/{repo}/chunks/{uuid}. Each chunk contains multiple zlib-compressed objects concatenated together. The object_chunks SQLite table maps each SHA to its chunk key, byte offset, and byte length for efficient extraction via R2 range reads.

Legacy loose objects (from before chunk storage) are stored at {owner}/{repo}/objects/{sha[0:2]}/{sha[2:]}. The engine falls back to loose object reads when an SHA is not found in the chunk index.

Object types:

TypeValueDescription
blob1File content
tree2Directory listing (mode + name + SHA entries)
commit3Commit metadata (tree, parents, author, message)
tag4Annotated tag (object, type, tag name, tagger, message)

Each repository maps to exactly one Durable Object instance. All ref reads and writes go through the DO’s SQLite, which provides:

  • Strong consistency — reads always see the latest writes
  • Serialized updates — concurrent pushes are serialized by the DO
  • Atomic multi-ref updates — a single push can update multiple refs atomically

R2 objects are immutable (content-addressed by SHA-1), so they don’t require coordination.

  1. Client sends ref update commands + packfile
  2. Worker routes to RepoStore DO (one per repo, strongly consistent)
  3. Packfile unpack: Objects hashed + zlib-compressed in memory, then flushed to R2 in ~2MB chunks (batched every 200 objects). Each chunk indexed in SQLite object_chunks table.
  4. Refs updated in DO SQLite
  5. Commits indexed in DO SQLite (author, message, timestamp)
  6. File sizes cached in DO SQLite for stats endpoint
  7. Report-status returned to client
  8. Worktree materialization (async for >500 objects):
    • Uses optimistic object cache — in-memory objects from unpack, zero R2 re-reads
    • Incremental mode (default): diffs old tree vs new tree, writes only changed/added files
    • Full mode (first push): reads blobs in 500-object batches, writes with 100-concurrent R2 PUTs
    • Yields between batches via setTimeout(0) so DO can serve concurrent requests

See Performance for benchmark results and optimization details.

On every push, gitmode materializes the commit tree into R2 as plain files at {owner}/{repo}/worktrees/{branch}/{filepath}. This means:

  • The vinext UI reads files directly from R2 — no git decompression needed
  • Files are edge-cached by R2 for fast reads
  • CI/CD pipelines can read source files without git operations