Architecture
Overview
Section titled “Overview”gitmode is an npm package (import { createHandler } from "gitmode") that runs on Cloudflare’s edge infrastructure with four components:
| Component | Role |
|---|---|
| Worker | HTTP router via createHandler() — matches git protocol and REST API routes, forwards to Durable Object |
| RepoStore DO | Per-repo Durable Object with embedded SQLite for refs, HEAD, metadata, commit index |
| PackWorkerDO | Edge compute pool — each worker reads data from R2 and processes locally (packfile assembly, diff, grep, worktree, tree walks) |
| R2 | Object storage for git objects bundled into ~2MB chunks, plus materialized worktree files |
| Zig WASM | Two modules: server (865KB, full git + libgit2) and client (83KB, SHA-1/zlib/delta only) |
| SSH Proxy | Node.js SSH-to-HTTP proxy for git clone ssh://... support (development only) |
Request Flow
Section titled “Request Flow”- Git client sends HTTP request (e.g.,
git push) or SSH request via the SSH proxy - Worker matches the URL pattern, extracts owner/repo, gets the RepoStore DO handle
- Worker forwards the request to the DO with action headers (
x-action,x-repo-path) - RepoStore DO handles the action:
- For git protocol: delegates to
handleInfoRefs,handleUploadPack, orhandleReceivePack - For REST API: delegates to
GitPorcelainmethods viahandleApiAction
- For git protocol: delegates to
- GitEngine reads/writes objects from R2 and refs from SQLite
- PackWorkerDO pool (compute-intensive operations): RepoStore fans out work to a dynamic pool of compute workers. Each worker reads data from R2 and processes locally — packfile assembly, unified diffs, regex grep, worktree materialization, or tree walks. Pool size scales with workload, capped at a configurable max (default 20). Activated when
PACK_WORKERbinding is configured and workload exceeds operation-specific thresholds. - Zig WASM handles binary operations: SHA-1 hashing, zlib, packfile parsing, delta encoding
Data Model
Section titled “Data Model”SQLite Tables (per DO)
Section titled “SQLite Tables (per DO)”-- Branch and tag refsCREATE TABLE refs ( name TEXT PRIMARY KEY, -- e.g. "heads/main", "tags/v1.0" sha TEXT NOT NULL);
-- HEAD pointer (symbolic ref or detached SHA)CREATE TABLE head ( id INTEGER PRIMARY KEY CHECK (id = 1), value TEXT NOT NULL -- e.g. "ref: refs/heads/main" or a raw SHA);
-- Repository metadataCREATE TABLE repo_meta ( id INTEGER PRIMARY KEY CHECK (id = 1), owner TEXT NOT NULL, name TEXT NOT NULL, description TEXT DEFAULT '', visibility TEXT DEFAULT 'public', default_branch TEXT DEFAULT 'main', created_at TEXT NOT NULL, updated_at TEXT);
-- Commit index for fast log queriesCREATE TABLE commits ( sha TEXT PRIMARY KEY, author TEXT NOT NULL, message TEXT NOT NULL, timestamp INTEGER NOT NULL);
-- Object chunk index (maps SHA → R2 chunk for batch reads)CREATE TABLE object_chunks ( sha TEXT PRIMARY KEY, chunk_key TEXT NOT NULL, byte_offset INTEGER NOT NULL, byte_length INTEGER NOT NULL);
-- File size cache (avoids R2 reads for stats endpoint)CREATE TABLE file_sizes ( sha TEXT PRIMARY KEY, size INTEGER NOT NULL);R2 Object Layout
Section titled “R2 Object Layout”Git objects are bundled into ~2MB chunks stored at {owner}/{repo}/chunks/{uuid}. Each chunk contains multiple zlib-compressed objects concatenated together. The object_chunks SQLite table maps each SHA to its chunk key, byte offset, and byte length for efficient extraction via R2 range reads.
Legacy loose objects (from before chunk storage) are stored at {owner}/{repo}/objects/{sha[0:2]}/{sha[2:]}. The engine falls back to loose object reads when an SHA is not found in the chunk index.
Object types:
| Type | Value | Description |
|---|---|---|
| blob | 1 | File content |
| tree | 2 | Directory listing (mode + name + SHA entries) |
| commit | 3 | Commit metadata (tree, parents, author, message) |
| tag | 4 | Annotated tag (object, type, tag name, tagger, message) |
Consistency Model
Section titled “Consistency Model”Each repository maps to exactly one Durable Object instance. All ref reads and writes go through the DO’s SQLite, which provides:
- Strong consistency — reads always see the latest writes
- Serialized updates — concurrent pushes are serialized by the DO
- Atomic multi-ref updates — a single push can update multiple refs atomically
R2 objects are immutable (content-addressed by SHA-1), so they don’t require coordination.
Push Flow (Detailed)
Section titled “Push Flow (Detailed)”- Client sends ref update commands + packfile
- Worker routes to RepoStore DO (one per repo, strongly consistent)
- Packfile unpack: Objects hashed + zlib-compressed in memory, then flushed to R2 in ~2MB chunks (batched every 200 objects). Each chunk indexed in SQLite
object_chunkstable. - Refs updated in DO SQLite
- Commits indexed in DO SQLite (author, message, timestamp)
- File sizes cached in DO SQLite for stats endpoint
- Report-status returned to client
- Worktree materialization (async for >500 objects):
- Uses optimistic object cache — in-memory objects from unpack, zero R2 re-reads
- Incremental mode (default): diffs old tree vs new tree, writes only changed/added files
- Full mode (first push): reads blobs in 500-object batches, writes with 100-concurrent R2 PUTs
- Yields between batches via
setTimeout(0)so DO can serve concurrent requests
See Performance for benchmark results and optimization details.
Worktree
Section titled “Worktree”On every push, gitmode materializes the commit tree into R2 as plain files at {owner}/{repo}/worktrees/{branch}/{filepath}. This means:
- The vinext UI reads files directly from R2 — no git decompression needed
- Files are edge-cached by R2 for fast reads
- CI/CD pipelines can read source files without git operations