Architecture

Overview

Architecture

gitmode is an npm package (import { createHandler } from "gitmode") that runs on Cloudflare’s edge infrastructure with four components:

Component	Role
Worker	HTTP router via `createHandler()` — matches git protocol and REST API routes, forwards to Durable Object
RepoStore DO	Per-repo Durable Object with embedded SQLite for refs, HEAD, metadata, commit index
PackWorkerDO	Edge compute pool — each worker reads data from R2 and processes locally (packfile assembly, diff, grep, worktree, tree walks)
R2	Object storage for git objects bundled into ~2MB chunks, plus materialized worktree files
Zig WASM	Two modules: server (865KB, full git + libgit2) and client (83KB, SHA-1/zlib/delta only)
SSH Proxy	Node.js SSH-to-HTTP proxy for `git clone ssh://...` support (development only)

Request Flow

Git client sends HTTP request (e.g., git push) or SSH request via the SSH proxy
Worker matches the URL pattern, extracts owner/repo, gets the RepoStore DO handle
Worker forwards the request to the DO with action headers (x-action, x-repo-path)
RepoStore DO handles the action:
- For git protocol: delegates to handleInfoRefs, handleUploadPack, or handleReceivePack
- For REST API: delegates to GitPorcelain methods via handleApiAction
GitEngine reads/writes objects from R2 and refs from SQLite
PackWorkerDO pool (compute-intensive operations): RepoStore fans out work to a dynamic pool of compute workers. Each worker reads data from R2 and processes locally — packfile assembly, unified diffs, regex grep, worktree materialization, or tree walks. Pool size scales with workload, capped at a configurable max (default 20). Activated when PACK_WORKER binding is configured and workload exceeds operation-specific thresholds.
Zig WASM handles binary operations: SHA-1 hashing, zlib, packfile parsing, delta encoding

Data Model

SQLite Tables (per DO)

-- Branch and tag refs
CREATE TABLE refs (
  name TEXT PRIMARY KEY,  -- e.g. "heads/main", "tags/v1.0"
  sha TEXT NOT NULL
);

-- HEAD pointer (symbolic ref or detached SHA)
CREATE TABLE head (
  id INTEGER PRIMARY KEY CHECK (id = 1),
  value TEXT NOT NULL     -- e.g. "ref: refs/heads/main" or a raw SHA
);

-- Repository metadata
CREATE TABLE repo_meta (
  id INTEGER PRIMARY KEY CHECK (id = 1),
  owner TEXT NOT NULL,
  name TEXT NOT NULL,
  description TEXT DEFAULT '',
  visibility TEXT DEFAULT 'public',
  default_branch TEXT DEFAULT 'main',
  created_at TEXT NOT NULL,
  updated_at TEXT
);

-- Commit index for fast log queries
CREATE TABLE commits (
  sha TEXT PRIMARY KEY,
  author TEXT NOT NULL,
  message TEXT NOT NULL,
  timestamp INTEGER NOT NULL
);

-- Object chunk index (maps SHA → R2 chunk for batch reads)
CREATE TABLE object_chunks (
  sha TEXT PRIMARY KEY,
  chunk_key TEXT NOT NULL,
  byte_offset INTEGER NOT NULL,
  byte_length INTEGER NOT NULL
);

-- File size cache (avoids R2 reads for stats endpoint)
CREATE TABLE file_sizes (
  sha TEXT PRIMARY KEY,
  size INTEGER NOT NULL
);

R2 Object Layout

Git objects are bundled into ~2MB chunks stored at {owner}/{repo}/chunks/{uuid}. Each chunk contains multiple zlib-compressed objects concatenated together. The object_chunks SQLite table maps each SHA to its chunk key, byte offset, and byte length for efficient extraction via R2 range reads.

Legacy loose objects (from before chunk storage) are stored at {owner}/{repo}/objects/{sha[0:2]}/{sha[2:]}. The engine falls back to loose object reads when an SHA is not found in the chunk index.

Object types:

Type	Value	Description
blob	1	File content
tree	2	Directory listing (mode + name + SHA entries)
commit	3	Commit metadata (tree, parents, author, message)
tag	4	Annotated tag (object, type, tag name, tagger, message)

Consistency Model

Each repository maps to exactly one Durable Object instance. All ref reads and writes go through the DO’s SQLite, which provides:

Strong consistency — reads always see the latest writes
Serialized updates — concurrent pushes are serialized by the DO
Atomic multi-ref updates — a single push can update multiple refs atomically

R2 objects are immutable (content-addressed by SHA-1), so they don’t require coordination.

Push Flow (Detailed)

Client sends ref update commands + packfile
Worker routes to RepoStore DO (one per repo, strongly consistent)
Packfile unpack: Objects hashed + zlib-compressed in memory, then flushed to R2 in ~2MB chunks (batched every 200 objects). Each chunk indexed in SQLite object_chunks table.
Refs updated in DO SQLite
Commits indexed in DO SQLite (author, message, timestamp)
File sizes cached in DO SQLite for stats endpoint
Report-status returned to client
Worktree materialization (async for >500 objects):
- Uses optimistic object cache — in-memory objects from unpack, zero R2 re-reads
- Incremental mode (default): diffs old tree vs new tree, writes only changed/added files
- Full mode (first push): reads blobs in 500-object batches, writes with 100-concurrent R2 PUTs
- Yields between batches via setTimeout(0) so DO can serve concurrent requests

See Performance for benchmark results and optimization details.

Worktree

On every push, gitmode materializes the commit tree into R2 as plain files at {owner}/{repo}/worktrees/{branch}/{filepath}. This means:

The vinext UI reads files directly from R2 — no git decompression needed
Files are edge-cached by R2 for fast reads
CI/CD pipelines can read source files without git operations