Scaling

Architecture Strengths

gitmode’s foundation — R2 for object storage, Durable Objects with SQLite for per-repo coordination — provides several inherent advantages:

Property	How
Zero-ops global deployment	Cloudflare Workers run in 300+ edge locations. No servers to manage.
Per-repo isolation	Each repo gets its own DO with SQLite. A hot repo can’t affect others. No shared database contention.
Zero egress costs	R2 charges $0 for egress (S3 charges ~$0.09/GB). For git hosting, egress is the #1 cost — R2 eliminates it.
No binary dependencies	Zig WASM engine is fully portable. No libgit2 system install, no JVM. Deploy is `wrangler deploy`.
Content-addressed storage	Git objects in R2 are immutable (keyed by SHA-1), so they don’t require coordination. Only refs need consistency.

Current Limits

A single Durable Object has hard limits imposed by the Cloudflare platform:

Resource	Limit
Memory	~128MB per DO instance
CPU	30 seconds per request
SQLite	10GB per DO
R2 object size	5TB

For most repositories (up to ~1-5GB of source code), a single RepoStore DO handles all operations comfortably. The current benchmarks demonstrate this:

Workload	100 files (796K)	1K files (7.8MB)	5K files (39MB)
Push	182ms	585ms	3.4s
Clone	107ms	686ms	3.1s

The memory ceiling becomes relevant when building packfiles for very large repositories (~10K+ objects of incompressible data), where all compressed objects must be assembled in memory.

Fan-Out Pattern

The per-DO memory limit is not an architectural limit — it’s a per-instance limit that can be overcome by distributing work across multiple DOs. This is a proven pattern on Cloudflare Workers.

How it works

Instead of one DO doing all the work, the RepoStore fans out to a pool of PackWorkerDO instances, each handling a bounded subset of objects:

Client ← Worker ← RepoStore (orchestrator)
                      ├── PackWorkerDO slot-0  (objects 0-499)
                      ├── PackWorkerDO slot-1  (objects 500-999)
                      ├── PackWorkerDO slot-2  (objects 1000-1499)
                      └── ...

Fan-out activates automatically when:

The PACK_WORKER binding is configured in wrangler.jsonc
The object count exceeds 200 (below this, local assembly is faster)

When PACK_WORKER is not configured, everything falls back to local assembly — fully backward compatible.

Scatter-gather for packfile assembly

The most memory-intensive operation is building a packfile for clone/fetch. With fan-out:

RepoStore performs the BFS tree walk to collect object SHAs (lightweight — just SHAs, no decompression)
lookupChunkMeta() queries SQLite to map each SHA to its R2 chunk key, byte offset, and byte length
SHAs are split into batches of 500 and assigned round-robin to the PackWorkerDO pool (max 20 slots)
Each PackWorkerDO receives object descriptors (SHA + R2 location), reads from R2 (10-concurrent fetches), decompresses, re-compresses for packfile format, and returns a binary segment
RepoStore gathers segments via Promise.allSettled, assembles the final packfile (PACK header + segments + SHA-1 trailer)
Response wraps the packfile in sideband-64k framing

Each PackWorkerDO handles ~500 objects, staying well under the 128MB limit. Promise.allSettled provides fault tolerance — failed batches are logged but don’t crash the entire clone.

Worker DO pool design

Worker DOs use deterministic naming (idFromName) for instance reuse:

// Fixed pool — DOs hibernate when idle (zero cost)
const id = packWorker.idFromName(`pack-slot-${slotIndex}`);
const worker = packWorker.get(id);
const resp = await worker.fetch(buildRequest);

Property	Value
Pool size	Dynamic — scales with batch count, capped at configurable max (default 20, hard max 100)
Configuration	`POOL_MAX_SLOTS` env var, or `PoolConfig.maxSlots` in code
Batch size	100–500 objects per worker depending on operation
Threshold	200 (packfile/worktree), 10 (diff), 50 (grep)
Naming	`pack-slot-{0..N}`
Idle cost	$0 (hibernated DOs are free)
Reuse	Same slots across requests (warm WASM, warm R2 connections)
Fallback	When `PACK_WORKER` binding is absent, local processing

Which operations benefit

Operation	Fan-out?	Why
Clone/fetch (packfile build)	Yes	Memory-intensive: read + compress + assemble all objects
Diff with content	Yes	Read blob pairs from R2, compute unified diffs at the edge
Grep	Yes	Read blobs from R2, regex search with context at the edge
Worktree materialization	Yes	Read blobs and write raw content to worktree paths
Tree walks	Yes	Parse tree objects and return child SHAs for BFS
Push (packfile unpack)	No	Already streams with 200-object flush batches
Log / stats	No	SQLite queries, no large buffers

Coordination

The buildPackfile function in src/packfile-builder.ts handles the decision automatically:

export async function buildPackfile(
  engine: GitEngine,
  objectShas: string[],
  packWorker?: DurableObjectNamespace,
  poolConfig?: PoolConfig
): Promise<Uint8Array> {
  const maxSlots = poolConfig?.maxSlots;
  // Fan out only when binding exists AND enough objects to justify it
  if (packWorker && objectShas.length > FANOUT_THRESHOLD) {
    return buildPackfileFanout(engine, objectShas, packWorker, maxSlots);
  }
  // Otherwise: local assembly (fast for small repos)
  // ...
}

Fan-out coordination uses Promise.allSettled for fault tolerance — if a worker DO fails, the batch is logged but doesn’t crash the clone. Failed objects are simply missing from the response (git client will retry).

Comparison with Other Architectures

	gitmode (R2 + DO)	Gitea (disk + PostgreSQL)	GitHub (custom)
Deployment	`wrangler deploy`	Docker + PostgreSQL	N/A (SaaS)
Global latency	Edge-native (300+ POPs)	Single region	Edge CDN
Max repo size	Unbounded with fan-out	Disk-limited	5GB soft limit
Concurrent reads	Fan out across DO pool	Single process	Replicated
Concurrent writes	Serialized per repo (DO)	Parallel (DB row locks)	Parallel (Spokes)
Storage cost	$0.015/GB, $0 egress	Disk cost	Included
Memory per operation	128MB per DO (fan out for more)	Process memory (GBs)	Process memory
Garbage collection	Not yet (planned)	`git gc` built-in	Automatic
Operational burden	Zero	Medium	N/A

Key tradeoffs

gitmode is strongest when:

You need global edge latency without managing infrastructure
Egress costs matter (hosting many public repos, CI cloning frequently)
Per-repo isolation is important (multi-tenant platforms)
You want a programmatic REST API alongside git protocol

Traditional git servers are stronger when:

Single-region deployment is acceptable
You need concurrent writes to the same repo at high throughput
Repos contain very large binary assets (>5GB)
You need built-in garbage collection and repacking

Cost at Scale

Scale	R2 Storage	R2 Operations	DO Compute	Total/month
100 repos, 100MB avg	$0.15	~$0.05	~$0.10	~$0.30
1K repos, 500MB avg	$7.50	~$2.00	~$5.00	~$15
10K repos, 1GB avg	$150	~$50	~$100	~$300

These estimates assume moderate activity (10 pushes + 50 clones per repo per month). The zero-egress property means costs scale with storage and writes, not reads — the opposite of S3-backed solutions.