Skip to content

Scaling

gitmode’s foundation — R2 for object storage, Durable Objects with SQLite for per-repo coordination — provides several inherent advantages:

PropertyHow
Zero-ops global deploymentCloudflare Workers run in 300+ edge locations. No servers to manage.
Per-repo isolationEach repo gets its own DO with SQLite. A hot repo can’t affect others. No shared database contention.
Zero egress costsR2 charges $0 for egress (S3 charges ~$0.09/GB). For git hosting, egress is the #1 cost — R2 eliminates it.
No binary dependenciesZig WASM engine is fully portable. No libgit2 system install, no JVM. Deploy is wrangler deploy.
Content-addressed storageGit objects in R2 are immutable (keyed by SHA-1), so they don’t require coordination. Only refs need consistency.

A single Durable Object has hard limits imposed by the Cloudflare platform:

ResourceLimit
Memory~128MB per DO instance
CPU30 seconds per request
SQLite10GB per DO
R2 object size5TB

For most repositories (up to ~1-5GB of source code), a single RepoStore DO handles all operations comfortably. The current benchmarks demonstrate this:

Workload100 files (796K)1K files (7.8MB)5K files (39MB)
Push182ms585ms3.4s
Clone107ms686ms3.1s

The memory ceiling becomes relevant when building packfiles for very large repositories (~10K+ objects of incompressible data), where all compressed objects must be assembled in memory.

The per-DO memory limit is not an architectural limit — it’s a per-instance limit that can be overcome by distributing work across multiple DOs. This is a proven pattern on Cloudflare Workers.

Instead of one DO doing all the work, the RepoStore fans out to a pool of PackWorkerDO instances, each handling a bounded subset of objects:

Client ← Worker ← RepoStore (orchestrator)
├── PackWorkerDO slot-0 (objects 0-499)
├── PackWorkerDO slot-1 (objects 500-999)
├── PackWorkerDO slot-2 (objects 1000-1499)
└── ...

Fan-out activates automatically when:

  • The PACK_WORKER binding is configured in wrangler.jsonc
  • The object count exceeds 200 (below this, local assembly is faster)

When PACK_WORKER is not configured, everything falls back to local assembly — fully backward compatible.

The most memory-intensive operation is building a packfile for clone/fetch. With fan-out:

  1. RepoStore performs the BFS tree walk to collect object SHAs (lightweight — just SHAs, no decompression)
  2. lookupChunkMeta() queries SQLite to map each SHA to its R2 chunk key, byte offset, and byte length
  3. SHAs are split into batches of 500 and assigned round-robin to the PackWorkerDO pool (max 20 slots)
  4. Each PackWorkerDO receives object descriptors (SHA + R2 location), reads from R2 (10-concurrent fetches), decompresses, re-compresses for packfile format, and returns a binary segment
  5. RepoStore gathers segments via Promise.allSettled, assembles the final packfile (PACK header + segments + SHA-1 trailer)
  6. Response wraps the packfile in sideband-64k framing

Each PackWorkerDO handles ~500 objects, staying well under the 128MB limit. Promise.allSettled provides fault tolerance — failed batches are logged but don’t crash the entire clone.

Worker DOs use deterministic naming (idFromName) for instance reuse:

// Fixed pool — DOs hibernate when idle (zero cost)
const id = packWorker.idFromName(`pack-slot-${slotIndex}`);
const worker = packWorker.get(id);
const resp = await worker.fetch(buildRequest);
PropertyValue
Pool sizeDynamic — scales with batch count, capped at configurable max (default 20, hard max 100)
ConfigurationPOOL_MAX_SLOTS env var, or PoolConfig.maxSlots in code
Batch size100–500 objects per worker depending on operation
Threshold200 (packfile/worktree), 10 (diff), 50 (grep)
Namingpack-slot-{0..N}
Idle cost$0 (hibernated DOs are free)
ReuseSame slots across requests (warm WASM, warm R2 connections)
FallbackWhen PACK_WORKER binding is absent, local processing
OperationFan-out?Why
Clone/fetch (packfile build)YesMemory-intensive: read + compress + assemble all objects
Diff with contentYesRead blob pairs from R2, compute unified diffs at the edge
GrepYesRead blobs from R2, regex search with context at the edge
Worktree materializationYesRead blobs and write raw content to worktree paths
Tree walksYesParse tree objects and return child SHAs for BFS
Push (packfile unpack)NoAlready streams with 200-object flush batches
Log / statsNoSQLite queries, no large buffers

The buildPackfile function in src/packfile-builder.ts handles the decision automatically:

export async function buildPackfile(
engine: GitEngine,
objectShas: string[],
packWorker?: DurableObjectNamespace,
poolConfig?: PoolConfig
): Promise<Uint8Array> {
const maxSlots = poolConfig?.maxSlots;
// Fan out only when binding exists AND enough objects to justify it
if (packWorker && objectShas.length > FANOUT_THRESHOLD) {
return buildPackfileFanout(engine, objectShas, packWorker, maxSlots);
}
// Otherwise: local assembly (fast for small repos)
// ...
}

Fan-out coordination uses Promise.allSettled for fault tolerance — if a worker DO fails, the batch is logged but doesn’t crash the clone. Failed objects are simply missing from the response (git client will retry).

gitmode (R2 + DO)Gitea (disk + PostgreSQL)GitHub (custom)
Deploymentwrangler deployDocker + PostgreSQLN/A (SaaS)
Global latencyEdge-native (300+ POPs)Single regionEdge CDN
Max repo sizeUnbounded with fan-outDisk-limited5GB soft limit
Concurrent readsFan out across DO poolSingle processReplicated
Concurrent writesSerialized per repo (DO)Parallel (DB row locks)Parallel (Spokes)
Storage cost$0.015/GB, $0 egressDisk costIncluded
Memory per operation128MB per DO (fan out for more)Process memory (GBs)Process memory
Garbage collectionNot yet (planned)git gc built-inAutomatic
Operational burdenZeroMediumN/A

gitmode is strongest when:

  • You need global edge latency without managing infrastructure
  • Egress costs matter (hosting many public repos, CI cloning frequently)
  • Per-repo isolation is important (multi-tenant platforms)
  • You want a programmatic REST API alongside git protocol

Traditional git servers are stronger when:

  • Single-region deployment is acceptable
  • You need concurrent writes to the same repo at high throughput
  • Repos contain very large binary assets (>5GB)
  • You need built-in garbage collection and repacking
ScaleR2 StorageR2 OperationsDO ComputeTotal/month
100 repos, 100MB avg$0.15~$0.05~$0.10~$0.30
1K repos, 500MB avg$7.50~$2.00~$5.00~$15
10K repos, 1GB avg$150~$50~$100~$300

These estimates assume moderate activity (10 pushes + 50 clones per repo per month). The zero-egress property means costs scale with storage and writes, not reads — the opposite of S3-backed solutions.