Skip to content

Benchmarks

  • Machine: Apple Silicon (M-series)
  • Tool: wrk -t2 -c10 -d10s
  • Workload: Hello world, JSON response, SIMD (Go calling Zig sum/dot/scale/minmax)
  • Wrangler dev for GoMode (local miniflare runtime)
Native GoGoMode (Worker)GoMode (DO)Std Go WASM
GET / req/sec80,7153,7641,586614
GET /json req/sec83,1823,715537
GET /simd req/sec3,6921,574
Latency (avg)0.6ms3.2ms7.2ms78ms
Binary sizenative79KB79KB3.0MB

Note: GoMode numbers are bottlenecked by wrangler dev (~3.7K req/sec ceiling). Production CF edge numbers will be significantly higher.

6.1x faster throughput with a 38x smaller binary.

Standard Go produces a 3MB WASM binary with a heavy runtime (goroutine scheduler, full GC, large stdlib). GoMode produces a 79KB binary — TinyGo with gc=leaking, scheduler=none, and Zig linked directly in.

The /simd route calls Zig SIMD functions (sum, dot product, scale, minmax) from Go. Throughput is identical to the plain hello world — 3,692 vs 3,764 req/sec. This confirms the CGo linking adds zero overhead: Zig functions are direct call instructions in the same WASM binary.

GoMode eliminates all serialization between JS and Go:

  1. JS writes request fields as zerobuf tagged values directly into WASM memory (32 bytes for 2 fields)
  2. Go reads at fixed offsets — no parsing
  3. Go writes response at fixed offsets — no serialization
  4. JS reads at fixed offsets — no parsing

Compare to standard Go WASM which requires JSON.stringify → encode → copy → json.Unmarshal → process → json.Marshal → copy → JSON.parse on every request.

Worker mode is 2.4x faster than DO mode because:

  • No DO queue — requests don’t serialize through a single instance
  • Horizontal scaling — CF spins up multiple isolates, each with its own cached WASM instance
  • Same warm latency — both have ~1ms per-request cost once WASM is initialized

Use DO mode when you need state (WebSocket sessions, counters, etc.). Use Worker mode for stateless handlers.

These benchmarks run on wrangler dev (local miniflare), not production CF edge. The wrangler dev server itself caps around ~3.7K req/sec, so these numbers reflect the wrangler bottleneck — not GoMode’s actual throughput limit.

  • Production CF edge — deploy benchmarks (coming soon)
  • Columnar SIMD workloads — bulk data transforms with Zig SIMD
  • Async CF bindings — KV, R2, D1 via Asyncify (not integrated yet)
Terminal window
# Install wrk
brew install wrk
# Build single WASM binary
npm run build
# Start dev server
npm run dev
# Benchmark
wrk -t2 -c10 -d10s http://localhost:8787/ # Worker mode
wrk -t2 -c10 -d10s http://localhost:8787/json # Worker JSON
wrk -t2 -c10 -d10s http://localhost:8787/simd # Worker SIMD
wrk -t2 -c10 -d10s http://localhost:8787/do/ # DO mode
wrk -t2 -c10 -d10s http://localhost:8787/do/simd # DO SIMD