Architecture
Overview
Section titled “Overview”PyMode compiles upstream CPython 3.13 to wasm32-wasi using zig cc, then runs it inside Cloudflare Workers via Durable Objects. Each request gets its own sandboxed Python runtime.
┌─────────────────────────────────────────────────────┐│ Cloudflare Edge ││ ││ ┌──────────┐ ┌──────────────────────────────┐ ││ │ Worker │────▶│ PythonDO (Durable Object) │ ││ │(stateless)│ │ │ ││ └──────────┘ │ ┌────────────────────────┐ │ ││ │ │ python.wasm (CPython) │ │ ││ │ │ │ │ ││ │ │ on_fetch(req, env) │ │ ││ │ │ │ │ │ ││ │ │ ▼ │ │ ││ │ │ WASM Host Imports │ │ ││ │ └───────┬────────────────┘ │ ││ │ │ │ ││ │ ┌─────▼─────┐ │ ││ │ │ Asyncify │ │ ││ │ │ (suspend/ │ │ ││ │ │ resume) │ │ ││ │ └─────┬─────┘ │ ││ └──────────┼───────────────────┘ ││ │ ││ ┌──────────▼───────────────┐ ││ │ KV │ R2 │ D1 │ TCP │ HTTP │ ││ └─────────────────────────────┘ │└─────────────────────────────────────────────────────┘Build Pipeline
Section titled “Build Pipeline”Phase 1: Native Python
Section titled “Phase 1: Native Python”A native CPython 3.13 is built first — this serves as the “build Python” for cross-compilation.
Phase 2: WASM Cross-Compilation
Section titled “Phase 2: WASM Cross-Compilation”CPython is cross-compiled to wasm32-wasi using zig cc wrappers:
| Component | Purpose |
|---|---|
zig-cc | C compiler wrapper targeting wasm32-wasi with -Os |
zig-ar | Static archiver |
zig-cpp | C preprocessor |
config.site-wasi | Pre-answers configure checks for WASI |
Key build flags:
- Target:
wasm32-wasi - Optimization:
-Os(ReleaseSmall) - Disabled: threads, shared libs, IPv6, pymalloc
- WASI emulation:
-lwasi-emulated-signal,-lwasi-emulated-getpid,-lwasi-emulated-process-clocks
Phase 3: Asyncify
Section titled “Phase 3: Asyncify”wasm-opt --asyncify instruments the binary so host imports can suspend and resume the WASM stack:
Python calls fetch() → WASM host import pymode.http_fetch → Asyncify unwinds the stack → JS awaits the actual HTTP fetch → Asyncify rewinds the stack → Python receives the responseThis means Python code looks synchronous while the host performs async I/O.
Phase 4: Stdlib Bundle
Section titled “Phase 4: Stdlib Bundle”generate-stdlib-fs.py packages the CPython stdlib and PyMode runtime into a TypeScript map that’s embedded in the worker:
export const stdlibFS: Record<string, string> = { "encodings/__init__.py": "...", "encodings/utf_8.py": "...", "json/__init__.py": "...", // ~90 stdlib modules};WASM Host Imports
Section titled “WASM Host Imports”Instead of JS interop or virtual filesystem hacks, PyMode uses direct WASM imports:
__attribute__((import_module("pymode"), import_name("kv_get")))int32_t pymode_kv_get(const char* key, int32_t key_len, uint8_t* buf, int32_t buf_len);The JavaScript host (PythonDO) provides these functions at WASM instantiation time. Python calls them through a C extension module (_pymode).
Import Namespaces
Section titled “Import Namespaces”| Namespace | Functions |
|---|---|
wasi_snapshot_preview1 | WASI standard (fd_read, fd_write, path_open, etc.) |
pymode | KV, R2, D1, TCP, HTTP, threading, dynamic loading |
asyncify | Stack unwind/rewind control |
PythonDO (Durable Object)
Section titled “PythonDO (Durable Object)”Each request is handled by a Durable Object instance that:
- Instantiates
python.wasmwith WASI + pymode imports - Writes the request as JSON to WASM stdin
- Calls
_start(which runs_handler.py→on_fetch()) - Reads the response from WASM stdout
- Manages async I/O via Asyncify during execution
The DO holds:
- Active TCP connections (for connection pooling)
- HTTP response handles
- Thread/child DO results
- Dynamic loading state (
.wasmside modules)
Dynamic Loading
Section titled “Dynamic Loading”C extension packages (markupsafe, simplejson, etc.) are compiled to .wasm side modules. PyMode intercepts dlopen/dlsym calls:
import markupsafe → CPython calls dlopen("markupsafe/_speedups.so") → dynload_pymode.c intercepts → WASM host import pymode.dl_open("markupsafe/_speedups.wasm") → JS loads + instantiates the side module → dlsym("PyInit__speedups") returns the init function → CPython initializes the extension moduleMulti-Worker RPC Architecture
Section titled “Multi-Worker RPC Architecture”The 10MB compressed bundle limit applies per worker (total of all JS, WASM, and static assets). For applications needing large packages (numpy, pandas, image processing), you can split across multiple workers connected via Service Bindings:
┌─────────────────────────────────────────────────────────────┐│ Cloudflare Edge (same colo, sub-ms latency) ││ ││ ┌──────────────────┐ Service ┌────────────────────┐ ││ │ PyMode Worker │───Binding───▶│ Compute Worker │ ││ │ (~3MB gz) │ │ (~8MB gz) │ ││ │ │ │ │ ││ │ python.wasm │ RPC │ python.wasm + │ ││ │ on_fetch() │◀────────────│ numpy.wasm (zig) │ ││ │ routing, auth │ │ heavy compute │ ││ └──────────────────┘ └────────────────────┘ ││ │ ││ │ Service Binding ││ ▼ ││ ┌────────────────────┐ ││ │ Extension Worker │ ││ │ (~5MB gz) │ ││ │ C extensions as │ ││ │ WASM side modules │ ││ └────────────────────┘ │└─────────────────────────────────────────────────────────────┘How It Works
Section titled “How It Works”Each worker is a separate deployment with its own 10MB budget, 128MB memory, and 30s CPU time. Workers communicate via Service Bindings — direct worker-to-worker calls within the same Cloudflare colo with no network overhead.
# wrangler.toml for the PyMode handler worker[[services]]binding = "COMPUTE"service = "compute-worker"# PyMode handler — fast routing, auth, template renderingdef on_fetch(request, env): # Delegate heavy compute to a separate worker resp = env.COMPUTE.fetch("http://internal/process", method="POST", body=json.dumps({"data": payload}) ) result = resp.json() return Response.json(result)When to Use Multi-Worker
Section titled “When to Use Multi-Worker”| Scenario | Approach |
|---|---|
| Pure Python packages (jinja2, pyyaml, langchain-core) | Bundle in site-packages.zip |
| Small C extensions (markupsafe) | Pure Python fallback or WASM side module |
| Rust extensions (pydantic_core) | Compiled to WASM variant (python-pydantic-core.wasm) |
| C extensions (numpy) | Compiled to WASM variant (python-numpy.wasm) |
| Heavy compute (ML inference) | Use Workers AI (env.AI.run()) |
Comparison with Pyodide
Section titled “Comparison with Pyodide”| PyMode | Pyodide (CF Python Workers) | |
|---|---|---|
| CPython | Upstream 3.13 | Patched 3.12 fork |
| Compiler | zig cc | Emscripten |
| Target | wasm32-wasi | wasm32-emscripten |
| Binary size | 5.7MB (1.8MB gz) | 20MB (6.4MB gz) |
| Cold start | ~28ms (5ms with Wizer) | ~50ms (with snapshot) |
| Async I/O | Asyncify (transparent) | VFS trampoline |
| CF Bindings | WASM host imports | JS interop bridge |
| Status | Active | Limited beta |