Lazy Evaluation
Execution model
Section titled “Execution model”DataFrame methods like .filter(), .sort(), .limit() do not execute anything. They build a QueryDescriptor — a plain object describing what to do. Execution happens when you call a terminal method.
// Nothing executes here — just builds a descriptorconst query = qm.table("events") .filter("status", "eq", "active") .sort("created_at", "desc") .limit(100)
// Execution happens HEREconst result = await query.collect()Terminal methods
Section titled “Terminal methods”| Method | What it does | When to use |
|---|---|---|
.collect() | Execute and return all matching rows | Default — most queries |
.exec() | Alias for .collect() | Same thing |
.first() | Return first matching row or null | Existence check or single lookup |
.count() | Return row count without materializing | Counting without data transfer |
.exists() | Return true if any row matches | Cheapest existence check |
.lazy() | Return a LazyResultHandle for paging | Large results, on-demand pages |
.stream() | Yield Row[] batches via AsyncGenerator | Process rows without loading all into memory |
.cursor() | Return AsyncIterable<Row[]> for streaming | Same as stream, requires executor cursor support |
.explain() | Return query plan without executing | Debugging, inspect pruning |
Lazy result handle
Section titled “Lazy result handle”.lazy() returns a LazyResultHandle that executes pages on demand:
const handle = await qm.table("events") .filter("status", "eq", "active") .sort("created_at", "desc") .lazy()
// Fetch page 0 (rows 0-99)const page0 = await handle.page(0, 100)
// Fetch page 3 (rows 300-399)const page3 = await handle.page(300, 100)
// Fetch a single rowconst row42 = await handle.row(42)
// Full materialization if neededconst all = await handle.collect()Each .page() call is a separate query execution with offset and limit. No state is held between pages — the handle re-executes the query each time. This means:
- Pages can be fetched in any order
- No memory accumulates between pages
- Sorted results are consistent if data doesn’t change
Columnar pipeline and materialization
Section titled “Columnar pipeline and materialization”Internally, QueryMode defers Row[] creation as long as possible. Data flows through the pipeline in columnar format — column buffers and selection vectors — rather than as JavaScript objects.
Local execution (Node/Bun):
ScanOperatorreads column pages from disk into typed arrays- Filters run via WASM SIMD directly on column buffers — no row objects created
- Aggregation, sort, and limit operate on columnar batches
Row[]objects are materialized only at the finalcollect()/stream()boundary
Edge execution (Cloudflare Workers):
FragmentDOscans pages and runs WASM queries, producing QMCB (QueryMode Columnar Binary) — a zero-copy columnar wire format- QMCB
ArrayBuffertransfers over Worker RPC via structured clone (not JSON serialization) QueryDOreceives QMCB, merges/sorts columnar batches without creatingRow[]Row[]is materialized only at the HTTP response boundary viacolumnarBatchToRows()
This means a query that scans 1M rows but returns 100 never creates 1M JavaScript objects — filters eliminate rows at the column-buffer level, and only the 100 matching rows become Row[] at the end.
For custom operator pipelines, drainPipeline() is the function that exhausts an operator chain and materializes all output rows:
import { buildPipeline, drainPipeline } from "querymode"
const pipeline = buildPipeline(descriptor, fragmentSource)const { rows, columns } = await drainPipeline(pipeline)Streaming iteration
Section titled “Streaming iteration”.stream() works directly on the DataFrame — no .lazy() needed:
for await (const batch of qm.table("events").stream(500)) { // batch is Row[] with up to 500 rows process(batch) // Break early to stop fetching if (done) break}If the executor supports cursors, .stream() fetches batches incrementally. Otherwise it falls back to .collect() and yields slices — still useful for processing without holding all rows in your code at once.
.stream() is also available on LazyResultHandle:
const handle = await qm.table("events").lazy()
for await (const batch of handle.stream(500)) { process(batch)}Cursor
Section titled “Cursor”.cursor() is the low-level streaming primitive. It requires an executor with cursor support (e.g., edge mode) and throws if not available:
for await (const batch of qm.table("events").cursor({ batchSize: 1000 })) { await processBatch(batch)}Prefer .stream() unless you need to guarantee incremental fetching.
Keyset pagination
Section titled “Keyset pagination”For large sorted datasets, offset-based pagination gets slower as offset grows (the engine must skip N rows). Keyset pagination uses the last seen value to start the next page:
// First pageconst page1 = await qm.table("events") .sort("id", "asc") .limit(50) .collect()
// Next page — starts after the last idconst lastId = page1.rows[page1.rows.length - 1].idconst page2 = await qm.table("events") .sort("id", "asc") .after(lastId) .limit(50) .collect().after(value) translates to a gt filter on the sort column (or lt for descending sorts), which benefits from page-level skip. Every page is equally fast regardless of depth.