Vector Search

QueryMode supports vector similarity search on embedding columns stored in Lance format. Searches use WASM SIMD for acceleration and HNSW or IVF-PQ indexes when available.

DataFrame API

// Search with a raw vector
const similar = await qm
  .table("images")
  .vector("embedding", queryVector, 10)
  .select("id", "title")
  .collect()

// Search with text (requires encoder)
const similar = await qm
  .table("articles")
  .vector("embedding", "climate change solutions", 10, {
    encoder: async (text) => myModel.encode(text),
    metric: "cosine",
  })
  .collect()

Parameters

Parameter	Type	Description
`column`	`string`	Column containing `Float32Array` embeddings
`queryVector`	`Float32Array \| string`	Query vector or text (text requires `encoder`)
`topK`	`number`	Number of nearest neighbors to return
`opts.metric`	`"cosine" \| "l2" \| "dot"`	Distance metric (default: `"cosine"`)
`opts.encoder`	`(text: string) => Promise<Float32Array>`	Text-to-vector encoder for string queries
`opts.nprobe`	`number`	IVF-PQ tuning: number of partitions to probe
`opts.efSearch`	`number`	HNSW tuning: search beam width

SQL

SELECT id, title FROM articles
WHERE embedding NEAR [0.1, 0.2, 0.3, ...] TOPK 10

The NEAR operator performs vector similarity search. TOPK limits results to the K nearest neighbors.

Distance metrics

Metric	Description	Best for
`cosine`	Cosine similarity (default)	Text embeddings, normalized vectors
`l2`	Euclidean distance	Spatial data, unnormalized vectors
`dot`	Dot product	When vectors are pre-normalized

Index types

Flat (no index)

Without an index, QueryMode performs brute-force SIMD-accelerated distance computation across all vectors. Fast for datasets under ~100K vectors.

IVF-PQ (load pre-built)

IVF-PQ (Inverted File with Product Quantization) indexes can be loaded from R2 for search:

IVF partitions vectors into clusters. At query time, only nprobe clusters are searched.
PQ compresses vectors into compact codes, reducing memory and I/O.

IVF-PQ indexes must be built externally (e.g. with LanceDB or FAISS) and stored in R2 alongside the data. QueryMode loads and searches them via the WASM engine. For indexes you can build directly in QueryMode, use HNSW.

HNSW

For datasets where you need fast approximate nearest neighbor search with high recall, build an HNSW (Hierarchical Navigable Small World) index:

import { HnswIndex } from "querymode"

// Build index
const index = new HnswIndex({
  dim: 128,
  metric: "cosine",
  M: 16,              // max connections per node (default: 16)
  efConstruction: 200, // construction beam width (default: 200)
})

// Add vectors one at a time
index.add(0, vec0)
index.add(1, vec1)

// Or batch add from a contiguous Float32Array
const allVectors = new Float32Array(1000 * 128) // 1000 vectors, 128 dims
index.addBatch(allVectors, 128)

// Search
const { indices, scores } = index.search(queryVec, 10, /* efSearch */ 50)
// indices: Uint32Array of nearest neighbor IDs
// scores: Float32Array of distances (lower = more similar)

HNSW tuning

Parameter	Default	Effect
`M`	16	Higher = better recall, more memory. 12-48 typical.
`efConstruction`	200	Higher = better index quality, slower build. 100-400 typical.
`efSearch`	topK	Higher = better recall at query time, slower search. Set to 2-4x topK.

Serialization

HNSW indexes can be serialized to binary for storage (R2, disk) and deserialized on load:

// Save
const binary: ArrayBuffer = index.serialize()
await bucket.put("indexes/embeddings.hnsw", binary)

// Load
const data = await bucket.get("indexes/embeddings.hnsw")
const restored = HnswIndex.deserialize(await data.arrayBuffer(), "cosine")
const results = restored.search(queryVec, 10)

IVF-PQ vs HNSW

	IVF-PQ	HNSW
Speed	Fast (quantized distances)	Fast (graph traversal)
Memory	Low (compressed codes)	High (full vectors + graph)
Recall	Good with enough probes	Excellent
Build time	External (k-means training)	Incremental (add one at a time)
Build in QueryMode	No (load pre-built)	Yes (`HnswIndex`)
Best for	Large datasets (>1M vectors)	Medium datasets (<1M vectors)

Combining with filters

Vector search composes with all other DataFrame operations:

const results = await qm
  .table("products")
  .filter("category", "eq", "electronics")
  .filter("price", "lt", 1000)
  .vector("embedding", queryVec, 20, { metric: "l2" })
  .select("id", "name", "price")
  .collect()

Vector search runs across all matching fragments. When combined with filters, both are applied during the scan phase — the query engine evaluates filter predicates and vector distances together to return filtered nearest neighbors.