Skip to content

Vector Search

QueryMode supports vector similarity search on embedding columns stored in Lance format. Searches use WASM SIMD for acceleration and HNSW or IVF-PQ indexes when available.

// Search with a raw vector
const similar = await qm
.table("images")
.vector("embedding", queryVector, 10)
.select("id", "title")
.collect()
// Search with text (requires encoder)
const similar = await qm
.table("articles")
.vector("embedding", "climate change solutions", 10, {
encoder: async (text) => myModel.encode(text),
metric: "cosine",
})
.collect()
ParameterTypeDescription
columnstringColumn containing Float32Array embeddings
queryVectorFloat32Array | stringQuery vector or text (text requires encoder)
topKnumberNumber of nearest neighbors to return
opts.metric"cosine" | "l2" | "dot"Distance metric (default: "cosine")
opts.encoder(text: string) => Promise<Float32Array>Text-to-vector encoder for string queries
opts.nprobenumberIVF-PQ tuning: number of partitions to probe
opts.efSearchnumberHNSW tuning: search beam width
SELECT id, title FROM articles
WHERE embedding NEAR [0.1, 0.2, 0.3, ...] TOPK 10

The NEAR operator performs vector similarity search. TOPK limits results to the K nearest neighbors.

MetricDescriptionBest for
cosineCosine similarity (default)Text embeddings, normalized vectors
l2Euclidean distanceSpatial data, unnormalized vectors
dotDot productWhen vectors are pre-normalized

Without an index, QueryMode performs brute-force SIMD-accelerated distance computation across all vectors. Fast for datasets under ~100K vectors.

IVF-PQ (Inverted File with Product Quantization) indexes can be loaded from R2 for search:

  • IVF partitions vectors into clusters. At query time, only nprobe clusters are searched.
  • PQ compresses vectors into compact codes, reducing memory and I/O.

IVF-PQ indexes must be built externally (e.g. with LanceDB or FAISS) and stored in R2 alongside the data. QueryMode loads and searches them via the WASM engine. For indexes you can build directly in QueryMode, use HNSW.

For datasets where you need fast approximate nearest neighbor search with high recall, build an HNSW (Hierarchical Navigable Small World) index:

import { HnswIndex } from "querymode"
// Build index
const index = new HnswIndex({
dim: 128,
metric: "cosine",
M: 16, // max connections per node (default: 16)
efConstruction: 200, // construction beam width (default: 200)
})
// Add vectors one at a time
index.add(0, vec0)
index.add(1, vec1)
// Or batch add from a contiguous Float32Array
const allVectors = new Float32Array(1000 * 128) // 1000 vectors, 128 dims
index.addBatch(allVectors, 128)
// Search
const { indices, scores } = index.search(queryVec, 10, /* efSearch */ 50)
// indices: Uint32Array of nearest neighbor IDs
// scores: Float32Array of distances (lower = more similar)
ParameterDefaultEffect
M16Higher = better recall, more memory. 12-48 typical.
efConstruction200Higher = better index quality, slower build. 100-400 typical.
efSearchtopKHigher = better recall at query time, slower search. Set to 2-4x topK.

HNSW indexes can be serialized to binary for storage (R2, disk) and deserialized on load:

// Save
const binary: ArrayBuffer = index.serialize()
await bucket.put("indexes/embeddings.hnsw", binary)
// Load
const data = await bucket.get("indexes/embeddings.hnsw")
const restored = HnswIndex.deserialize(await data.arrayBuffer(), "cosine")
const results = restored.search(queryVec, 10)
IVF-PQHNSW
SpeedFast (quantized distances)Fast (graph traversal)
MemoryLow (compressed codes)High (full vectors + graph)
RecallGood with enough probesExcellent
Build timeExternal (k-means training)Incremental (add one at a time)
Build in QueryModeNo (load pre-built)Yes (HnswIndex)
Best forLarge datasets (>1M vectors)Medium datasets (<1M vectors)

Vector search composes with all other DataFrame operations:

const results = await qm
.table("products")
.filter("category", "eq", "electronics")
.filter("price", "lt", 1000)
.vector("embedding", queryVec, 20, { metric: "l2" })
.select("id", "name", "price")
.collect()

Vector search runs across all matching fragments. When combined with filters, both are applied during the scan phase — the query engine evaluates filter predicates and vector distances together to return filtered nearest neighbors.