textsift
What this is
Section titled “What this is”A npm package that runs openai/privacy-filter — a 1.5B / 50M-active MoE bidirectional token classifier for PII detection — entirely on the user’s device. Per-platform GPU fast paths (Metal on macOS, Vulkan on Linux, Dawn on Windows, WebGPU in browsers); Zig + SIMD128 WASM as the no-GPU fallback. Apache 2.0.
The model is OpenAI’s; the value of this package is packaging:
- A native o200k-style BPE tokenizer in pure TypeScript. If your app doesn’t already ship
@huggingface/transformersfor other models, that’s a real bundle-size win. - Per-platform native GPU backends — hand-MSL on macOS, hand-GLSL→SPIR-V on Linux, Tint→D3D12 on Windows, plus WGSL for browser WebGPU. All produce byte-identical span output.
- A WASM CPU path (Zig + SIMD128) that loads
model_q4f16.onnxdirectly. The transformers.js / ORT-Web stack can’t: ORT-Web’s WASM bundle has noMatMulNBits/GatherBlockQuantizedkernels for int4. Other JS runtimes (onnxruntime-node,web-llm, etc.) can in principle but don’t ship out-of-the-box for this model. - Persistent OPFS caching of the 770 MB model weights in browsers (filesystem cache in Node), configured by default.
- Streaming overloads of
detect()andredact()— pass anAsyncIterable<string>to abort an LLM stream the moment a credit card or API key appears, render redacted text progressively, or front a model gateway (Cloudflare-Worker style) that has to forward chunk-by-chunk. - Custom rule engine + built-in
"secrets"preset (JWT, GitHub PAT, AWS, Slack, OpenAI/Anthropic/Google/Stripe keys, PEM private-key headers).
Install
Section titled “Install”npm install textsiftOne package, two import entry points + a CLI:
// Browser / Node-via-WASM (today)import { PrivacyFilter } from "textsift/browser";
// Node native — auto-picks the platform's GPU fast path (Metal on// macOS, Vulkan on Linux, Dawn on Windows). Falls back to WASM if no GPU.import { PrivacyFilter } from "textsift";# Same engine as a CLI — composes with shell, no clipboard danceecho "Hi Alice, alice@example.com" | npx textsift redactnpx textsift table customers.csv --header --mode synth > clean.csvSee the CLI reference for every subcommand.
30-second example
Section titled “30-second example”import { PrivacyFilter } from "textsift/browser";
const filter = await PrivacyFilter.create();const result = await filter.redact( "Hi, my name is John Smith and my email is john@example.com.",);
console.log(result.redactedText);// "Hi, my name is [private_person] and my email is [private_email]."Measured numbers
Section titled “Measured numbers”Browser (M3 Pro, Chromium 147) — per-forward latency, median of 5 runs:
| Input length | textsift (WebGPU) | textsift (WASM MT) | tjs (WebGPU default) |
|---|---|---|---|
| ~7 tokens | 8.9 ms | 29.0 ms | 32.7 ms |
| ~25 tokens | 11.8 ms | 44.6 ms | 38.5 ms |
| ~80 tokens | 22.0 ms | 95.9 ms | 56.4 ms |
textsift WebGPU is 2.6–3.7× faster than transformers.js across every input length.
Node native (M2 Pro, Metal-direct) — synthetic-weight forward at production model dimensions:
| T | textsift native |
|---|---|
| 7 | 5.2 ms |
| 32 | 10.8 ms |
| 80 | 23.8 ms |
Hand-written MSL beats Dawn’s WGSL→MSL codegen by ~1.9× on the same hardware.
Node native (Linux Intel Iris Xe, Vulkan-direct) — the differentiator:
| T | textsift native | ONNX Runtime Node CPU | Speedup |
|---|---|---|---|
| 32 | 28 ms | ~800 ms | 28× |
GPU-accelerated PII filtering on Intel iGPU / AMD APU / non-NVIDIA hardware without CUDA, without ROCm, without driver dance. npm install textsift ships a vendored Vulkan-direct binary that talks to whatever Mesa-supported GPU is there.
Cold start: we don’t claim a speedup over transformers.js. The OPFS-vs-Cache-API gap is a storage decision, not an inference-engine one. See benchmarks for the full breakdown.
CPU fallback
Section titled “CPU fallback”If no GPU is available (Linux without Vulkan, Node in a sandbox, browsers without WebGPU + shader-f16), import { PrivacyFilter } from "textsift" automatically falls back to the WASM CPU path. textsift’s WASM is a Zig + SIMD128 implementation that loads model_q4f16.onnx directly — transformers.js with device: "wasm" fails session creation on this model because ORT-Web is missing the int4 contrib kernels (MatMulNBits / GatherBlockQuantized), so within the standard JS-app ecosystem this is the only working CPU path. Other runtimes (e.g. onnxruntime-node with custom ops, web-llm) can in principle load it but require setup textsift doesn’t.
Caveats
Section titled “Caveats”openai/privacy-filter is a detection aid, not an anonymization guarantee. Read the caveats page and OpenAI’s model card before treating output as compliance-safe.