@yoch/frozenminisearch v1.3.0

FrozenMiniSearch

Memory-optimized, read-only full-text search for Node.js and browsers. FrozenMiniSearch keeps the serving API close to MiniSearch while using compact, immutable indexes for fixed corpora.

Try the demo application (Billboard Hot 100 search and auto-suggest in the browser).

Use it when your documents are built offline, shipped to production, and queried many times. In that shape, frozen indexes use ~98-99% less index RAM in the main benchmark set, save to compact binary snapshots, and load faster than MiniSearch JSON.

If you need live add, remove, or discard, use MiniSearch. If the corpus is fixed, this package is designed to keep the search experience familiar while making each serving replica much smaller.

Why FrozenMiniSearch?

FrozenMiniSearch is for the common production path where search data changes elsewhere, not inside the web process:

Build or import the index offline.
Save it as a compact binary snapshot.
Load it in many read-only Node.js processes.
Query with MiniSearch-style search, autoSuggest, filters, boosts, prefix/fuzzy search, wildcard, and AND / OR / AND_NOT.

Internally it replaces mutable JavaScript object graphs with packed radix postings, typed arrays, and columnar stored fields. The result is less flexible than MiniSearch, but much cheaper to keep resident.

Measured vs MiniSearch

Same corpora, same BM25-style queries, MiniSearch 7.2.0 as the reference.

Scenario	Docs	Index RAM	Binary size	Load time	Search p50
Divina, with stored text	14,097	0.3 vs 16.1 MB (~98% less)	~71% less	~54% faster	~19% faster
Divina, index only	14,097	0.2 vs 14.9 MB (~99% less)	~74% less	~82% faster	~17% faster
High-frequency terms	10,000	—	~92% less	~83% faster	~43% faster
Dense numeric ids	100,000	—	~73% less	~88% faster	~28% faster
Uint16 doc id boundary	65,535	—	~77% less	~91% faster	~45% faster

Across this full run, frozen is faster on 24/27 search cases. Divina inferno (exact, paired p50): mutable 14.9 µs → frozen 11.1 µs (-4 µs, ratio 0.74).

Numbers are from benchmarks/baselines/reference.json, captured 2026-06-21 on Node v24.16.0, 3 runs per scenario. Heap protocol v3 (isolated scenario processes, in-process trials, median+MAD on allowlisted scenarios) — trend, not exact accounting. Index RAM column shows — for scenarios outside the heap allowlist.

Quick start

npm install @yoch/frozenminisearch

import FrozenMiniSearch from '@yoch/frozenminisearch'

const options = { fields: ['title', 'text'], storeFields: ['title'] }
const index = FrozenMiniSearch.fromDocuments(documents, options)

index.search('ishmael', { prefix: true })
index.autoSuggest('zen ar')

const buf = index.saveBinarySync()
const loaded = FrozenMiniSearch.loadBinarySync(buf, options)

For larger imports, use the incremental builder:

import FrozenMiniSearch, {
  createFrozenIndexBuilder,
  freezeFrozenIndexBuilder,
} from '@yoch/frozenminisearch'

const builder = createFrozenIndexBuilder(options, { estimatedDocumentCount: rows.length })
for (const doc of rows) builder.add(doc)
const index = freezeFrozenIndexBuilder(builder)

ESM and CommonJS are both supported on Node (main → CJS, module → ESM). For browsers and bundlers, use the dedicated browser entry (search, build, and async binary I/O):

import FrozenMiniSearch from '@yoch/frozenminisearch/browser'

const index = FrozenMiniSearch.fromDocuments(documents, options)
index.search('ishmael', { prefix: true })

// Load a zlib snapshot from CDN (Uint8Array)
const buf = new Uint8Array(await (await fetch('/index.frozen')).arrayBuffer())
const loaded = await FrozenMiniSearch.loadBinaryAsync(buf, options)

See the hosted demo or examples/plain_js_frozen/ locally (yarn demo:prepare then serve the repo root).

Usage

Basic usage

const documents = [
  { id: 1, title: 'Moby Dick', text: 'Call me Ishmael. Some years ago...', category: 'fiction' },
  { id: 2, title: 'Zen and the Art of Motorcycle Maintenance', text: 'I can see by my watch...', category: 'fiction' },
  // ...
]

const options = { fields: ['title', 'text'], storeFields: ['title', 'category'] }
const index = FrozenMiniSearch.fromDocuments(documents, options)

index.search('zen art motorcycle')
// => [{ id, title, category, score, match, ... }, ...]

Frozen indexes are read-only: there is no add, remove, or discard. Rebuild offline or use createFrozenIndexBuilder for incremental ingestion before freeze.

Search options

MiniSearch-style options work on search() and autoSuggest():

index.search('zen', { fields: ['title'] })
index.search('zen', { boost: { title: 2 } })
index.search('moto', { prefix: true })
index.search('ismael', { fuzzy: 0.2 })
index.search('zen', { filter: (result) => result.category === 'fiction' })
index.search('zen', { combineWith: 'AND' }) // OR, AND_NOT

const index = FrozenMiniSearch.fromDocuments(documents, {
  fields: ['title', 'text'],
  searchOptions: { prefix: true, fuzzy: 0.2 },
})

Wildcard and nested query combinations are supported (FrozenMiniSearch.wildcard, QueryCombination).

Auto-suggestions

index.autoSuggest('zen ar')
// => [{ suggestion: 'zen archery art', terms: [...], score }, ...]

index.autoSuggest('neromancer', { fuzzy: 0.2 })
index.autoSuggest('zen ar', { filter: (result) => result.category === 'fiction' })

Field extraction

For nested or computed fields, pass extractField at index build time (and again when loading binary snapshots if you override defaults):

const options = {
  fields: ['title', 'author.name', 'pubYear'],
  extractField: (document, fieldName) => {
    if (fieldName === 'pubYear') {
      return document.pubDate?.getFullYear().toString()
    }
    return fieldName.split('.').reduce((doc, key) => doc && doc[key], document)
  },
}

The default extractor is available via FrozenMiniSearch.getDefault('extractField').

Tokenization

const options = {
  fields: ['title', 'text'],
  tokenize: (string, _fieldName) => string.split('-'),
  searchOptions: {
    tokenize: (string) => string.split(/[\s-]+/),
  },
}

FrozenMiniSearch.getDefault('tokenize') returns the built-in Unicode space/punctuation splitter. Only that exact function reference enables the fastest indexing path; equivalent wrappers use the general path.

Term processing

const stopWords = new Set(['and', 'or', 'the'])

const options = {
  fields: ['title', 'text'],
  processTerm: (term) => (stopWords.has(term) ? null : term.toLowerCase()),
  searchOptions: {
    processTerm: (term) => term.toLowerCase(),
  },
}

FrozenMiniSearch.getDefault('processTerm') downcases terms (no stemming or stop-word list by default).

Default helpers

FrozenMiniSearch.getDefault('tokenize')
FrozenMiniSearch.getDefault('processTerm')
FrozenMiniSearch.getDefault('extractField')
FrozenMiniSearch.getDefault('stringifyField')

Use these when wrapping a custom function and delegating to the library default, same as MiniSearch.getDefault.

Migration

For fixed corpora, most serving code can stay the same. Change how the index is built or loaded, then keep calling search, autoSuggest, has, and getStoredFields.

Default and named imports both work:

// ESM
import FrozenMiniSearch from '@yoch/frozenminisearch'
import { FrozenMiniSearch } from '@yoch/frozenminisearch'

// CommonJS
const FrozenMiniSearch = require('@yoch/frozenminisearch')
const { FrozenMiniSearch } = require('@yoch/frozenminisearch')

Build directly:

import FrozenMiniSearch from '@yoch/frozenminisearch'

const frozen = FrozenMiniSearch.fromDocuments(documents, options)

Or freeze an existing MiniSearch index:

import MiniSearch from 'minisearch'
import FrozenMiniSearch from '@yoch/frozenminisearch'

const mutable = new MiniSearch(options)
mutable.addAll(documents)

const frozen = FrozenMiniSearch.fromMiniSearch(mutable, options)
const fromJson = FrozenMiniSearch.fromJson(JSON.stringify(mutable), options)

MiniSearch is only needed if you still build mutable indexes. Frozen instances do not support live add, remove, or discard.

Search API (compatible with MiniSearch)

search(query, searchOptions?) — string, wildcard (FrozenMiniSearch.wildcard), or nested QueryCombination
autoSuggest(queryString, options?)
has(id), getStoredFields(id)
getDefault(optionName) — built-in tokenize, processTerm, extractField, stringifyField, …
saveBinarySync / loadBinarySync on Node (async variants too); browser entry supports async binary only (Uint8Array, raw / zlib / auto)

Custom tokenize and processTerm functions are not stored in snapshots; pass the same functions again when loading.

See Usage above for examples.

Binary snapshots (Node)

Binary snapshots are the preferred production format on Node.js.

const buf = index.saveBinarySync()
const loaded = FrozenMiniSearch.loadBinarySync(buf, {}) // field names embedded in snapshot

Node ≥ 20
compression: 'auto' uses zlib when it shrinks the payload (portable on Node 20+ and in the browser build); falls back to raw when compression does not help.
Use explicit compression when you need a specific artifact:

const portable = index.saveBinarySync({ compression: 'zlib' }) // CDN / browser
const uncompressed = index.saveBinarySync({ compression: 'raw' })
const bestRatio = index.saveBinarySync({ compression: 'zstd' }) // Node 22.15+ only

Raw snapshots load in the browser without native compression APIs. zlib snapshots in the browser require CompressionStream / DecompressionStream. Browser binary I/O is async because it uses native browser stream APIs, but it still materializes the full compressed/decompressed payload in memory. zstd snapshots require Node 22.15+ (read/write on Node; not supported in the browser build).

Benchmarks

See benchmarks/README.md.

npm run bench -- run --profile=vs-reference   # compare frozen vs minisearch
npm run bench:diff                            # regression vs reference.json
npm run bench:readme -- --from=benchmarks/baselines/latest.json

Development

yarn install
yarn test          # src/ + dev/parity/
yarn build
node scripts/verify-npm-pack.cjs

Parity tests compare against MiniSearch 7. Longer notes and performance work live under dev/docs/README.md and benchmarks/README.md.

Changelog & credits

See CHANGELOG.md.

MiniSearch — Luca Ongaro (MIT)
@yoch/frozenminisearch — memory-optimized frozen indexes and compact binary snapshots

Upstream docs: MiniSearch