Properties

category: reference
tags: [tasks, semantic-search, architecture]
last_updated: 2026-03-15
confidence: high

Semantic Search Architecture Issues

Current state

FAISS + ONNX MiniLM embedding, running in-process in the gunicorn worker. Works for single-tenant. 65 pages indexed for dev wiki.

Issues to address

1. Multi-tenant indexing (blocking)

The sync thread watches one wiki (whichever storage was set at startup). TenantResolver swaps storage per-request, but the sync thread holds the original reference. Each wiki needs its own FAISS index directory and its own sync state. The reindex_all function also wipes and rebuilds the entire shared index.

Needed: Per-wiki FAISS directories (/srv/data/faiss/{slug}/), per-wiki sync state, sync thread that iterates over all wikis or per-wiki threads.

2. In-process embedding risks

The ONNX model (~80MB) loads in the gunicorn worker. The sync thread is a daemon thread — killed without cleanup on SIGTERM. If killed mid-write to the FAISS index, the index could corrupt (recovered by full reindex on next start, but that's slow).

Options:

Separate embedding worker process (like ChromaDB was, but lighter)
Queue-based: page saves write to a queue (SQLite reindex_queue table already in schema), worker process reads and embeds
Graceful shutdown handler in sync thread

3. Sync frequency

Currently every 60 seconds by polling git HEAD SHA. For a multi-tenant setup with many wikis, polling every wiki every 60 seconds doesn't scale. A queue (reindex_queue table triggered by page_saved hook) would be more efficient.

Not blocking launch

Semantic search works for the dev wiki. Multi-tenant indexing is needed before opening to users with multiple wikis. The in-process risks and sync frequency are optimization concerns for later.