Properties

category: reference
tags: [tasks, semantic-search, architecture]
last_updated: 2026-03-20
confidence: high

Semantic Search Architecture — IMPLEMENTED

Implementation (2026-03-20)

Semantic search is fully implemented and operational. All issues listed below have been resolved.

FAISS backend with IndexFlatIP, per-wiki indexes under /srv/data/faiss/{slug}/
ONNX MiniLM-L6-v2 embeddings (ChromaDB bundled model)
Multi-tenant via BackendRegistry — lazy per-wiki index creation
Synchronous lifecycle hooks — HookListener registers page_saved, page_deleted, page_renamed hooks that trigger immediate FAISS upsert/delete. No background worker or queue.
Auto-reindex on first wiki access back-fills existing pages
reindex_all is per-wiki scoped

Per-wiki FAISS directories (/srv/data/faiss/{slug}/), per-wiki state managed by BackendRegistry.

Synchronous lifecycle hooks replace the daemon sync thread. No risk of mid-write corruption from SIGTERM killing a background thread.

Hook-based updates are immediate on page save/delete/rename. No polling.

Not blocking. Current corpus sizes are well within the estimated ~10K chunk threshold. Can revisit if corpus grows significantly.

Tasks/Semantic_Search_Multi_Tenant — multi-tenant implementation details
Design/Async_Embedding_Pipeline — original FAISS + MiniLM design (AWS, archived)