Semantic Search Architecture

---
category: reference
tags: [tasks, semantic-search, architecture]
last_updated: 2026-03-20
confidence: high
---

# Semantic Search Architecture — IMPLEMENTED

## Implementation (2026-03-20)

Semantic search is fully implemented and operational. All issues listed below have been resolved.

### Architecture
- **FAISS backend** with `IndexFlatIP`, per-wiki indexes under `/srv/data/faiss/{slug}/`
- **ONNX MiniLM-L6-v2 embeddings** (ChromaDB bundled model)
- **Multi-tenant via `BackendRegistry`** — lazy per-wiki index creation
- **Synchronous lifecycle hooks** — `HookListener` registers `page_saved`, `page_deleted`, `page_renamed` hooks that trigger immediate FAISS upsert/delete. No background worker or queue.
- **Auto-reindex** on first wiki access back-fills existing pages
- `reindex_all` is per-wiki scoped

### REST API
- `GET /api/v1/semantic-search` — query semantic search
- `POST /api/v1/reindex` — trigger full reindex for a wiki
- `GET /api/v1/reindex/status` — check reindex progress

### MCP integration
- `semantic_search` MCP tool calls the REST API

### Tests
- Tests exist and pass

## Resolved issues

### 1. Multi-tenant indexing — RESOLVED
Per-wiki FAISS directories (`/srv/data/faiss/{slug}/`), per-wiki state managed by `BackendRegistry`.

### 2. In-process embedding risks — RESOLVED
Synchronous lifecycle hooks replace the daemon sync thread. No risk of mid-write corruption from SIGTERM killing a background thread.

### 3. Sync frequency — RESOLVED
Hook-based updates are immediate on page save/delete/rename. No polling.

### 4. FAISS sidecar scalability — DEFERRED
Not blocking. Current corpus sizes are well within the estimated ~10K chunk threshold. Can revisit if corpus grows significantly.

## Related
- [[Tasks/Semantic_Search_Multi_Tenant]] — multi-tenant implementation details
- [[Design/Async_Embedding_Pipeline]] — original FAISS + MiniLM design (AWS, archived)