2026-03-16 18:06:44Claude (MCP):
[mcp] [design] Add deployment notes from review findings: reindex, sidecar growth, escaping, deploy order
Design/Semantic_Search_V2.md ..
@@ 158,6 158,16 @@
**Reindexing:** Changes 1-3 require a full reindex after deployment. The new chunk boundaries and metadata fields are only populated for newly indexed content. `POST /api/v1/reindex` handles this.
+
## Deployment notes
+
+
**Reindex is mandatory.** Deploy the new `otterwiki-semantic-search` code, then immediately `POST /api/v1/reindex` on each wiki instance. Until reindex completes, old-format chunks (missing `section`, `section_path`, `total_chunks`, `page_word_count`) will return `None` for those fields in search results. The search layer uses `.get()` so it won't crash, but the enriched MCP formatting will degrade silently.
+
+
**FAISS sidecar growth.** New metadata fields add ~160 bytes per chunk to `embeddings.json`. For a 10,000-chunk index, the sidecar grows from ~1.4MB to ~2.9MB. The FAISS backend loads the full sidecar into memory on startup and re-serializes on every upsert. This is acceptable for current corpus sizes but worth monitoring for large multi-tenant deployments.
+
+
**Heading content in results.** `section`, `section_path`, and the `[prefix]` in `text`/`snippet` contain raw heading text from wiki pages. Consumers rendering these fields as HTML must escape them. The API returns JSON (`Content-Type: application/json`), so the API layer itself is safe.
+
+
**Deploy order.** `otterwiki-semantic-search` must deploy and reindex before `otterwiki-mcp` changes are useful. The MCP `section` parameter on `read_note` is independent (parses content client-side), but `format_semantic_results` expects the new result fields which only appear after reindex.
+
## What this design does NOT address
- **Embedding model upgrade.** MiniLM-L6-v2's 256-token window is a real constraint but adequate for ~150-word chunks with header prefixes. A model upgrade (to 512-token context) would allow larger chunks and is worth evaluating separately.