Semantic Search
The most useful kind of search for a long-term writing project isn’t keyword search. It’s meaning search. You have a vague recollection of writing something about a specific idea, you don’t remember the exact words you used, and you want to find it anyway. Keyword search fails — you type the thing you think you said, and you don’t find your own work because you actually said it differently. This is the default failure of every text search tool that works on literal string matching, and it gets worse the larger your project grows. By the time you have 300,000 words of manuscript and 500 Legendry entries, trying to find “that passage about the cost of magic” via Ctrl+F is genuinely painful. The solution is semantic search — an approach that indexes text by its meaning (represented as a high-dimensional vector) rather than by the literal words, so that queries find matches based on conceptual similarity instead of exact string match. This is what Ishvana does under the hood for every major search surface, and the technology that powers it is interesting enough to be worth a dedicated page.
The core idea
Section titled “The core idea”Every piece of text Ishvana cares about — lore entries, research bookmarks, chat history, document excerpts, outline nodes — gets converted into a vector embedding when it’s created or updated. An embedding is a list of floating-point numbers (Ishvana uses 1024 dimensions, produced by the bundled multilingual-e5-large-instruct model) that represents the meaning of the text. Two pieces of text that mean similar things have similar embeddings — their vectors point in similar directions in the high-dimensional space. Two pieces of text that mean different things have different embeddings.
When you search, your query gets converted to an embedding using the same model. Ishvana then looks for indexed embeddings that are nearest to your query embedding in vector space. The nearest matches are the most semantically similar content to what you asked for.
The crucial property of this system: you don’t have to share any actual words with the content you’re searching for. A query like “the relationship between magic and personal cost” can find a lore entry about “power has a price” even though the two share almost no literal words. They mean similar things, so their embeddings are close, so the search finds it.
ChromaDB
Section titled “ChromaDB”The vector database Ishvana uses is called ChromaDB. It’s an open-source library designed specifically for storing and querying vector embeddings. It lives as a file-based database in data/chromadb/ (same data directory as everything else), and it’s accessed by Ishvana’s backend whenever a search involves vectors.
ChromaDB handles three things:
- Storing vectors. When you add a lore entry or a research bookmark, its embedding gets stored in the appropriate ChromaDB collection.
- Querying vectors. Given a query vector, ChromaDB returns the N nearest vectors in the collection, ranked by similarity. Returns include the original text and metadata alongside the similarity score.
- Managing collections. Different content types live in different collections — lore entries in one, research bookmarks in another, document chunks in a third. Queries can be scoped to specific collections or run across all of them.
Why ChromaDB specifically rather than one of the alternatives (Weaviate, Qdrant, Pinecone, FAISS): because it’s file-based and embedded. There’s no separate database server to run. The whole thing lives as a folder on your disk, same as SQLite. That means ChromaDB inherits the same “local-first, no daemon” properties as the rest of Ishvana’s data storage. No network. No running service. No setup beyond “Ishvana’s backend includes the ChromaDB library.”
Local embedding model
Section titled “Local embedding model”Ishvana’s embedding model runs locally. It doesn’t send your text to a cloud API to generate embeddings. This matters for three reasons:
- Privacy. Your manuscript, your lore, your research notes — all of it gets embedded on your machine, not a third party’s server. Even the search index is yours alone.
- Cost. Cloud embedding APIs are cheap but not free, and a large project has a lot of embeddings. A 500-entry Legendry with rich section content might have thousands of embeddings. A research library with hundreds of bookmarks adds more. Running the embedding model locally costs zero dollars in API fees.
- Speed. Local embeddings run in tens of milliseconds for a short query. Cloud embeddings add round-trip latency. For interactive search that should feel instant, local is the only option.
The specific model used is intfloat/multilingual-e5-large-instruct — a transformer model that produces 1024-dimensional embeddings. Its base is XLM-RoBERTa-large, pretrained on CC-100 only (2.5 TB of filtered Common Crawl across 100 languages), with no BookCorpus, no scraped fiction, and no copyrighted-novel corpus anywhere in its training pipeline. That last detail matters for commercial-use authors — the embedding pipeline can’t have memorized prose it shouldn’t have, because none of that prose was ever in the training data. The model is MIT-licensed and bundled with the app. (Earlier Ishvana releases shipped MiniLM-L6 and all-mpnet-base-v2; the current model was swapped in 2026-04-25 when an audit found that the prior model’s base inherited BookCorpus through RoBERTa pretraining.)
What gets indexed
Section titled “What gets indexed”Several categories of content get indexed for semantic search:
- Lore entries. Every entry in your Legendry — title, summary, sections, tags — gets embedded and stored in the lore collection. When you search the Legendry, the query runs against these embeddings.
- Research bookmarks. Every page you save as a Smart Bookmark gets its content embedded, along with its title and generated summary. The research panel’s semantic search queries this collection.
- Document chunks. When you have a long manuscript, Ishvana breaks it into smaller chunks (usually paragraph-level or scene-level) and embeds each chunk. Search within your manuscript queries this collection.
- YouTube transcripts. Transcribed YouTube content gets embedded at the segment level.
- Entity mentions. Entity references detected by the entity extractor get embedded with their surrounding context.
Each collection has its own namespace in ChromaDB, so a search scoped to “lore only” doesn’t accidentally pull in research bookmarks or document chunks.
Hybrid search: keyword + semantic
Section titled “Hybrid search: keyword + semantic”Pure semantic search has a specific failure mode — it’s sometimes too abstract. A query like “Crimson Hyenas” is a literal name search, and the author wants entries that literally contain “Crimson Hyenas,” not entries that are conceptually adjacent to “outlaw bands.” Pure semantic search might rank “Heaven’s Fall” (another outlaw faction) higher than the actual Crimson Hyenas entry, which is wrong.
Ishvana’s default search is hybrid — it runs both a keyword search and a semantic search, and combines the results with a weighted ranking. Exact name matches get a strong boost. Semantic matches get a softer boost. The final ranked list has both the literal matches at the top and the meaning-adjacent matches below.
The weighting is tuned so that a query that’s clearly a name search (short, proper-noun-heavy) favors keyword matches. A query that’s clearly a concept search (longer, descriptive, no proper nouns) favors semantic matches. You don’t explicitly pick which mode to use — the search infers from the query shape.
When semantic search shines
Section titled “When semantic search shines”A few specific use cases where semantic is meaningfully better than keyword:
Thematic lookups across a large corpus. “What are the scenes in my manuscript where a character is grieving?” — keyword search can’t find these because “grief” doesn’t necessarily appear in the text. Semantic search finds scenes with grief-adjacent language (mourning, loss, absence, emptiness) even when the word “grief” isn’t there.
Concept recall across research. “Find the bookmark I saved about medieval logistics.” You don’t remember the exact title or the exact author. Semantic search finds it anyway if the content was about that topic.
Cross-entity conceptual lookups. “What lore entries touch on the idea of sacrifice being central to power?” This is a theme query across your world, and semantic search can surface every character, location, event, and concept that has sacrifice-as-power language even when the exact phrase isn’t in any of them.
Similar content detection. “Find lore entries that are similar to this one.” Used internally by the Lore ML pipeline to cluster entries, detect near-duplicates, and suggest missing cross-links.
When semantic search fails
Section titled “When semantic search fails”Honest limitations:
- Very short queries. A one-word query doesn’t carry much meaning for the embedding model, so the matches can be random. Short queries work better as keyword searches.
- Specific proper nouns. “Where did I mention Kent Musa” is a name search, not a concept search. Semantic doesn’t add value and sometimes adds noise. The hybrid ranking usually handles this correctly, but not always.
- Negation. “Find lore entries that don’t mention magic” doesn’t work — semantic search returns similarity-ranked results, not inverse matches. Use filters instead.
- Recency bias. ChromaDB doesn’t natively rank by date. If two lore entries are equally relevant semantically but one was created last week and the other two years ago, the recency isn’t automatically a tiebreaker. You’d have to sort after retrieval.
- Embedding model drift. The embeddings are computed with a specific model at indexing time. If Ishvana’s default embedding model changes in a future version, older embeddings are from a different model’s space and their search quality degrades. When this happens, a one-time re-indexing is required — which Ishvana handles automatically on version upgrade, but it takes time.
How embeddings get generated
Section titled “How embeddings get generated”The workflow for a new piece of content:
- Content is created (you add a lore entry, save a bookmark, write a new document).
- The backend runs the embedding model on the relevant text (entry title + summary + sections for a lore entry; page title + content + generated summary for a bookmark; chunk text for a document chunk).
- The resulting vector gets stored in the appropriate ChromaDB collection with a reference back to the source (entry ID, bookmark ID, chunk ID).
- Future searches against that collection can find this content via vector similarity.
The whole process happens in the background as a non-blocking operation. You add a lore entry and the UI shows the entry immediately — the embedding happens asynchronously and the entry becomes searchable within a few seconds. If you search for the entry before the embedding has been generated, it won’t appear in semantic results (only in keyword results), but it will appear as soon as the background embedding job completes.