hallouminate
A markdown corpus indexer for LLMs to build and query their own per-repo wikis. Hallouminate stores markdown verbatim on disk, embeds it with fastembed, indexes the embeddings in LanceDB, and exposes a small MCP surface so an LLM can author and search a per-repo knowledge base without leaving its agent loop.
The filesystem is the source of truth; LanceDB rows are derived and refreshed
automatically when an LLM writes via add_markdown, or in bulk via
hallouminate index. Code files (.rs, .toml, …) can also be indexed as
text for semantic search, but hallouminate does no structural analysis — it’s
a wiki indexer that happens to tolerate code, not a code-intelligence tool.
Why a daemon
A long-lived local daemon owns the LanceDB ground directory, per-corpus mutation locks, and config resolution. The CLI and the stdio MCP server both talk to it over a Unix domain socket — one owner, no cross-process LanceDB races. See Architecture for the full picture.
Where to go next
- Installation — install the binary and register the MCP server with your agent.
- CLI reference — every subcommand and its flags.
- MCP surface — the nine tools an LLM calls to author and search wikis.
- Configuration — the XDG baseline, repo-layer merge, and embedding-model options.
- Dogfooding — this repo maintains its own wiki with hallouminate; here’s how to read it.
License
MIT — see LICENSE.
Installation
The fastest path is the prebuilt-binary installer — no Rust toolchain, no
protoc, no compile. Build from source with cargo only if your platform has no
prebuilt, or you want a development checkout.
Prebuilt binary (recommended)
Downloads a prebuilt hallouminate for your platform and adds it to your PATH:
curl --proto '=https' --tlsv1.2 -LsSf https://github.com/paulnsorensen/hallouminate/releases/latest/download/hallouminate-installer.sh | sh
Verify it:
hallouminate --version
Prebuilt binaries are published for each release:
| Platform | Target |
|---|---|
| macOS, Apple Silicon | aarch64-apple-darwin |
| Linux, arm64 | aarch64-unknown-linux-gnu (glibc ≥ 2.39) |
| Linux, x86-64 | x86_64-unknown-linux-gnu (glibc ≥ 2.39) |
Re-run the one-liner any time to upgrade to the latest release. On Intel Mac, Windows, or an older glibc, build from source with cargo below.
From crates.io
Builds from source, so it works on any platform with a Rust toolchain — at the cost of compiling native dependencies (a few minutes).
Prerequisites
- A Rust toolchain (
cargo) — see https://rustup.rs. protoc(the Protocol Buffers compiler) — thelancedbbuild needs it.- macOS:
brew install protobuf - Debian/Ubuntu:
apt install protobuf-compiler
- macOS:
cargo install hallouminate --locked
The binary installs to ~/.cargo/bin/hallouminate (make sure that’s on your
PATH).
From source
Same prerequisites as crates.io. Clone and build:
git clone https://github.com/paulnsorensen/hallouminate.git
cd hallouminate
cargo build --release
The binary lands in target/release/hallouminate.
Register the MCP server
Point your agent at hallouminate serve — the stdio MCP server that exposes
the wiki tools. With Claude Code:
claude mcp add hallouminate -- hallouminate serve
hallouminate serve auto-spawns the daemon if none is running, so there’s no
separate process to manage.
Bootstrap a config
hallouminate config init # scaffold the XDG baseline config
hallouminate config validate # confirm it parses
See Configuration for what goes in the config and how the
XDG baseline merges with a repo-layer .hallouminate/config.toml.
Claude Code skill pack
A Claude Code plugin ships in this repo under
plugins/hallouminate.
It installs hallouminate and bootstraps your first wiki interactively:
/plugin marketplace add paulnsorensen/hallouminate
/plugin install hallouminate@hallouminate
/hallouminate:install
/install installs the binary, registers the MCP server, then asks where and
how your first wiki should live (Socratic style) before scaffolding, indexing,
and committing it with git.
CLI reference
hallouminate is a single binary. The CLI, the MCP server, and the daemon are
all the same executable; CLI subcommands dial the daemon over a Unix domain
socket (see Architecture).
| Command | Purpose |
|---|---|
hallouminate serve | Run the stdio MCP server (auto-spawns the daemon if down). |
hallouminate index [--corpus NAME] | Bulk (re)index one corpus, or every configured corpus. |
hallouminate ground "<query>" [flags] | Semantic search from the CLI. |
hallouminate daemon <run|stop|restart|status> | Manage the long-lived daemon. |
hallouminate config <init|show|validate|download> | Inspect or scaffold config. |
hallouminate hook <install|uninstall> | Manage the per-repo discovery hook. |
hallouminate --version prints the version; hallouminate --help and
hallouminate <command> --help print usage for any subcommand.
serve
hallouminate serve
Starts the stdio MCP server an agent connects to. It is stateless beyond its
tool router and a startup-captured working directory — every tool call dials
the daemon. If no daemon is running, serve spawns one.
index
hallouminate index # rebuild every configured corpus
hallouminate index --corpus repo:hallouminate:wiki
Use this when files were touched outside hallouminate. Writes that go through
add_markdown already auto-reindex just the changed file.
ground
hallouminate ground "how does the daemon work"
hallouminate ground "socket protocol" --corpus repo:hallouminate:wiki --format json-pretty
| Flag | Effect |
|---|---|
--corpus NAME | Corpus to search (defaults to the wiki for the current repo). |
--format outline|json|json-pretty | Output shape. outline (default) is a ripgrep-style digest. |
--full | Return full chunk bodies instead of snippets. |
--top-files N | Number of files to roll up. |
--chunks-per-file N | Chunks to include per file. |
--limit N | Hard cap on returned chunks. |
--snippet-chars N | Snippet length when not using --full. |
daemon
hallouminate daemon run # run in the foreground
hallouminate daemon status # is one running?
hallouminate daemon stop
hallouminate daemon restart
The daemon is the single owner of the LanceDB ground directory. Restart it
after editing the baseline config; repo-layer edits take effect on the
next request without a restart. --config PATH overrides the baseline config
path.
config
hallouminate config init # scaffold the XDG baseline config
hallouminate config show # print the effective merged config for this cwd
hallouminate config validate # parse + flag unknown top-level keys
hallouminate config download # pre-fetch the configured embedding model
See Configuration.
hook
hallouminate hook install [--repo PATH]
hallouminate hook uninstall [--repo PATH]
Installs or removes a per-repo discovery hook. --repo PATH targets a repo
other than the current directory.
Socket override
--socket PATH on index, ground, and the other client subcommands points
at a specific daemon socket. Otherwise the socket is resolved from
HALLOUMINATE_SOCKET, then $XDG_RUNTIME_DIR, then the cache dir — see
Architecture.
MCP surface
hallouminate serve starts a stdio MCP server. It is stateless beyond its
tool router and a startup-captured working directory; every tool call dials
the local daemon over a Unix domain socket, and serve auto-spawns the daemon
if none is up.
Default corpus
Read-side tools (ground, list_files, list_tree) that omit corpus
default to the wiki for the repository containing the daemon’s working
directory — repo:<NAME>:wiki for the deepest [[repository]] whose path
is an ancestor of the cwd. When the cwd sits under no configured repo, the
caller must name a corpus explicitly.
The mutating tools (add_markdown, delete_markdown) and read_markdown
always require an explicit corpus, to avoid accidental writes to the
wrong wiki or ambiguous reads.
The nine tools
list_corpora
Every corpus the daemon knows about — explicit [[corpus]] entries plus
derived repo:NAME:wiki and repo:NAME:corpus corpora. No params. Call this
first to learn what’s available.
list_files
The files currently visible in a corpus, honoring its paths/globs/exclude
rules. Param: corpus (defaults to wiki-for-cwd). Returns an array of
{path, absolute_path}.
list_tree
The same files as list_files, grouped into a {path, absolute_path, files, subdirs} tree. Subdirs with no markdown beneath them are pruned. Use this for
progressive disclosure — navigate the wiki without reading every index.md
first. Param: corpus (defaults to wiki-for-cwd).
ground
Semantic search. Embeds the query with the configured embedding model
(default snowflake/snowflake-arctic-embed-s), retrieves top chunks from
LanceDB, and rolls up per-file with breadcrumb context. Params: query
(required), corpus, top_files, chunks_per_file, limit, snippet_chars.
Returns a ripgrep-style outline in content and the full structured response
in structuredContent.docs.
add_markdown
Atomic-write a markdown file to the corpus’ first configured root, then refresh
just that file’s LanceDB rows. For repo:*:wiki corpora it also rebuilds the
link list inside each ancestor index.md between the
<!-- HALLOUMINATE:INDEX-START --> / <!-- HALLOUMINATE:INDEX-END -->
markers — scaffolding a missing index.md, preserving prose outside the
markers, and leaving marker-less files alone. Params: corpus, path,
content, overwrite (default false). Symlinks and parent-dir escapes are
rejected by the sandbox. Returns advisory lint warnings (empty-destination
links, empty mermaid blocks, heading-level jumps) without blocking the write.
read_markdown
Verbatim UTF-8 contents of a file in a corpus. Params: corpus, path. Use
this before add_markdown { overwrite: true } to inspect current content.
delete_markdown
Unlink a file from the corpus’ first root and prune its rows from the index.
Irreversible. For repo:*:wiki corpora it also re-walks the ancestor
index.mds so they no longer link to the deleted file. Params: corpus,
path.
index
Bulk (re)build the LanceDB index for one or all corpora. Param: corpus
(optional; omit to rebuild every configured corpus). Use this when files were
touched outside hallouminate.
get_footnote
Resolve a single citation: the footnote target for a page’s #footnote_number.
Params: corpus (defaults to wiki-for-cwd, same as ground), page (the
wiki page’s relative path), footnote_number (the label after ^ — "1" for
[^1], "note" for [^note]). Use this to expand one footnote without reading
the whole page.
Conventions for LLM authors
Markdown is stored verbatim — hallouminate imposes no schema. The convention the indexer counts on:
- One topic per file. The chunker splits on H1/H2/H3 headings.
- First non-blank line is
# Title. The H1 is the breadcrumb root for every chunk and the gloss in the parentindex.mdlink list. - File stem matches the slug — lowercase, kebab-case,
.md. - Idempotent writes —
add_markdownrejects existing files unlessoverwrite: true;read_markdownfirst so you don’t clobber blind.
Error mapping
| Daemon variant | JSON-RPC code | Meaning |
|---|---|---|
InvalidParams | -32602 | Caller input failures (bad corpus name, unsafe path, missing arg). |
Internal | -32603 | Server / transport faults, including “daemon unavailable”. |
When the daemon is unreachable, calls return -32603 — the MCP server does
not fall back to opening a local LanceDB handle, since that’s exactly the
multi-process race the daemon exists to prevent.
Configuration
Config lives at $XDG_CONFIG_HOME/hallouminate/config.toml
(~/.config/hallouminate/config.toml by default). Two layers combine per
request: an XDG baseline loaded once at daemon boot, and a repo layer
discovered fresh on every request from the client’s working directory.
hallouminate config init # scaffold the baseline
hallouminate config show # the effective merged config for this cwd
hallouminate config validate # parse + flag unknown top-level keys
Sections
| Section | Holds |
|---|---|
[[corpus]] | Explicit named corpora (name, paths, globs, exclude rules). |
[[repository]] | Repo declarations; each derives repo:NAME:wiki and repo:NAME:corpus. |
[search] | Read-side defaults (top_files_default, chunks_per_file_default, …). |
[embeddings] | Embedding model and toggle (below). |
[watch] | File-watch settings. |
[storage] | Ground-directory location. |
The XDG baseline vs the repo layer
The baseline owns explicit [[corpus]] entries, [[repository]]
declarations, and the [search]/[embeddings]/[watch]/[storage]
defaults. It is loaded once at daemon startup — change it and restart the
daemon.
The repo layer is <repo>/.hallouminate/config.toml, found by walking up
from the cwd to the first .git boundary. It overrides scalars and adds
repo-local corpora, and is re-read on every request — so repo-layer edits take
effect without a daemon restart. The repo layer is required: a CLI
invocation from a directory with no ancestor .hallouminate/config.toml
errors out. An empty file satisfies the check.
A repo declares itself like this repo does:
[[repository]]
name = "hallouminate"
path = "."
path = "." resolves against the repo root (the parent of .hallouminate/),
so the wiki lands at <repo>/.hallouminate/wiki and is searchable as
repo:hallouminate:wiki from any checkout.
Merge rules
- Array entries (
[[corpus]],[[repository]]) — repo entries append after baseline entries; duplicate names error. - Scalars — the repo wins if it sets a non-default value; conflicting non-default values error and name both source paths.
Embeddings
Dense embeddings are on by default, using
snowflake/snowflake-arctic-embed-s. On first index hallouminate downloads
that model and fuses its vector signal with lexical search.
Supported models
All embed to 384-dim vectors. Omitting embeddings.model selects the default.
| Model | Notes |
|---|---|
snowflake/snowflake-arctic-embed-s | Default. English, symmetric retrieval. |
BAAI/bge-small-en-v1.5 | English, symmetric retrieval. |
intfloat/multilingual-e5-small | Multilingual, asymmetric retrieval; no quantized variant. |
Turning embeddings off
Run lexically only — full-text search + ripgrep + rerank, no embedding model downloaded (just the tokenizer used for chunking):
[embeddings]
enabled = false
Changing the embedding mode (or model) for a ground directory already indexed under a different mode trips the store’s mismatch guard on the next run. Delete the ground directory and re-index to rebuild:
rm -rf ~/.local/share/hallouminate/ground
hallouminate index
To pre-fetch the model so the first index doesn’t pay the download cost:
hallouminate config download
Paths at a glance
| What | Default |
|---|---|
| Baseline config | $XDG_CONFIG_HOME/hallouminate/config.toml |
| Repo-layer config | <repo>/.hallouminate/config.toml |
| Ground (LanceDB) directory | ~/.local/share/hallouminate/ground |
| Model cache | ~/.cache/hallouminate/fastembed |
| Daemon socket | $XDG_RUNTIME_DIR/hallouminate/daemon.sock (cache-dir fallback) |
Architecture
Hallouminate uses a Sliced Bread
layout — vertical slices with public APIs at slice boundaries, no cross-slice
peeks at internals. Three top-level concerns under src/.
Layers
src/app/ — orchestration
The application layer composes domain logic with adapters. It owns the
clap-derived CLI (cli.rs), the long-lived daemon/, config parsing and the
XDG/repo-layer merge (config.rs), logging, and XDG path resolution. App
depends on domain and adapters; it does not own pure logic.
src/domain/ — pure logic
No I/O beyond filesystem walks and hashing. Slices: corpus/ (chunker, walker,
hasher, sandbox, snippet), embeddings/ (the fastembed wrapper), ground/
(search orchestration and result formatting), indexer/ (plan/apply/write),
plus shared types in common.rs. Domain has no dependency on app or adapters.
src/adapters/ — external systems
lance.rs is the LanceDB vector-storage adapter; mcp/ is the rmcp-based
stdio MCP server. Adapters depend on domain for types, but not on app.
The dependency direction is adapters → domain ← app → adapters: domain is
the stable core, app composes adapter implementations with domain
orchestration.
Why there’s a daemon
LanceDB does not support concurrent writer processes against the same table.
If a CLI index and an MCP add_markdown both opened LanceDB directly, they
would race on table mutations. The daemon is the single owner of the LanceDB
ground directory, the per-corpus mutation locks, and the repository registry.
Every other caller — CLI subcommand, MCP tool, future agent — dials the daemon
over a Unix domain socket.
Socket location
The socket path resolves in this order (src/app/daemon/socket.rs):
HALLOUMINATE_SOCKETenv var — per-process override.$XDG_RUNTIME_DIR/hallouminate/daemon.sock— the default when a runtime dir exists.${XDG_CACHE_HOME:-~/.cache}/hallouminate/daemon.sock— fallback.
The daemon takes a flock on <socket>.lock to enforce single-instance
ownership; a second daemon on the same socket errors out.
Wire protocol
JSON-lines over the socket: one request line in, one response line out, then
the connection closes. The request carries the client’s cwd, which the
daemon walks on every request to discover the active repo-layer config and
merge it with the boot baseline. That’s how one daemon serves many repos with
different configs.
Mutating ops (index, add_markdown, delete_markdown) take the per-corpus
mutation lock and then a global write-lane semaphore, in that order. Read ops
skip both and run concurrently.
Entry points
src/main.rs— process entry; callshallouminate::app::run().src/lib.rs— library facade for tests and downstream callers.src/app.rs— top-levelrun(): parses the CLI and dispatches.
A living example
This repo’s own architecture notes are also maintained as a hallouminate wiki
at .hallouminate/wiki/ — see Dogfooding. The wiki entries
carry file:line and commit citations that this page summarizes.
Dogfooding: our own wiki
Hallouminate maintains its own wiki with hallouminate. The repo declares
itself as a [[repository]], so its knowledge base is searchable as the
repo:hallouminate:wiki corpus from any checkout. The wiki lives at
.hallouminate/wiki/
and is the canonical, durable record of how the project actually works — the
source these docs are distilled from.
Two corpora, two lifecycles:
| Where | Indexed as | Lifecycle | Holds |
|---|---|---|---|
.hallouminate/wiki/ | repo:hallouminate:wiki | durable across sessions | architecture, conventions, protocols, “why this design” notes |
.cheese/ | cheese-local | transient per-task | per-task agent reports |
What’s in the wiki
The entries are written for an LLM working in the repo — they carry
file:line and commit citations these human-facing docs summarize:
- architecture — the sliced-bread layout and dependency direction.
- mcp-surface — the nine MCP tools, params, and error mapping.
- daemon-and-cli — why there’s a daemon, the JSON-line socket protocol, the CLI surface, and the lock order.
- corpus-walker — gitignore-aware corpus walking and the explicit-root opt-in.
- config-layering — the XDG baseline plus repo-layer merge.
- wiki-conventions — how to author entries without contradicting the indexer.
Read it the way an LLM would
If you have hallouminate installed and the MCP server registered, an agent working in this repo queries the wiki directly:
list_tree { corpus: "repo:hallouminate:wiki" }
ground { corpus: "repo:hallouminate:wiki", query: "why is there a daemon" }
read_markdown { corpus: "repo:hallouminate:wiki", path: "daemon-and-cli.md" }
From the CLI:
hallouminate ground "socket resolution order" --corpus repo:hallouminate:wiki
Keeping it current
The repo’s AGENTS.md
instructs every coding agent to refresh the wiki after a change lands on
main — but only when the change altered durable knowledge (architecture,
conventions, protocols, the MCP tool surface, a “why this design” decision).
Routine bug fixes and transient per-task output stay out; that’s what
.cheese/ is for.
Updates go through the MCP (read_markdown → add_markdown with
overwrite: true), not raw file edits, so the LanceDB index and the ancestor
index.md link lists stay in sync. When the wiki is edited on disk directly,
re-sync with:
hallouminate index --corpus repo:hallouminate:wiki
That loop — author through the tool, search through the tool, keep the index honest — is the product proving itself on its own source.