Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

hallouminate

A markdown corpus indexer for LLMs to build and query their own per-repo wikis. Hallouminate stores markdown verbatim on disk, embeds it with fastembed, indexes the embeddings in LanceDB, and exposes a small MCP surface so an LLM can author and search a per-repo knowledge base without leaving its agent loop.

The filesystem is the source of truth; LanceDB rows are derived and refreshed automatically when an LLM writes via add_markdown, or in bulk via hallouminate index. Code files (.rs, .toml, …) can also be indexed as text for semantic search, but hallouminate does no structural analysis — it’s a wiki indexer that happens to tolerate code, not a code-intelligence tool.

Why a daemon

A long-lived local daemon owns the LanceDB ground directory, per-corpus mutation locks, and config resolution. The CLI and the stdio MCP server both talk to it over a Unix domain socket — one owner, no cross-process LanceDB races. See Architecture for the full picture.

Where to go next

  • Installation — install the binary and register the MCP server with your agent.
  • CLI reference — every subcommand and its flags.
  • MCP surface — the nine tools an LLM calls to author and search wikis.
  • Configuration — the XDG baseline, repo-layer merge, and embedding-model options.
  • Dogfooding — this repo maintains its own wiki with hallouminate; here’s how to read it.

License

MIT — see LICENSE.

Installation

The fastest path is the prebuilt-binary installer — no Rust toolchain, no protoc, no compile. Build from source with cargo only if your platform has no prebuilt, or you want a development checkout.

Downloads a prebuilt hallouminate for your platform and adds it to your PATH:

curl --proto '=https' --tlsv1.2 -LsSf https://github.com/paulnsorensen/hallouminate/releases/latest/download/hallouminate-installer.sh | sh

Verify it:

hallouminate --version

Prebuilt binaries are published for each release:

PlatformTarget
macOS, Apple Siliconaarch64-apple-darwin
Linux, arm64aarch64-unknown-linux-gnu (glibc ≥ 2.39)
Linux, x86-64x86_64-unknown-linux-gnu (glibc ≥ 2.39)

Re-run the one-liner any time to upgrade to the latest release. On Intel Mac, Windows, or an older glibc, build from source with cargo below.

From crates.io

Builds from source, so it works on any platform with a Rust toolchain — at the cost of compiling native dependencies (a few minutes).

Prerequisites

  • A Rust toolchain (cargo) — see https://rustup.rs.
  • protoc (the Protocol Buffers compiler) — the lancedb build needs it.
    • macOS: brew install protobuf
    • Debian/Ubuntu: apt install protobuf-compiler
cargo install hallouminate --locked

The binary installs to ~/.cargo/bin/hallouminate (make sure that’s on your PATH).

From source

Same prerequisites as crates.io. Clone and build:

git clone https://github.com/paulnsorensen/hallouminate.git
cd hallouminate
cargo build --release

The binary lands in target/release/hallouminate.

Register the MCP server

Point your agent at hallouminate serve — the stdio MCP server that exposes the wiki tools. With Claude Code:

claude mcp add hallouminate -- hallouminate serve

hallouminate serve auto-spawns the daemon if none is running, so there’s no separate process to manage.

Bootstrap a config

hallouminate config init       # scaffold the XDG baseline config
hallouminate config validate   # confirm it parses

See Configuration for what goes in the config and how the XDG baseline merges with a repo-layer .hallouminate/config.toml.

Claude Code skill pack

A Claude Code plugin ships in this repo under plugins/hallouminate. It installs hallouminate and bootstraps your first wiki interactively:

/plugin marketplace add paulnsorensen/hallouminate
/plugin install hallouminate@hallouminate
/hallouminate:install

/install installs the binary, registers the MCP server, then asks where and how your first wiki should live (Socratic style) before scaffolding, indexing, and committing it with git.

CLI reference

hallouminate is a single binary. The CLI, the MCP server, and the daemon are all the same executable; CLI subcommands dial the daemon over a Unix domain socket (see Architecture).

CommandPurpose
hallouminate serveRun the stdio MCP server (auto-spawns the daemon if down).
hallouminate index [--corpus NAME]Bulk (re)index one corpus, or every configured corpus.
hallouminate ground "<query>" [flags]Semantic search from the CLI.
hallouminate daemon <run|stop|restart|status>Manage the long-lived daemon.
hallouminate config <init|show|validate|download>Inspect or scaffold config.
hallouminate hook <install|uninstall>Manage the per-repo discovery hook.

hallouminate --version prints the version; hallouminate --help and hallouminate <command> --help print usage for any subcommand.

serve

hallouminate serve

Starts the stdio MCP server an agent connects to. It is stateless beyond its tool router and a startup-captured working directory — every tool call dials the daemon. If no daemon is running, serve spawns one.

index

hallouminate index               # rebuild every configured corpus
hallouminate index --corpus repo:hallouminate:wiki

Use this when files were touched outside hallouminate. Writes that go through add_markdown already auto-reindex just the changed file.

ground

hallouminate ground "how does the daemon work"
hallouminate ground "socket protocol" --corpus repo:hallouminate:wiki --format json-pretty
FlagEffect
--corpus NAMECorpus to search (defaults to the wiki for the current repo).
--format outline|json|json-prettyOutput shape. outline (default) is a ripgrep-style digest.
--fullReturn full chunk bodies instead of snippets.
--top-files NNumber of files to roll up.
--chunks-per-file NChunks to include per file.
--limit NHard cap on returned chunks.
--snippet-chars NSnippet length when not using --full.

daemon

hallouminate daemon run        # run in the foreground
hallouminate daemon status     # is one running?
hallouminate daemon stop
hallouminate daemon restart

The daemon is the single owner of the LanceDB ground directory. Restart it after editing the baseline config; repo-layer edits take effect on the next request without a restart. --config PATH overrides the baseline config path.

config

hallouminate config init       # scaffold the XDG baseline config
hallouminate config show       # print the effective merged config for this cwd
hallouminate config validate   # parse + flag unknown top-level keys
hallouminate config download   # pre-fetch the configured embedding model

See Configuration.

hook

hallouminate hook install [--repo PATH]
hallouminate hook uninstall [--repo PATH]

Installs or removes a per-repo discovery hook. --repo PATH targets a repo other than the current directory.

Socket override

--socket PATH on index, ground, and the other client subcommands points at a specific daemon socket. Otherwise the socket is resolved from HALLOUMINATE_SOCKET, then $XDG_RUNTIME_DIR, then the cache dir — see Architecture.

MCP surface

hallouminate serve starts a stdio MCP server. It is stateless beyond its tool router and a startup-captured working directory; every tool call dials the local daemon over a Unix domain socket, and serve auto-spawns the daemon if none is up.

Default corpus

Read-side tools (ground, list_files, list_tree) that omit corpus default to the wiki for the repository containing the daemon’s working directory — repo:<NAME>:wiki for the deepest [[repository]] whose path is an ancestor of the cwd. When the cwd sits under no configured repo, the caller must name a corpus explicitly.

The mutating tools (add_markdown, delete_markdown) and read_markdown always require an explicit corpus, to avoid accidental writes to the wrong wiki or ambiguous reads.

The nine tools

list_corpora

Every corpus the daemon knows about — explicit [[corpus]] entries plus derived repo:NAME:wiki and repo:NAME:corpus corpora. No params. Call this first to learn what’s available.

list_files

The files currently visible in a corpus, honoring its paths/globs/exclude rules. Param: corpus (defaults to wiki-for-cwd). Returns an array of {path, absolute_path}.

list_tree

The same files as list_files, grouped into a {path, absolute_path, files, subdirs} tree. Subdirs with no markdown beneath them are pruned. Use this for progressive disclosure — navigate the wiki without reading every index.md first. Param: corpus (defaults to wiki-for-cwd).

ground

Semantic search. Embeds the query with the configured embedding model (default snowflake/snowflake-arctic-embed-s), retrieves top chunks from LanceDB, and rolls up per-file with breadcrumb context. Params: query (required), corpus, top_files, chunks_per_file, limit, snippet_chars. Returns a ripgrep-style outline in content and the full structured response in structuredContent.docs.

add_markdown

Atomic-write a markdown file to the corpus’ first configured root, then refresh just that file’s LanceDB rows. For repo:*:wiki corpora it also rebuilds the link list inside each ancestor index.md between the <!-- HALLOUMINATE:INDEX-START --> / <!-- HALLOUMINATE:INDEX-END --> markers — scaffolding a missing index.md, preserving prose outside the markers, and leaving marker-less files alone. Params: corpus, path, content, overwrite (default false). Symlinks and parent-dir escapes are rejected by the sandbox. Returns advisory lint warnings (empty-destination links, empty mermaid blocks, heading-level jumps) without blocking the write.

read_markdown

Verbatim UTF-8 contents of a file in a corpus. Params: corpus, path. Use this before add_markdown { overwrite: true } to inspect current content.

delete_markdown

Unlink a file from the corpus’ first root and prune its rows from the index. Irreversible. For repo:*:wiki corpora it also re-walks the ancestor index.mds so they no longer link to the deleted file. Params: corpus, path.

index

Bulk (re)build the LanceDB index for one or all corpora. Param: corpus (optional; omit to rebuild every configured corpus). Use this when files were touched outside hallouminate.

get_footnote

Resolve a single citation: the footnote target for a page’s #footnote_number. Params: corpus (defaults to wiki-for-cwd, same as ground), page (the wiki page’s relative path), footnote_number (the label after ^"1" for [^1], "note" for [^note]). Use this to expand one footnote without reading the whole page.

Conventions for LLM authors

Markdown is stored verbatim — hallouminate imposes no schema. The convention the indexer counts on:

  • One topic per file. The chunker splits on H1/H2/H3 headings.
  • First non-blank line is # Title. The H1 is the breadcrumb root for every chunk and the gloss in the parent index.md link list.
  • File stem matches the slug — lowercase, kebab-case, .md.
  • Idempotent writesadd_markdown rejects existing files unless overwrite: true; read_markdown first so you don’t clobber blind.

Error mapping

Daemon variantJSON-RPC codeMeaning
InvalidParams-32602Caller input failures (bad corpus name, unsafe path, missing arg).
Internal-32603Server / transport faults, including “daemon unavailable”.

When the daemon is unreachable, calls return -32603 — the MCP server does not fall back to opening a local LanceDB handle, since that’s exactly the multi-process race the daemon exists to prevent.

Configuration

Config lives at $XDG_CONFIG_HOME/hallouminate/config.toml (~/.config/hallouminate/config.toml by default). Two layers combine per request: an XDG baseline loaded once at daemon boot, and a repo layer discovered fresh on every request from the client’s working directory.

hallouminate config init       # scaffold the baseline
hallouminate config show       # the effective merged config for this cwd
hallouminate config validate   # parse + flag unknown top-level keys

Sections

SectionHolds
[[corpus]]Explicit named corpora (name, paths, globs, exclude rules).
[[repository]]Repo declarations; each derives repo:NAME:wiki and repo:NAME:corpus.
[search]Read-side defaults (top_files_default, chunks_per_file_default, …).
[embeddings]Embedding model and toggle (below).
[watch]File-watch settings.
[storage]Ground-directory location.

The XDG baseline vs the repo layer

The baseline owns explicit [[corpus]] entries, [[repository]] declarations, and the [search]/[embeddings]/[watch]/[storage] defaults. It is loaded once at daemon startup — change it and restart the daemon.

The repo layer is <repo>/.hallouminate/config.toml, found by walking up from the cwd to the first .git boundary. It overrides scalars and adds repo-local corpora, and is re-read on every request — so repo-layer edits take effect without a daemon restart. The repo layer is required: a CLI invocation from a directory with no ancestor .hallouminate/config.toml errors out. An empty file satisfies the check.

A repo declares itself like this repo does:

[[repository]]
name = "hallouminate"
path = "."

path = "." resolves against the repo root (the parent of .hallouminate/), so the wiki lands at <repo>/.hallouminate/wiki and is searchable as repo:hallouminate:wiki from any checkout.

Merge rules

  • Array entries ([[corpus]], [[repository]]) — repo entries append after baseline entries; duplicate names error.
  • Scalars — the repo wins if it sets a non-default value; conflicting non-default values error and name both source paths.

Embeddings

Dense embeddings are on by default, using snowflake/snowflake-arctic-embed-s. On first index hallouminate downloads that model and fuses its vector signal with lexical search.

Supported models

All embed to 384-dim vectors. Omitting embeddings.model selects the default.

ModelNotes
snowflake/snowflake-arctic-embed-sDefault. English, symmetric retrieval.
BAAI/bge-small-en-v1.5English, symmetric retrieval.
intfloat/multilingual-e5-smallMultilingual, asymmetric retrieval; no quantized variant.

Turning embeddings off

Run lexically only — full-text search + ripgrep + rerank, no embedding model downloaded (just the tokenizer used for chunking):

[embeddings]
enabled = false

Changing the embedding mode (or model) for a ground directory already indexed under a different mode trips the store’s mismatch guard on the next run. Delete the ground directory and re-index to rebuild:

rm -rf ~/.local/share/hallouminate/ground
hallouminate index

To pre-fetch the model so the first index doesn’t pay the download cost:

hallouminate config download

Paths at a glance

WhatDefault
Baseline config$XDG_CONFIG_HOME/hallouminate/config.toml
Repo-layer config<repo>/.hallouminate/config.toml
Ground (LanceDB) directory~/.local/share/hallouminate/ground
Model cache~/.cache/hallouminate/fastembed
Daemon socket$XDG_RUNTIME_DIR/hallouminate/daemon.sock (cache-dir fallback)

Architecture

Hallouminate uses a Sliced Bread layout — vertical slices with public APIs at slice boundaries, no cross-slice peeks at internals. Three top-level concerns under src/.

Layers

src/app/ — orchestration

The application layer composes domain logic with adapters. It owns the clap-derived CLI (cli.rs), the long-lived daemon/, config parsing and the XDG/repo-layer merge (config.rs), logging, and XDG path resolution. App depends on domain and adapters; it does not own pure logic.

src/domain/ — pure logic

No I/O beyond filesystem walks and hashing. Slices: corpus/ (chunker, walker, hasher, sandbox, snippet), embeddings/ (the fastembed wrapper), ground/ (search orchestration and result formatting), indexer/ (plan/apply/write), plus shared types in common.rs. Domain has no dependency on app or adapters.

src/adapters/ — external systems

lance.rs is the LanceDB vector-storage adapter; mcp/ is the rmcp-based stdio MCP server. Adapters depend on domain for types, but not on app.

The dependency direction is adapters → domain ← app → adapters: domain is the stable core, app composes adapter implementations with domain orchestration.

Why there’s a daemon

LanceDB does not support concurrent writer processes against the same table. If a CLI index and an MCP add_markdown both opened LanceDB directly, they would race on table mutations. The daemon is the single owner of the LanceDB ground directory, the per-corpus mutation locks, and the repository registry. Every other caller — CLI subcommand, MCP tool, future agent — dials the daemon over a Unix domain socket.

Socket location

The socket path resolves in this order (src/app/daemon/socket.rs):

  1. HALLOUMINATE_SOCKET env var — per-process override.
  2. $XDG_RUNTIME_DIR/hallouminate/daemon.sock — the default when a runtime dir exists.
  3. ${XDG_CACHE_HOME:-~/.cache}/hallouminate/daemon.sock — fallback.

The daemon takes a flock on <socket>.lock to enforce single-instance ownership; a second daemon on the same socket errors out.

Wire protocol

JSON-lines over the socket: one request line in, one response line out, then the connection closes. The request carries the client’s cwd, which the daemon walks on every request to discover the active repo-layer config and merge it with the boot baseline. That’s how one daemon serves many repos with different configs.

Mutating ops (index, add_markdown, delete_markdown) take the per-corpus mutation lock and then a global write-lane semaphore, in that order. Read ops skip both and run concurrently.

Entry points

  • src/main.rs — process entry; calls hallouminate::app::run().
  • src/lib.rs — library facade for tests and downstream callers.
  • src/app.rs — top-level run(): parses the CLI and dispatches.

A living example

This repo’s own architecture notes are also maintained as a hallouminate wiki at .hallouminate/wiki/ — see Dogfooding. The wiki entries carry file:line and commit citations that this page summarizes.

Dogfooding: our own wiki

Hallouminate maintains its own wiki with hallouminate. The repo declares itself as a [[repository]], so its knowledge base is searchable as the repo:hallouminate:wiki corpus from any checkout. The wiki lives at .hallouminate/wiki/ and is the canonical, durable record of how the project actually works — the source these docs are distilled from.

Two corpora, two lifecycles:

WhereIndexed asLifecycleHolds
.hallouminate/wiki/repo:hallouminate:wikidurable across sessionsarchitecture, conventions, protocols, “why this design” notes
.cheese/cheese-localtransient per-taskper-task agent reports

What’s in the wiki

The entries are written for an LLM working in the repo — they carry file:line and commit citations these human-facing docs summarize:

  • architecture — the sliced-bread layout and dependency direction.
  • mcp-surface — the nine MCP tools, params, and error mapping.
  • daemon-and-cli — why there’s a daemon, the JSON-line socket protocol, the CLI surface, and the lock order.
  • corpus-walker — gitignore-aware corpus walking and the explicit-root opt-in.
  • config-layering — the XDG baseline plus repo-layer merge.
  • wiki-conventions — how to author entries without contradicting the indexer.

Read it the way an LLM would

If you have hallouminate installed and the MCP server registered, an agent working in this repo queries the wiki directly:

list_tree   { corpus: "repo:hallouminate:wiki" }
ground      { corpus: "repo:hallouminate:wiki", query: "why is there a daemon" }
read_markdown { corpus: "repo:hallouminate:wiki", path: "daemon-and-cli.md" }

From the CLI:

hallouminate ground "socket resolution order" --corpus repo:hallouminate:wiki

Keeping it current

The repo’s AGENTS.md instructs every coding agent to refresh the wiki after a change lands on main — but only when the change altered durable knowledge (architecture, conventions, protocols, the MCP tool surface, a “why this design” decision). Routine bug fixes and transient per-task output stay out; that’s what .cheese/ is for.

Updates go through the MCP (read_markdownadd_markdown with overwrite: true), not raw file edits, so the LanceDB index and the ancestor index.md link lists stay in sync. When the wiki is edited on disk directly, re-sync with:

hallouminate index --corpus repo:hallouminate:wiki

That loop — author through the tool, search through the tool, keep the index honest — is the product proving itself on its own source.