Query planning¶
Run this before routing. The goal is to know what answer would close the question, not to get an answer immediately.
Five steps¶
- Restate the decision being supported. What action does the user take after this research? Different deliverables (decision, spec input, code change) call for different sources.
- Extract constraints. Dates, versions, repo scope, languages, geographies, deal-breakers. These become filters in the routing block.
- Clarify only if it changes the source plan. Ask at most one question, and only when missing context would route to a different source set. "Which version of Next.js?" โ yes if Next.js routing differs by major. "What's your deadline?" โ no, doesn't change sources.
- Decompose into 2-5 focused subqueries. Multi-faceted questions ("competitive landscape of X") fan out badly when sent as one query. Each subquery should be a thing a search engine could answer in one page.
- Name stop criteria. Before fetching, write down what "done" looks like: "two authoritative sources agree", "vendor docs explicitly answer the API question", "no source contradicts the claim within last 3 months". Stop when met; don't keep gathering.
Tavily query construction¶
Pulled from tavily-best-practices/references/search.md
(https://github.com/tavily-ai/skills/blob/main/skills/tavily-best-practices/references/search.md):
- Keep queries under 400 characters. Short query, not a long-form prompt.
- Don't compose one giant query โ break it into the 2-5 subqueries from step 4 and run them in parallel. In the Claude Code harness, parallelism means a single assistant turn with multiple tool calls; sequential turns serialise.
- Include constraints in the query: company names, framework versions, geographies, year. Search engines reward concrete keywords.
- Pick the right depth:
basicfor general lookups (default),advancedfor precision-sensitive questions,fastwhen latency matters. - Filter freshness at the API, not after:
time_range="month"for current facts,start_date/end_datefor absolute windows. - Filter authority at the API:
include_domains=["arxiv.org","github.com","sec.gov"]for trusted sources,exclude_domains=["reddit.com","quora.com"]for noise. - Don't ask
tavily_searchfor raw bodies. Leaveinclude_raw_content=false; pass the surviving URLs totavily_extract(query=โฆ)instead. Cheaper, lower noise, and aligns withcontext-isolation.md.
Subquery decomposition examples¶
Bad (one big query):
"competitive landscape of AI code assistants in 2026 including market share, pricing, key differentiators, customer segments, and recent product moves"
Good (5 focused subqueries):
- "AI code assistant market share 2026"
- "Cursor vs GitHub Copilot vs Claude Code pricing 2026"
- "AI code assistant key differentiators 2026"
- "AI code assistant enterprise customer segments 2026"
- "Cursor product launches 2026" / "Copilot product launches 2026" / "Claude Code product launches 2026"
Run them in parallel, score-filter, then extract the top URLs per subquery.
Stop criteria template¶
Pin down before routing:
PLAN
- Decision: <what the user does next>
- Constraints: <versions, dates, scope, language>
- Subqueries: 1) <q1> 2) <q2> 3) <q3>
- Done when: <concrete signal>
- Source priority: <vendor docs > โฆ > GitHub examples>
The routing block (in routing.md) consumes this directly.
When to skip planning¶
A planning step that's longer than the answer wastes context. Skip the full Plan emission for:
- Single-fact lookups ("what's the latest stable Node version").
- Single-file local questions ("where does this repo wire up auth").
- Questions the user has already decomposed.
Always plan when:
- The question has more than one moving part (X and Y, before/after, multiple criteria).
- The question is comparative.
- "Latest", "current", or "best" is in the question.
- The deliverable is a report rather than a fact.