HBDS: How AI Agents Search Typed Documents
HBDS (Hierarchical Budget-Constrained Descent Search) is a search algorithm that navigates typed knowledge hierarchies under strict token budgets. It works in four phases: ORIENT (classify the query), DESCEND (beam search through the tree), SYNTHESIZE (combine results), and WRITE BACK (persist new knowledge). HBDS requires typed documents — it cannot operate on flat text.
Every AI agent has a context window — a fixed number of tokens it can work with. The knowledge it needs is usually much larger than that window. HBDS is the algorithm that decides what to read within that budget. It navigates a tree of typed documents, scoring relevance at each level, checking costs before reading, and falling back to summaries when the budget is tight.
The Problem HBDS Solves
An AI agent working on your codebase has access to hundreds of documents. Its context window might be 200,000 tokens. The total documentation might be 2,000,000 tokens.
The agent cannot read everything. It must choose. Current approaches:
| Approach | How It Works | The Problem |
|---|---|---|
| RAG | Chunk docs, embed, retrieve top-k similar chunks | Flat — no hierarchy, arbitrary boundaries, no budget control |
| Agentic tool use | LLM decides what to grep/read based on reasoning | No formal budget tracking, no guaranteed coverage |
| Graph RAG | Extract entities from flat text, build knowledge graph | Structure is inferred — inference errors propagate |
| RAPTOR | Cluster chunks, summarize into tree, search top-down | Tree is auto-generated from flat text, no typed blocks |
All of these either operate on flat structure or infer structure from flat text at retrieval time. None use structure declared in the document format itself. HBDS does.
The Four Phases
HBDS works in four sequential phases. Each phase consumes tokens from a monotonically decreasing budget. The context window IS the execution budget.
Phase 1: ORIENT
Goal: Classify the query and select where to start searching.
The agent reads a lightweight root index — a table of contents for the entire knowledge tree. One LLM call classifies the query and picks the most relevant top-level domains.
Query: "How does authentication work in this app?"
Root Index:
project/ (backend, frontend, infra — 450K tokens)
general/ (TypeScript, AWS, patterns — 800K tokens)
agent-memories/ (session logs — 200K tokens)
Classification: project_local
Selected domain: project/backend/
Confidence: 0.92
Cost: 600 tokens
If the root index contains a keyword match with >0.9 confidence, HBDS skips the entire descent and reads the matching leaf directly. This handles queries like "What is the Stripe price ID?" in under a second.
Phase 2: DESCEND
Goal: Navigate the tree to find relevant leaf nodes, within budget.
This is the core of the algorithm — an iterative beam search with budget awareness. At each level of the tree:
- Read child manifests — summaries, keywords, and token estimates
- Score candidates — the LLM rates each child's relevance (0.0 to 1.0)
- Select beam — keep the top-k most relevant (default beam width: 3)
- Check budget — verify the budget can afford the node before reading
- Read or degrade — full content if budget allows, summary if not
- Expand — if a node has children, add them to the candidate queue
Query: "How does authentication work?"
Budget remaining: 28,000 tokens
Level 0 - project/ children:
backend/ score: 0.95 -> beam
frontend/ score: 0.35 -> pruned
infra/ score: 0.20 -> pruned
Level 1 - backend/ children:
auth/ score: 0.98, 4,200 tokens -> beam
billing/ score: 0.12 -> pruned
data/ score: 0.30 -> pruned
Level 2 - auth/ is a leaf node
Budget check: 4,200 <= 25,400 -> AFFORDABLE
Read full content
Budget remaining: 21,200 tokens
Result: auth/ content (relevance: 0.98, 4,200 tokens)
This step is impossible on flat text. Before reading a leaf node, HBDS checks
can_afford(node) — comparing the node's declared token count against the
remaining budget. In flat text, there is no declared token count. The agent must
read the document to find out how large it is. Every read is a blind commitment.
Graceful Degradation
When the budget is insufficient for full content, HBDS doesn't skip the node.
It falls back to the node's ::summary block — a compact, high-signal
representation.
Node: infrastructure/deployment-guide/
Full content: 12,000 tokens
Summary: 180 tokens
Budget remaining: 3,500 tokens
Decision: 12,000 > 3,500 -> cannot afford
Fallback: read summary only (180 tokens)
Result stored as PARTIAL (relevance discounted 50%)
Phase 3: SYNTHESIZE
Goal: Combine all retrieved knowledge into coherent context.
Takes every result from DESCEND, sorts by relevance, and packs into the remaining budget. A single LLM call produces:
| Output | Purpose |
|---|---|
| Answer | Coherent synthesis of retrieved knowledge |
| Confidence | 0-1 rating based on source quality |
| Gaps | Aspects the sources don't cover |
| New insights | Deductions not in any source |
| Sources used | Node IDs for citation |
Phase 4: WRITE BACK
Goal: Persist new knowledge generated during the search.
The synthesis phase often produces new insights — deductions that emerge from combining multiple sources. HBDS writes these back to the knowledge tree.
New insight: "The auth system depends on a post-confirmation
trigger that creates the user's initial workspace. This
dependency is not documented in either doc."
Placement: project/backend/auth/ (append)
Trust score: 0.6 (agent-generated)
Source: "agent-write-back"
Over time, the knowledge tree self-improves through use. This is only possible because typed documents provide the structure to organize into.
The Budget System
The entire context window is divided into zones:
Total: 200,000 tokens
System prompt [FIXED] 3,000
Task description [FIXED] 2,000
Search budget [FLEXIBLE] 29,000 (15%)
ORIENT reserve 1,000
DESCEND reserve 25,000
SYNTHESIZE reserve 3,000
Task work [FLEXIBLE] 155,000 (80%)
Write-back [FLEXIBLE] 5,800 (3%)
Output reserve [FIXED] 4,000
If search completes early, unused budget transfers to task work. Search never steals from the task zone.
struct BudgetTracker {
total: usize,
consumed: usize,
reserved: usize,
}
impl BudgetTracker {
fn available(&self) -> usize {
self.total - self.consumed - self.reserved
}
fn can_afford(&self, node: &TreeNode) -> bool {
node.content_tokens <= self.available()
}
fn can_afford_subtree(&self, node: &TreeNode) -> bool {
node.subtree_token_estimate <= self.available()
}
fn consume(&mut self, n: usize, label: &str) {
assert!(n <= self.available(), "Budget exceeded");
self.consumed += n;
}
}
Budget Report
Every HBDS search produces a full accounting report:
{
"totalBudget": 29000,
"consumed": {
"orient": { "readIndex": 180, "classify": 420, "total": 600 },
"descend": {
"depth0": { "score": 380, "expand": 150, "total": 530 },
"depth1": { "score": 350, "readLeaves": 4200, "total": 4550 }
},
"synthesize": { "total": 1800 },
"writeback": { "total": 450 },
"searchTotal": 7930
},
"unused": 21070,
"searchEfficiency": {
"nodesScored": 8,
"nodesRead": 1,
"tokensPerUsefulResult": 7930,
"searchFractionActual": 0.04
}
}
In this example, the search used only 4% of the total budget to find the right knowledge. The remaining 96% is available for the actual task.
The Knowledge Tree Schema
Every node in the HBDS knowledge tree:
{
"id": "proj.backend.auth",
"path": "/project/backend/auth",
"parentId": "proj.backend",
"type": "leaf",
"summary": "Auth setup using OPAQUE-KE with post-confirmation triggers.",
"keywords": ["auth", "opaque", "login", "jwt"],
"relevanceHints": [
"How does authentication work?",
"How do users sign up?"
],
"contentTokens": 4200,
"metadata": {
"version": 3,
"trustScore": 0.92,
"accessCount": 14,
"source": "manual"
}
}
| Field | Why It Exists | HBDS Phase |
|---|---|---|
| summary | Cheap proxy; enables graceful degradation | DESCEND fallback |
| keywords | Fast direct-hit matching without LLM scoring | ORIENT fast path |
| relevanceHints | Example queries this node answers | DESCEND scoring |
| contentTokens | Pre-read budget verification | DESCEND can_afford |
| trustScore | Quality signal — decays over time | DESCEND ranking |
| accessCount | Popularity signal for tree optimization | WRITE BACK |
Trust Scores
Trust is not static. It decays over time and is boosted by verification:
| Event | Effect |
|---|---|
| Time passes | Decays by 0.3%/day (half-life ~231 days) |
| Agent verifies (reads and confirms) | +0.05 |
| Human verifies | +0.15 |
| Cross-referenced by another node | +0.03 |
| Falls below 0.10 | Flagged for review or archived |
During search, low-trust nodes receive a penalty:
effectiveScore = rawScore * trustScore. Nodes past their TTL are excluded.
Configuration
| Parameter | Default | What It Controls |
|---|---|---|
| beamWidth | 3 | Branches explored per descent level |
| maxDepth | 5 | Maximum tree depth |
| pruneThreshold | 0.2 | Minimum score to keep a candidate |
| depthDecayFactor | 0.85 | Score penalty per descent level |
| searchFraction | 0.15 | Fraction of total budget for search |
| minimumUsefulBudget | 2000 | Stop search below this token count |
| enableWriteBack | true | Allow knowledge tree updates |
| enableDirectHit | true | Allow fast-path keyword matching |
For research-heavy tasks, increase searchFraction to 0.40. For execution-heavy
tasks where context is already in the prompt, reduce to 0.05.
Why Flat Text Cannot Support HBDS
| HBDS Capability | Required Property | Markdown? | SurfDoc? |
|---|---|---|---|
| Classify query against domain index | P2: Typed cross-references | No | Yes |
| Score candidate relevance | P1: Declared block types | No | Yes |
| Check budget before reading | P4: Pre-computed token costs | No | Yes |
| Fall back to summaries | P1 + P3: Types + priority | No | Yes |
| Type-aware synthesis | P1: Block types | No | Yes |
| Place write-back knowledge | P2: Typed references | No | Yes |
You cannot retrofit these properties onto Markdown without making it something other than Markdown. The properties are not presentation — they are structure. The format IS the architecture.
Storage Options
| Option | Scale | Best For |
|---|---|---|
| File-based (Markdown tree) | < 1,000 nodes | Getting started, Git-friendly |
| Hybrid (files + SQLite index) | 1,000-10,000 nodes | Single-user production |
| Database (PostgreSQL) | 10,000+ nodes | Multi-user teams |
Start file-based. Migrate to hybrid when keyword search gets slow. Use database for team-shared knowledge stores.
Provider Agnosticism
The algorithm is universal. The LLM provider is swappable.
| Component | Universal? |
|---|---|
| Knowledge tree schema | Yes |
| HBDS algorithm logic | Yes |
| Budget tracking | Yes |
| Token counting | Partially — tokenizer varies |
| LLM calls | Prompt format is universal; API format varies |
| Context window size | Provider-specific |
HBDS auto-adapts: larger windows allow wider beams. Models with windows under 32K enter summary-only mode.
Summary
HBDS is a search algorithm for a world where:
- Context windows are finite and always will be
- Knowledge worth querying is effectively infinite
- Flat text provides no structure for algorithms to navigate
- Typed documents declare the structure algorithms need
The algorithm navigates typed knowledge hierarchies under strict token budgets, uses beam search with pre-read budget verification, degrades gracefully to summaries, and self-improves through write-back.
The format is the architecture. The algorithm is the proof.
Typed Structure Over Flat Text — why AI agents need traversable documents, and the five failure modes of flat text that typed formats eliminate.