HBDS: How AI Agents Search Typed Documents

HBDS (Hierarchical Budget-Constrained Descent Search) is a search algorithm that navigates typed knowledge hierarchies under strict token budgets. It works in four phases: ORIENT (classify the query), DESCEND (beam search through the tree), SYNTHESIZE (combine results), and WRITE BACK (persist new knowledge). HBDS requires typed documents — it cannot operate on flat text.

Info: What is HBDS for?

Every AI agent has a context window — a fixed number of tokens it can work with. The knowledge it needs is usually much larger than that window. HBDS is the algorithm that decides what to read within that budget. It navigates a tree of typed documents, scoring relevance at each level, checking costs before reading, and falling back to summaries when the budget is tight.

The Problem HBDS Solves

An AI agent working on your codebase has access to hundreds of documents. Its context window might be 200,000 tokens. The total documentation might be 2,000,000 tokens.

The agent cannot read everything. It must choose. Current approaches:

Approach	How It Works	The Problem
RAG	Chunk docs, embed, retrieve top-k similar chunks	Flat — no hierarchy, arbitrary boundaries, no budget control
Agentic tool use	LLM decides what to grep/read based on reasoning	No formal budget tracking, no guaranteed coverage
Graph RAG	Extract entities from flat text, build knowledge graph	Structure is inferred — inference errors propagate
RAPTOR	Cluster chunks, summarize into tree, search top-down	Tree is auto-generated from flat text, no typed blocks

All of these either operate on flat structure or infer structure from flat text at retrieval time. None use structure declared in the document format itself. HBDS does.

The Four Phases

HBDS works in four sequential phases. Each phase consumes tokens from a monotonically decreasing budget. The context window IS the execution budget.

Phase 1: ORIENT

Goal: Classify the query and select where to start searching.

The agent reads a lightweight root index — a table of contents for the entire knowledge tree. One LLM call classifies the query and picks the most relevant top-level domains.

Query: "How does authentication work in this app?"

Root Index:
  project/          (backend, frontend, infra — 450K tokens)
  general/          (TypeScript, AWS, patterns — 800K tokens)
  agent-memories/   (session logs — 200K tokens)

Classification: project_local
Selected domain: project/backend/
Confidence: 0.92
Cost: 600 tokens

Tip: Fast path

If the root index contains a keyword match with >0.9 confidence, HBDS skips the entire descent and reads the matching leaf directly. This handles queries like "What is the Stripe price ID?" in under a second.

Phase 2: DESCEND

Goal: Navigate the tree to find relevant leaf nodes, within budget.

This is the core of the algorithm — an iterative beam search with budget awareness. At each level of the tree:

Read child manifests — summaries, keywords, and token estimates
Score candidates — the LLM rates each child's relevance (0.0 to 1.0)
Select beam — keep the top-k most relevant (default beam width: 3)
Check budget — verify the budget can afford the node before reading
Read or degrade — full content if budget allows, summary if not
Expand — if a node has children, add them to the candidate queue

Query: "How does authentication work?"
Budget remaining: 28,000 tokens

Level 0 - project/ children:
  backend/     score: 0.95  -> beam
  frontend/    score: 0.35  -> pruned
  infra/       score: 0.20  -> pruned

Level 1 - backend/ children:
  auth/        score: 0.98, 4,200 tokens  -> beam
  billing/     score: 0.12  -> pruned
  data/        score: 0.30  -> pruned

Level 2 - auth/ is a leaf node
  Budget check: 4,200 <= 25,400  -> AFFORDABLE
  Read full content
  Budget remaining: 21,200 tokens

Result: auth/ content (relevance: 0.98, 4,200 tokens)

Warning: Pre-read budget verification

This step is impossible on flat text. Before reading a leaf node, HBDS checks can_afford(node) — comparing the node's declared token count against the remaining budget. In flat text, there is no declared token count. The agent must read the document to find out how large it is. Every read is a blind commitment.

Graceful Degradation

When the budget is insufficient for full content, HBDS doesn't skip the node. It falls back to the node's ::summary block — a compact, high-signal representation.

Node: infrastructure/deployment-guide/
Full content: 12,000 tokens
Summary: 180 tokens
Budget remaining: 3,500 tokens

Decision: 12,000 > 3,500  -> cannot afford
Fallback: read summary only (180 tokens)
Result stored as PARTIAL (relevance discounted 50%)

Phase 3: SYNTHESIZE

Goal: Combine all retrieved knowledge into coherent context.

Takes every result from DESCEND, sorts by relevance, and packs into the remaining budget. A single LLM call produces:

Output	Purpose
Answer	Coherent synthesis of retrieved knowledge
Confidence	0-1 rating based on source quality
Gaps	Aspects the sources don't cover
New insights	Deductions not in any source
Sources used	Node IDs for citation

Phase 4: WRITE BACK

Goal: Persist new knowledge generated during the search.

The synthesis phase often produces new insights — deductions that emerge from combining multiple sources. HBDS writes these back to the knowledge tree.

New insight: "The auth system depends on a post-confirmation
trigger that creates the user's initial workspace. This
dependency is not documented in either doc."

Placement: project/backend/auth/ (append)
Trust score: 0.6 (agent-generated)
Source: "agent-write-back"

Over time, the knowledge tree self-improves through use. This is only possible because typed documents provide the structure to organize into.

The Budget System

The entire context window is divided into zones:

Total: 200,000 tokens
  System prompt         [FIXED]       3,000
  Task description      [FIXED]       2,000
  Search budget         [FLEXIBLE]    29,000  (15%)
    ORIENT reserve                    1,000
    DESCEND reserve                   25,000
    SYNTHESIZE reserve                3,000
  Task work             [FLEXIBLE]    155,000 (80%)
  Write-back            [FLEXIBLE]    5,800   (3%)
  Output reserve        [FIXED]       4,000

If search completes early, unused budget transfers to task work. Search never steals from the task zone.

struct BudgetTracker {
    total: usize,
    consumed: usize,
    reserved: usize,
}

impl BudgetTracker {
    fn available(&self) -> usize {
        self.total - self.consumed - self.reserved
    }

    fn can_afford(&self, node: &TreeNode) -> bool {
        node.content_tokens <= self.available()
    }

    fn can_afford_subtree(&self, node: &TreeNode) -> bool {
        node.subtree_token_estimate <= self.available()
    }

    fn consume(&mut self, n: usize, label: &str) {
        assert!(n <= self.available(), "Budget exceeded");
        self.consumed += n;
    }
}

Budget Report

Every HBDS search produces a full accounting report:

{
  "totalBudget": 29000,
  "consumed": {
    "orient": { "readIndex": 180, "classify": 420, "total": 600 },
    "descend": {
      "depth0": { "score": 380, "expand": 150, "total": 530 },
      "depth1": { "score": 350, "readLeaves": 4200, "total": 4550 }
    },
    "synthesize": { "total": 1800 },
    "writeback": { "total": 450 },
    "searchTotal": 7930
  },
  "unused": 21070,
  "searchEfficiency": {
    "nodesScored": 8,
    "nodesRead": 1,
    "tokensPerUsefulResult": 7930,
    "searchFractionActual": 0.04
  }
}

In this example, the search used only 4% of the total budget to find the right knowledge. The remaining 96% is available for the actual task.

The Knowledge Tree Schema

Every node in the HBDS knowledge tree:

{
  "id": "proj.backend.auth",
  "path": "/project/backend/auth",
  "parentId": "proj.backend",
  "type": "leaf",
  "summary": "Auth setup using OPAQUE-KE with post-confirmation triggers.",
  "keywords": ["auth", "opaque", "login", "jwt"],
  "relevanceHints": [
    "How does authentication work?",
    "How do users sign up?"
  ],
  "contentTokens": 4200,
  "metadata": {
    "version": 3,
    "trustScore": 0.92,
    "accessCount": 14,
    "source": "manual"
  }
}

Field	Why It Exists	HBDS Phase
summary	Cheap proxy; enables graceful degradation	DESCEND fallback
keywords	Fast direct-hit matching without LLM scoring	ORIENT fast path
relevanceHints	Example queries this node answers	DESCEND scoring
contentTokens	Pre-read budget verification	DESCEND can_afford
trustScore	Quality signal — decays over time	DESCEND ranking
accessCount	Popularity signal for tree optimization	WRITE BACK

Trust Scores

Trust is not static. It decays over time and is boosted by verification:

Event	Effect
Time passes	Decays by 0.3%/day (half-life ~231 days)
Agent verifies (reads and confirms)	+0.05
Human verifies	+0.15
Cross-referenced by another node	+0.03
Falls below 0.10	Flagged for review or archived

During search, low-trust nodes receive a penalty: effectiveScore = rawScore * trustScore. Nodes past their TTL are excluded.

Configuration

Parameter	Default	What It Controls
beamWidth	3	Branches explored per descent level
maxDepth	5	Maximum tree depth
pruneThreshold	0.2	Minimum score to keep a candidate
depthDecayFactor	0.85	Score penalty per descent level
searchFraction	0.15	Fraction of total budget for search
minimumUsefulBudget	2000	Stop search below this token count
enableWriteBack	true	Allow knowledge tree updates
enableDirectHit	true	Allow fast-path keyword matching

For research-heavy tasks, increase searchFraction to 0.40. For execution-heavy tasks where context is already in the prompt, reduce to 0.05.

Why Flat Text Cannot Support HBDS

HBDS Capability	Required Property	Markdown?	SurfDoc?
Classify query against domain index	P2: Typed cross-references	No	Yes
Score candidate relevance	P1: Declared block types	No	Yes
Check budget before reading	P4: Pre-computed token costs	No	Yes
Fall back to summaries	P1 + P3: Types + priority	No	Yes
Type-aware synthesis	P1: Block types	No	Yes
Place write-back knowledge	P2: Typed references	No	Yes

You cannot retrofit these properties onto Markdown without making it something other than Markdown. The properties are not presentation — they are structure. The format IS the architecture.

Storage Options

Option	Scale	Best For
File-based (Markdown tree)	< 1,000 nodes	Getting started, Git-friendly
Hybrid (files + SQLite index)	1,000-10,000 nodes	Single-user production
Database (PostgreSQL)	10,000+ nodes	Multi-user teams

Start file-based. Migrate to hybrid when keyword search gets slow. Use database for team-shared knowledge stores.

Provider Agnosticism

The algorithm is universal. The LLM provider is swappable.

Component	Universal?
Knowledge tree schema	Yes
HBDS algorithm logic	Yes
Budget tracking	Yes
Token counting	Partially — tokenizer varies
LLM calls	Prompt format is universal; API format varies
Context window size	Provider-specific

HBDS auto-adapts: larger windows allow wider beams. Models with windows under 32K enter summary-only mode.

Summary

HBDS is a search algorithm for a world where:

Context windows are finite and always will be
Knowledge worth querying is effectively infinite
Flat text provides no structure for algorithms to navigate
Typed documents declare the structure algorithms need

The algorithm navigates typed knowledge hierarchies under strict token budgets, uses beam search with pre-read budget verification, degrades gracefully to summaries, and self-improves through write-back.

The format is the architecture. The algorithm is the proof.

Tip: Read the thesis

Typed Structure Over Flat Text — why AI agents need traversable documents, and the five failure modes of flat text that typed formats eliminate.