HBDS: How AI Agents Search Typed Documents

HBDS (Hierarchical Budget-Constrained Descent Search) is a search algorithm that navigates typed knowledge hierarchies under strict token budgets. It works in four phases: ORIENT (classify the query), DESCEND (beam search through the tree), SYNTHESIZE (combine results), and WRITE BACK (persist new knowledge). HBDS requires typed documents — it cannot operate on flat text.

Info: What is HBDS for?

Every AI agent has a context window — a fixed number of tokens it can work with. The knowledge it needs is usually much larger than that window. HBDS is the algorithm that decides what to read within that budget. It navigates a tree of typed documents, scoring relevance at each level, checking costs before reading, and falling back to summaries when the budget is tight.


The Problem HBDS Solves

An AI agent working on your codebase has access to hundreds of documents. Its context window might be 200,000 tokens. The total documentation might be 2,000,000 tokens.

The agent cannot read everything. It must choose. Current approaches:

ApproachHow It WorksThe Problem
RAGChunk docs, embed, retrieve top-k similar chunksFlat — no hierarchy, arbitrary boundaries, no budget control
Agentic tool useLLM decides what to grep/read based on reasoningNo formal budget tracking, no guaranteed coverage
Graph RAGExtract entities from flat text, build knowledge graphStructure is inferred — inference errors propagate
RAPTORCluster chunks, summarize into tree, search top-downTree is auto-generated from flat text, no typed blocks

All of these either operate on flat structure or infer structure from flat text at retrieval time. None use structure declared in the document format itself. HBDS does.


The Four Phases

HBDS works in four sequential phases. Each phase consumes tokens from a monotonically decreasing budget. The context window IS the execution budget.

Phase 1: ORIENT

Goal: Classify the query and select where to start searching.

The agent reads a lightweight root index — a table of contents for the entire knowledge tree. One LLM call classifies the query and picks the most relevant top-level domains.

Query: "How does authentication work in this app?"

Root Index:
  project/          (backend, frontend, infra — 450K tokens)
  general/          (TypeScript, AWS, patterns — 800K tokens)
  agent-memories/   (session logs — 200K tokens)

Classification: project_local
Selected domain: project/backend/
Confidence: 0.92
Cost: 600 tokens
Tip: Fast path

If the root index contains a keyword match with >0.9 confidence, HBDS skips the entire descent and reads the matching leaf directly. This handles queries like "What is the Stripe price ID?" in under a second.

Phase 2: DESCEND

Goal: Navigate the tree to find relevant leaf nodes, within budget.

This is the core of the algorithm — an iterative beam search with budget awareness. At each level of the tree:

  1. Read child manifests — summaries, keywords, and token estimates
  2. Score candidates — the LLM rates each child's relevance (0.0 to 1.0)
  3. Select beam — keep the top-k most relevant (default beam width: 3)
  4. Check budget — verify the budget can afford the node before reading
  5. Read or degrade — full content if budget allows, summary if not
  6. Expand — if a node has children, add them to the candidate queue
Query: "How does authentication work?"
Budget remaining: 28,000 tokens

Level 0 - project/ children:
  backend/     score: 0.95  -> beam
  frontend/    score: 0.35  -> pruned
  infra/       score: 0.20  -> pruned

Level 1 - backend/ children:
  auth/        score: 0.98, 4,200 tokens  -> beam
  billing/     score: 0.12  -> pruned
  data/        score: 0.30  -> pruned

Level 2 - auth/ is a leaf node
  Budget check: 4,200 <= 25,400  -> AFFORDABLE
  Read full content
  Budget remaining: 21,200 tokens

Result: auth/ content (relevance: 0.98, 4,200 tokens)
Warning: Pre-read budget verification

This step is impossible on flat text. Before reading a leaf node, HBDS checks can_afford(node) — comparing the node's declared token count against the remaining budget. In flat text, there is no declared token count. The agent must read the document to find out how large it is. Every read is a blind commitment.

Graceful Degradation

When the budget is insufficient for full content, HBDS doesn't skip the node. It falls back to the node's ::summary block — a compact, high-signal representation.

Node: infrastructure/deployment-guide/
Full content: 12,000 tokens
Summary: 180 tokens
Budget remaining: 3,500 tokens

Decision: 12,000 > 3,500  -> cannot afford
Fallback: read summary only (180 tokens)
Result stored as PARTIAL (relevance discounted 50%)

Phase 3: SYNTHESIZE

Goal: Combine all retrieved knowledge into coherent context.

Takes every result from DESCEND, sorts by relevance, and packs into the remaining budget. A single LLM call produces:

OutputPurpose
AnswerCoherent synthesis of retrieved knowledge
Confidence0-1 rating based on source quality
GapsAspects the sources don't cover
New insightsDeductions not in any source
Sources usedNode IDs for citation

Phase 4: WRITE BACK

Goal: Persist new knowledge generated during the search.

The synthesis phase often produces new insights — deductions that emerge from combining multiple sources. HBDS writes these back to the knowledge tree.

New insight: "The auth system depends on a post-confirmation
trigger that creates the user's initial workspace. This
dependency is not documented in either doc."

Placement: project/backend/auth/ (append)
Trust score: 0.6 (agent-generated)
Source: "agent-write-back"

Over time, the knowledge tree self-improves through use. This is only possible because typed documents provide the structure to organize into.


The Budget System

The entire context window is divided into zones:

Total: 200,000 tokens
  System prompt         [FIXED]       3,000
  Task description      [FIXED]       2,000
  Search budget         [FLEXIBLE]    29,000  (15%)
    ORIENT reserve                    1,000
    DESCEND reserve                   25,000
    SYNTHESIZE reserve                3,000
  Task work             [FLEXIBLE]    155,000 (80%)
  Write-back            [FLEXIBLE]    5,800   (3%)
  Output reserve        [FIXED]       4,000

If search completes early, unused budget transfers to task work. Search never steals from the task zone.

struct BudgetTracker {
    total: usize,
    consumed: usize,
    reserved: usize,
}

impl BudgetTracker {
    fn available(&self) -> usize {
        self.total - self.consumed - self.reserved
    }

    fn can_afford(&self, node: &TreeNode) -> bool {
        node.content_tokens <= self.available()
    }

    fn can_afford_subtree(&self, node: &TreeNode) -> bool {
        node.subtree_token_estimate <= self.available()
    }

    fn consume(&mut self, n: usize, label: &str) {
        assert!(n <= self.available(), "Budget exceeded");
        self.consumed += n;
    }
}

Budget Report

Every HBDS search produces a full accounting report:

{
  "totalBudget": 29000,
  "consumed": {
    "orient": { "readIndex": 180, "classify": 420, "total": 600 },
    "descend": {
      "depth0": { "score": 380, "expand": 150, "total": 530 },
      "depth1": { "score": 350, "readLeaves": 4200, "total": 4550 }
    },
    "synthesize": { "total": 1800 },
    "writeback": { "total": 450 },
    "searchTotal": 7930
  },
  "unused": 21070,
  "searchEfficiency": {
    "nodesScored": 8,
    "nodesRead": 1,
    "tokensPerUsefulResult": 7930,
    "searchFractionActual": 0.04
  }
}

In this example, the search used only 4% of the total budget to find the right knowledge. The remaining 96% is available for the actual task.


The Knowledge Tree Schema

Every node in the HBDS knowledge tree:

{
  "id": "proj.backend.auth",
  "path": "/project/backend/auth",
  "parentId": "proj.backend",
  "type": "leaf",
  "summary": "Auth setup using OPAQUE-KE with post-confirmation triggers.",
  "keywords": ["auth", "opaque", "login", "jwt"],
  "relevanceHints": [
    "How does authentication work?",
    "How do users sign up?"
  ],
  "contentTokens": 4200,
  "metadata": {
    "version": 3,
    "trustScore": 0.92,
    "accessCount": 14,
    "source": "manual"
  }
}
FieldWhy It ExistsHBDS Phase
summaryCheap proxy; enables graceful degradationDESCEND fallback
keywordsFast direct-hit matching without LLM scoringORIENT fast path
relevanceHintsExample queries this node answersDESCEND scoring
contentTokensPre-read budget verificationDESCEND can_afford
trustScoreQuality signal — decays over timeDESCEND ranking
accessCountPopularity signal for tree optimizationWRITE BACK

Trust Scores

Trust is not static. It decays over time and is boosted by verification:

EventEffect
Time passesDecays by 0.3%/day (half-life ~231 days)
Agent verifies (reads and confirms)+0.05
Human verifies+0.15
Cross-referenced by another node+0.03
Falls below 0.10Flagged for review or archived

During search, low-trust nodes receive a penalty: effectiveScore = rawScore * trustScore. Nodes past their TTL are excluded.


Configuration

ParameterDefaultWhat It Controls
beamWidth3Branches explored per descent level
maxDepth5Maximum tree depth
pruneThreshold0.2Minimum score to keep a candidate
depthDecayFactor0.85Score penalty per descent level
searchFraction0.15Fraction of total budget for search
minimumUsefulBudget2000Stop search below this token count
enableWriteBacktrueAllow knowledge tree updates
enableDirectHittrueAllow fast-path keyword matching

For research-heavy tasks, increase searchFraction to 0.40. For execution-heavy tasks where context is already in the prompt, reduce to 0.05.


Why Flat Text Cannot Support HBDS

HBDS CapabilityRequired PropertyMarkdown?SurfDoc?
Classify query against domain indexP2: Typed cross-referencesNoYes
Score candidate relevanceP1: Declared block typesNoYes
Check budget before readingP4: Pre-computed token costsNoYes
Fall back to summariesP1 + P3: Types + priorityNoYes
Type-aware synthesisP1: Block typesNoYes
Place write-back knowledgeP2: Typed referencesNoYes

You cannot retrofit these properties onto Markdown without making it something other than Markdown. The properties are not presentation — they are structure. The format IS the architecture.


Storage Options

OptionScaleBest For
File-based (Markdown tree)< 1,000 nodesGetting started, Git-friendly
Hybrid (files + SQLite index)1,000-10,000 nodesSingle-user production
Database (PostgreSQL)10,000+ nodesMulti-user teams

Start file-based. Migrate to hybrid when keyword search gets slow. Use database for team-shared knowledge stores.


Provider Agnosticism

The algorithm is universal. The LLM provider is swappable.

ComponentUniversal?
Knowledge tree schemaYes
HBDS algorithm logicYes
Budget trackingYes
Token countingPartially — tokenizer varies
LLM callsPrompt format is universal; API format varies
Context window sizeProvider-specific

HBDS auto-adapts: larger windows allow wider beams. Models with windows under 32K enter summary-only mode.


Summary

HBDS is a search algorithm for a world where:

The algorithm navigates typed knowledge hierarchies under strict token budgets, uses beam search with pre-read budget verification, degrades gracefully to summaries, and self-improves through write-back.

The format is the architecture. The algorithm is the proof.

Tip: Read the thesis

Typed Structure Over Flat Text — why AI agents need traversable documents, and the five failure modes of flat text that typed formats eliminate.