Part II: The Protocol¶
Chapter 4: Six Tools¶
In 1974, computer scientist Christopher Alexander published A Pattern Language, a catalogue of 253 design patterns for buildings and towns. The book's argument was not that architects should memorize 253 patterns. It was that good design recurs -- that the same solutions to the same problems appear across different scales and contexts, and that naming them makes them easier to recognize, teach, and apply. The patterns ranged from urban planning ("City Country Fingers") to room layout ("The Flow Through Rooms") to the placement of a window seat. Each was a named, composable solution to a recurring problem.
What Alexander discovered, and what software engineers rediscovered twenty years later when they adapted his framework for code, is that the value of a pattern library is not in its size. It is in the coverage-to-complexity ratio. A small set of well-chosen patterns that together cover the full space of common problems is more useful than a large set that covers the same space redundantly or inconsistently. The goal is completeness with economy.
BFS-QL has six tools. The choice is not arbitrary and not conservative -- it is the result of asking, for every candidate tool, whether it covers something the others do not, and whether the space it covers is one an LLM actually needs.
Why Six¶
The full space of what an LLM needs to do with a knowledge graph can be decomposed into six operations, each distinct, together exhaustive:
Orientation. The LLM arrives at a graph it has never seen. It does not
know what kinds of entities the graph contains, what relationships are
represented, or how they are named. Before it can navigate, it needs a map.
This is describe_schema.
Resolution. The LLM has a name -- a drug, a disease, an author. It
needs the canonical ID that the graph uses for that entity. Names are
ambiguous; canonical IDs are not. The operation of mapping a name to an ID
is fundamental and cannot be collapsed into traversal without introducing
the hallucination problem Chapter 2 described. This is search_entities.
Traversal. The LLM has a seed -- one or more canonical IDs. It wants
to know what they connect to. This is the core operation, the one that
makes graph knowledge accessible. Everything else is setup or follow-up.
This is bfs_query.
Expansion. The traversal returns stubs -- lightweight placeholders for
nodes that were present in the topology but did not warrant full metadata.
The LLM sees that something is there and wants to know what it is. For a
single stub, this is describe_entity.
Batch expansion. A single bfs_query call typically surfaces several
stubs worth inspecting. Calling describe_entity on each in sequence
means one round-trip per entity: the LLM issues a call, waits, reads the
result, decides to expand the next stub, and repeats. Each round-trip
carries the full overhead of a tool invocation in an LLM session -- not
a database round-trip, but a model reasoning step. In practice, Claude Code
flagged this explicitly as friction: sequential single-entity expansion is
slow and accumulates latency when several stubs warrant attention.
describe_entities accepts a list of IDs and returns full records for all
of them in a single call. It is not a convenience alias for a loop; it is
the operation that makes batch expansion a first-class primitive rather
than an emergent pattern the model has to construct.
Intersection. The LLM has a set of seeds and wants to know what is
common to all of them -- not the union of their neighborhoods, but the
nodes reachable from every seed simultaneously. bfs_query returns the
union; the LLM cannot reliably do the set intersection itself over hundreds
of nodes. This is intersect_subgraphs.
Orient, resolve, traverse, expand, batch-expand, intersect. The protocol
has grown by one each time a real gap appeared -- intersect_subgraphs when
multi-seed reasoning proved unreliable without it, describe_entities when
sequential single-entity expansion proved too costly. Candidate additions
like "find shortest path" or "list all entities of type X" reduce to
compositions of existing tools without material cost, or add query-oriented
answers rather than navigational handles. The bar is real demonstrated need,
not speculation.
The Session Workflow¶
The six tools define a natural sequence that a well-behaved LLM follows against any BFS-QL graph:
1. describe_schema()
→ learn entity types, predicates, graph description
2. search_entities(name, node_types=[...])
→ resolve a name to one or more canonical IDs
→ pass node_types to avoid noise in results
→ inspect entity_type to disambiguate if needed
3. bfs_query(seeds, max_hops, ...)
→ traverse from the resolved ID
→ start with topology_only=True for large graphs, OR
→ use exclude_node_types=["paper","author"], min_mentions=2
for a concept-only result on literature graphs
→ use node_types and predicates to focus metadata detail
4. describe_entities([id, id, ...])
→ expand any stubs that warrant closer inspection
→ batch multiple IDs in a single call
Steps 1 and 2 may be partially redundant if the BFS-QL server injects
schema into tool descriptions at startup -- in that case the LLM may skip
the explicit describe_schema call. Steps 3 and 4 are iterative: the
output of one bfs_query call identifies stubs that motivate describe_entity
calls, which may motivate further bfs_query calls seeded at newly
discovered nodes. The workflow is a loop, not a pipeline.
This matters for how the tools were designed. Each tool must be callable
in any order, with the outputs of earlier calls serving as inputs to later
ones. bfs_query takes canonical IDs -- which search_entities produces.
describe_entity takes canonical IDs -- which appear in bfs_query results.
The interface is compositional by construction.
What Is Not a Tool¶
The choice of six tools is also a choice of what not to include. Some candidates worth examining:
A "shortest path" tool. Useful for certain graph analyses. Not needed
for LLM reasoning, which doesn't navigate to specific destinations -- it
explores neighborhoods. An LLM that needs to know whether two entities are
connected can issue a multi-hop bfs_query and inspect the result. The
two-step answer is not materially worse than a dedicated tool, and adding
the tool adds one more surface for the LLM to reason about.
A "list all entities of type X" tool. The medlit demo has 119 disease
entities. A tool that returns all of them is not useful to an LLM trying
to reason; it is a context flood. The right operation is bfs_query from
a relevant seed with node_types=["disease"], which returns the disease
entities that are connected to something the LLM already cares about.
Relevance is structural, not taxonomic.
A "count" tool. Useful for human analysts building dashboards. Not useful for LLM reasoning. An LLM that receives "there are 119 disease entities" has not learned anything it can act on. The count tells it nothing about which diseases matter, how they connect, or what the graph structure implies about the domain.
The pattern in all three cases is the same: the candidate tool answers a query-oriented question rather than a traversal-oriented one. It gives the LLM a fact rather than a navigational handle. BFS-QL is designed for navigation. The six tools reflect that.
Chapter 5: describe_schema -- Self-Orienting Graphs¶
In the early days of the web, connecting to a new API meant reading its
documentation. The documentation was a separate artifact -- a PDF, a wiki
page, a sequence of example curl commands -- maintained by humans, often
out of sync with the actual API, and unavailable to the software that needed
it. A client that wanted to know what endpoints were available had to be
told by a human who had read the docs.
This was not a fundamental limitation. Roy Fielding's REST
dissertation, published in 2000, included
hypermedia as a first-class constraint: a well-designed REST API should
carry, in its responses, the information a client needs to navigate it.
Links, not documentation. The API tells you what it can do; you don't need
to be told separately. This principle -- that interfaces should be
self-describing -- has become standard in modern API design. OpenAPI
specifications, GraphQL introspection, FastAPI's /docs endpoint: all are
expressions of the same idea.
describe_schema is BFS-QL's implementation of
this principle for knowledge graphs. An LLM connecting to a graph it has
never seen -- a private Fuseki instance, a domain-specific SPARQL endpoint,
a kgraph-derived Postgres store for a hospital's clinical data -- needs to
know what entity types and predicates exist before it can construct a
meaningful query. In the SPARQL world, this required reading documentation.
In BFS-QL, it requires one tool call.
What It Returns¶
A describe_schema response contains three things:
-
graph_description: A human-readable string describing the graph and its domain -- what the data represents, where it came from, what kinds of questions it is meant to answer. This is provided by the graph operator when the BFS-QL server is configured. A well-written description tells the LLM whether this is the right graph for its current question. -
entity_types: The complete list of valid entity type names in the graph. These are exactly the values the LLM can pass asnode_typesin abfs_querycall. Not approximate names, not documentation -- the actual strings the query engine understands. -
predicates: The complete list of valid predicate names. These are exactly the values the LLM can pass aspredicatesin abfs_querycall.
The medlit graph, for example, returns 19 entity types and 16 predicates.
After one call, the LLM knows that drug, disease, and procedure are
valid node types -- and that protein and enzyme are also present, which
tells it something about the level of mechanistic detail in the graph. It
knows that TREATS, CAUSES, and INHIBITS are valid predicates -- and
that CITES and AUTHORED are also present, which tells it that the graph
includes bibliographic structure alongside clinical knowledge.
This is orientation in the strict sense. The LLM knows what it is looking at before it starts navigating.
Two Delivery Modes¶
The describe_schema tool can be called explicitly or made unnecessary
through a second mechanism: schema injection.
At startup, the BFS-QL server calls entity_types() and predicates()
on the backend and holds the results in memory. If the schema is small
enough -- the implementation uses a threshold of 20 entity types and 30
predicates -- the server injects the valid values directly into the
bfs_query tool description. The LLM reads the tool description before
it calls the tool, so it arrives at bfs_query already knowing what
node_types and predicates values are valid. No explicit describe_schema
call required.
This is a zero-cost optimization for small schemas. The LLM doesn't spend a tool call on orientation; the orientation is already embedded in the interface.
The tradeoff is tool description size. A graph with 19 entity types and
16 predicates adds roughly 200 characters to the bfs_query description --
negligible. A graph with 200 entity types and 500 predicates would make the
tool description unwieldy and consume context before the LLM has done
anything. Above the threshold, injection is suppressed and explicit calling
is the path.
Both modes are supported transparently. The server chooses based on schema size. The LLM's behavior is the same either way: it starts a session knowing the schema, whether that knowledge came from injection or from a tool call.
The graph_description as a First-Class Signal¶
The graph description is worth more attention than it usually receives. In the medlit example, it reads: "36 PubMed papers on Cushing disease and related endocrinology." That sentence tells an LLM several things that affect how it should reason:
- The corpus is small (36 papers). Claims that seem universal may be specific to this literature.
- The domain is focused (Cushing disease). Entities and relationships outside that domain are unlikely to be well-represented.
- The data source is biomedical literature. Relationships have provenance and carry confidence scores.
A graph operator deploying BFS-QL should treat the description as they
would treat a system prompt: an opportunity to shape how the LLM approaches
the data. "This graph contains inferred relationships; verify important
claims against source documents." "The entity type provisional indicates
entities whose canonical IDs could not be resolved." "Predicates are
directional; TREATS runs from drug to disease, not the reverse."
The server instructions mechanism serves a similar function. BFS-QL's
server sends a block of instructions to the LLM at session initialization,
before any tool calls. These instructions can include graph-specific
guidance that doesn't fit in the tool descriptions -- in the medlit
deployment, for example, the instructions note that entity IDs beginning
with prov: are provisional artifacts from the ingestion pipeline, carry
no external canonical meaning, and should be treated as anonymous
placeholders. Without that note, an LLM might waste reasoning cycles
wondering what a provisional ID like
prov:2e02b663d97c45499d4ce644abf81b8a refers to.
Self-description is not just schema. It is everything the graph operator knows about the data that the LLM would benefit from knowing before it starts.
Chapter 6: The Query Model¶
The core of BFS-QL is a single query structure with five parameters. Understanding why each parameter is present -- and why the others are not -- is the key to using the protocol well and to implementing it correctly.
The Parameters¶
seeds is a list of canonical entity IDs. This is the starting point
of the traversal. Multiple seeds are supported because many useful questions
are inherently relational: not "what connects to this entity?" but "what do
these two entities have in common?" A multi-seed query issues a single BFS
from all seeds simultaneously and returns their combined neighborhood,
deduplicated. The LLM doesn't need to issue separate queries and merge the
results manually.
max_hops is an integer controlling traversal depth. A value of 1
returns only immediate neighbors; 2 returns neighbors of neighbors; and so
on up to a maximum of 5. The practical guidance is to start at 1 and expand
only if the first result doesn't contain what you need. A 2-hop traversal
from a well-connected node in the medlit graph returns 84 nodes and 99
edges. A 3-hop traversal from the same node would return most of the graph.
Depth is a context budget decision, not a correctness decision -- the graph
is the same either way.
node_types is an optional list of entity type names. Nodes whose type
matches receive full metadata in the response. Nodes whose type does not
match are returned as stubs -- present in the result with their ID and type,
but no metadata. Omitting node_types gives full metadata for all nodes,
which is appropriate when the graph is small or when the LLM needs
comprehensive information. Providing node_types focuses the context budget
on what matters.
predicates is an optional list of predicate names. Edges whose
predicate matches receive full metadata in the response, including confidence
scores, source documents, and provenance. Edges whose predicate does not
match are returned as bare subject-predicate-object triples. The behavior
is symmetric with node_types: topology is always present, detail is
selectively paid for.
topology_only is a boolean that, when true, suppresses all metadata
from the response. Every node is returned as a bare ID and type; every edge
as a bare subject-predicate-object triple. No node metadata, no edge
metadata, no provenance. The response is pure structural skeleton.
exclude_node_types is an optional list of entity type names to remove
entirely from the result. Unlike node_types (which demotes non-matching
nodes to stubs but keeps them), exclude_node_types removes the specified
types and all edges that touch them. The topology is no longer guaranteed
complete when this parameter is used -- that is the point. Use it to
suppress high-volume types that dominate large traversals without adding
conceptual value. The canonical use case is exclude_node_types=["paper",
"author"] on a concept-oriented query: papers and authors are the
connective tissue of a literature-derived graph and account for the majority
of nodes in a deep traversal, but an LLM reasoning about disease mechanisms
rarely needs them.
min_mentions is an optional integer (default 1, no filtering) that
removes nodes whose total_mentions field in metadata is below the
threshold, along with all edges touching them. This suppresses
low-confidence provisional entities that appear in only one or two source
documents and are structurally present but semantically unreliable. Nodes
without a total_mentions field are always included regardless of
threshold, so the filter is safe on backends that do not populate it. Note
that min_mentions filters the result, not the traversal -- a
low-mention node can still serve as a bridge to high-mention nodes at deeper
hops, but it will not appear in the returned result.
limit and offset are optional integers for paginating large
results. limit caps the number of nodes returned; offset skips the
first N nodes. Together they allow an LLM to page through a large
neighborhood without requesting everything at once. node_count and
edge_count always reflect the full traversal regardless of pagination, so
the LLM can see the total size and decide whether to request more pages.
Edges are filtered to those whose both endpoints appear in the returned
node window, so each page is a self-consistent subgraph. When neither
parameter is specified the full result is returned unchanged.
The Flat Format¶
These five parameters are passed as a flat JSON object. There is no nesting, no sub-query structure, no boolean expression language. The query either specifies seeds, a depth, and optional filters, or it doesn't. This flatness is a deliberate choice.
Query languages like SPARQL and GraphQL support arbitrarily nested structures because they need to -- they are designed to express complex constraints precisely. BFS-QL is not designed for precise constraint expression. It is designed for reliable generation by a language model. Every level of nesting in a query format is an opportunity for the model to make a structural error -- a misplaced bracket, a wrong level of indentation, a filter applied at the wrong scope. A flat format has no levels. The model either provides the parameter or it doesn't.
This is not a limitation on expressiveness. The five parameters cover the full space of what BFS-QL needs to express. The flatness is expressiveness appropriate to the operation.
Context Budget Management¶
The central design constraint of the query model is the context window. Every token in the response consumes context budget; too many tokens degrade reasoning. The query parameters are the mechanism for managing that budget.
The recommended query progression reflects this:
First: topology survey. Call bfs_query with topology_only=True
and max_hops=2.
This returns the complete structural skeleton of the
neighborhood -- every node and edge -- at minimum token cost. For the
medlit desmopressin example, this is 14,000 characters for 84 nodes and
99 edges. The LLM can read the full topology and identify what matters
before committing context budget to metadata.
Second: selective expansion. Call describe_entities with the IDs of
the nodes the topology survey identified as significant. This retrieves full
metadata for multiple nodes in a single call. The LLM pays for exactly the
information it has decided it needs, and nothing else. (The single-node
describe_entity remains available for one-off lookups; use
describe_entities when expanding several stubs at once.)
Third: targeted re-query. If a follow-up traversal is needed -- perhaps
the topology survey revealed an unexpected cluster that warrants its own
exploration -- issue a new bfs_query with node_types and predicates
filters focused on what matters. The third query is more expensive than the
first but more targeted: it retrieves full metadata only for the entity
types and predicates the LLM has decided are relevant.
This progression from cheap-and-broad to expensive-and-targeted is the working set principle in practice. The first query establishes the topological working set. The second and third queries fill in detail selectively.
Alternative for concept-dense graphs. On large literature-derived graphs, a topology survey at max_hops=2 may itself exceed the context budget -- hundreds of paper and author nodes dominate the result. In this case, skip the topology survey and issue a direct concept-only query:
This returns only concept entities (diseases, genes, drugs, pathways, etc.)
with 2 or more corpus mentions -- high-signal nodes with full metadata --
in a single in-band response. The breast cancer 1-hop query on the
graphwright corpus returns 73 nodes and 86 edges this way, compared to
1,347 nodes in the unfiltered 2-hop result. Use max_hops=1 as the default
and expand to 2 only if the 1-hop result is too sparse.
Multi-Seed Queries¶
The multi-seed case deserves more attention than it typically receives, because it is the natural form for a large class of clinically and scientifically interesting questions.
bfs_query with multiple seeds returns the union of their neighborhoods,
deduplicated. This is useful for many questions: "What connects this disease
to this gene?" returns the combined neighborhood of both seeds, and the
structural answer -- the nodes that appear in both halves of the union --
is present in the result for an LLM to inspect. For small result sets, this
works well.
For larger graphs, union-and-inspect becomes unreliable. When each seed's
1-hop neighborhood contains hundreds of nodes, asking the LLM to identify
which nodes appear in both is structured bookkeeping that language models
do poorly -- they miss nodes, conflate similar IDs, and produce inconsistent
results. This is the problem intersect_subgraphs solves: it returns only
the nodes within k hops of every seed, without the LLM performing any
manual set operations.
The medlit example illustrates the bfs_query case. A 1-hop multi-seed
query from desmopressin (RxNorm:3251) and Cushing syndrome (MeSH:D003480)
returns 35 nodes and 37 edges. Of those, exactly two nodes are in the
direct neighborhood of both seeds: PMC11128938, the paper that co-describes
both entities, and DBPedia:Cushing's_disease, the specific disease subtype
that desmopressin treats. For a 36-paper graph at 1-hop depth, the LLM can
inspect the union reliably. For a larger graph or deeper traversal,
intersect_subgraphs is the right tool.
What the Response Contains¶
A BFS-QL response contains:
seeds: The seed IDs used. Included for reference -- in a multi-turn session, the LLM may need to recall which seeds were used for a given result.max_hops: The depth used.node_countandedge_count: Total counts. These are useful for calibrating follow-up queries -- a result with 200 nodes warrants a more targeted re-query than a result with 15.nodes: A list of node records. Each is either a fullNode(with metadata) or a stubEntityStub(ID and type only), depending on whether its type matchednode_types.edges: A list of edge records. Each is either a fullEdgeWithMetadata(with confidence, source documents, and provenance) or a bareEdge(subject, predicate, object only), depending on whether its predicate matchedpredicates.schema_summary: The entity types and predicates actually present in this result subgraph, regardless of the filters applied. See the next section.
One design choice worth noting: stub nodes are always included. If a
Disease node is present in the topology but node_types=["drug"], the
Disease node appears as a stub -- ID and type, no metadata. It is not
omitted. The topology is always complete. This is the separation of
topology from presentation that Chapter 3 argued for: filtering controls
detail level, not presence.
Schema Discovery in Results¶
Every BFS-QL query response includes a schema_summary field containing the
entity types and predicates actually present in that result subgraph.
This applies to both bfs_query and intersect_subgraphs.
This is a first-class feature, not implementation detail.
"schema_summary": {
"entity_types_found": ["disease", "drug", "gene", "paper"],
"predicates_found": ["associated_with", "targets", "treats"]
}
The value of schema_summary is especially clear in two situations.
Large or open-world graphs. describe_schema may return
comprehensive=False when the graph is too large to enumerate entity
types and predicates exhaustively -- a Wikidata endpoint, for instance,
has thousands of predicates that cannot all be listed upfront. In this
case, the LLM cannot know what filters are valid before issuing a query.
schema_summary solves the problem by reporting the vocabulary
actually present in the neighborhood. After a topology_only survey,
the LLM can read schema_summary and use those values as node_types
and predicates filters in a targeted follow-up query. No documentation
needed, no guessing at predicate names.
Paginated results. When limit and offset are used to page through
a large traversal, schema_summary always reflects the full traversal,
not just the current page. The LLM sees the complete vocabulary of the
neighborhood even if it is only reading a window of nodes. This matters
because the decision about which types and predicates to filter on should
be made with knowledge of the whole subgraph, not just the first page.
schema_summary closes the loop that describe_schema opens. Together
they ensure an LLM always has valid filter values available, whether from
the static schema at startup or from the live vocabulary of a result.
Name Disambiguation in search_entities¶
search_entities accepts a node_types parameter that restricts results
to entities of the specified types. This exists to address a common
disambiguation problem.
Common scientific terms match multiple entity types. "Breast cancer"
matches the disease concept (MeSH:D001943) and also dozens of papers
whose titles contain the phrase. When an LLM calls search_entities to
resolve a disease name, it typically wants the disease concept, not the
papers. Without node_types, the results may be dominated by papers; the
disease entity may not appear in the top results at all.
```python search_entities("breast cancer", node_types=["disease"])