Part IV: The Bigger Picture¶
Every knowledge graph that uses canonical IDs correctly is automatically composable with every other one that does the same. This is an emergent property of anchoring to shared authorities — nobody designed MeSH, HGNC, RxNorm, and UniProt as an LLM interoperability layer, but that is what they have quietly become.
The companion volume Knowledge Graphs from Unstructured Text argues for canonical identity as a quality and provenance concern. The companion volume The Identity Server develops the full architecture of how canonical identity is achieved and maintained. This part is where that investment pays off: the shared identifiers are the bridges between graphs, and BFS-QL is the interface that traverses them.
The LLM is the reasoner. BFS-QL is the interface. Shared canonical IDs are the bridges. All three pieces are available right now.
Chapter 13: Composing Graphs¶
Start a Claude Code session. Add two MCP servers: one for a kgraph-derived
Postgres graph of recent endocrinology literature, one for DBpedia. The model sees twelve tools: the six BFS-QL tools prefixed with
bfs-ql. and the same six prefixed with dbpedia..
The servers are identically structured. The model does
not know or care that one is backed by Postgres and the other by Virtuoso.
It knows only that each gives it six tools for navigating a graph.
This is not a specially engineered federation capability. No protocol extension is required. No shared schema is negotiated in advance. The two servers are independent; they know nothing about each other. What makes them composable is not the protocol -- it is the identifiers.
Identity Bridging¶
Desmopressin in the kgraph Postgres graph has the canonical ID RxNorm:3251.
Desmopressin in DBpedia has the URI
<http://dbpedia.org/resource/Desmopressin>,
which the SPARQL backend normalizes to DBpedia:Desmopressin.
These are
different identifiers -- the graphs use different ID schemes. But both
entities carry an RxNorm property. An LLM that knows to look for it
can recognize that RxNorm:3251 in one graph and RxNorm: 3251 in the
DBpedia record refer to the same compound.
When graphs share a canonical ID scheme -- both use RxNorm for drugs, both
use MeSH for diseases -- bridging is automatic. The LLM queries the first
graph, finds RxNorm:3251, uses that ID as a seed in the second graph's
bfs_query, and traverses the boundary. No mapping table. No federation
protocol. The shared ID is the bridge.
When graphs use different ID schemes, bridging requires a step: take the
entity's label from the first graph ("desmopressin"), call search_entities
in the second graph with that label, inspect the results, and pick the
right match. This is the same disambiguation step the LLM performs at
the start of any session. The difference is that it is now cross-graph.
Composability is proportional to shared canonical identity. Two graphs that both use RxNorm for drugs and MeSH for diseases can be traversed as a single logical graph for any query that stays within those domains. Two graphs with entirely bespoke ID schemes can be bridged only by label matching, which is slower and more ambiguous. The degree of composability is not a property of the BFS-QL protocol. It is a property of the graphs.
What the LLM Actually Sees¶
In a session with two BFS-QL servers connected, the model sees something like this in its tool list:
bfs-ql.describe_schema()
-- medlit Postgres graph
bfs-ql.search_entities(query, ...)
bfs-ql.bfs_query(seeds, ...)
bfs-ql.describe_entity(id)
bfs-ql.describe_entities([id, ...])
bfs-ql.intersect_subgraphs(seeds, k, ...)
dbpedia.describe_schema()
-- DBpedia SPARQL endpoint
dbpedia.search_entities(query, ...)
dbpedia.bfs_query(seeds, ...)
dbpedia.describe_entity(id)
dbpedia.describe_entities([id, ...])
dbpedia.intersect_subgraphs(seeds, k, ...)
The server name prefix is the only differentiator. The tool signatures
are identical. The session workflow is identical. A query that begins in
the kgraph graph -- orient, resolve desmopressin to RxNorm:3251, traverse
2 hops -- can continue in DBpedia by using RxNorm:3251 (or the label
"desmopressin") as the seed for dbpedia.search_entities. The model
bridges graphs the same way a human researcher bridges databases: by
carrying a known identifier across sources.
The research literature graph knows what papers say about desmopressin -- which studies, which findings, which patient populations, which confidence scores. The encyclopedic backbone knows what desmopressin is -- its pharmacological class, its mechanism of action, its related compounds, its place in the drug taxonomy. Together they give the LLM both the frontier and the foundation. Neither graph has both. The composition does.
The Canonical ID Argument, Revisited¶
The companion volume argues for canonical IDs as a quality concern: an entity that is anchored to a MeSH term or RxNorm code is unambiguous, verifiable, and connected to a community of expert judgment. Here, the same argument appears as a composition argument: that anchoring is what makes the entity bridgeable across graphs.
The two arguments are not separate. They are the same observation from different vantage points. A canonical ID is not just a unique key for deduplication. It is a pointer into a shared epistemic commons -- the accumulated judgment of a community about how to name and classify things in a domain. When two graphs both point to that commons, they become connected through it, without any bilateral coordination.
The biomedical, legal, chemistry, and geography communities built their identifier infrastructures -- MeSH, MeSH, RxNorm, HGNC, ChEBI, PubChem, Wikidata, GeoNames -- over decades for their own internal purposes: literature indexing, regulatory compliance, compound tracking, geographic reference. They were not building an interoperability layer for LLM reasoning. But that is what they built, as a side effect of building a shared commons. The emergent property was always latent in the infrastructure. BFS-QL makes it accessible.
Chapter 14: The Server Is Not the Point¶
It is easy, when building infrastructure, to mistake the infrastructure for the product. The MCP server starts, the tools register, the LLM connects -- and the system works. It is tempting to call this the achievement. It is not. The achievement is what happens next: an LLM reasoning over a knowledge graph and reaching conclusions it could not reach from any single document.
The server is not the point. The graph is the point. The server exists to make the graph accessible. If the graph is not worth serving -- if its entities are poorly extracted, its relationships are hallucinated, its canonical IDs are inconsistent -- then a flawless MCP server delivers nothing. The interface contract is only as valuable as what it connects to.
Graph Quality as the Upstream Constraint¶
The BFS-QL interface exposes whatever the graph contains. It does not
validate, filter, or improve graph content. A relationship that was
extracted with low confidence appears in bfs_query results just as a
high-confidence relationship does -- the confidence score is metadata,
not a filter. An entity that was improperly deduplicated appears as two
nodes where one is the canonical representative and the other is a
stub pointing at it. A canonical ID that was assigned incorrectly
connects the entity to the wrong place in the epistemic commons.
These are upstream concerns. They belong to the extraction and curation pipeline -- to the tools and processes covered in Knowledge Graphs from Unstructured Text, and specifically to its Chapter 9, which covers diagnostic queries for assessing graph health: entity type distribution, predicate coverage, deduplication quality, ID resolution rates. The question "is this graph worth serving?" should be answered before the MCP server is provisioned, not after.
This is not a criticism of BFS-QL's design. It is the correct division of labor. An interface that tries to compensate for graph quality issues -- by silently filtering low-confidence edges, by resolving deduplication conflicts on the fly, by guessing canonical IDs -- would be doing the wrong work at the wrong layer. The interface should be transparent. The graph owner is responsible for what it contains.
What "Active Contract" Means¶
The phrase active contract distinguishes BFS-QL from a passive data pipe. A passive pipe -- a REST endpoint that returns graph data on demand -- is indifferent to how its output is used. It has no opinion about session workflow, query order, or what the caller should do with stubs. It just serves data.
BFS-QL has opinions. The tool descriptions guide the LLM toward a specific
workflow: orient first, resolve names before traversing, start with
topology before requesting full metadata, use describe_entity for
expansion rather than re-querying. The server instructions warn about
prov: provisional IDs. The describe_schema tool is designed to be
called at session start, not on demand. The topology_only flag exists
because the server anticipates that full metadata is often unnecessary
for the first traversal.
These are not protocol features. They are epistemic scaffolding -- design choices that encode knowledge about how LLMs reason over graphs and what patterns of use lead to good outcomes. An LLM that follows the intended workflow reaches better conclusions faster, with less context waste, than one that queries arbitrarily. The contract is active because the server is not neutral about outcomes.
Minimal, Predictable, Describable¶
The three properties that make an LLM tool interface work -- minimal, predictable, describable -- are not independent. A larger surface area is harder to describe accurately. An unpredictable interface (one whose behavior depends on state, ordering, or undocumented invariants) is harder to reason about. A surface that is both large and unpredictable is practically unusable by a language model, which is why SPARQL fails as an LLM interface despite being a powerful and well-designed query language.
BFS-QL has six tools. Each does one thing. Each has the same behavior every time it is called with the same arguments. Each is described in a tool docstring that fits in a few sentences. The model can hold the entire interface in its working context simultaneously. It does not need to reason about which tool to use; the session workflow makes the order explicit. It does not need to consult documentation mid-session; the tool descriptions are self-contained.
These properties are not accidents of the current implementation. They
are design constraints that shaped every decision in Parts II and III.
The six-tool surface emerged from asking which operations are truly
distinct. The stub/full model emerged from asking how to keep response
size predictable. The topology_only mode emerged from asking what the
minimum useful response is. Minimal, predictable, and describable are
not virtues added at the end -- they are the criteria by which each
design choice was evaluated.
Chapter 15: Open Source and the SaaS Layer¶
The BFS-QL library is open source. This is not a purely ideological choice, though there are ideological reasons for it. It is a strategic choice informed by the specific position BFS-QL occupies: a developer tool that bridges existing infrastructure (knowledge graphs, SPARQL endpoints, Postgres databases) to a new capability (LLM reasoning over structured data). Developer tools in this position benefit from open source in ways that consumer products do not.
What the Library Gives You¶
The open-source library provides:
- The
GraphDbInterfaceABC and all backend implementations (Postgres, SPARQL, Neo4j) - The BFS traversal engine, stub/full filtering, topology mode
- The
CachedGraphDbcaching wrapper - The
create_server()function and the six-tool MCP server - The
bfs-ql serveCLI command for local deployment - The test suite and integration test infrastructure
This is a complete, functional implementation. A developer with an existing knowledge graph can clone the library, implement a backend for their store (or use an existing one), and have a working MCP server in hours. The library requires no license, no account, no API key.
The intended users of the raw library are developers, researchers, and organizations with existing infrastructure: a company that runs its own SPARQL endpoint, a research group with a Postgres graph of domain literature, a developer building a domain-specific LLM application who wants to add graph reasoning without depending on an external service. These users have the technical capability to self-host and the privacy or control requirements that make a hosted service unattractive.
What the Hosted Service Adds¶
The open-source library solves the single-graph, single-tenant, developer-operated case. The hosted service solves everything else:
Provisioning. Connecting a new SPARQL endpoint or Postgres database to a hosted BFS-QL server is a configuration operation, not a deployment operation. The user provides a connection string; the service handles the rest.
Multi-tenancy. Multiple users, multiple graphs, isolated sessions. The library has no concept of tenants; the service does.
Schema discovery. For unknown or public endpoints (DBpedia, Wikidata, ChEMBL), the service can probe the endpoint, build a partial schema, and configure the BFS-QL server automatically. The library requires the user to know their schema in advance.
Managed caching. The library's CachedGraphDb is session-scoped and
in-memory. The hosted service can maintain persistent, cross-session caches
for frequently queried graphs, reducing latency and backend load.
Private endpoint support. An organization with a knowledge graph behind a firewall can configure the hosted service as a proxy, exposing the BFS-QL interface to external LLM clients without exposing the raw endpoint.
Uptime and query accounting. The library has no monitoring, no rate limiting, no audit log. The service does.
The Elastic Playbook¶
Elastic (Elasticsearch) open-sourced its search engine and built a multi-hundred-million-dollar business selling the managed service: Elastic Cloud. MongoDB did the same. HashiCorp did the same with Terraform and Vault. The pattern is well-established: open source the library, sell the managed service.
The reasons it works in these cases are the same reasons it works for BFS-QL:
First, the library is genuinely useful on its own. Open-source projects that are deliberately crippled to push users toward the paid service develop reputational problems. The library should be complete. Users who self-host should not feel they are getting an inferior product.
Second, self-hosting has real costs. Setting up and maintaining a production service -- monitoring, scaling, reliability, security patching -- is expensive even when the software is free. The managed service eliminates those costs. Users who value their time over control will pay for it.
Third, the open-source library is the top of the funnel. Developers who discover BFS-QL through the library, build something with it, and then need to scale or simplify operations are the natural customers for the hosted service. The library is marketing; the service is revenue.
Why Open Source Is Right for the Library¶
Beyond strategy, open source is correct for BFS-QL's library for reasons specific to its position.
Community backends. The eight-method interface is a specification. A community of users implementing backends for their preferred graph stores -- TigerGraph, Amazon Neptune, TerminusDB, RDFLib, ArangoDB -- extends BFS-QL's reach without requiring the core team to maintain implementations for every possible backend. Open source is the mechanism for this.
Credibility with technical audiences. The primary users of the library are developers and researchers who will inspect the source code before deploying it. Open source means they can. A black-box library that asks them to trust its behavior without inspection is a harder sell to a technical audience that has other options.
The kgraph referral path. The companion volume (Knowledge Graphs from Unstructured Text) is itself directed at technical practitioners who are building knowledge graphs. Its readers are exactly the users who will want to serve those graphs through BFS-QL. An open-source library that readers can install and run immediately, following the book's instructions, is a better companion to the book than a hosted service that requires account creation before the reader can try the first example.
Chapter 16: What Comes Next¶
BFS-QL as it exists today solves a specific problem: making a single knowledge graph accessible to an LLM through a minimal, well-defined interface. The solution is complete in the sense that it works -- the medlit demo graph, the DBpedia SPARQL endpoint, any kgraph-derived Postgres database can be served, connected, and queried. But complete does not mean finished. Several directions are visible from here.
Multi-Graph Federation¶
The composition model in Chapter 13 is manual: the LLM navigates across
graphs by carrying identifiers and calling search_entities in the
destination graph. This works, but it requires the LLM to be aware of
which graph to query at each step, to manage the bridging operation
explicitly, and to handle the case where a canonical ID in one graph does
not resolve in another.
A federation layer would make this automatic. Given a set of registered BFS-QL graphs and a query, the federation layer would identify which graphs are relevant, issue parallel BFS queries, and merge the results -- resolving identity conflicts using shared canonical IDs and presenting the union as a single logical graph. The LLM would see one set of six tools, not N sets.
The technical foundation for this exists. Shared canonical IDs provide the
merge key. The GraphDbInterface ABC provides the interface that every
backend already implements. The CachedGraphDb wrapper provides the
session-scoped caching that would extend naturally to a cross-graph scope.
What does not yet exist is the federation engine itself: the identity
resolution pass, the union semantics for conflicting metadata, the query
routing logic.
Schema-Aware Query Optimization¶
BFS-QL currently fetches all edges from all nodes in the frontier at each hop, then applies stub/full filtering to the results. For a well-connected graph with many predicates, most of those edges may be irrelevant to the query -- they will become stubs regardless. The traversal still issues the backend calls, still acquires pool connections, still processes the rows.
Schema-aware optimization would use the predicates filter to prune
traversal before it happens. If the caller specifies
predicates=["treats", "inhibits"], the backend's edges_from query
can include a WHERE predicate IN (...) clause, eliminating irrelevant
edges at the database level rather than after retrieval. The
GraphDbInterface would need to support optional predicate hints, or
a specialized edges_from_filtered method could be added for backends
that can use it efficiently.
This optimization matters most for large, densely connected graphs where the majority of edges are structural noise for any given query. For the medlit demo graph (99 edges total), it is irrelevant. For a Wikidata subgraph or a large pharmaceutical compound graph, it could reduce traversal time by an order of magnitude.
Richer Traversal Primitives¶
BFS is the right primitive for LLM-driven graph exploration today, for the reasons Chapter 3 argues: it starts from what the model knows, expands outward, and produces a bounded, interpretable result. But BFS is not the only useful traversal primitive, and as LLM context windows grow and models become better at structured reasoning, richer operations become viable.
Shortest-path queries -- "what is the shortest connection between compound
A and disease B?" -- are structurally different from BFS but expressible
as a composition of BFS calls. A dedicated shortest_path tool would be
more efficient and more explicit. Aggregate queries -- "which entity type
appears most frequently in the 2-hop neighborhood of this seed?" -- are
currently inexpressible in BFS-QL; they require the LLM to aggregate
BFS results manually, which is error-prone for large result sets.
These are not arguments to add these tools now. The six-tool surface is
correct for current LLM capabilities and context constraints. The argument
is that the eight-method ABC is designed to remain stable as the surface
above it evolves: new tools can be added to the MCP server without
changing any backend implementation. The ABC is the stable layer;
the tools are the variable layer. intersect_subgraphs is an example of
how this works in practice: it was added to the server layer without any
changes to the three existing backends.
The Interface Contract as the Stable Layer¶
This is worth stating directly. Backends and LLMs will both evolve. Postgres will add new features. SPARQL endpoints will grow. Neo4j's query language will change. LLMs will get better at structured reasoning, larger context windows, more reliable tool use. Any of these changes might motivate changes to how BFS-QL works.
The eight-method ABC is designed to absorb these changes without
propagating them. A new backend for a new graph store implements the
same eight methods it would have implemented today. A new LLM with
larger context windows gets larger BFS results from the same bfs_query
tool; the interface does not change. A new traversal primitive can be added
as a seventh tool on the MCP server; existing backends continue to work
unchanged because the new tool is implemented in the server layer in
terms of the existing eight methods.
The design principle -- all intelligence in the server layer, all backend-specific logic behind the ABC -- is what makes this possible. The boundary between server and backend is not just an organizational choice. It is the line along which the system can evolve without breaking the parts that work.
What Would Have to Change¶
It is worth asking what would have to happen for BFS-QL to become inadequate as an interface. Several scenarios are plausible:
LLM-native graph reasoning. If future LLMs could natively reason over graph structures -- not just flat text, but labeled property graphs with adjacency semantics -- then the BFS-QL interface might be bypassed in favor of direct graph access. This seems distant; the transformer architecture has no graph inductive bias, and graph-native reasoning would require architectural changes beyond what current scaling achieves.
Context windows that eliminate the working-set constraint. If context windows grew to the point where dumping an entire knowledge graph was computationally reasonable, the argument for selective BFS traversal would weaken. At ten million tokens, you could load DBpedia. But the quadratic attention cost means this is not merely a hardware scaling problem -- it is structural. The working-set constraint does not go away with larger windows; it becomes more expensive, not less.
A better traversal primitive. If a different graph access pattern -- not BFS, not random walk, not shortest path, but something not yet named -- turned out to be more natural for LLM-driven reasoning, BFS-QL would need to change. This is possible. BFS is motivated by the analogy to how LLMs construct reasoning chains, but analogies are not proofs.
None of these scenarios is imminent. What this analysis reveals is that BFS-QL's longevity depends most on the durability of the working-set constraint. If context windows remain scarce relative to graph size -- which the transformer architecture strongly suggests they will -- then selective, bounded traversal from known seeds will remain the right model. The interface built around that model will remain correct.