Skip to content

Chapter14

Chapter 14: The Server Is Not the Point\index{BFS-QL!as active contract}

It is easy, when building infrastructure, to mistake the infrastructure for the product. The MCP server starts, the tools register, the LLM connects -- and the system works. It is tempting to call this the achievement. It is not. The achievement is what happens next: an LLM reasoning over a knowledge graph and reaching conclusions it could not reach from any single document.

The server is not the point. The graph is the point. The server exists to make the graph accessible. If the graph is not worth serving -- if its entities are poorly extracted, its relationships are hallucinated, its canonical IDs are inconsistent -- then a flawless MCP server delivers nothing. The interface contract is only as valuable as what it connects to.

Graph Quality as the Upstream Constraint

The BFS-QL interface exposes whatever the graph contains. It does not validate, filter, or improve graph content. A relationship that was extracted with low confidence appears in bfs_query results just as a high-confidence relationship does -- the confidence score is metadata, not a filter. An entity that was improperly deduplicated appears as two nodes where one is the canonical representative and the other is a stub pointing at it. A canonical ID that was assigned incorrectly connects the entity to the wrong place in the epistemic commons.

These are upstream concerns. They belong to the extraction and curation pipeline -- to the tools and processes covered in Knowledge Graphs from Unstructured Text, and specifically to its Chapter 9, which covers diagnostic queries for assessing graph health: entity type distribution, predicate coverage, deduplication quality, ID resolution rates. The question "is this graph worth serving?" should be answered before the MCP server is provisioned, not after.

This is not a criticism of BFS-QL's design. It is the correct division of labor. An interface that tries to compensate for graph quality issues -- by silently filtering low-confidence edges, by resolving deduplication conflicts on the fly, by guessing canonical IDs -- would be doing the wrong work at the wrong layer. The interface should be transparent. The graph owner is responsible for what it contains.

What "Active Contract" Means

The phrase active contract distinguishes BFS-QL from a passive data pipe. A passive pipe -- a REST endpoint that returns graph data on demand -- is indifferent to how its output is used. It has no opinion about session workflow, query order, or what the caller should do with stubs. It just serves data.

BFS-QL has opinions. The tool descriptions guide the LLM toward a specific workflow: orient first, resolve names before traversing, start with topology before requesting full metadata, use describe_entity for expansion rather than re-querying. The server instructions warn about prov: provisional IDs. The describe_schema tool is designed to be called at session start, not on demand. The topology_only flag exists because the server anticipates that full metadata is often unnecessary for the first traversal.

These are not protocol features. They are epistemic scaffolding -- design choices that encode knowledge about how LLMs reason over graphs and what patterns of use lead to good outcomes. An LLM that follows the intended workflow reaches better conclusions faster, with less context waste, than one that queries arbitrarily. The contract is active because the server is not neutral about outcomes.

Minimal, Predictable, Describable

The three properties that make an LLM tool interface work -- minimal, predictable, describable -- are not independent. A larger surface area is harder to describe accurately. An unpredictable interface (one whose behavior depends on state, ordering, or undocumented invariants) is harder to reason about. A surface that is both large and unpredictable is practically unusable by a language model, which is why SPARQL fails as an LLM interface despite being a powerful and well-designed query language.

BFS-QL has six tools. Each does one thing. Each has the same behavior every time it is called with the same arguments. Each is described in a tool docstring that fits in a few sentences. The model can hold the entire interface in its working context simultaneously. It does not need to reason about which tool to use; the session workflow makes the order explicit. It does not need to consult documentation mid-session; the tool descriptions are self-contained.

These properties are not accidents of the current implementation. They are design constraints that shaped every decision in Parts II and III. The six-tool surface emerged from asking which operations are truly distinct. The stub/full model emerged from asking how to keep response size predictable. The topology_only mode emerged from asking what the minimum useful response is. Minimal, predictable, and describable are not virtues added at the end -- they are the criteria by which each design choice was evaluated.