Chapter09
Chapter 9: Identity During Querying¶
search_entities and the Identity Server¶
BFS-QL's search_entities tool accepts a natural-language string and returns
a list of matching entity IDs. Under the hood, this is an identity server
operation: embed the query string, search for nearby entity vectors in the
identity server's database, return the canonical IDs of the matching entities.
The caller -- the LLM using the BFS-QL interface -- does not know that it is calling the identity server. It provides a string and receives IDs. The identity server provides the matching. This is the correct abstraction: the query layer is responsible for traversal, the identity server is responsible for resolution.
Embeddings Are an Identity Server Concern¶
The Postgres/pgvector backend described in BFS-QL Chapter 10 notes that "embedding model consistency between ingest and query time must be explicit metadata, not convention." The identity server resolves this requirement by owning all embeddings.
Because the identity server manages both ingest-time embedding (during entity
creation and the embedding-similarity stage of the lookup chain) and query-time
embedding (during search_entities), it guarantees consistency without any
coordination between the ingestion pipeline and the query layer. The query layer
calls search_entities with a string; the identity server embeds it with the
same model it used during ingest; the cosine distances are meaningful.
The embedding model, vector dimensions, and distance metric are internal implementation details of the identity server. The ingestion pipeline does not know which embedding model is in use. The query layer does not know. Only the identity server knows, and it is consistent because it is a single service.
Cross-Graph Composition¶
When a BFS-QL client has connections to two graphs -- one built from research papers, one built from clinical trial records -- and both graphs anchor their entities to the same authorities, the client can traverse from one graph to the other using canonical IDs as bridges.
This traversal requires no special protocol support in BFS-QL. The client calls
bfs_query on the first graph starting from a canonical ID. The response
includes the canonical IDs of neighboring entities. The client calls
search_entities on the second graph with those IDs. If the second graph
contains entities with matching canonical IDs, the traversal crosses graphs.
The identity server is why this works. Both graphs used the same authorities. Both graphs anchored their entities to those authorities. The shared IDs are the bridges. The identity server made them available; BFS-QL traverses them.