Skip to content

Chapter15

Chapter 15: Open Source and the SaaS Layer\index{open source}\index{SaaS}

The BFS-QL library is open source. This is not a purely ideological choice, though there are ideological reasons for it. It is a strategic choice informed by the specific position BFS-QL occupies: a developer tool that bridges existing infrastructure (knowledge graphs, SPARQL endpoints, Postgres databases) to a new capability (LLM reasoning over structured data). Developer tools in this position benefit from open source in ways that consumer products do not.

What the Library Gives You

The open-source library provides:

  • The GraphDbInterface ABC and all backend implementations (Postgres, SPARQL, Neo4j)
  • The BFS traversal engine, stub/full filtering, topology mode
  • The CachedGraphDb caching wrapper
  • The create_server() function and the six-tool MCP server
  • The bfs-ql serve CLI command for local deployment
  • The test suite and integration test infrastructure

This is a complete, functional implementation. A developer with an existing knowledge graph can clone the library, implement a backend for their store (or use an existing one), and have a working MCP server in hours. The library requires no license, no account, no API key.

The intended users of the raw library are developers, researchers, and organizations with existing infrastructure: a company that runs its own SPARQL endpoint, a research group with a Postgres graph of domain literature, a developer building a domain-specific LLM application who wants to add graph reasoning without depending on an external service. These users have the technical capability to self-host and the privacy or control requirements that make a hosted service unattractive.

What the Hosted Service Adds

The open-source library solves the single-graph, single-tenant, developer-operated case. The hosted service solves everything else:

Provisioning. Connecting a new SPARQL endpoint or Postgres database to a hosted BFS-QL server is a configuration operation, not a deployment operation. The user provides a connection string; the service handles the rest.

Multi-tenancy. Multiple users, multiple graphs, isolated sessions. The library has no concept of tenants; the service does.

Schema discovery. For unknown or public endpoints (DBpedia, Wikidata, ChEMBL), the service can probe the endpoint, build a partial schema, and configure the BFS-QL server automatically. The library requires the user to know their schema in advance.

Managed caching. The library's CachedGraphDb is session-scoped and in-memory. The hosted service can maintain persistent, cross-session caches for frequently queried graphs, reducing latency and backend load.

Private endpoint support. An organization with a knowledge graph behind a firewall can configure the hosted service as a proxy, exposing the BFS-QL interface to external LLM clients without exposing the raw endpoint.

Uptime and query accounting. The library has no monitoring, no rate limiting, no audit log. The service does.

The Elastic Playbook

Elastic (Elasticsearch) open-sourced its search engine and built a multi-hundred-million-dollar business selling the managed service: Elastic Cloud. MongoDB did the same. HashiCorp did the same with Terraform and Vault. The pattern is well-established: open source the library, sell the managed service.

The reasons it works in these cases are the same reasons it works for BFS-QL:

First, the library is genuinely useful on its own. Open-source projects that are deliberately crippled to push users toward the paid service develop reputational problems. The library should be complete. Users who self-host should not feel they are getting an inferior product.

Second, self-hosting has real costs. Setting up and maintaining a production service -- monitoring, scaling, reliability, security patching -- is expensive even when the software is free. The managed service eliminates those costs. Users who value their time over control will pay for it.

Third, the open-source library is the top of the funnel. Developers who discover BFS-QL through the library, build something with it, and then need to scale or simplify operations are the natural customers for the hosted service. The library is marketing; the service is revenue.

Why Open Source Is Right for the Library

Beyond strategy, open source is correct for BFS-QL's library for reasons specific to its position.

Community backends. The eight-method interface is a specification. A community of users implementing backends for their preferred graph stores -- TigerGraph, Amazon Neptune, TerminusDB, RDFLib, ArangoDB -- extends BFS-QL's reach without requiring the core team to maintain implementations for every possible backend. Open source is the mechanism for this.

Credibility with technical audiences. The primary users of the library are developers and researchers who will inspect the source code before deploying it. Open source means they can. A black-box library that asks them to trust its behavior without inspection is a harder sell to a technical audience that has other options.

The kgraph referral path. The companion volume (Knowledge Graphs from Unstructured Text) is itself directed at technical practitioners who are building knowledge graphs. Its readers are exactly the users who will want to serve those graphs through BFS-QL. An open-source library that readers can install and run immediately, following the book's instructions, is a better companion to the book than a hosted service that requires account creation before the reader can try the first example.