Chapter02
Chapter 2: A Brief History of Knowledge Representation¶
The Idea That Wouldn't Die¶
There is a fantasy at the heart of computing that is almost as old as computers themselves: the machine that doesn't just store and retrieve facts but understands them. Not a filing cabinet you query with the right syntax. Not a search engine that hands you links and wishes you luck. A machine that knows things the way a person knows things -- that can draw on what it understands about a subject and tell you something true and useful in return.
This fantasy has motivated some of the most ambitious projects in the history of computer science, and it has stalled, repeatedly, in the same place. Not at the reasoning end -- researchers got surprisingly far at encoding the logic of a domain. The wall was always at the other end: getting knowledge in. Turning the vast, ambiguous record of what humans know -- written in papers and case notes and specifications, in the imprecise medium of natural language -- into something a machine could actually reason from.
Meaning is Relational¶
The intellectual lineage of the knowledge graph runs through two mid-twentieth-century ideas that turned out to be more right than their authors could fully demonstrate at the time.
Marvin Minsky's 1974 paper "A Framework for Representing Knowledge" [@minsky1974framework] argued that knowledge isn't a list of facts -- it's a web of structured relationships. When you walk into a restaurant you don't reason from first principles; you retrieve a pre-existing frame with slots for host, menu, food, check, tip, and fill in the details from observation. The relationships are the knowledge. A node in isolation is just a label; a node embedded in a typed graph of relationships to other nodes is a concept, with context, with implications, with a place in a web of meaning.
Douglas Hofstadter's argument in Gödel, Escher, Bach [@hofstadter1979geb] sharpened this: meaning isn't a property of individual symbols but of symbol systems -- of the relationships and transformations between symbols. "BRCA1" as a string of characters means nothing. It means something because of its typed relationships to other nodes: it encodes a protein, it increases risk of breast cancer, it interacts with other genes. The meaning is in the web, not in the label. A sufficiently rich relational representation doesn't just store knowledge -- it participates in reasoning over it.
The knowledge graph as built today is the realization of what both were pointing at: a rigorous, computable, queryable structure where entities have typed relationships and the graph itself carries meaning. The difference is that Minsky and Hofstadter were working at the level of cognitive theory. We are building infrastructure.
The Bottleneck Was Always Extraction¶
The decades that followed produced ambitious attempts to build on these foundations. Expert systems in the 1970s and 80s encoded domain knowledge as explicit rules -- MYCIN could outperform medical residents on bacterial infection diagnosis; XCON saved DEC tens of millions of dollars a year configuring computer systems. Cyc attempted to hand-encode common sense at scale, accumulating millions of assertions over decades. The Semantic Web envisioned machine-readable linked data published across the entire web. Google's Knowledge Graph demonstrated the value of structured entity knowledge at production scale, built from curated databases and encyclopedias.
Every approach hit the same wall. Expert systems couldn't capture the tacit knowledge experts exercise without noticing. Cyc's hand-encoding phase -- the phase that was supposed to precede the self-learning phase -- never ended. The Semantic Web couldn't solve the adoption problem: structuring content for others' benefit costs more than it returns to the publisher. And Google, with essentially unlimited engineering resources, found it easier to rely on human-curated structured sources than to extract reliably from unstructured text. The bottleneck was never the reasoning. It was always getting knowledge in from natural language prose.
LLMs Change the Economics¶
Large language models don't solve every problem in this space, but they dissolve the specific bottleneck that stopped everything else. The marginal cost of a new extraction task dropped from months of domain adaptation and annotation work to a prompt that describes what you're looking for. A cardiologist can review an extraction prompt, understand what it's asking for, and suggest improvements -- without understanding machine learning. Schema changes require editing the prompt, not retraining a model. The cycle from "I want to extract this relationship" to "I have a working extractor" is measured in hours.
That is the only thing that changed. It changes everything. The rest of this book is about how to build what it makes possible.