Field notes

The day we stopped saying 'AI' and started saying 'retrieval'

A short note on a vocabulary switch we made internally — and the reason a one-word change settled three months of recurring product debates.

Bo Bergstrom 3 min read Category

Sometime around the end of last year, we stopped using the word “AI” in internal product conversations and started saying “retrieval” instead. It sounds like a stylistic preference. It wasn’t. It changed how we made decisions.

Here is what was happening before. A product debate would open with “should the AI also do X” — where X might be summarizing a long RFP, drafting a section, scoring a bid, suggesting win themes. The conversation that followed was always abstract. AI is a capability with no defined boundary. You can ask whether a capability with no boundary should do anything, and the answer is “maybe, depending on cost and quality and whether it’s a good idea.” Three engineers, four opinions. Repeat in two weeks.

The same debate, replayed with the word “retrieval,” collapses to a different shape. “Should the retrieval pipeline also do X” forces an immediate concrete question: what’s the corpus, what’s the query, what’s the relevance signal, what’s the failure mode, what does refusing look like. “AI” lets you skip those questions. “Retrieval” makes them the first questions.

We didn’t do this for branding. We did it because internal arguments were stuck.

The word “AI” carries two failure modes in product talk. The first is that it makes the system sound more capable than it is — the listener pictures a thoughtful agent, not a probabilistic generator over a context window. The second is that it makes the system sound more autonomous than it is — the listener pictures a thing that decides, not a function call you control. Both fictions hurt design. You build worse interfaces when you imagine your software as a colleague instead of as a pipeline you can wire up wrong.

Switching to “retrieval” demoted the system in our heads. It is a function. It takes a query and a corpus and returns ranked passages. A model on top of those passages is also a function. Both are bounded, instrumentable, and can be told to refuse. Once we talked about them that way, we started instrumenting them that way. The retrieval floor that gates drafting in our grounded-AI pledge enforcement is a direct consequence of that vocabulary switch — it would not have existed if we were still arguing about whether “the AI” should “be more careful.”

The marketing site mostly still says “grounded AI.” That’s a deliberate concession to the search terms our customers actually use. “Grounded retrieval-augmented generation” is the technically accurate phrase, and approximately nobody types it into a search bar. So externally we keep “AI” as the entry word and earn the right to clarify in the post.

Internally, the two of us who write code together still trade the word. When somebody says “the AI” in a planning conversation, the other one will say, “do you mean retrieval, draft, or verify?” — because those three are the three actual systems, and an answer that doesn’t pick one is an answer that hasn’t thought about it.

The categorization is doing real work. The category is full of products whose marketing says AI and whose engineering does not distinguish the three calls. That is the failure pattern Stanford’s hallucination paper documents in its way and that the practitioner reviews document in theirs.

Vocabulary precedes architecture. We were going to write better software the moment we stopped letting ourselves get away with the abstract noun.