The RFP software category is broken in three specific ways

I want to be precise about why I think the RFP software category needs a rebuild, because “the category is broken” is the kind of sentence a founder writes when they want to sell you something. So let me name three specific things that are wrong, with citations.

Each one is fixable. That is the only reason I am writing this post instead of going to do something easier with my life.

One — the AI drafts from training data, not from your knowledge base

Open any current G2 or Capterra page for the two largest vendors in this category. Read the AI-feature reviews. The pattern is consistent enough to be its own meme. From a Loopio review on Capterra: “Magic doesn’t work well. The answers are usually wrong.” The AI feature works on basic, frequently-asked questions and falls apart on anything nuanced — which is most of what an RFP is.

The reason it falls apart is structural, not a tuning problem. A drafting feature that pulls from a generic model’s training data will write fluent answers about your category, not about your company. It cannot cite the paragraph in your own past response that contained the precise phrasing your customer’s procurement team approved last quarter, because it never read your past response. Even when retrieval is layered on top, citations don’t guarantee the claim is supported — Stanford HAI’s study of commercial legal RAG found 17 to 33% hallucination rates in tools that did retrieve. The drafted sentence cited a passage. The passage didn’t actually back up the sentence.

Practitioners feel this on the ground before they read the academic paper. An autorfp.ai review summary of Loopio reports the same thing in plainer language: response quality degrades sharply when the content library isn’t actively maintained, and the AI surfaces outdated suggestions with the same confidence as fresh ones.

This is fixable. The fix is to design the drafting engine so that it cannot produce a sentence unless a span in your KB entails it — a refusal at draft time, not a check after. We wrote up how we enforce that earlier this week.

Two — pricing is opaque, and the floor is high

The two largest vendors are quote-only. Every G2 and Capterra page for both products has the same gray box where pricing should be. You request a demo, an SDR qualifies your seat count and revenue band, and a number lands in your inbox a week later. The number is large.

I am not going to publish a specific competitor’s price floor in a blog post — those numbers move and I’d rather link than paraphrase. But the pattern is documented across review sites: enterprise-tier seat commitments measured in five figures per seat per year, with twelve-month minimums, are normal in this category. Smaller proposal shops — the ones that are losing 30 hours a week to DDQs and security questionnaires — read those quotes and bounce.

This is fixable. The fix is to publish a price next to the feature list and let the reader self-qualify. We do that. So do a handful of newer entrants. Quote-only pricing is a posture, not a fact about the work, and the posture is starting to look dated.

Three — knowledge bases rot, and the AI on top of them rots with them

Every proposal vendor sells a knowledge base. Every knowledge base goes stale within months unless someone owns its hygiene. The freshness problem isn’t a marketing concern; it’s the actual product.

The autorfp.ai review summary describes this pattern explicitly: when a content library isn’t actively maintained, an expensive proposal tool turns into “an overpriced document repository.” The same theme appears in Responsive reviews on G2 — search results that surface the wrong block, the right block buried under twelve close-but-wrong matches, the suggested answer that contradicts the current product reality because it was written for a version two releases ago.

The category’s response to this has been to ship more dashboards. Content health scores, last-reviewed-on badges, owner assignments. None of which solve the underlying problem, which is that the work of keeping the library true is human work, and the tooling around it should be designed to make that work cheap rather than to put a metric on its absence.

This is fixable. The fix has three parts and it’s the part of our roadmap I think we have the strongest feel for. Versioned content blocks with explicit “valid as of” markers, so a reviewer can see at a glance when a claim was last verified. Inline drafting that surfaces the source block and its age, so a writer notices staleness in the flow of work. And a post-mortem step that writes back into the KB after every bid — because if the corpus only ever grows from documents but never from what the team learned, it’s a one-way pipe and it ages badly.

What I am not arguing

I am not arguing the incumbents are bad companies. They built a category that didn’t exist 15 years ago. They have engineers I’d hire tomorrow. The Loopio comparison page on our marketing site says, in our own copy, “Where Loopio wins: battle-tested extraction at enterprise scale” — that’s true and worth saying. See the comparison for what we think they do well and where the gap actually lives.

What I am arguing is that the three failures above are not minor product roughness. They are the consequence of architecture choices that made sense in 2015 and don’t anymore.

The thesis

A proposal tool is a knowledge product. The center of gravity is the corpus, not the workflow. If the corpus is fresh, structured, and citable end-to-end, the drafting becomes a side effect — pleasant, productive, mostly correct, easy to fix when it isn’t. If the corpus is stale, opaque, and uncited, no amount of model upgrade fixes the underlying problem.

I think the next decade of this category belongs to whoever takes that seriously. We’re trying. So are some others. Either way, the three things above are the bar to clear.