Field notes

The win-loss database schema, explained

The five tables behind PursuitAgent's win-loss intelligence feature: proposal, theme, outcome, debrief, and the join back to the RFP block that sourced the claim. With the SQL.

The PursuitAgent engineering team 5 min read Engineering

We shipped the win-loss dashboard this month. This post is the schema behind it — five tables, a handful of foreign keys, and the join that lets a debrief comment walk back to the exact answer block that failed.

The design goal: every finding from a debrief should be reducible to a change in the knowledge base. That constraint is what determined the schema.

The five tables

-- One row per submitted proposal.
create table proposal (
  id              uuid primary key,
  company_id      uuid not null references company(id),
  rfp_id          uuid references rfp(id),
  buyer_name      text not null,
  submitted_at    timestamptz not null,
  outcome         text check (outcome in ('pending','won','lost','no_decision','withdrawn')),
  outcome_known_at timestamptz,
  deal_value_usd  numeric(14,2),
  created_at      timestamptz not null default now()
);

-- One row per win theme asserted in the proposal.
create table proposal_theme (
  id              uuid primary key,
  proposal_id     uuid not null references proposal(id) on delete cascade,
  label           text not null,
  assertion       text not null,
  source_block_id uuid references kb_block(id),
  status          text check (status in ('asserted','supported','contradicted','retired')),
  created_at      timestamptz not null default now()
);

-- One row per structured outcome signal.
create table proposal_outcome (
  id              uuid primary key,
  proposal_id     uuid not null references proposal(id) on delete cascade,
  source          text check (source in ('buyer_debrief','buyer_email','internal','gao','public_record')),
  stated_reason   text,
  evidence_url    text,
  captured_at     timestamptz not null default now()
);

-- One row per debrief comment. The thing that drives KB edits.
create table debrief_note (
  id              uuid primary key,
  proposal_id     uuid not null references proposal(id) on delete cascade,
  author_user_id  uuid references app_user(id),
  note            text not null,
  severity        text check (severity in ('observation','concern','blocker')),
  target_block_id uuid references kb_block(id),
  suggested_edit  text,
  edit_status     text check (edit_status in ('proposed','accepted','declined','deferred')),
  created_at      timestamptz not null default now()
);

-- One row per observed "these themes ran together across bids" cluster.
create table theme_cluster (
  id              uuid primary key,
  company_id      uuid not null references company(id),
  centroid        vector(1536) not null,
  label           text,
  member_count    int not null,
  win_rate        numeric(5,2),
  updated_at      timestamptz not null default now()
);

Five tables. None of them are clever. The cleverness is in what they connect.

The central join

Every debrief_note has a target_block_id. That’s the join we kept fighting to keep. It’s how a reviewer’s comment — “this encryption answer is written against a previous posture, not the current one” — becomes a proposed edit to a specific KB block. Without it, a debrief is prose. With it, a debrief is a work queue.

The foreign key is nullable on purpose. Some debrief comments don’t point at a block — they point at a missing section, or at a process problem, or at a theme that was never in the KB to begin with. We didn’t want to force every note through a block reference because it would push writers to pick the closest-wrong answer. Nullable means “this is a finding without a target,” which is a real category.

Themes are versioned against blocks

proposal_theme.source_block_id is the other important join. Every win theme we ship points at the KB block(s) that sourced it. When a theme is marked contradicted after a debrief, we can walk from the theme to the block, and from the block to every other proposal that cited it. That’s how one debrief finding propagates.

This matters because win themes in most shops live as slogans in a cover letter. Ours live as assertions with evidence, and the evidence is trackable. We wrote about the assertion-plus-evidence structure in the win-themes field guide a year ago.

Outcomes have sources

proposal_outcome.source distinguishes between “the buyer said this in a debrief” (buyer_debrief), “the buyer wrote this in an email” (buyer_email), “the internal team inferred this” (internal), “we read this in a GAO decision” (gao), and “we found it in a public record” (public_record).

The distinction is load-bearing. An internal inference is useful but soft. A buyer debrief quote is hard evidence. A GAO decision is public and citable. A product that treats all four as the same signal builds a dashboard that lies confidently.

Clusters are vectors

theme_cluster stores a centroid embedding over the themes we’ve asserted across all bids. That’s how we surface “you’ve asserted this win theme 14 times; it won 6 of 14” without forcing humans to tag themes by hand. The clustering post lands tomorrow.

We store win_rate as a materialized column because the cluster membership is the thing that moves; the win rate of historical members is stable and worth caching.

Indexes that matter

create index on proposal (company_id, submitted_at desc);
create index on proposal (outcome, outcome_known_at) where outcome in ('won','lost');
create index on debrief_note (proposal_id);
create index on debrief_note (target_block_id) where target_block_id is not null;
create index on proposal_theme (source_block_id) where source_block_id is not null;
create index on theme_cluster using hnsw (centroid vector_cosine_ops);

The partial index on debrief_note (target_block_id) is what the dashboard query hits constantly — “show me all debrief findings that point at this block, across all proposals.”

What we deliberately didn’t model

No lesson table. A “lesson” sounds right but it’s a narrative layer on top of the evidence, and every team we’ve watched builds it differently. We expose the evidence and let the UI render the narrative per-team.

No competitor table, yet. The first version of the dashboard asked “who did you lose to” and the data came back too noisy to be useful. We’ll add it when we can connect it to a signal that actually moves KB blocks. Today, it would just be a column nobody trusts.

No price column on proposal_outcome. Price claims come in through stated_reason and get parsed after the fact. The lose-on-price piece explains why we’re skeptical of price as a structured signal.

Where this goes next

Tomorrow Sarah writes about what to capture in this schema — the 18 fields per bid most teams skip 12 of. Thursday I’ll write about the clustering over proposal_theme that surfaces repeat themes. The week after, we cover the debrief-to-KB-edit linkage that debrief_note.target_block_id enables.

The schema is not the product. The product is the discipline of filling it in.

Sources

  1. 1. Shipped — win-loss pair capture
  2. 2. KB schema evolution, year one