Shipped: auto-generated compliance matrix from ingested RFPs

We shipped auto-generated compliance matrices yesterday. One click from intake. The relevant page is /platform/rfp-analysis; this post is the engineering note.

What landed

When an RFP is ingested, the analysis pipeline now extracts every requirement clause and emits a structured compliance matrix as part of the intake artifact. Previously, the compliance step was a separate run a user had to invoke after intake completed. Now it’s part of the pipeline.

The matrix has one row per extracted requirement, with columns for:

Requirement ID — generated, stable across re-runs.
Source location — page and paragraph in the original RFP.
Verb — shall, must, will, should, may, or the inferred mode.
Category — Technical, Management, Past Performance, Pricing, Compliance, Other.
Owner — populated by rules, editable by the user.
Response section — empty until a draft is wired in.
Status — unaddressed, addressed, acknowledged-no-response.

The matrix is rendered in the analysis UI and exportable as XLSX or CSV. It is also wired into the drafting tab — when a writer opens a section, the matrix rows that map to that section are visible in the side panel.

What this maps to

Stage 4 of the eight-stage RFP pipeline is Compliance. Sarah’s post on the pipeline names the failure mode this ship is meant to address: “Compliance matrices are built late. They’re built by copy-pasting the RFP text into Excel, which takes a day. By the time the matrix is ready, the first drafters are already three sections deep into a response.”

The intent of this ship is to compress that day to a few seconds and to make the matrix the first artifact a drafter sees, not the last.

How it works

Three components.

Requirement extraction. A pass over the RFP text identifies clauses with compliance verbs (shall, must, will provide, is required to, plus a hand-curated list of softer modalities). Each clause is normalized — split on conjunctions where the buyer wrote one sentence with two requirements, joined where they split one requirement across two sentences.

Categorization. The extracted clauses are passed through a classifier that assigns each to a category (Technical, Management, etc.). The classifier reads the clause and a window of surrounding context and emits a single label. We tuned it against a held-out set of 8 RFPs (federal, state, and commercial) annotated by hand. Current accuracy on the held-out set is in the high 80s on category, mid 90s on verb identification.

Diff against addenda. When an addendum is uploaded against the same RFP record, the matrix re-runs and surfaces a diff — added requirements highlighted in green, modified in yellow, deleted in red. The diff is shown in the matrix tab and a notification fires for whoever owns that record.

Where it needs a human

Two known limits.

Implicit requirements. Some RFPs encode requirements in tables or attachments without compliance verbs. (“The vendor must comply with the requirements in Appendix A.”) The pipeline catches the parent clause but doesn’t always recurse into the appendix’s tabular requirements. We surface the parent as a row and ask the user to expand it manually until the recursion works on more attachment formats.

Conditional requirements. Clauses like “if the vendor is proposing a cloud solution, the vendor shall provide…” are extracted as a single requirement. The conditional logic isn’t represented. A user has to manually note the condition or split the row.

Where to find it

Open any ingested RFP in the analysis UI. The Compliance tab populates automatically. New RFPs get the matrix on intake; existing ingested RFPs need a one-click re-analyze to populate.

Documentation is at /platform/rfp-analysis. Issue tracker is open if the matrix misses a clause that you’d expect it to catch — share the page number and the clause text and we’ll add it to the eval set.