Field notes

The citation density target per section

Why executive summaries get two citations per paragraph and technical sections get five. The rationale for citation density as a section-level target, and what happens to drafts that fall below it.

The PursuitAgent engineering team 5 min read Engineering

Citations are not a uniform good. A paragraph with one citation per sentence is hard to read; a paragraph with no citations is hard to trust. The right density depends on the section, the audience, and the type of claim being made. We set per-section targets in the drafter and refuse to ship sections that fall below them.

This post is the engineering note on how the targets are set, why they vary, and what the drafter does when a section can’t hit the floor.

The section-level targets

Five section types, five targets:

  • Executive summary — 2 citations per paragraph (floor 1).
  • Technical approach — 5 citations per page (floor 3).
  • Past performance — 4 citations per case (floor 2 — typically project name, customer reference, contract value, outcome).
  • Compliance and certifications — 1 citation per claim (floor 1; every certification claim has to cite the attestation).
  • Cost narrative — 2 citations per paragraph (floor 1).

The targets are not arbitrary. They came out of an analysis of which claims in past responses got flagged at gold-team review and which didn’t. Sections where reviewers flagged unsourced claims had measurably lower density than sections where they didn’t. The targets are calibrated to the density level above which gold-team flags become rare.

Why exec summaries get two and technical sections get five

The exec summary is the most-read section in a proposal and the one with the lowest tolerance for visible citation noise. Two citations per paragraph hits the trust floor without making the document look like a footnoted legal brief. The reader knows the substantive claims are sourced; the prose still reads as prose.

Technical approach sections have a different audience and a different content profile. The technical evaluator is looking for specifics — protocols, certifications, architectural decisions, performance numbers. Each specific is a candidate for verification. Five citations per page is what it takes to source the specifics without leaving any of them dangling. The technical reader is not put off by visible citations; they are reassured by them.

What happens at the floor

The drafter checks density at section completion. A section that meets the target ships to the verify panel for human review. A section below the target but above the floor ships with a warning — the writer sees a marker indicating which paragraphs are under-cited and is asked to either add citations or rewrite the under-cited claims as non-substantive prose.

A section below the floor does not ship. The drafter refuses and routes the section to the reviewer with a list of the unsourced claims and the candidate KB blocks the engine considered for each. The reviewer either supplies a citation, edits the claim to remove the unsourced portion, or accepts the refusal and rewrites the section.

This connects to the broader refusal pattern in the Grounded-AI Pledge — the drafter will not ship content that fails verification, period. Citation density is one of three dimensions the verifier checks, alongside per-claim verification and numeric claim verification.

Why density matters more than absolute count

A 60-page proposal with 200 citations sounds well-sourced. If 180 of the citations are clustered in two appendices and the executive summary has zero, the proposal is structurally under-cited where it matters. Per-section density is the right unit; aggregate count is misleading.

The Stanford HAI paper on commercial legal RAG tools made a related point: citation presence does not equal claim support. A heavily cited document where the citations don’t actually support the claims is worse than a sparsely cited document where they do. Density targets are necessary but not sufficient — the per-claim verifier is what closes the gap between “has a citation” and “the citation supports the claim.”

What we are still tuning

Two open questions.

The cost narrative target is conservative. Cost narratives typically draw from a small number of source documents (the cost worksheet, the price book, prior contract pricing) and don’t have the citation surface of a technical narrative. We may relax the target for cost sections and rely more heavily on numeric claim verification, which catches the dollar figures regardless of whether they are inline-cited.

The exec summary floor is high. A floor of one citation per paragraph means the opening paragraph — which is by design about the buyer’s situation, not about our company — has to cite something. In practice this works because most exec-summary openings reference a specific buyer-side fact (“you flagged X as priority one in section 3.2 of the RFP”) that is itself a citable source. But the rule occasionally produces awkward prose. We are watching it.

Measurement

We track section-level density as a metric per response and per tenant. The dashboard shows, for each shipped response, the density per section and the percentage of sections that hit their target on the first draft. We do not aggregate across tenants — every tenant’s KB has its own shape and the metric is meaningful only against that KB’s own history.

The metric we watch most closely is “first-draft-passes-density.” A section that hits its target on the first draft is a section where the KB had the evidence the drafter needed, organized in a way the retriever could find. A section that fails first-draft density and requires a writer to add citations manually is a signal that either the KB block exists but isn’t tagged for retrieval, or the block doesn’t exist at all and a new one needs creating. Either case is a backlog item for the SME pipeline (see Sarah’s SME collaboration series for the ticketing pattern that handles those gaps).

For the broader picture of how the verify pipeline works, the canonical post is the inline verify button. For why we treat refusals as a product feature rather than a failure, see the refusal pattern in code.

Sources

  1. 1. PursuitAgent Grounded-AI Pledge
  2. 2. Stanford HAI — Hallucination-Free? Assessing the Reliability of Leading AI Legal Research Tools