The DDQ evidence-provenance API

An external auditor asks: “this DDQ answer about your SOC 2 scope — where did the specific claim about exclusion of our XYZ subsidiary come from?” The right answer is a link. A link to the exact KB block, the version of that block that was live when the DDQ was answered, and the specific span that supported the claim.

We’ve had this in the UI for a while. What we shipped this week is a read-only HTTP API that lets an auditor follow the same trail without logging into our product. This is the endpoint set, the auth model, and what we tightened before we let it touch production.

Why an API and not just the UI

Two reasons. First, some customers’ auditors don’t want to learn a new product — they want the data in a format they can feed into whatever their audit workflow uses. Second, for long-tail evidence reviews, an API lets the auditor script a walkthrough of 40 questions at once rather than clicking through each.

We’d resisted shipping this for a year. The risk is obvious: a public-facing endpoint into evidence data needs hard auth and hard rate limiting, and getting either wrong exposes a specific customer’s KB. So we built it read-only, scoped to single bids, and behind a time-boxed token.

The three endpoints

All three live under /api/v1/ddq/:bidId/provenance. All three require a provenance token issued by the bid owner.

// GET /api/v1/ddq/:bidId/provenance/answers
// Returns the answer list for a DDQ, with per-answer provenance pointers.
type AnswerSummary = {
  answerId: string;
  question: string;
  answer: string;
  citedBlockIds: string[];
  generatedAt: string; // ISO 8601
};

// GET /api/v1/ddq/:bidId/provenance/answers/:answerId
// Full detail for one answer. Claim-by-claim breakdown.
type AnswerDetail = {
  answerId: string;
  question: string;
  answer: string;
  claims: Array<{
    claim: string;
    sourceBlockId: string;
    sourceBlockVersion: string;
    sourceSpanStart: number;
    sourceSpanEnd: number;
    entailmentVerifiedAt: string;
  }>;
};

// GET /api/v1/ddq/:bidId/provenance/blocks/:blockId
// The source block, at the version cited. Read-only snapshot.
type BlockSnapshot = {
  blockId: string;
  version: string;
  content: string;
  capturedAt: string;
  approvedBy: string | null;
};

No write endpoints. No discovery endpoints — you cannot list all bids, or all blocks, or all customers. Every query is scoped to a specific bid ID the token already knows about.

The auth model

A provenance token is issued by the bid owner for a specific bid and a specific auditor email. The token has three scoped attributes:

Bid ID. The token works only against this bid’s provenance endpoints. It returns 404 for any other bid.
Auditor identity. The token carries the auditor’s email and organization. Every request logs the auditor’s identity server-side; the customer sees the access log in their admin panel.
Expiration. Tokens expire 30 days from issue. There is no renewal flow — a new audit requires a new token. This is intentional. Rotation is cheap and permanent access is a liability.

The token is a JWT signed with a per-customer key. Verification is stateless. Revocation is immediate via a deny-list check on every request — a revoked token gets a 401 on its next call, and the customer sees the revocation in their admin panel.

We use a separate signing key from the product’s main auth tokens. If we ever have to rotate the provenance signing key in an incident, rotating it doesn’t log out every user.

What we hardened before shipping

Three things the first internal prototype got wrong and the shipped version handles.

Rate limiting. The prototype had none. An auditor who writes a script that accidentally loops is the same footprint as a malicious actor who scrapes. We added a per-token rate limit of 60 requests per minute, and a per-customer rate limit of 600 per minute across all tokens. Both are generous for an audit workflow and leave clear headroom before abuse.

Content redaction. Some KB blocks include content that should not leave the customer’s security perimeter — personnel data, in particular, can appear in evidence blocks about team composition. The API honors a block-level redactInExternalAccess flag; redacted blocks return a 403 with a reason code rather than content. The customer sets this flag per block in their admin UI. Default is not redacted; we opted for explicit opt-in over safe-by-default because a silently-redacted block is a worse audit experience than a visible 403.

Block version freezing. The API returns the block at the version cited when the DDQ was answered, not the current version. This is the feature that took the most engineering thought. If the block has been edited since the answer was generated, the auditor sees what the answer-at-time was based on, not what the KB says today. The versioning mechanics are in the KB block versioning deep dive.

What the API doesn’t do

No search. No bulk export. No write-back. No authentication via the customer’s SSO (tokens are signed by our infrastructure; SSO integration is a different surface area and not one we wanted to take on in v1).

Two of these are on the candidate list for v2. SSO is probably not — the scope boundary of “read-only tokens issued per bid” is easier to reason about than a full identity-federated surface, and for a quarterly audit workflow, token-per-audit is fine.

The rollout

We shipped to two design-partner customers in early February. Both run SOC 2 audits with external firms. Both reported that the API cut their evidence-review cycle from weeks to days. Both also asked for the bulk export we didn’t ship; we’re considering it for v2.

Docs for the three endpoints are in the product admin panel under API → Provenance. A public documentation page goes up next week.