Blog · Tag

latency.

4 posts in this archive.

Draft latency, a year on: 45s P95 to 28s

A year of draft-latency work. What moved P95 from 45 seconds to 28, which changes cost quality and which cost money, and the three tradeoffs we chose not to take.

The PursuitAgent engineering team Apr 15, 2026

Engineering

The SLA on draft generation: 45 seconds, 95th percentile

The operational target we hold draft generation to, why it's 45 seconds and not 30 or 90, and the specific things we do to hold the number under peak federal-FY-Q2 load.

The PursuitAgent engineering team Mar 16, 2026

Engineering

Draft autocomplete latency, end to end

Typing lag, inference queue, streaming output. The three budgets that add up to the 240ms P95 we hold ourselves to, and what happens when any one of them slips.

The PursuitAgent engineering team Jan 26, 2026

Engineering

Our retrieval latency budget, explained

Where the milliseconds go in a single retrieval call: embedding lookup, vector search, reranker, hybrid merge, payload hydration. P50 120ms, P95 400ms, and what we cut to get there.

The PursuitAgent engineering team Jun 16, 2025

See the proposal workflow

Take the 5-minute tour, then start a trial workspace with your own source material.

Take the tour Start trial