Blog · Tag

llm-infra.

2 posts in this archive.

Draft autocomplete latency, end to end

Typing lag, inference queue, streaming output. The three budgets that add up to the 240ms P95 we hold ourselves to, and what happens when any one of them slips.

The PursuitAgent engineering team Jan 26, 2026

Engineering

Caching the draft step

How we cache partial drafts across proposals without introducing stale-answer risk. The cache key design, invalidation rules, and the directional cost impact we measured internally.

The PursuitAgent engineering team Jan 19, 2026

See the proposal workflow

Take the 5-minute tour, then start a trial workspace when you're ready to run a real pursuit against your own source material.

Take the tour Get started