Engineering
Retrieval eval pipeline
The practical reading path for PursuitAgent's retrieval evaluation work: gold-set construction, retrieval metrics, and quarterly regression checks.
Retrieval eval pillar
The public explanation of gold sets, graded relevance, and retrieval quality gates.
Quarterly RAG eval report
A snapshot of how the evaluation metrics are read and where they can mislead.
Testing retrieval gold sets
How the benchmark set is curated, reviewed, and kept difficult enough to matter.