Engineering

Retrieval eval pipeline

The practical reading path for PursuitAgent's retrieval evaluation work: gold-set construction, retrieval metrics, and quarterly regression checks.

Retrieval eval pillar

The public explanation of gold sets, graded relevance, and retrieval quality gates.

A snapshot of how the evaluation metrics are read and where they can mislead.

How the benchmark set is curated, reviewed, and kept difficult enough to matter.