Blog · Tag

embeddings.

6 posts in this archive.

Engineering

Embedding evaluation, revisited

What we measure differently from 12 months ago. How the gold set grew, which metrics earned their spot in CI, and which ones we quietly retired.

The PursuitAgent engineering team
Engineering

Clustering win themes across 200 past bids

How we cluster win-theme assertions across a corpus of past proposals to surface repeat themes, where the signal is real, and where the clustering is just noise dressed as insight.

The PursuitAgent engineering team
Engineering

Migrating to Gemini Embedding v3, the safe way

A dual-index backfill and a staged cutover across two weeks. How we evaluated retrieval deltas before the switch, what we watched for during the cutover, and the one metric that gated the final flip.

The PursuitAgent engineering team
Engineering

Per-customer embedding tenancy, explained

How tenant isolation works at the vector level in PursuitAgent. Why we use Postgres row-level security on pgvector as the default, where shared embedding spaces would be cheaper, and the trade-offs we are not willing to take.

The PursuitAgent engineering team
Engineering

Semantic deduplication of KB blocks at ingest

How we merge near-duplicate KB blocks at ingest time using embedding similarity, the threshold we settled on after testing four values, and the trade-off we accept by tuning toward over-merging.

The PursuitAgent engineering team
Engineering

Embedding model selection: why Gemini Embedding 2 for proposals

A teardown of how we evaluated four embedding models — Gemini Embedding 2, OpenAI text-embedding-3-large, Cohere embed-v4, and Voyage — for a proposal corpus, and the methodology that drove the choice.

The PursuitAgent research team

See the proposal workflow

Take the 5-minute tour, then start a trial workspace when you're ready to run a real pursuit against your own source material.