Blog · Pillar

Engineering.

109 posts in this archive.

The system-health check at year one

Four dashboards we watch every morning. What each one caught this quarter — and the one that nearly missed a regression.

The PursuitAgent engineering team Apr 27, 2026

Engineering

Shipped: auto-generated post-mortem themes

After 200 debriefs the themes cluster predictably. The clustering is now automatic — here's what ships, what it does, and what it refuses to do.

PursuitAgent Apr 24, 2026

Engineering

Block schema v3: merging KB blocks and evidence atoms

The schema change that let DDQ evidence live in the same store as proposal answers. What we split, what we merged, and the migration that took a week longer than we planned.

The PursuitAgent engineering team Apr 20, 2026

Engineering

Draft latency, a year on: 45s P95 to 28s

A year of draft-latency work. What moved P95 from 45 seconds to 28, which changes cost quality and which cost money, and the three tradeoffs we chose not to take.

The PursuitAgent engineering team Apr 15, 2026

Engineering

Query understanding, a year on: where the model won

A year of hand-written query-rewrite rules versus LLM-based query rewriting on RFP questions. Which side won, where the hand-written rules still beat the model, and what the hybrid looks like now.

The PursuitAgent engineering team Apr 13, 2026

Grounded AI

Hallucination rate: a year-in measurement update

How we measure hallucination rate on grounded drafts, what the number looks like a year in, what moved it since the early baseline, and where the number lives in production for customers to see.

The PursuitAgent engineering team Apr 8, 2026

Engineering

Shipped: grounded-summary export with inline sources

The export path customers have asked for since month one. Executive summary exports now carry the inline citations as hyperlinks in the DOCX and PDF outputs, with an appendix that lists every evidence source in order.

PursuitAgent Apr 3, 2026

A year of ingest pipeline, condensed

Forty changes to the ingest pipeline across a year of shipping. The five that actually mattered, the ones that didn't, and what the pattern says about where to spend the next year's ingest budget.

Bo Bergstrom Apr 1, 2026

Engineering

Draft attribution in exports: PDF, DOCX, HTML

Inline citations have to survive the export. How the rendering preserves citation anchors across the three export formats, where each format makes it hard, and the specific decisions we made to keep the attribution auditable.

The PursuitAgent engineering team Mar 30, 2026

Grounded AI

New models, quarterly eval: Sonnet 4.6, GPT-5.2, Gemini 3.1 Pro

An internal eval across three current-generation models for our specific workloads — drafting, claim verification, extraction. What moved, where we switched defaults, and why one workload still sits on a year-old model.

The PursuitAgent engineering team Mar 25, 2026

Engineering

Compliance extraction, revisited

The grammar we moved to for requirements extraction, why we stopped treating 'shall' as a single class, and the evaluation showing a 38% drop in false-positive requirements.

The PursuitAgent engineering team Mar 23, 2026

Engineering

In preview: per-answer quality score with breakdown

A four-dimensional quality score — clarity, grounding, compliance, brevity — is rolling out in preview on drafted answers. How the score is computed, where it lives in the UI, and what it changes about the review pass.

PursuitAgent Mar 20, 2026

Engineering

March reliability incidents, documented

Two incidents on the platform this month — one degradation, one full outage. What triggered each, how long they ran, what the user impact was, and the specific changes we made after.

The PursuitAgent engineering team Mar 18, 2026

Engineering

The SLA on draft generation: 45 seconds, 95th percentile

The operational target we hold draft generation to, why it's 45 seconds and not 30 or 90, and the specific things we do to hold the number under peak federal-FY-Q2 load.

The PursuitAgent engineering team Mar 16, 2026

Research

Our infra spend for proposal workloads, year one

Database, inference, storage, and ops. What we spent running proposal workloads over our first year, broken down by category. Where the cost curves went where we expected, and where they surprised us.

The PursuitAgent research team Mar 11, 2026

Engineering

RAG for past-performance reference selection

How the retriever picks the three best past-performance references out of 180 for a given scope. Not cosine similarity on a paragraph — structured retrieval over multiple facets with a scorer that knows what a good reference looks like.

The PursuitAgent engineering team Mar 9, 2026

Engineering

Shipped: bulk RFP ingest with duplicate detection

A short changelog entry. Bulk ingest of 10 RFPs in a minute, with block-level duplicate detection so the same clauses across multiple RFPs don't double-count in your KB.

PursuitAgent Mar 6, 2026

Grounded AI

The claim-verification cost profile, stage by stage

Per-claim verification is the defense against citation hallucination. It also costs real money. A breakdown of token costs at each stage of the verification pipeline, with the numbers we actually see in production.

The PursuitAgent engineering team Mar 4, 2026

Engineering

Reviewer feedback routing: comment to block to KB

How we close the loop from an inline comment on a draft paragraph to a versioned edit on the KB block that generated it. The routing is boring; the discipline it enforces is the whole game.

The PursuitAgent engineering team Mar 2, 2026

Procurement

The DDQ evidence-provenance API

External auditors can now walk from a DDQ answer back to the source evidence without opening the KB. The endpoints, the auth model, and what we hardened before shipping.

The PursuitAgent engineering team Feb 25, 2026

Engineering

The prompt test suite, an update

300 tests across our drafting and verification prompts. What they cover, what they miss, which ones still flake, and how we keep the flaky ones from becoming the reason we stop running CI.

The PursuitAgent engineering team Feb 23, 2026

Engineering Long read

One year of grounded retrieval: what changed, what didn't

The engineering companion to the founder retrospective. A year of build-log posts, condensed: what the retrieval stack looks like now, how verification evolved, what the gold set became, and what's still unsolved.

The PursuitAgent engineering team Feb 18, 2026

Engineering

Embedding evaluation, revisited

What we measure differently from 12 months ago. How the gold set grew, which metrics earned their spot in CI, and which ones we quietly retired.

The PursuitAgent engineering team Feb 16, 2026

Grounded AI

Grounded AI for win-theme discovery

How we surface candidate win themes from a corpus of 80 winning proposals without inventing them. The retrieval pattern, the entailment guard, and where the system refuses rather than guesses.

The PursuitAgent engineering team Feb 11, 2026

Engineering

Linking debrief notes to the specific answer blocks that failed

How every debrief comment becomes a KB-block edit suggestion. The two-pass linker that walks from a free-text comment to the exact block that sourced the failed answer, with the SQL and the heuristics.

The PursuitAgent engineering team Feb 9, 2026

Engineering

Shipped: the win-loss dashboard with debrief capture

We shipped the win-loss dashboard last week. It's the feature behind this month's series. Debrief capture, theme clustering, and KB write-back, all wired to the schema.

PursuitAgent Feb 6, 2026

Engineering

Clustering win themes across 200 past bids

How we cluster win-theme assertions across a corpus of past proposals to surface repeat themes, where the signal is real, and where the clustering is just noise dressed as insight.

The PursuitAgent engineering team Feb 4, 2026

Engineering

The win-loss database schema, explained

The five tables behind PursuitAgent's win-loss intelligence feature: proposal, theme, outcome, debrief, and the join back to the RFP block that sourced the claim. With the SQL.

The PursuitAgent engineering team Feb 2, 2026

Engineering

A pgvector migration postmortem

An index rebuild that cost us 90 minutes of degraded search across a handful of tenants. What we changed in the runbook, and the piece of the migration we wish we had rehearsed.

The PursuitAgent engineering team Jan 30, 2026

Engineering

Draft autocomplete latency, end to end

Typing lag, inference queue, streaming output. The three budgets that add up to the 240ms P95 we hold ourselves to, and what happens when any one of them slips.

The PursuitAgent engineering team Jan 26, 2026

Engineering

In preview: proposal templates v2 with custom sections

Template inheritance replaces copy. Sections defined once, customized per tenant, updated in one place. In preview behind the templates feature flag while the marketed Proposal Builder surface catches up.

PursuitAgent Jan 23, 2026

Engineering

Migrating to Gemini Embedding v3, the safe way

A dual-index backfill and a staged cutover across two weeks. How we evaluated retrieval deltas before the switch, what we watched for during the cutover, and the one metric that gated the final flip.

The PursuitAgent engineering team Jan 21, 2026

Engineering

Caching the draft step

How we cache partial drafts across proposals without introducing stale-answer risk. The cache key design, invalidation rules, and the directional cost impact we measured internally.

The PursuitAgent engineering team Jan 19, 2026

Engineering

Retrieval over Slack history: what works, what's too sharp

An experiment with RAG over customer-Slack channel history. Three useful retrieval patterns, two failure modes that led us to gate the feature behind explicit capture flags, and the operational guardrails.

The PursuitAgent engineering team Jan 14, 2026

Engineering

What we learned analyzing 90 days of search logs

Three patterns in the KB-search query logs we did not expect, and one UX change we made because of the findings. Notes from a quarterly log review, written in the build-log spirit.

The PursuitAgent engineering team Jan 12, 2026

Engineering

Shipped: keyboard-first KB search

A small thing that power users have been asking for since month three. Keyboard-first KB search with a reachable command palette, typed result filtering, and no mouse required to open, navigate, and act on a chunk.

PursuitAgent Jan 9, 2026

Grounded AI

When two citations disagree: how the draft resolves it

Two KB chunks say different things about the same claim. The conflict-resolution logic that decides which one the drafted answer cites — when to prefer newer, when to prefer higher-authority, and when to refuse.

The PursuitAgent engineering team Jan 7, 2026

Engineering

Observability for drafting: traces, logs, and replays

How we debug a bad draft six weeks after the fact. The three-layer observability stack — request traces, retrieval logs, and deterministic replays — that makes post-hoc drafting issues tractable.

The PursuitAgent engineering team Jan 5, 2026

Grounded AI

Grounded-AI regressions we caught in year one

Four regressions in our grounded-drafting pipeline this year. How we caught each one, how long it took to roll back, and the one we did not catch in time. Engineering notes, not a victory lap.

The PursuitAgent engineering team Dec 29, 2025

Engineering

The prompt library behind grounded drafting

Seven named prompts, one kill-switch registry, a versioning scheme, and the governance pattern we use to keep prompt sprawl from becoming an outage. Engineering notes on how we actually run prompts in production.

The PursuitAgent engineering team Dec 22, 2025

Engineering

Backup and restore for a KB that contains embeddings

Point-in-time restore, vector consistency, and why we run a full restore drill once a month. The engineering notes on backing up a knowledge base that is half relational and half vector.

The PursuitAgent engineering team Dec 15, 2025

Engineering

Prompt versioning in production, the boring way

Git, tags, eval gates. How we roll a prompt change without breaking drafts in flight, and why the boring version is the one that actually works.

The PursuitAgent engineering team Dec 10, 2025

Engineering

Block reuse tracking: the metric that matters

Which KB blocks got used, in what proposals, and what they correlated with winning. How we instrument reuse, what the numbers told us, and where the signal turns into noise.

The PursuitAgent engineering team Dec 8, 2025

Engineering

Shipped: export to .docx with formatting fidelity

The boring feature that unblocks federal submissions. What formatting fidelity actually means, what we preserved, and what we couldn't.

PursuitAgent Dec 5, 2025

Grounded AI

Detecting ungrounded spans in drafts, line by line

A per-sentence classifier that flags which spans in a drafted RFP answer lack source coverage in the retrieved context. What it costs, what it catches, and what it still misses.

The PursuitAgent engineering team Dec 3, 2025

Engineering

Retrieval eval snapshot, December 2025

Quarter four retrieval evaluation numbers against our held-out RFP and DDQ corpus. What moved since September, what's still stuck, and which regressions we're not yet fixing.

The PursuitAgent engineering team Dec 2, 2025

Engineering

The async drafting worker pool, explained

How 40 concurrent draft sections get written without saturating the LLM budget or crashing the rate limiter. The worker pool, the budget enforcer, and the retry ladder.

The PursuitAgent engineering team Nov 24, 2025

Engineering

In preview: the post-mortem pipeline at submit close

The moment a proposal marks won or lost, PursuitAgent opens a structured post-mortem against the captured pair. In preview while the post-mortem template, reflection, and write-back loop mature toward general availability.

PursuitAgent Nov 21, 2025

Procurement

Multi-tenant DDQ templates across customer accounts

How one SOC 2 answer shape generalizes across many customer tenants without leaking tenant-specific facts. The separation between template structure and tenant content, explained.

The PursuitAgent engineering team Nov 19, 2025

Engineering

Tuning pgvector HNSW for proposal workloads

M, ef_construction, ef_search — the three knobs that decide retrieval latency and recall in a pgvector HNSW index. What we chose for PursuitAgent and why.

The PursuitAgent engineering team Nov 17, 2025

Grounded AI

Confidence-threshold tuning for DDQ auto-answer

Where we set the confidence bar for auto-answering a DDQ question. The precision/recall trade-off, explained with our own data and the number we actually use for security questionnaires.

The PursuitAgent engineering team Nov 12, 2025

Engineering

Turning a SOC 2 PDF into 140 KB blocks

The ingest, the extraction, the linking. A worked trace of how a SOC 2 Type II report becomes the set of KB blocks that DDQ answers cite — with the real pgvector row shape at the end.

The PursuitAgent engineering team Nov 10, 2025

Engineering

In preview: auto-attachment of evidence on DDQ answers

Auto-attachment of evidence PDFs — SOC 2, pentest, policy documents — to DDQ answers that cite them. In preview for design-partner tenants while DDQ workflows mature toward general availability.

PursuitAgent Nov 7, 2025

Engineering

The evidence vault: where SOC 2 PDFs live and how they cite

How a DDQ answer citing 'SOC 2 report, section CC6.1' actually finds the right PDF, serves it to the right buyer, and keeps the audit trail. The storage, access, and audit layer underneath.

The PursuitAgent engineering team Nov 3, 2025

Procurement

The DDQ evidence-attachment API

How buyer-side evidence-request fields get auto-populated from a KB evidence vault. The schema, the matching logic, and the human-in-the-loop step we will not remove.

The PursuitAgent engineering team Oct 29, 2025

Engineering

Per-customer embedding tenancy, explained

How tenant isolation works at the vector level in PursuitAgent. Why we use Postgres row-level security on pgvector as the default, where shared embedding spaces would be cheaper, and the trade-offs we are not willing to take.

The PursuitAgent engineering team Oct 27, 2025

Engineering

Shipped: bulk edit for answer blocks, with undo

A real-world request we dragged our feet on for nine months. Bulk edit is now in the product, with version-aware undo and a confirmation flow that prevents the silent overwrite that made us nervous in the first place.

PursuitAgent Oct 24, 2025

Grounded AI

Citation UI: three designs we tried, two we kept

How we render inline citations next to grounded-AI output. Three UX experiments — footnote chips, side-pane evidence cards, and inline hover popovers — and what we learned about which ones reviewers actually use.

The PursuitAgent engineering team Oct 22, 2025

Engineering

KB block versioning: the five-year commit history

How a KB block evolves across 18 proposals, three approvals, and one rollback. The data model behind block versioning, why we keep every prior version, and the queries that make it useful.

The PursuitAgent engineering team Oct 20, 2025

Engineering

The background job queue for proposal processing

How Hatchet orchestrates the ingest, classify, draft, and verify stages of a proposal response. The four stages, the retry policies, the dead-letter handling, and the one place we deliberately chose synchronous over async.

The PursuitAgent engineering team Oct 13, 2025

Grounded AI

Hallucination monitoring in production

The metric we watch weekly: per-claim refusal rate, citation-mismatch rate, and the human-graded sample. What we do when each one moves, and the threshold values that trigger an alert.

The PursuitAgent engineering team Oct 8, 2025

Engineering

Semantic deduplication of KB blocks at ingest

How we merge near-duplicate KB blocks at ingest time using embedding similarity, the threshold we settled on after testing four values, and the trade-off we accept by tuning toward over-merging.

The PursuitAgent engineering team Oct 6, 2025

Engineering

In preview: the retrieval-eval dashboard, publicly visible

Our internal retrieval evaluation dashboard is going public in preview. Real gold-set numbers, real regressions, updated nightly. Here is what is on it and what we deliberately left out.

The PursuitAgent engineering team Oct 3, 2025

Engineering

Our retrieval eval, quarterly report

A quarter of running our retrieval evaluation harness against a frozen gold set: the regressions we caught, the two changes that actually moved precision, and the metric we stopped reporting because it lied.

The PursuitAgent engineering team Oct 1, 2025

Procurement

Security questionnaires: linking answers to evidence

How a SOC 2 attestation PDF becomes a citation source for DDQ answers. The ingest pipeline, the per-control extraction, and the per-claim linking that makes 'yes' answers verifiable instead of theatrical.

The PursuitAgent engineering team Sep 29, 2025

Engineering

The citation density target per section

Why executive summaries get two citations per paragraph and technical sections get five. The rationale for citation density as a section-level target, and what happens to drafts that fall below it.

The PursuitAgent engineering team Sep 22, 2025

Grounded AI

Numeric claim extraction and verification

How we parse numbers from drafts — percentages, dollar figures, head counts, dates — and check each one against a KB source before the sentence ships. The pipeline, the regex floor, the LLM ceiling, and what we still get wrong.

The PursuitAgent engineering team Sep 17, 2025

Engineering

Cost control for RAG: daily budgets, fallback models, burn alerts

How we keep RAG spend predictable per tenant. Daily budgets, model-tier fallbacks, and burn-rate alerts before the bill spikes — with the dashboard and the rules.

The PursuitAgent engineering team Sep 15, 2025

Engineering

Shipped: answer-block tagging for win/loss cross-reference

A small change with a downstream payoff. Every answer block now carries tags for buyer, sector, theme, and outcome — wiring the foundation for the win/loss cross-reference work next month.

PursuitAgent Sep 12, 2025

Engineering

How the draft packet is generated, line by line

The prompt, the retrieval context, and the output template that produce an SME draft packet. A worked example from a real-shaped RFP question to a ready-to-review answer.

The PursuitAgent engineering team Sep 10, 2025

Engineering

The SME draft packet, generated automatically

What we ship to an SME alongside the question so they can answer in five minutes instead of fifty. The packet's components, the retrieval that builds it, and the design choices that keep the SME out of our tool.

The PursuitAgent engineering team Sep 8, 2025

Engineering

In preview: SME-ask tickets with SLA timers

Every SME ask creates a ticket with a deadline derived from the bid date. Open tickets show on the proposal dashboard. Missed SLAs flag, they don't auto-chase. In preview alongside the Proposal Builder section-assignment surface.

PursuitAgent Sep 5, 2025

Engineering

The SME Slack bot: architecture and boundaries

How the PursuitAgent SME bot asks for input, what it does with the answer, and what it deliberately refuses to do. A short tour of the boundary between the bot and the human.

The PursuitAgent engineering team Sep 1, 2025

Engineering

KB schema evolution, year one

Four migrations we ran on the knowledge-base blocks table, three we rolled back, and what the schema looks like now. A field report on schema discipline.

The PursuitAgent engineering team Aug 27, 2025

Engineering

Retrieval evaluation, part 2: dealing with numeric claims

Why numeric facts break vanilla retrieval and the two tactics — hybrid search and numeric-claim isolation — that fix it. Continuation of the eval series.

The PursuitAgent engineering team Aug 25, 2025

Grounded AI

Confidence scores for grounded drafts, explained

What '82% confident' means in our drafting engine, how it's computed from retrieval and entailment signals, and where it leads the reviewer.

The PursuitAgent engineering team Aug 20, 2025

Engineering

Streaming drafts over SSE, with citations inline

How we stream draft output to the browser while keeping citation integrity intact. The architecture, the failure modes, and the part we got wrong twice.

The PursuitAgent engineering team Aug 18, 2025

Engineering

In preview: question router v2 with confidence scores

DDQ questions now route with a confidence score in preview. High-confidence routes auto-draft from the KB; low-confidence routes to human review with a typed reason for the routing call.

PursuitAgent Aug 15, 2025

Engineering

How we curate the retrieval gold set

120 questions, three annotators, a disagreement-resolution protocol. The recipe behind the held-out set we evaluate every retrieval pipeline change against — and the parts we plan to open-source.

The PursuitAgent engineering team Aug 11, 2025

Grounded AI

Retrieval over diagrams, not just text

How we index D2 code and diagram descriptions so an architecture question can ground to a specific figure. The pipeline, the failure modes, and the citation surface for a diagram source.

The PursuitAgent engineering team Aug 6, 2025

Engineering

The answer provenance graph in the KB

Every block in the knowledge base tracks source, author, approver, and last-used-in. The provenance graph isn't bookkeeping — it's a product surface. Here's what it stores and what it powers.

The PursuitAgent engineering team Aug 4, 2025

Engineering

Shipped: answer-block inheritance across projects

When you edit an approved KB block, in-flight proposals inherit the change without overwriting their local edits. Here's how the merge resolves and what we did about the conflict cases.

PursuitAgent Aug 1, 2025

Engineering

The cost per response, broken down to the penny

Embedding calls, retrieval compute, draft tokens, verifier tokens, storage. The unit cost structure of a single drafted RFP answer, with a worked example. We publish the unit economics, not customer costs.

The PursuitAgent engineering team Jul 28, 2025

Engineering

Shipped: win/loss pair capture at submit time

When a proposal hits submit, PursuitAgent now captures the response, the disposition, and the structured post-mortem inputs into a single record. The first piece of the win/loss intelligence loop is in production.

PursuitAgent Jul 24, 2025

Procurement

Ingesting a 300-question security questionnaire

A 300-question security questionnaire is a throughput problem, not a writing problem. The ingest pipeline has five stages: extract, classify, dedupe against the last one, retrieve, assemble. Here is what each one does and where it costs.

The PursuitAgent engineering team Jul 22, 2025

Engineering

Query rewriting for RFP questions with implicit context

Most RFP questions retrieve poorly because they assume context the corpus does not carry. Query rewriting turns 'describe your approach' into a retrieval string that hits. Examples, the rewrite chain, and the cost tradeoff.

The PursuitAgent engineering team Jul 21, 2025

Engineering

The chunk size ablation: 256, 512, 1024 tokens on RFP text

We ran the same retrieval pipeline at three chunk sizes against our RFP-text gold set. Directional results, the tradeoffs that surfaced, and why we don't ship a single global chunk size.

The PursuitAgent engineering team Jul 14, 2025

Engineering

Shipped: answer-block diff view for reviewers

When a drafted answer is regenerated against a different KB block, reviewers now see a side-by-side of the previous version and the current version. Shipped this week.

PursuitAgent Jul 11, 2025

Engineering

Our eval harness, on the command line

A walkthrough of the dev loop for retrieval changes — one command to baseline, one command to re-run, one to diff. The CLI ergonomics that keep us from tuning by feel.

The PursuitAgent engineering team Jul 7, 2025

Engineering

Shipped: the inline verify button in drafts

Hover any drafted sentence in the proposal builder and a verify button surfaces the source block, the entailment trace, and the timestamp of the last KB update. Shipped this week.

PursuitAgent Jul 3, 2025

Engineering Long read

How we evaluate retrieval quality on our own corpus

Our gold set, the metrics we track, the eval harness on a laptop, the regression-guard CI job, and the directional numbers we'll publicly stand behind. Long read.

The PursuitAgent engineering team Jul 2, 2025

Engineering

Inside the ingest pipeline: parse, extract, index

How a PDF becomes searchable KB blocks. LlamaParse for parsing, structural-plus-semantic extraction, pgvector indexing with HNSW. Where each stage wins and where it falls over.

The PursuitAgent engineering team Jun 23, 2025

Engineering

Shipped: multi-doc RFP ingest with attachment dependencies

RFPs ship as bundles. The scoring rubric, the technical appendix, the pricing workbook. The Analyzer now ingests all of them as one pursuit, with dependencies tracked between them.

PursuitAgent Jun 20, 2025

Engineering

Our retrieval latency budget, explained

Where the milliseconds go in a single retrieval call: embedding lookup, vector search, reranker, hybrid merge, payload hydration. P50 120ms, P95 400ms, and what we cut to get there.

The PursuitAgent engineering team Jun 16, 2025

Engineering

Shipped: KB block freshness alerts in review

Stale KB content now flags itself before a reviewer touches it. The freshness signal sits inline on every drafted answer that cites a block past its refresh date.

PursuitAgent Jun 13, 2025

Engineering

Hybrid search: dense embeddings plus BM25 for proposals

Pure dense retrieval misses on numeric identifiers, product names, and SOC codes. Pure BM25 misses on paraphrase. The blend ratio we use, how we tune it, and the test set that catches regressions.

The PursuitAgent engineering team Jun 10, 2025

Engineering

Shipped: auto-generated compliance matrix from ingested RFPs

Stage 4 of the RFP pipeline — the compliance matrix — is now one click away from intake. Drop a PDF in, get a matrix out. Here's what shipped and where it still needs a human.

PursuitAgent Jun 5, 2025

Grounded AI Grounded Retrieval 101 · Part 4/4

Grounded Retrieval 101, Part 4: what we're still wrong about

The closing post of the Grounded Retrieval 101 series. Three failure modes we have not solved — numeric precision, compound claims, synonym drift — with the test cases that surface them and what we are doing about each.

The PursuitAgent engineering team Jun 4, 2025

Engineering

How the citation rendering stack works

From a retrieval hit to a verify button next to a sentence, in four components. The plumbing behind every cited claim PursuitAgent ships, and why we render the source inline instead of in a footnote.

The PursuitAgent engineering team Jun 2, 2025

Engineering

Shipped: content-block freshness scores

Every KB block now carries a freshness score that decays as the source ages, drifts from the company's current marketing language, or contradicts a more recent block. Stale citations get caught at draft time.

PursuitAgent May 30, 2025

Engineering

Testing retrieval: gold sets, precision@k, and why BLEU lies for proposals

Surface-form metrics like BLEU and ROUGE rate proposal text by token overlap. Token overlap is a poor proxy for whether the answer is actually right. Here's the eval stack we use instead.

The PursuitAgent engineering team May 26, 2025

Engineering

Shipped: diagram-aware extraction via Gemini 2.5 Flash

System architecture diagrams are now first-class KB blocks. We extract them with Gemini 2.5 Flash, store the description as text and the structure as D2 code, and retrieve both.

PursuitAgent May 23, 2025

Engineering

Our chunking pipeline, end to end

Five stages between an uploaded PDF and a retrievable KB block: parse, structural split, semantic rechunk, overlap, and index. Where each one fails and why we kept the boundaries.

The PursuitAgent engineering team May 19, 2025

Engineering

In preview: per-block source permissions

Per-block permissions in the KB — control which teams or roles can read, draft from, or edit each content block, with audit trails on every read. In preview while the marketed KB surface catches up.

PursuitAgent May 16, 2025

Grounded AI Grounded Retrieval 101 · Part 2/4

Grounded Retrieval 101, Part 2: why citations don't guarantee groundedness

A citation tells you which passage was retrieved. It does not tell you whether the cited passage actually supports the generated claim. Part 2 of the Grounded Retrieval series — the entailment gap, and what closes it.

The PursuitAgent engineering team May 14, 2025

Engineering

Embedding model selection: why Gemini Embedding 2 for proposals

A teardown of how we evaluated four embedding models — Gemini Embedding 2, OpenAI text-embedding-3-large, Cohere embed-v4, and Voyage — for a proposal corpus, and the methodology that drove the choice.

The PursuitAgent research team May 12, 2025

Engineering

Shipped: content-block versioning in the KB

Every content block in your knowledge base now keeps an immutable version history. Drafts that cited a block stay tied to the exact text they cited, even after the block is edited.

PursuitAgent May 9, 2025

Grounded AI Grounded Retrieval 101 · Part 1/4

Grounded Retrieval 101, Part 1: what RAG is and why it still hallucinates

RAG in three sentences, then the hard part: why retrieval-augmented generation still produces fabricated answers, and what the academic and practitioner literature says about it. Part 1 of a four-part series.

The PursuitAgent engineering team May 7, 2025

Engineering

How we chunk proposals for retrieval

Fixed-window chunking loses at headers, table cells, and numeric clauses. This post walks through the structural-plus-semantic chunking strategy we run on past proposals and KB content blocks, with code.

The PursuitAgent engineering team May 5, 2025

See the proposal workflow

Take the 5-minute tour, then start a trial workspace with your own source material.

Take the tour Start trial