Field notes

End-of-year product priorities, in public

What we're building in December and January, why those and not the alternatives. An honest account of the four bets we're making and the three we deliberately deferred.

Bo Bergstrom 6 min read Category

We write our roadmap in public, mostly. Not every specific ship, but the shape of what we’re investing in and why. This is what December and January look like for us. Four bets we’re making, three we deliberately deferred, and the honest reasoning on each.

The reason to write this in public: we’ve learned more from customers pushing back on roadmap posts than we’ve learned from any other single channel. If you’re a customer or a prospect and something here is wrong, I want to know.

The four bets

Bet 1 — Reflection in the win/loss loop

Pair capture shipped in July. The post-mortem pipeline shipped this month. Reflection — the step that reads a completed post-mortem and produces candidate KB write-backs — is next. The plumbing is mostly there. The work remaining is evaluation: which proposed KB updates are good enough to surface to a human for review, which ones should be auto-discarded, where the precision/recall trade-off sits.

Why this bet is first: it’s the piece that closes the compounding loop. Every other thing we build is linear — retrieval quality, drafting speed, DDQ automation — and linear is fine. Reflection is quadratic. Every bid makes every future bid slightly better. Missing it means our product is a better version of what Loopio and Responsive already ship; having it means our product is doing something they aren’t.

Bet 2 — Evidence-grounded DDQ answers, not just cited ones

A DDQ answer today ships with citations to the KB blocks it drew from. That’s good; it’s table stakes for a grounded product. It’s not enough. A DDQ buyer wants to see the actual evidence — the SOC 2 page, the pen-test letter, the encryption policy extract — not a citation to a block that summarizes them.

We’re wiring evidence attachments into the answer path. When a DDQ answer is drafted, the system pulls the citable blocks and the source documents those blocks derived from. The buyer gets an answer with inline citations and inline evidence. The work we shipped on evidence-vault-architecture and shipped-evidence-attachment-auto is the foundation; December is wiring it end-to-end through the DDQ drafting flow.

Bet 3 — Freshness as a first-class product surface

A product that drafts from a KB has to own the KB’s condition. We ship freshness scores and alerts, but the integration points — where those scores show up during drafting, during reviews, during the post-mortem — are uneven. December’s work is making freshness a consistent surface: when a drafter is pulling a block, they see its freshness state; when a reviewer is reading a draft, they see which sections are grounded in stale blocks; when a post-mortem is being written, stale blocks that appeared in the response are automatically flagged for the rewrite queue.

This one isn’t glamorous. It’s plumbing through the rest of the product. Skipping it is how KB rot becomes a product-wide problem; investing in it is how we keep the grounded-AI posture honest. Stanford HAI’s legal-RAG research — 17-33% hallucination rates even with retrieval — is partly about the gap between “has citations” and “has credible citations.” Freshness is a piece of the credibility story.

Bet 4 — The onboarding path for teams under 20 people

We’ve been too slow on this. Our self-serve signup flow works, but the gap between “signed up” and “actually ingested a KB and shipped a first draft” is too long. Teams under 20 people don’t have the staffing slack to absorb a multi-week onboarding, and our activation curve reflects it. We churn prospects in the first ten days who would otherwise have stuck.

The January work: a guided first-bid flow that takes a new tenant from signup to drafting their first section in under a business day. No professional services required. Built on the ingestion and classification work we’ve shipped; the gap is UX and guardrails, not infrastructure.

The three deferrals

Deferred 1 — The marketplace of third-party KB connectors

We get asked for Notion, Confluence, SharePoint, and Google Drive connectors regularly. They would be useful. They would not be the highest-leverage use of a quarter. A connector is, in our experience, a six-week engagement per source (authentication, permission mapping, incremental sync, content normalization). Doing four of them means one quarter of engineering spent on something that isn’t differentiating — most competitors have these connectors already.

We will ship them. We will not ship them before reflection, evidence-grounded DDQs, freshness plumbing, and the onboarding fix. If you’re evaluating us and your binding constraint is that your KB lives in Notion and you want a direct sync, we may be the wrong product for you this quarter. That’s an honest answer rather than a roadmap promise I can’t keep.

Deferred 2 — Competitor-feature parity projects

There are two or three features in incumbent products (Loopio, Responsive) that customers ask us for that are essentially feature-parity asks. We’re not shipping any of them in December-January. Our differentiation is in the shape of the product, not in the checklist-parity of the feature list. Shipping to a feature matrix is how we end up as a cheaper version of the incumbent; shipping to a thesis is how we end up as a different product. The thesis is: grounded drafting, on a KB the team trusts, with a compounding loop. Features that don’t serve that thesis don’t ship.

Deferred 3 — The mobile app

We get asked. We’re not building one. Proposal work is desk work. The 2% of the workflow that happens on a phone (approval pings, status checks) can be served by email and by a mobile-responsive web view, which we already have. A native mobile app is a full team-quarter. Not worth it against the four bets above.

What this list is missing

Things I think about and haven’t committed to:

  • Open-source pieces. We use a lot of open-source infrastructure and have considered open-sourcing parts of our extraction pipeline. It hasn’t reached the top of the list, and I don’t want to commit to it in public without the headroom to follow through.
  • Public benchmarks. We have internal retrieval evaluations; publishing them as a benchmark that the category could adopt is appealing. It’s also a months-long editorial effort that I haven’t budgeted for.
  • An integration with the ATS/CRM side. Some customers want their CRM to drive bid/no-bid automatically. I think it’s the right long-term direction. I don’t think we’re ready to build it in the next 60 days.

How to argue with this

If you’re a customer reading this and your binding constraint is on the deferred list, I want to hear from you. The deferral calculus is based on the aggregate of customer conversations over the last quarter; individual cases can shift the weights. If you’re a prospect and a bet above is specifically the shape you’re buying for, that’s also useful — it tells us we’re pointed at the right things. Either way, the feedback loop is where the roadmap gets better.

For the product thesis this roadmap is working inside, see the eight-stage pipeline pillar and the positioning on the platform page. For the engineering posture underneath, see the build-log posts from earlier in November — pgvector HNSW tuning, the async drafting worker pool, multi-tenant DDQ templates.

Sources

  1. 1. PursuitAgent — platform page
  2. 2. APMP Body of Knowledge — proposal technology
  3. 3. Stanford HAI — Legal RAG hallucinations