Color-team review: the full playbook

This is the operational companion to Sarah’s color-team review essay from earlier this month and her broader argument that the color-team discipline is worth keeping but not worth importing wholesale. That essay argued the discipline. This post is the playbook — the step-by-step procedure for running each review on a real bid, with rubrics you can copy.

The Shipley tradition uses four named reviews: pink (structural, ~30% drafted), red (content, ~80% drafted), gold (final go/no-go, ~95% drafted), and white (post-submission retrospective). Each one has a job. Most teams that get this wrong are not failing on intent — they are running pink as red, red as gold, or gold as a copyedit. The point of the playbook is that the four reviews are not the same review repeated four times. Each one is a different question, asked at a different draft completeness, by a different group of people.

When to use this playbook

Use the full four-stage cadence when:

The bid value is seven figures or above. The math on review hours pays out.
The team is large enough that drafters and reviewers can be different humans. Three writers and three independent reviewers is the minimum that makes the discipline portable.
The procurement uses compliance-weighted scoring (best-value tradeoff with explicit factor weights, or any LPTA where the technical threshold is non-trivial). The rubric structure of color reviews maps directly onto the buyer’s rubric.

For smaller bids — sub-seven-figure commercial work, simple LPTA commodity bids, RFIs that won’t lead to an immediate award — compress the cadence. The Bid Lab makes the case that the full Shipley ceremony “actually slows things down instead of helping” for 10-person shops, and we agree. The compression we recommend: combine pink and red into a single mid-draft structural-and-content review at ~60% drafted, and keep gold and white as full reviews. That gets you three review touchpoints instead of four, and it keeps the highest-leverage stages (gold and white) intact.

The rest of this post assumes you are running the full cadence. The compressed-cadence variant takes the pink rubric items, merges them into the red rubric, and runs the combined review at the 60% mark.

Pink team — structural

Timing. When the response is approximately 30% drafted. In practice: the executive summary is in first draft, the technical approach has section headings and bullet outlines under each heading, the management approach has a draft team org chart, and the cost volume has a structure even if numbers aren’t final. No section is finished. Many sections are stubs. That is the point.

Who attends. Two to three reviewers, none of whom are drafters on this bid. The proposal manager runs the review but does not score it. Useful reviewers: a proposal manager from another active bid, a capture lead with knowledge of this buyer, and one technical lead who is not writing the technical volume.

The “not drafters” rule matters. Drafters reviewing their own structure default to defending it. Reviewers who have never seen this draft can ask “why is the technical approach in section 4 instead of section 2?” without ego attached.

The rubric — 12 items. Score each as PASS / FAIL / NEEDS WORK. A FAIL on any item triggers a written remediation plan with a deadline before red team.

Does the response section structure exactly match the compliance matrix order? (Drift here is the single most common pink-team finding.)
Is every “shall,” “must,” “will provide,” and “describe” from the RFP mapped to a response section? (Use the matrix; spot-check ten random requirements.)
Are page-limit allocations set per section, summing to the RFP page limit minus a 5% buffer?
Does the executive summary lead with the buyer’s stated outcome — not the vendor’s qualifications?
Are the three to five capture-plan win themes named in the executive summary draft?
Does the technical approach section have a defensible win-theme thread visible in the section outline (not just in the exec summary)?
Is past performance scoped to citations the team can actually retrieve, with named contracts and named buyer-side references?
Is the cost volume structure aligned with the RFP’s required cost-narrative format?
Are required attachments (certifications, forms, key personnel resumes) accounted for as separate work items with named owners?
Is the submission portal mechanism documented with a named human responsible for the submit click?
Is the gold-team date set, and is it at least 48 hours before submission?
Does the section ownership map have one named drafter per section, with a named backup?

Output format. A single page. Each item gets one row. Each FAIL gets a one-sentence remediation, an owner, and a date. The pink-team output is filed as a comment in the proposal record, not delivered as a deck. Reviews that produce decks don’t get read.

Example findings. From recent reviews on bids in the PursuitAgent product:

“Section 4.2 (technical approach) currently leads with the vendor’s company history. Move buyer-outcome lead from exec summary into 4.2 opening — owner: writer-A, due: red-team minus 5 days.”
“Compliance matrix has 8 unmapped requirements (shall-clauses in Attachment B). Map by red team — owner: PM, due: red-team minus 7 days.”
“Win theme #2 (‘lowest implementation risk’) has no evidence thread in the technical approach outline. Add a paragraph in 4.4 with the named past-contract reference — owner: capture lead, due: red-team minus 3 days.”

Typical duration. 90 minutes for the review meeting, plus 30 to 60 minutes per reviewer for pre-read. Total reviewer time for a three-person panel: roughly six to eight hours. The proposal manager spends another two hours scheduling, distributing materials, and writing up findings. On a four-week response, this is one half-day on the calendar.

The single biggest pink-team mistake is starting it too early. If the structure is empty (an outline of section names with no content underneath), the reviewers can only react to titles. They will produce a deck of style-guide nits and structural opinions that don’t survive contact with content. Hold the review until each section has at least bullets under each heading and a draft win-theme paragraph. Better to delay pink by 48 hours than to run it on an empty skeleton.

Red team — content

Timing. When the response is approximately 80% drafted. Every section has a complete first draft. The cost volume has draft numbers (final pricing may still be moving, but the structure and the per-line rationale exist). The executive summary has been through one editorial pass. Past performance citations are written with named references. Attachments are at least listed and some are drafted.

Who attends. Three to five reviewers. At least one SME per major section (if the bid has technical, management, and cost volumes, you want one technical SME, one operations SME with management-approach context, and one finance reviewer for cost). Plus a content lead (often the proposal manager from another bid or an outside reviewer) and an executive sponsor for high-value bids.

The SMEs are the ones who matter most. Red team is when the technical credibility of the response gets stress-tested by people who could actually deliver the work. A red-team SME who reads the technical approach and says “we cannot do this in the timeline we just promised” has saved the team a guaranteed loss or, worse, a guaranteed delivery failure. Lohfeld Consulting has argued that proposal managers spend more time chasing SMEs for input than building strategy — red team is the moment that investment pays back, because the SME’s sign-off becomes a delivery commitment, not a writing favor.

The rubric — 15 items. Score PASS / FAIL / NEEDS WORK. FAILs at red team are more serious than at pink — there are now fewer days to fix them.

Is every win theme present in every major section, with evidence (not just assertion)?
Is every factual claim about the vendor traceable to a source the reviewer can verify in one click?
Are technical approach numbers (timelines, FTE counts, throughput) deliverable by the team that wrote them?
Are past-performance citations relevant to this scope, with named buyer references reachable for verification?
Does the executive summary’s second sentence carry weight? (First-sentence hooks land or don’t; second sentences are where most exec summaries leak credibility.)
Is every “shall” in the RFP answered with language the evaluator can map back to the requirement?
Are management-approach commitments — staffing, escalation, governance — consistent with the staffing plan and the cost volume?
Does the cost narrative justify each line item against the technical approach, not against generic language?
Are key personnel resumes specific to this scope, not generic CVs?
Has every win theme been swap-tested? (If you can replace the vendor name with a competitor’s and the win theme still makes sense, the win theme is too generic. See the swap test post.)
Are differentiators tied to discriminators? (If the response claims a capability, can the team prove a competitor cannot match it?)
Are red-flag items from the procurement-side reading captured? (See the procurement-lead reading discipline.)
Is the response readable at the procurement-lead’s pace? (Average evaluator gives a section 4 to 7 minutes. Sections that require longer to absorb lose score.)
Are graphics, tables, and visuals load-bearing — would removing them weaken the response — or are they decorative?
Are compliance-matrix attestations actually true? (Every “we comply with X” must be defensible by someone in the room.)

Output format. A line-item review document with three columns: finding, severity (S1 / S2 / S3), and owner. S1 findings block submission. S2 findings need a written response and a fix plan before gold team. S3 findings get logged for the white-team retrospective and may or may not be addressed for this bid.

The proposal manager consolidates findings within four hours of the review ending. Findings get assigned to drafters that day. The remediation window between red team and gold team is typically five to seven calendar days — every hour of that window is consumed.

Common failure modes.

Red team that doesn’t surface S1s. A red team that produces only S2 and S3 findings either had the wrong reviewers or the wrong rubric. The most common cause: SMEs were invited but did not pre-read; they spent the review hour reading the draft for the first time and produced surface-level reactions instead of substantive technical critique. Fix by requiring pre-read before the meeting, not during it.
The deadline-defense reflex. A reviewer raises an S1 (“this technical approach won’t deliver in the proposed timeline”) and the room responds with “we don’t have time to rewrite that, we’ll handle it post-award.” This is the proposal-manager equivalent of the broken windows fallacy. Either the issue is real (in which case shipping unchanged guarantees failure or scope renegotiation) or it isn’t (in which case the SME’s reasoning needs to be answered, not deferred). Defer rarely; rewrite often.
Reviewers who copyedit instead of review. A red-team reviewer who returns 200 line edits and zero structural findings has reviewed at the wrong altitude. Tell reviewers explicitly at the kickoff: red team is for content, not copy. Copy editing happens in a separate pass after gold.
The drafter who attends their own red team. Writers want to be in the room to defend their work. This is exactly why they should not be. The proposal manager attends; the drafters get the findings document and have until gold team to respond.

Typical duration. Three hours for the review meeting if pre-read is enforced. Add 60 to 90 minutes per reviewer for pre-read. Total time across a four-person panel: roughly 16 to 20 hours. The proposal manager spends another four to six hours consolidating and assigning remediation.

Gold team — win themes and compliance

Timing. When the response is approximately 95% drafted. Every red-team S1 and S2 is closed or has a written deferral with executive sponsor sign-off. The cost volume has final numbers. The executive summary has been through three editorial passes. Attachments are complete and named per the buyer’s required convention.

Who attends. A small group: the proposal manager, the executive sponsor (the senior commercial owner who made the bid/no-bid call), the compliance lead, and at most one or two outside reviewers who have not seen this draft. Three to four people. No drafters.

Gold team is small on purpose. By this point, the bulk of the editorial work is done. Gold is the final integrity check before submission — it is not the moment to surface new content. Reviewers who want to add new material at gold are reviewers who should have been at red.

The rubric — 8 items. This rubric is structured as pass/fail only. A FAIL on any of the eight items blocks submission until resolved.

Compliance completeness. Every requirement in the compliance matrix has a non-null response pointer. Every “shall” is answered. The matrix itself is attached to the response if the buyer requires it.
Citation integrity. Every factual claim about the vendor — capabilities, past performance, certifications, financial standing — is traceable to a source in the company’s records. No fabricated specifics. (This is the rubric item that grounded AI is supposed to make easy. See our grounded-AI pledge.)
Win-theme presence. Every named win theme from the capture plan appears in every major section, not just the exec summary. (Reviewers spot-check three sections at random and verify theme presence with evidence.)
Buyer-outcome alignment. The executive summary’s first paragraph leads with the buyer’s stated outcome, not the vendor’s qualifications. The same outcome is referenced in the closing paragraph of the technical approach and in the cost-narrative opening.
Submission-format compliance. Page count, font, margins, file format, and naming convention exactly match the RFP’s submission instructions. The portal mechanism has been dry-run by the named submit-clicker.
Pricing integrity. The cost volume’s totals match the cost-narrative’s section subtotals. The cost narrative justifies each line against the technical approach. Discounts and assumptions are documented where the RFP allows.
Attachments and signatures. Every required attachment is present and current-dated. Every required signature is in place. Cover letter is on letterhead. Required certifications are not expired.
Submission timing. The submission window opens in at least 24 hours from gold-team conclusion. Submission is scheduled for at least eight hours before the deadline (not 30 minutes before).

Output format. A go/no-go decision document. The executive sponsor signs the go decision in writing. If any rubric item is a FAIL, the submission is delayed until the FAIL is resolved or the bid is withdrawn. There is no “we’ll fix it after submit” at gold team.

Pre-submit go/no-go. The gold team output is the formal go/no-go for submission. This is not a ceremonial check — it is the moment when the executive sponsor accepts accountability for the response as shipped. If the sponsor is uncomfortable with any rubric item, the response does not go out. The hard cases are real: a gold team that surfaces a citation-integrity FAIL with 36 hours on the clock is a hard conversation, but it is a conversation that has to happen at gold rather than after the buyer reads the response.

Typical duration. 90 minutes. Gold team is short because the rubric is binary and the work is mostly verification, not deliberation. Pre-read is mandatory and brief — the executive sponsor and compliance lead read the full response in advance. The meeting is the verification, not the reading.

White team — post-submission

Timing. Within 14 days of the buyer’s award decision (win or loss). The team’s memory is freshest in the two weeks after submission; by week three, the next bid has displaced this one and the white team becomes a calendar event nobody attends.

If the buyer’s decision takes longer than 30 days — which happens often in federal procurement and not infrequently in commercial — run an interim white team within 14 days of submission anyway, on what the team learned about its own process. Then run a follow-up debrief-driven white team after the award decision lands.

Who attends. The drafters, the capture lead, the proposal manager, and (when possible) someone from the buyer’s evaluation panel via debrief notes. The executive sponsor attends for high-value bids. SMEs are invited but optional.

The drafters’ attendance matters most. White team is where the people who wrote the response find out which parts of their work moved the rubric and which parts did not. It is the only stage at which writers get structured feedback on their own output, and that feedback is the input to next quarter’s draft quality.

The rubric. White team is less rubric-driven than the upstream reviews because the question is open-ended: what did we learn? Still, structure helps. We use a six-question template:

What did we propose? A one-paragraph summary of the offer as submitted, including the three to five win themes and the headline price.
What were we graded on? The published evaluation criteria, plus any debrief-derived information about how the panel actually weighted them.
What did the panel say worked? From the debrief: specific strengths the evaluators called out. From internal review: which content blocks, win themes, or structural choices the team is proudest of regardless of award outcome.
What did the panel say did not work? From the debrief: specific weaknesses. From internal review: known compromises the team made under deadline pressure that may have cost score.
What KB updates does this generate? Concrete write-back actions: content blocks to update, past-performance citations to refresh, win themes to retire or promote, capture-plan templates to revise.
What process change does this generate? Operational lessons: pink team that started too early, red team that ran too short, SME availability that broke down, portal-submission near-misses. Each one becomes a backlog item.

Output format. A post-mortem document filed in the proposal record, with the KB updates and process changes broken out as separate action items with named owners and dates. The KB updates are the most important part of the document — they are how the loss or win compounds into the next bid. A white team that produces no KB updates is a white team whose findings will not survive contact with next quarter’s deadlines.

The post-mortem write-back is not optional. The product thesis behind any compounding proposal function — see the eight-stage pipeline post — is that every bid feeds the next. White team is the stage where that compounding either happens or doesn’t.

Common failure modes.

The team holds the meeting and writes nothing back. Lessons live in heads, not in the corpus. Three months later, a different drafter writes the same too-generic past-performance citation that the white team flagged in July.
The buyer debrief is skipped. “The procurement officer wouldn’t tell us anything useful.” This is sometimes true and often a function of how the team asked. Procurement officers will, in our experience, tell you exactly why you lost if asked correctly. (See the procurement-lead reading post for the framing.) Skipping the debrief skips the highest-density learning input.
The white team becomes a blame meeting. When the loss is recent, the room defaults to defending or attacking. The proposal manager has to actively steer toward the rubric. Specific people did specific things; the question is what the team does next quarter, not who to fault.
The action items have no follow-through. This is the most common failure mode in proposal-shop post-mortems generally — and the Leulu and Co essay on this is worth reading. The fix is to put white-team action items into the same backlog as next quarter’s process improvements, with named owners and review dates.

Typical duration. Two hours for the meeting, plus three to five hours of preparation by the proposal manager (debrief notes, internal review draft, KB-update list). Total time per white team: roughly eight hours. For a team running 15 to 25 bids a year, that is roughly three full weeks of proposal-manager time annually — and the most consequential three weeks the function spends.

The review calendar

The single highest-leverage operational decision is setting all four review dates at kickoff, before any drafting happens. This sounds obvious. It is the thing teams skip more than any other process step.

At kickoff:

Mark the submission deadline on the calendar.
Subtract two business days for submission buffer. That is the gold-team date.
Subtract another five to seven business days. That is the red-team date.
Subtract another ten to fifteen business days. That is the pink-team date.
Schedule the white team for two weeks after the expected award decision (with an interim white team at submission + 14 days if the award is far out).

These dates go in calendar invites with named reviewers, before the first writing assignment is given. Once the calendar is set, the rule is: the dates do not move because drafters are behind. If a section is not ready for pink, the reviewers review what exists and the missing-content findings become S1 remediation items. If a section is not ready for red, the same rule applies. The reviews are fixed points; the drafting fits around them.

The non-negotiable spacing matters because each review needs at least 48 hours between conclusion and the next milestone (writing checkpoint, gold team, submission) for findings to be addressed. Reviews compressed into the final 24 hours before submission produce findings that nobody acts on, which means the review was theatre.

Tooling

This playbook is product-neutral. It can be run on a wiki and a spreadsheet. The tooling that helps, regardless of vendor:

A compliance matrix that updates as addenda land — so pink team can score against the current state, not the original RFP.
Inline comment threading on the response document — so red-team findings stay attached to the specific paragraph they apply to, not lost in a separate findings deck.
Citation tracking on every factual claim — so gold team’s citation-integrity check is a verification, not a research project.
A KB write-back path from white team — so post-mortem findings update the corpus the next bid draws from.

PursuitAgent’s proposal builder is built around this exact review cadence — color-team rubrics live next to the draft, citation integrity is enforced at the block level, and the white-team write-back is a one-click action that promotes approved content blocks into the active KB. Other tools can do parts of this. The discipline is what matters; the tooling is downstream.

Closing

The four reviews are not the same review repeated four times. Each one asks a different question at a different completeness, and each one fails in characteristic ways when run against the wrong rubric. Pink is structural; red is content; gold is integrity; white is compounding. Run them in order, with the right people, at the right cadence, and the proposal function gets better with every bid that ships through it.

For the argument behind the discipline, see Sarah’s color-team review essay from earlier this month and the broader eight-stage pipeline pillar that places color-team review in context. This post is the operational handbook those essays point at — keep it open while you run your next four reviews, and write back in 90 days with what the playbook got wrong on a real bid. We will update it.