← Vault Index
Source: frameworks/kit-scope-discipline/04-kit-scope-discipline-quality.md

04 — QUALITY: Scope Discipline Kit

Pass threshold: 90 / 100 When to run: After every scope-check doc generation (Mode 1) and after every kit improvement (Mode 2) before either is shared with the advisor or committed to the client repo. Also required: Pre-build validation gate in file 01 must pass before any production starts. This QC gate runs after production.


Scope Definition Quality (25 points)

The scope-check doc's in-scope and out-of-scope sections are the document's foundation. If they're wrong, everything downstream misdirects.

#CheckPoints
1"What's In Scope (Explicit)" contains verb-led, specific items sourced directly from the signed SOW. No "advisory services" or other vague umbrella terms.6
2Every in-scope item is pointed-to in a contracted sense — the advisor can say "this is what we agreed to" and produce the SOW evidence.4
3"What's Explicitly Out of Scope" is split into three subsections (Lane violations / Lever violations / Historical out-of-scope asks). All three subsections are present, even if one has fewer entries.4
4Lane violations list at least 3 concrete items specific to this client's industry, vendor mix, or operational context.4
5Lever violations list at least 2 items naming who has the lever for each (e.g., "forcing vendor behavior — the vendor has the lever").4
6Historical out-of-scope asks cite at least 2 specific past incidents with dates, if session history exists. If no session history exists (brand-new client), the section is flagged explicitly ("populates after first Mode 3 update") rather than left blank without explanation.3

Trigger Specificity (20 points)

Triggers are what make the doc usable in real time. Vague triggers mean the advisor can't recognize the pattern in the moment.

#CheckPoints
7At least 3 concrete triggers listed, each naming a specific person, topic, or circumstance.6
8Zero generic triggers ("when the client emails," "when there's a problem," "when something urgent comes up").4
9Each trigger names the likely out-of-scope ask that follows the signal — not just the signal itself.4
10Each trigger includes a dated pattern observation ("first documented YYYY-MM-DD") when evidence exists.3
11Triggers align with the Historical section — patterns observed there become triggers named here.3

Scripted Response Voice (20 points)

Templates that sound wrong produce robo-boundaries the advisor won't use. Voice is load-bearing.

#CheckPoints
12Five scripted response variants provided, keyed to the 2x2 matrix quadrants + the ACTIVE BLEED case (small favor / out of lane / no lever / scope expansion / active bleed).5
13All scripted responses pass the voice requirements in file 02 (direct, warm, concrete, time-bounded when applicable, no emoji, no corporate hedging, no therapy language).6
14Each scripted response has a clear "Use when" note explaining which ask pattern triggers it.4
15Time caps are explicit (e.g., "30 minutes") in any accept-as-favor response. Never implied with soft language like "I'll see what I can do."3
16The advisor has reviewed the scripted responses for voice accuracy before the doc ships. Required when the kit is run by someone other than the advisor.2

Content Integrity (15 points)

The doc must contain only what's factually grounded. Fabrication erodes advisor trust.

#CheckPoints
17The doc contains only advisor-internal framing. No polished diplomatic language that would read as client-facing when the goal is candid scoping.4
18Dollar amounts or hour counts cited are factual — drawn from actual logged time or stated bleed. Not estimated, not rounded up for effect.4
19Historical out-of-scope asks are documented only where evidence exists. No speculation about future bleed or hypothetical asks.4
20No personality critique of client stakeholders beyond what's directly relevant to the scope pattern.3

Forbidden Term Scan (10 points — Blocking)

Any hit in this category is a hard fail regardless of total score. Voice consistency is structural.

#CheckPoints
21Zero instances of "boundaries" used as a generic noun in the therapeutic sense ("healthy boundaries," "boundary work"). Structural use ("the boundary between in-scope and out-of-scope") is acceptable.2
22Zero instances of forbidden therapy language from file 02: "toxic," "doormat," "people-pleaser," "taking advantage of," "self-care," "protect your energy," "energy vampire," "set healthy boundaries," "just say no," "firm but kind."4
23Zero instances of corporate hedging: "circle back," "align on," "pursuant to," "per our discussion," "touch base," "at the end of the day," "going forward."2
24Zero instances of filler phrases that weaken directness: "I just wanted to," "sorry to bother," "if that's okay," "I was thinking maybe."2

Blocking rule: Any hit on checks 21-24 = automatic revise. The doc does not ship with forbidden terms present.


Structural Completeness (10 points)

The doc's structure must match the golden example exactly. Structural deviations break the advisor's ability to scan.

#CheckPoints
25All required sections present in order: Header / What's In Scope / What's Explicitly Out of Scope (all three subsections) / Scope Triggers / Scripted Responses (5 variants) / Current Status / Cap Tracking / Review Log.3
26Filename matches convention: clients/[client-name]/scope-check.md exactly — lowercase, no date suffix, no version tag. One per client, updated in place.2
27Review Log entry added with today's date and a locked event type (CREATED, HELD, MISSED, EXPANDED, CAP HIT, REVIEWED).2
28Cap Tracking table is present even if empty — single row with "No active favors with caps at this time" is acceptable. Table itself does not get deleted.2
29Status flag at top of doc is set correctly based on state (ACTIVE BLEED when applicable, omitted for STABLE).1

Scoring and Revision Protocol

Score calculation:

  1. Total points across 29 checks = 100.
  2. Pass threshold = 90.
  3. Forbidden term hits (checks 21-24) override total score — any hit blocks delivery regardless.

If score < 90:

  1. Identify the lowest-scoring category first.
  2. Fix the issues in that category.
  3. Re-run the full QC.
  4. Repeat until 90+ AND zero forbidden-term hits.
  5. If after two revision attempts the score remains below 90, escalate to the advisor with the specific gaps flagged, rather than shipping a marginal doc.

Required-Revise Triggers (Mode 1 blocking gates — separate from the 100-point scoring):

These are conditions that produce an automatic revise regardless of how the rest of the doc looks:

  1. SOW wasn't read. The "What's In Scope" section cannot be sourced without the SOW. If the doc shipped without SOW evidence, revise.
  2. Triggers are generic. If checks 7-9 all fail (no concrete triggers), revise with the 10-min trigger interview before re-generating.
  3. Scripted responses are in a generic voice, not the advisor's. If check 16 fails, revise with advisor voice calibration.
  4. Historical section contains speculation. If check 19 fails (documented items that haven't happened), revise by removing.
  5. Any forbidden term is present. See checks 21-24.

Common Failure Modes

Documented failure patterns and the Mode 2 updates that address them. This section grows as production experience surfaces new gaps.

FailureWhat HappensHow to Fix (Mode 2 Update)
Vague triggers ("when the client emails")Advisor can't use the doc in real time because the signal isn't recognizable at the moment of decision.Mode 3 update: add topic-specificity ("when [stakeholder] emails about [specific recurring topic]"). If the trigger generation rule in file 05 keeps producing vague triggers, tighten the rule.
Scripted responses in wrong voiceAdvisor won't use them — writes their own every time — which means the kit added no value. Over time, advisor stops opening the doc.Mode 2 update: capture the advisor's actual preferred wording from a recent scope-held instance. Replace template. Also check file 02 voice requirements are being enforced in file 05.
Doc written with client-appropriate politenessDoc reads diplomatic, loses the candid utility that makes it useful in private.Mode 2 update: regenerate with file 02 voice requirements enforced. Also check file 01's content filtering rules are being followed.
Speculative out-of-scope itemsDoc reads as catastrophizing or paranoid. Advisor stops trusting it because it doesn't match their experience with the client.Mode 3: remove speculation. Re-anchor in documented incidents only.
Status flags not set when ACTIVE BLEED existsAdvisor doesn't triage the doc with urgency; bleed continues.Mode 3: set status flag. Enforce via file 05 output skill logic — if logged unpaid exceeds threshold, flag automatically.
Doc not updated after scope-held or scope-missed eventDoc becomes stale; kit's self-improvement loop breaks.Mode 2: add a calendar reminder pattern to the kit's handoff ("review scope-check.md before each monthly session, update after"). If this is chronic, the kit isn't being operationalized — surface in advisor review.
Cap set without a "Trigger at Cap" actionFavor hit cap, but no next step was defined — bleed resumes automatically.Mode 2: enforce the Trigger-at-Cap column as required in the output skill.
Applied to self-facing workKit used on advisor's own positioning, brand, or internal productization — which is a different behavioral pattern.Mode 2: add explicit "NOT This" in file 00 start-here (already in place). If the error keeps happening, surface in advisor review — this may signal a need for a separate self-facing production kit.
Doc contains forbidden termsVoice inconsistency undermines the kit's structural framing. The advisor reads "boundaries" in their own doc and flinches.Mode 2: run the forbidden-term scan (checks 21-24) before any generated doc ships. If specific forbidden terms keep appearing, tighten the rule in file 02.
Generic / client-agnostic scope-check shippedAdvisor treats the doc as a checklist, not a tool. Opens it once, never again.Block the build via Gap Protocol (file 01). Run the 10-min interview first. Do not ship a generic doc.
Historical section shipped without evidence citationsAdvisor can't verify claims in the doc, starts to doubt it.Mode 3: add citations (session recap date, email reference, logged hours). If no evidence, remove the entry.
More than 7 triggers listedDoc becomes overwhelming; advisor can't hold all triggers in working memory during a live moment.Mode 3: consolidate or prioritize. Keep 3-7 triggers maximum; retire ones that haven't fired in 90 days.

Self-QC Scoring Procedure (For Kit Operator)

  1. Generate the scope-check doc (and proposal scope section if Mode 1 includes it).
  2. Read through once, holistically. Does it feel usable in a live client moment?
  3. Run through the 29 checks above. Score each category.
  4. Run the forbidden-term scan (checks 21-24). Any hit = automatic revise.
  5. Total the score. If < 90, identify the lowest category, fix, re-run.
  6. Check the Required-Revise Triggers. Any hit = revise regardless of total score.
  7. Present the final score alongside the doc when delivering. Include which checks lost points and why.
  8. If below 90 after two revision attempts, escalate to the advisor with specific gaps flagged.