Source: frameworks/kit-scope-discipline/04-kit-scope-discipline-quality.md

04 — QUALITY: Scope Discipline Kit

Pass threshold: 90 / 100 When to run: After every scope-check doc generation (Mode 1) and after every kit improvement (Mode 2) before either is shared with the advisor or committed to the client repo. Also required: Pre-build validation gate in file 01 must pass before any production starts. This QC gate runs after production.

Scope Definition Quality (25 points)

The scope-check doc's in-scope and out-of-scope sections are the document's foundation. If they're wrong, everything downstream misdirects.

#	Check	Points
1	"What's In Scope (Explicit)" contains verb-led, specific items sourced directly from the signed SOW. No "advisory services" or other vague umbrella terms.	6
2	Every in-scope item is pointed-to in a contracted sense — the advisor can say "this is what we agreed to" and produce the SOW evidence.	4
3	"What's Explicitly Out of Scope" is split into three subsections (Lane violations / Lever violations / Historical out-of-scope asks). All three subsections are present, even if one has fewer entries.	4
4	Lane violations list at least 3 concrete items specific to this client's industry, vendor mix, or operational context.	4
5	Lever violations list at least 2 items naming who has the lever for each (e.g., "forcing vendor behavior — the vendor has the lever").	4
6	Historical out-of-scope asks cite at least 2 specific past incidents with dates, if session history exists. If no session history exists (brand-new client), the section is flagged explicitly ("populates after first Mode 3 update") rather than left blank without explanation.	3

Trigger Specificity (20 points)

Triggers are what make the doc usable in real time. Vague triggers mean the advisor can't recognize the pattern in the moment.

#	Check	Points
7	At least 3 concrete triggers listed, each naming a specific person, topic, or circumstance.	6
8	Zero generic triggers ("when the client emails," "when there's a problem," "when something urgent comes up").	4
9	Each trigger names the likely out-of-scope ask that follows the signal — not just the signal itself.	4
10	Each trigger includes a dated pattern observation ("first documented YYYY-MM-DD") when evidence exists.	3
11	Triggers align with the Historical section — patterns observed there become triggers named here.	3

Scripted Response Voice (20 points)

Templates that sound wrong produce robo-boundaries the advisor won't use. Voice is load-bearing.

#	Check	Points
12	Five scripted response variants provided, keyed to the 2x2 matrix quadrants + the ACTIVE BLEED case (small favor / out of lane / no lever / scope expansion / active bleed).	5
13	All scripted responses pass the voice requirements in file 02 (direct, warm, concrete, time-bounded when applicable, no emoji, no corporate hedging, no therapy language).	6
14	Each scripted response has a clear "Use when" note explaining which ask pattern triggers it.	4
15	Time caps are explicit (e.g., "30 minutes") in any accept-as-favor response. Never implied with soft language like "I'll see what I can do."	3
16	The advisor has reviewed the scripted responses for voice accuracy before the doc ships. Required when the kit is run by someone other than the advisor.	2

Content Integrity (15 points)

The doc must contain only what's factually grounded. Fabrication erodes advisor trust.

#	Check	Points
17	The doc contains only advisor-internal framing. No polished diplomatic language that would read as client-facing when the goal is candid scoping.	4
18	Dollar amounts or hour counts cited are factual — drawn from actual logged time or stated bleed. Not estimated, not rounded up for effect.	4
19	Historical out-of-scope asks are documented only where evidence exists. No speculation about future bleed or hypothetical asks.	4
20	No personality critique of client stakeholders beyond what's directly relevant to the scope pattern.	3

Forbidden Term Scan (10 points — Blocking)

Any hit in this category is a hard fail regardless of total score. Voice consistency is structural.

#	Check	Points
21	Zero instances of "boundaries" used as a generic noun in the therapeutic sense ("healthy boundaries," "boundary work"). Structural use ("the boundary between in-scope and out-of-scope") is acceptable.	2
22	Zero instances of forbidden therapy language from file 02: "toxic," "doormat," "people-pleaser," "taking advantage of," "self-care," "protect your energy," "energy vampire," "set healthy boundaries," "just say no," "firm but kind."	4
23	Zero instances of corporate hedging: "circle back," "align on," "pursuant to," "per our discussion," "touch base," "at the end of the day," "going forward."	2
24	Zero instances of filler phrases that weaken directness: "I just wanted to," "sorry to bother," "if that's okay," "I was thinking maybe."	2

Blocking rule: Any hit on checks 21-24 = automatic revise. The doc does not ship with forbidden terms present.

Structural Completeness (10 points)

The doc's structure must match the golden example exactly. Structural deviations break the advisor's ability to scan.

#	Check	Points
25	All required sections present in order: Header / What's In Scope / What's Explicitly Out of Scope (all three subsections) / Scope Triggers / Scripted Responses (5 variants) / Current Status / Cap Tracking / Review Log.	3
26	Filename matches convention: `clients/[client-name]/scope-check.md` exactly — lowercase, no date suffix, no version tag. One per client, updated in place.	2
27	Review Log entry added with today's date and a locked event type (CREATED, HELD, MISSED, EXPANDED, CAP HIT, REVIEWED).	2
28	Cap Tracking table is present even if empty — single row with "No active favors with caps at this time" is acceptable. Table itself does not get deleted.	2
29	Status flag at top of doc is set correctly based on state (ACTIVE BLEED when applicable, omitted for STABLE).	1

Scoring and Revision Protocol

Score calculation:

Total points across 29 checks = 100.
Pass threshold = 90.
Forbidden term hits (checks 21-24) override total score — any hit blocks delivery regardless.

If score < 90:

Identify the lowest-scoring category first.
Fix the issues in that category.
Re-run the full QC.
Repeat until 90+ AND zero forbidden-term hits.
If after two revision attempts the score remains below 90, escalate to the advisor with the specific gaps flagged, rather than shipping a marginal doc.

Required-Revise Triggers (Mode 1 blocking gates — separate from the 100-point scoring):

These are conditions that produce an automatic revise regardless of how the rest of the doc looks:

SOW wasn't read. The "What's In Scope" section cannot be sourced without the SOW. If the doc shipped without SOW evidence, revise.
Triggers are generic. If checks 7-9 all fail (no concrete triggers), revise with the 10-min trigger interview before re-generating.
Scripted responses are in a generic voice, not the advisor's. If check 16 fails, revise with advisor voice calibration.
Historical section contains speculation. If check 19 fails (documented items that haven't happened), revise by removing.
Any forbidden term is present. See checks 21-24.

Common Failure Modes

Documented failure patterns and the Mode 2 updates that address them. This section grows as production experience surfaces new gaps.

Failure	What Happens	How to Fix (Mode 2 Update)
Vague triggers ("when the client emails")	Advisor can't use the doc in real time because the signal isn't recognizable at the moment of decision.	Mode 3 update: add topic-specificity ("when [stakeholder] emails about [specific recurring topic]"). If the trigger generation rule in file 05 keeps producing vague triggers, tighten the rule.
Scripted responses in wrong voice	Advisor won't use them — writes their own every time — which means the kit added no value. Over time, advisor stops opening the doc.	Mode 2 update: capture the advisor's actual preferred wording from a recent scope-held instance. Replace template. Also check file 02 voice requirements are being enforced in file 05.
Doc written with client-appropriate politeness	Doc reads diplomatic, loses the candid utility that makes it useful in private.	Mode 2 update: regenerate with file 02 voice requirements enforced. Also check file 01's content filtering rules are being followed.
Speculative out-of-scope items	Doc reads as catastrophizing or paranoid. Advisor stops trusting it because it doesn't match their experience with the client.	Mode 3: remove speculation. Re-anchor in documented incidents only.
Status flags not set when ACTIVE BLEED exists	Advisor doesn't triage the doc with urgency; bleed continues.	Mode 3: set status flag. Enforce via file 05 output skill logic — if logged unpaid exceeds threshold, flag automatically.
Doc not updated after scope-held or scope-missed event	Doc becomes stale; kit's self-improvement loop breaks.	Mode 2: add a calendar reminder pattern to the kit's handoff ("review scope-check.md before each monthly session, update after"). If this is chronic, the kit isn't being operationalized — surface in advisor review.
Cap set without a "Trigger at Cap" action	Favor hit cap, but no next step was defined — bleed resumes automatically.	Mode 2: enforce the Trigger-at-Cap column as required in the output skill.
Applied to self-facing work	Kit used on advisor's own positioning, brand, or internal productization — which is a different behavioral pattern.	Mode 2: add explicit "NOT This" in file 00 start-here (already in place). If the error keeps happening, surface in advisor review — this may signal a need for a separate self-facing production kit.
Doc contains forbidden terms	Voice inconsistency undermines the kit's structural framing. The advisor reads "boundaries" in their own doc and flinches.	Mode 2: run the forbidden-term scan (checks 21-24) before any generated doc ships. If specific forbidden terms keep appearing, tighten the rule in file 02.
Generic / client-agnostic scope-check shipped	Advisor treats the doc as a checklist, not a tool. Opens it once, never again.	Block the build via Gap Protocol (file 01). Run the 10-min interview first. Do not ship a generic doc.
Historical section shipped without evidence citations	Advisor can't verify claims in the doc, starts to doubt it.	Mode 3: add citations (session recap date, email reference, logged hours). If no evidence, remove the entry.
More than 7 triggers listed	Doc becomes overwhelming; advisor can't hold all triggers in working memory during a live moment.	Mode 3: consolidate or prioritize. Keep 3-7 triggers maximum; retire ones that haven't fired in 90 days.

Self-QC Scoring Procedure (For Kit Operator)

Generate the scope-check doc (and proposal scope section if Mode 1 includes it).
Read through once, holistically. Does it feel usable in a live client moment?
Run through the 29 checks above. Score each category.
Run the forbidden-term scan (checks 21-24). Any hit = automatic revise.
Total the score. If < 90, identify the lowest category, fix, re-run.
Check the Required-Revise Triggers. Any hit = revise regardless of total score.
Present the final score alongside the doc when delivering. Include which checks lost points and why.
If below 90 after two revision attempts, escalate to the advisor with specific gaps flagged.