← Vault Index
Source: frameworks/kit-builder/04-kit-builder-quality.md

04 — QUALITY: Kit Builder

Pass threshold: 90 / 100 When to run: After producing any kit (Mode 1 or Mode 3), or after improving a kit (Mode 2). Run against the kit you just produced, not the deliverable the kit will eventually produce.


Completeness (20 points)

#CheckPoints
1All files present with correct naming: NN-[kit-name]-[file-type].md (or .html). File count matches the kit type decision (6 standard, more if justified, fewer if lightweight)4
2Directory created at content/frameworks/[kit-name]/2
3Start-here (00) has: what this is, audience, format, operating modes, file inventory, "does NOT do" section4
4Context (01) has: inputs per mode, validation rules, priority hierarchy3
5Terminology (02) has: locked terms, forbidden terms, visual states (if applicable)3
6Golden example (03) exists as one of: fully populated deliverable, structural reference to a live example, or explicit placeholder awaiting first deployment4

Quality Gate Design (15 points)

#CheckPoints
7Quality gate format is justified (point-scored, pass/fail, checklist, or interactive HTML — see QC Format Decision Guide in 01-context)3
8Checks organized by category (not a flat list)2
9Every check is specific and testable — no vague language like "appropriate" or "good"4
10Common failure modes section present with: what goes wrong, what happens, how to fix3
11If point-scored: point total adds to 100 with explicit pass threshold. If pass/fail: blocking failures identified. If checklist: ship criteria defined.3

Output Skill Quality (20 points)

#CheckPoints
12Output skill restates scope and required inputs (standalone-readable)3
13Content rules are numbered and specific3
14For HTML outputs: reusable component templates provided. For markdown: section specifications with format requirements3
15Full template or structural skeleton provided (in output skill, not in golden example)4
16Delivery checklist at the end (pre-ship gate)2
17Production steps are in the right order — you can follow them top to bottom2
18External QC dependencies documented if applicable (copy-qc.md, sentence-editor.md, brand QC)3

Self-Improvement Loop (15 points)

#CheckPoints
19Start-here includes Mode 2 (Improve) as an operating mode3
20Mode 2 trigger is clear: QC failure, manual changes, or system suggestions3
21Quality gate has a "Common Failure Modes" section (even if empty initially)3
22Change log convention documented (HTML comment or markdown section)2
23Output skill includes "update the golden example" instruction for post-production changes2
24Start-here self-improvement loop references the three questions: Did I change anything? Did QC miss something? Should the kit do more?2

Consistency with Vault Conventions (15 points)

#CheckPoints
25File naming follows NN-[kit-name]-[file-type].md (or .html for golden examples and interactive QC)3
26Kit name is lowercase, hyphenated2
27Does not duplicate logic from existing kits — references them instead3
28Relationship to other kits documented in start-here2
29Terminology is consistent with other vault kits (same terms for same concepts)3
30Voice matches vault conventions: direct, specific, no jargon, no fluff2

Accuracy (15 points)

#CheckPoints
31Golden example content matches what the output skill would produce (or placeholder is justified)4
32Quality checks are actually testable against the golden example (no checks that the example would fail)3
33Context inputs match what the output skill actually needs to run3
34Terminology definitions match how terms are actually used across all kit files3
35Operating mode descriptions match what the output skill actually does2

Common Failure Modes

FailureWhat HappensHow to Fix
Template disguised as golden exampleFile 03 has {{PLACEHOLDER}} tags instead of real contentReplace with a fully populated deliverable. Templates go in the output skill, not file 03. Exception: placeholder with structural specs is valid for kits awaiting first deployment.
Vague quality checks"Tone is appropriate" instead of specific, testable criteriaRewrite: "Zero instances of [specific forbidden terms]. Every [section type] follows [specific pattern]."
Output skill can't stand aloneFile 05 references inputs or terms defined only in other files without restating themAdd scope section and required inputs list to file 05. It should be runnable without reading 00-02 first.
Missing "does NOT do"Start-here doesn't set boundaries, so the kit gets used for things it wasn't designed forAdd explicit scope boundaries. List 3-5 things that are adjacent but not this kit's job.
Orphan kitKit has no documented relationship to other kitsAdd relationship section to start-here. Every kit either derives from, references, coordinates with, or extends at least one other kit.
QC without failure modesQuality gate scores outputs but doesn't document lessons learnedAdd "Common Failure Modes" section — even if initially empty. This is where Mode 2 improvements land.
One-mode kitStart-here defines only creation, no improvement pathAdd Mode 2 (Improve) at minimum. Every kit must have a path to get better over time.
Forced 6-file structureKit has 6 files when it needs 9 (missing consultant methodology, gap protocol) or 2 (bloated with unnecessary files)Match file count to kit type. Read existing kits of similar complexity before deciding.
Wrong QC formatKit uses 100-point scoring when pass/fail with blocking failures would be more appropriate (or vice versa)Evaluate: are some errors disqualifying regardless of score? → blocking failures. Are quality dimensions relatively weighted? → point-scored. Is the output simple? → checklist.
Missing external QC dependenciesKit produces copy-heavy output but doesn't reference copy-qc.md or voice checksCheck whether the output includes prose, client-facing language, or brand voice. If yes, add external QC pass to the quality gate and output skill.
Shallow pattern extractionKit builder read 1-2 reference kits instead of all kits of similar type before buildingRead all kits in the vault before producing any new kit. What looks like a universal pattern from 2 examples may be a local decision.