04 — QUALITY: Kit Builder
Pass threshold: 90 / 100 When to run: After producing any kit (Mode 1 or Mode 3), or after improving a kit (Mode 2). Run against the kit you just produced, not the deliverable the kit will eventually produce.
Completeness (20 points)
| # | Check | Points |
|---|---|---|
| 1 | All files present with correct naming: NN-[kit-name]-[file-type].md (or .html). File count matches the kit type decision (6 standard, more if justified, fewer if lightweight) | 4 |
| 2 | Directory created at content/frameworks/[kit-name]/ | 2 |
| 3 | Start-here (00) has: what this is, audience, format, operating modes, file inventory, "does NOT do" section | 4 |
| 4 | Context (01) has: inputs per mode, validation rules, priority hierarchy | 3 |
| 5 | Terminology (02) has: locked terms, forbidden terms, visual states (if applicable) | 3 |
| 6 | Golden example (03) exists as one of: fully populated deliverable, structural reference to a live example, or explicit placeholder awaiting first deployment | 4 |
Quality Gate Design (15 points)
| # | Check | Points |
|---|---|---|
| 7 | Quality gate format is justified (point-scored, pass/fail, checklist, or interactive HTML — see QC Format Decision Guide in 01-context) | 3 |
| 8 | Checks organized by category (not a flat list) | 2 |
| 9 | Every check is specific and testable — no vague language like "appropriate" or "good" | 4 |
| 10 | Common failure modes section present with: what goes wrong, what happens, how to fix | 3 |
| 11 | If point-scored: point total adds to 100 with explicit pass threshold. If pass/fail: blocking failures identified. If checklist: ship criteria defined. | 3 |
Output Skill Quality (20 points)
| # | Check | Points |
|---|---|---|
| 12 | Output skill restates scope and required inputs (standalone-readable) | 3 |
| 13 | Content rules are numbered and specific | 3 |
| 14 | For HTML outputs: reusable component templates provided. For markdown: section specifications with format requirements | 3 |
| 15 | Full template or structural skeleton provided (in output skill, not in golden example) | 4 |
| 16 | Delivery checklist at the end (pre-ship gate) | 2 |
| 17 | Production steps are in the right order — you can follow them top to bottom | 2 |
| 18 | External QC dependencies documented if applicable (copy-qc.md, sentence-editor.md, brand QC) | 3 |
Self-Improvement Loop (15 points)
| # | Check | Points |
|---|---|---|
| 19 | Start-here includes Mode 2 (Improve) as an operating mode | 3 |
| 20 | Mode 2 trigger is clear: QC failure, manual changes, or system suggestions | 3 |
| 21 | Quality gate has a "Common Failure Modes" section (even if empty initially) | 3 |
| 22 | Change log convention documented (HTML comment or markdown section) | 2 |
| 23 | Output skill includes "update the golden example" instruction for post-production changes | 2 |
| 24 | Start-here self-improvement loop references the three questions: Did I change anything? Did QC miss something? Should the kit do more? | 2 |
Consistency with Vault Conventions (15 points)
| # | Check | Points |
|---|---|---|
| 25 | File naming follows NN-[kit-name]-[file-type].md (or .html for golden examples and interactive QC) | 3 |
| 26 | Kit name is lowercase, hyphenated | 2 |
| 27 | Does not duplicate logic from existing kits — references them instead | 3 |
| 28 | Relationship to other kits documented in start-here | 2 |
| 29 | Terminology is consistent with other vault kits (same terms for same concepts) | 3 |
| 30 | Voice matches vault conventions: direct, specific, no jargon, no fluff | 2 |
Accuracy (15 points)
| # | Check | Points |
|---|---|---|
| 31 | Golden example content matches what the output skill would produce (or placeholder is justified) | 4 |
| 32 | Quality checks are actually testable against the golden example (no checks that the example would fail) | 3 |
| 33 | Context inputs match what the output skill actually needs to run | 3 |
| 34 | Terminology definitions match how terms are actually used across all kit files | 3 |
| 35 | Operating mode descriptions match what the output skill actually does | 2 |
Common Failure Modes
| Failure | What Happens | How to Fix |
|---|---|---|
| Template disguised as golden example | File 03 has {{PLACEHOLDER}} tags instead of real content | Replace with a fully populated deliverable. Templates go in the output skill, not file 03. Exception: placeholder with structural specs is valid for kits awaiting first deployment. |
| Vague quality checks | "Tone is appropriate" instead of specific, testable criteria | Rewrite: "Zero instances of [specific forbidden terms]. Every [section type] follows [specific pattern]." |
| Output skill can't stand alone | File 05 references inputs or terms defined only in other files without restating them | Add scope section and required inputs list to file 05. It should be runnable without reading 00-02 first. |
| Missing "does NOT do" | Start-here doesn't set boundaries, so the kit gets used for things it wasn't designed for | Add explicit scope boundaries. List 3-5 things that are adjacent but not this kit's job. |
| Orphan kit | Kit has no documented relationship to other kits | Add relationship section to start-here. Every kit either derives from, references, coordinates with, or extends at least one other kit. |
| QC without failure modes | Quality gate scores outputs but doesn't document lessons learned | Add "Common Failure Modes" section — even if initially empty. This is where Mode 2 improvements land. |
| One-mode kit | Start-here defines only creation, no improvement path | Add Mode 2 (Improve) at minimum. Every kit must have a path to get better over time. |
| Forced 6-file structure | Kit has 6 files when it needs 9 (missing consultant methodology, gap protocol) or 2 (bloated with unnecessary files) | Match file count to kit type. Read existing kits of similar complexity before deciding. |
| Wrong QC format | Kit uses 100-point scoring when pass/fail with blocking failures would be more appropriate (or vice versa) | Evaluate: are some errors disqualifying regardless of score? → blocking failures. Are quality dimensions relatively weighted? → point-scored. Is the output simple? → checklist. |
| Missing external QC dependencies | Kit produces copy-heavy output but doesn't reference copy-qc.md or voice checks | Check whether the output includes prose, client-facing language, or brand voice. If yes, add external QC pass to the quality gate and output skill. |
| Shallow pattern extraction | Kit builder read 1-2 reference kits instead of all kits of similar type before building | Read all kits in the vault before producing any new kit. What looks like a universal pattern from 2 examples may be a local decision. |