Test-Driven Implementation Plans
Implementation plans are natural language documents, structured for AI consumption, that define what to build and how to verify it. Test cases come first. There is no pseudocode. Every step is written in plain English with explicit dependencies and success criteria.
The insight
Section titled “The insight”AI models parse natural language better than pseudocode.
Pseudocode sits in an uncanny valley — it looks like code but isn’t code. When an AI reads pseudocode, it faces an ambiguity: is this meant to be implemented literally, or is it a conceptual sketch? Different models interpret this differently. The same pseudocode produces different implementations across sessions.
Natural language with clear structure removes this ambiguity entirely. “Create a function that accepts a user ID and returns the user’s profile, or throws a NotFoundError if the user does not exist” is unambiguous. The AI knows exactly what to build. The implementation details (naming, error types, return types) are decided by the AI based on the project’s existing patterns.
Why no pseudocode
Section titled “Why no pseudocode”| Pseudocode problem | Natural language solution |
|---|---|
Ambiguous syntax — is getData() literal or conceptual? | ”Fetch the data from the database” is always conceptual |
Implies implementation — for i in range(n) suggests a loop pattern | ”Process each item” lets the AI choose the right pattern |
| Language-specific — pseudocode leans toward one language | Natural language is language-agnostic |
| False precision — looks exact but isn’t | Natural language is explicitly approximate |
| Stale quickly — pseudocode is harder to update than prose | Prose is easy to revise |
Test-driven means tests come first
Section titled “Test-driven means tests come first”The implementation plan lists test cases before implementation steps. This is deliberate:
- Tests define the contract. Before writing any code, the plan establishes what “correct” means.
- AI writes better code when tests exist first. The implementation targets the test cases, not the AI’s interpretation of requirements.
- Tests are verifiable. “The function returns the user profile” is a requirement. “Calling
getProfile('user-123')returns{ id: 'user-123', name: 'Test User' }” is a test case.
Structure of a good implementation plan
Section titled “Structure of a good implementation plan”1. Goal statement
Section titled “1. Goal statement”What is being built and why. One to three sentences maximum.
## Goal
Add rate limiting to the API gateway. Requests exceeding 100 per minuteper API key should receive a 429 response with a Retry-After header.This prevents abuse and protects downstream services from overload.2. Success criteria
Section titled “2. Success criteria”Measurable, testable conditions. These are the acceptance criteria — when all are met, the task is done.
## Success Criteria
- [ ] Requests within rate limit succeed normally (200)- [ ] Request 101 within a 60-second window returns 429- [ ] 429 response includes Retry-After header with seconds remaining- [ ] Rate limit state persists across server restarts- [ ] Rate limit is per API key, not per IP address- [ ] Existing tests continue to pass3. Test cases
Section titled “3. Test cases”Written before implementation steps. Each test case has a name, setup conditions, action, and expected result.
## Test Cases
### TC-1: Request within limit succeeds- Setup: Clean rate limit state- Action: Send 1 request with valid API key- Expected: 200 response, normal body
### TC-2: Request at exact limit succeeds- Setup: Clean rate limit state- Action: Send 100 requests with same API key within 60 seconds- Expected: All return 200
### TC-3: Request exceeding limit is rejected- Setup: Clean rate limit state- Action: Send 101 requests with same API key within 60 seconds- Expected: First 100 return 200, request 101 returns 429
### TC-4: Retry-After header is accurate- Setup: Exceed rate limit at T+30s of the window- Action: Read Retry-After header from 429 response- Expected: Value is approximately 30 (seconds remaining in window)
### TC-5: Different API keys have independent limits- Setup: Clean rate limit state- Action: Send 100 requests with key-A, then 1 request with key-B- Expected: All requests succeed (key-B has its own counter)
### TC-6: Rate limit resets after window expires- Setup: Exceed rate limit- Action: Wait 60 seconds, send another request- Expected: 200 response (new window)
### TC-7: State survives server restart- Setup: Send 50 requests, restart server- Action: Send 51 more requests- Expected: Request 101 overall returns 4294. Implementation steps
Section titled “4. Implementation steps”Natural language, ordered, with dependencies noted. No pseudocode. Each step describes what to do, not how to do it at the code level.
## Implementation Steps
### Step 1: Add rate limit storageCreate a persistent store for rate limit counters. Each entry tracksan API key, a window start timestamp, and a request count.Depends on: nothing (new component)
### Step 2: Create rate limit middlewareCreate middleware that runs before route handlers. It should:- Extract the API key from the request header- Look up or create the rate limit entry for that key- Increment the counter- If the counter exceeds 100 and the window has not expired, return 429 with a Retry-After header- If the window has expired, reset the counter and start a new windowDepends on: Step 1
### Step 3: Register middleware on the API gatewayAdd the rate limit middleware to the gateway's middleware chain,before authentication middleware (rate limiting should apply evento requests with invalid keys).Depends on: Step 2
### Step 4: Write and run testsImplement the test cases from the Test Cases section above.Run all tests including existing test suite.Depends on: Steps 1-3
### Step 5: Update API documentationDocument the rate limiting behaviour, including the 429 responseformat and Retry-After header, in the API reference.Depends on: Step 4 (confirms final behaviour)5. Validation criteria
Section titled “5. Validation criteria”How to know the task is complete. This maps to the VALIDATE phase of Jimmy’s Workflow.
## Validation
- All 7 test cases pass- Existing test suite passes with no regressions- Manual test: send 101 rapid requests via curl, confirm 429 on the last- Rate limit persists after restarting the server- Confidence: HIGH if all above passHow the AI CLI tool consumes these plans
Section titled “How the AI CLI tool consumes these plans”The implementation plan is a document — typically a markdown file in the project. When the coding session begins, the AI reads the plan and executes it using Jimmy’s Workflow:
“Read the implementation plan at
./docs/plans/rate-limiting.mdand execute it using Jimmy’s Workflow”
The AI then:
- PRE-FLIGHT — Reads the plan, checks that all dependencies are available, confirms requirements are clear
- IMPLEMENT — Follows the implementation steps in order, writing tests first (from the Test Cases section), then implementation code
- VALIDATE — Runs the test suite, checks against the success criteria
- CHECKPOINT — Reports confidence level and completion status
The plan is the single source of truth. The AI does not invent requirements. It does not skip steps. It implements what the plan says.
Why this is AI-optimised
Section titled “Why this is AI-optimised”Implementation plans structured this way map directly to Principle 5.5 (AI-Optimized Documentation):
| Principle 5.5 sub-principle | How implementation plans apply it |
|---|---|
| Structured data over prose | Sections with clear headings, lists, explicit labels |
| Explicit context | Goal statement explains what and why; dependencies are noted |
| Cause-effect relationships | ”If counter exceeds 100 and window has not expired, return 429” |
| Machine-readable formats | Consistent structure: Goal, Success Criteria, Test Cases, Steps, Validation |
| Searchable content | Test case IDs (TC-1, TC-2), step numbers, clear heading hierarchy |
| Version-stamped | Plans include dates when relevant |
| Cross-referenced | Dependencies between steps are explicit |
The plan is structured data that happens to be readable by humans. It is designed to be consumed by an AI implementation agent — and it works because AI excels at following explicit, structured instructions.
Common mistakes
Section titled “Common mistakes”| Mistake | Why it fails | Fix |
|---|---|---|
| Writing implementation steps before test cases | AI implements to the steps, not to the tests — no verification contract | Always write test cases first |
| Using pseudocode in steps | Ambiguous — AI doesn’t know if it’s literal or conceptual | Use natural language exclusively |
| Vague success criteria (“it should work”) | AI declares success prematurely | Measurable, testable conditions only |
| Skipping dependency notes | AI implements steps out of order or misses prerequisites | Note dependencies on every step |
| Over-specifying implementation details | Constrains AI from using project-appropriate patterns | Describe what, not how |
| Combining multiple features in one plan | Harder to validate, harder to checkpoint | One feature per plan |
Further reading
Section titled “Further reading”- Documentation-First Development — Why documentation comes before code
- Jimmy’s Workflow v2.1 — The validation system that executes these plans
- 11 Core Principles — Principle 5.5 (AI-Optimized Documentation)
- Documentation Standards — The 7 principles for structured documentation