AGENTS.md is a project-level configuration file that provides AI coding assistants with complete project context, development principles, and operational guidelines. It follows the agents.md standard and works across Claude, Cursor, GitHub Copilot, and other AI tools.

What is Jimmy's Workflow?

Jimmy's Workflow is a four-phase validation system (PRE-FLIGHT, IMPLEMENT, VALIDATE, CHECKPOINT) designed to prevent AI hallucination and ensure robust implementation. Each phase has specific gates and confidence levels (HIGH/MEDIUM/LOW) that determine when human review is needed.

How do I set up multiple AI instances as a team?

Assign each AI instance a role card with identity, responsibilities, personality traits, and success criteria. Use file-based handoff protocols for coordination. Personality orchestration at the system prompt level produces measurably better results than treating AI instances as generic tools.

Test-Driven Implementation Plans

Implementation plans are natural language documents, structured for AI consumption, that define what to build and how to verify it. Test cases come first. There is no pseudocode. Every step is written in plain English with explicit dependencies and success criteria.

The insight

AI models parse natural language better than pseudocode.

Pseudocode sits in an uncanny valley — it looks like code but isn’t code. When an AI reads pseudocode, it faces an ambiguity: is this meant to be implemented literally, or is it a conceptual sketch? Different models interpret this differently. The same pseudocode produces different implementations across sessions.

Natural language with clear structure removes this ambiguity entirely. “Create a function that accepts a user ID and returns the user’s profile, or throws a NotFoundError if the user does not exist” is unambiguous. The AI knows exactly what to build. The implementation details (naming, error types, return types) are decided by the AI based on the project’s existing patterns.

Why no pseudocode

Pseudocode problem	Natural language solution
Ambiguous syntax — is `getData()` literal or conceptual?	”Fetch the data from the database” is always conceptual
Implies implementation — `for i in range(n)` suggests a loop pattern	”Process each item” lets the AI choose the right pattern
Language-specific — pseudocode leans toward one language	Natural language is language-agnostic
False precision — looks exact but isn’t	Natural language is explicitly approximate
Stale quickly — pseudocode is harder to update than prose	Prose is easy to revise

Test-driven means tests come first

The implementation plan lists test cases before implementation steps. This is deliberate:

Tests define the contract. Before writing any code, the plan establishes what “correct” means.
AI writes better code when tests exist first. The implementation targets the test cases, not the AI’s interpretation of requirements.
Tests are verifiable. “The function returns the user profile” is a requirement. “Calling getProfile('user-123') returns { id: 'user-123', name: 'Test User' }” is a test case.

Structure of a good implementation plan

1. Goal statement

What is being built and why. One to three sentences maximum.

## Goal

Add rate limiting to the API gateway. Requests exceeding 100 per minute
per API key should receive a 429 response with a Retry-After header.
This prevents abuse and protects downstream services from overload.

2. Success criteria

Measurable, testable conditions. These are the acceptance criteria — when all are met, the task is done.

## Success Criteria

- [ ] Requests within rate limit succeed normally (200)
- [ ] Request 101 within a 60-second window returns 429
- [ ] 429 response includes Retry-After header with seconds remaining
- [ ] Rate limit state persists across server restarts
- [ ] Rate limit is per API key, not per IP address
- [ ] Existing tests continue to pass

3. Test cases

Written before implementation steps. Each test case has a name, setup conditions, action, and expected result.

## Test Cases

### TC-1: Request within limit succeeds
- Setup: Clean rate limit state
- Action: Send 1 request with valid API key
- Expected: 200 response, normal body

### TC-2: Request at exact limit succeeds
- Setup: Clean rate limit state
- Action: Send 100 requests with same API key within 60 seconds
- Expected: All return 200

### TC-3: Request exceeding limit is rejected
- Setup: Clean rate limit state
- Action: Send 101 requests with same API key within 60 seconds
- Expected: First 100 return 200, request 101 returns 429

### TC-4: Retry-After header is accurate
- Setup: Exceed rate limit at T+30s of the window
- Action: Read Retry-After header from 429 response
- Expected: Value is approximately 30 (seconds remaining in window)

### TC-5: Different API keys have independent limits
- Setup: Clean rate limit state
- Action: Send 100 requests with key-A, then 1 request with key-B
- Expected: All requests succeed (key-B has its own counter)

### TC-6: Rate limit resets after window expires
- Setup: Exceed rate limit
- Action: Wait 60 seconds, send another request
- Expected: 200 response (new window)

### TC-7: State survives server restart
- Setup: Send 50 requests, restart server
- Action: Send 51 more requests
- Expected: Request 101 overall returns 429

4. Implementation steps

Natural language, ordered, with dependencies noted. No pseudocode. Each step describes what to do, not how to do it at the code level.

## Implementation Steps

### Step 1: Add rate limit storage
Create a persistent store for rate limit counters. Each entry tracks
an API key, a window start timestamp, and a request count.
Depends on: nothing (new component)

### Step 2: Create rate limit middleware
Create middleware that runs before route handlers. It should:
- Extract the API key from the request header
- Look up or create the rate limit entry for that key
- Increment the counter
- If the counter exceeds 100 and the window has not expired, return 429
  with a Retry-After header
- If the window has expired, reset the counter and start a new window
Depends on: Step 1

### Step 3: Register middleware on the API gateway
Add the rate limit middleware to the gateway's middleware chain,
before authentication middleware (rate limiting should apply even
to requests with invalid keys).
Depends on: Step 2

### Step 4: Write and run tests
Implement the test cases from the Test Cases section above.
Run all tests including existing test suite.
Depends on: Steps 1-3

### Step 5: Update API documentation
Document the rate limiting behaviour, including the 429 response
format and Retry-After header, in the API reference.
Depends on: Step 4 (confirms final behaviour)

5. Validation criteria

How to know the task is complete. This maps to the VALIDATE phase of Jimmy’s Workflow.

## Validation

- All 7 test cases pass
- Existing test suite passes with no regressions
- Manual test: send 101 rapid requests via curl, confirm 429 on the last
- Rate limit persists after restarting the server
- Confidence: HIGH if all above pass

How the AI CLI tool consumes these plans

The implementation plan is a document — typically a markdown file in the project. When the coding session begins, the AI reads the plan and executes it using Jimmy’s Workflow:

“Read the implementation plan at ./docs/plans/rate-limiting.md and execute it using Jimmy’s Workflow”

The AI then:

PRE-FLIGHT — Reads the plan, checks that all dependencies are available, confirms requirements are clear
IMPLEMENT — Follows the implementation steps in order, writing tests first (from the Test Cases section), then implementation code
VALIDATE — Runs the test suite, checks against the success criteria
CHECKPOINT — Reports confidence level and completion status

The plan is the single source of truth. The AI does not invent requirements. It does not skip steps. It implements what the plan says.

Why this is AI-optimised

Implementation plans structured this way map directly to Principle 5.5 (AI-Optimized Documentation):

Principle 5.5 sub-principle	How implementation plans apply it
Structured data over prose	Sections with clear headings, lists, explicit labels
Explicit context	Goal statement explains what and why; dependencies are noted
Cause-effect relationships	”If counter exceeds 100 and window has not expired, return 429”
Machine-readable formats	Consistent structure: Goal, Success Criteria, Test Cases, Steps, Validation
Searchable content	Test case IDs (TC-1, TC-2), step numbers, clear heading hierarchy
Version-stamped	Plans include dates when relevant
Cross-referenced	Dependencies between steps are explicit

The plan is structured data that happens to be readable by humans. It is designed to be consumed by an AI implementation agent — and it works because AI excels at following explicit, structured instructions.

Common mistakes

Mistake	Why it fails	Fix
Writing implementation steps before test cases	AI implements to the steps, not to the tests — no verification contract	Always write test cases first
Using pseudocode in steps	Ambiguous — AI doesn’t know if it’s literal or conceptual	Use natural language exclusively
Vague success criteria (“it should work”)	AI declares success prematurely	Measurable, testable conditions only
Skipping dependency notes	AI implements steps out of order or misses prerequisites	Note dependencies on every step
Over-specifying implementation details	Constrains AI from using project-appropriate patterns	Describe what, not how
Combining multiple features in one plan	Harder to validate, harder to checkpoint	One feature per plan