AGENTS.md is a project-level configuration file that provides AI coding assistants with complete project context, development principles, and operational guidelines. It follows the agents.md standard and works across Claude, Cursor, GitHub Copilot, and other AI tools.

What is Jimmy's Workflow?

Jimmy's Workflow is a four-phase validation system (PRE-FLIGHT, IMPLEMENT, VALIDATE, CHECKPOINT) designed to prevent AI hallucination and ensure robust implementation. Each phase has specific gates and confidence levels (HIGH/MEDIUM/LOW) that determine when human review is needed.

How do I set up multiple AI instances as a team?

Assign each AI instance a role card with identity, responsibilities, personality traits, and success criteria. Use file-based handoff protocols for coordination. Personality orchestration at the system prompt level produces measurably better results than treating AI instances as generic tools.

Research

Research findings from real-world production testing — not theoretical benchmarks, not synthetic evaluations. Every finding in this section was validated against a production database of 3,833 articles across 29 RSS sources during October 2025.

Key findings

Finding	Impact	Confidence
Structured workflow eliminates the quality gap between model tiers	Use cheaper models with workflow, save 67% on API costs	HIGH — production validated
Orchestrator + specialist pattern reduces cost 40-60%	Multi-model architecture with parallel execution	HIGH — production validated
Different AI architectures catch different blind spots	Use multiple AI systems for review, not just one	MEDIUM — observed pattern

Detailed research

Page	Summary
Haiku 4.5 Findings	How smaller models match or exceed larger model quality when given explicit workflow structure. 1.8x faster, 67% cheaper, 5% better quality.
Orchestrator Pattern	The Orchestrator + Specialist architecture for cost-effective multi-model AI development. 50%+ cost reduction with no quality loss.

Research methodology

All findings were produced through comparative analysis under controlled conditions:

Control variable: Jimmy’s Workflow v2.1 as the structured workflow system
Test environment: A content processing platform with a production database (3,833 articles, 29 RSS sources)
Comparison method: Same tasks executed by different model tiers, with and without structured workflow, measuring speed, cost, quality, and reliability
Quality assessment: Scored on a 1-5 scale across multiple dimensions (accuracy, completeness, workflow compliance)
Models tested: Claude Haiku 4.5, Claude Sonnet 4.5, Gemini Pro

The research question was straightforward: does explicit workflow structure change the cost-quality equation for AI model selection?

The answer was yes — decisively.

Limitations

These findings should be interpreted with the following constraints in mind:

Small sample sizes — The Haiku findings are based on a 7-query test suite. Results are directionally strong but not statistically rigorous at scale.
Single domain — All testing was performed on content processing tasks (article analysis, metadata extraction, classification). Results may not generalise to other domains such as code generation or creative writing.
Specific model versions — Tested against Claude Haiku 4.5 and Claude Sonnet 4.5 as available in October 2025. Model capabilities change with updates.
Quality assessment subjectivity — Quality scores were assigned by a single evaluator. No inter-rater reliability was established.
Workflow-specific — Results depend on Jimmy’s Workflow as the structured system. Other workflow frameworks may produce different results.

These are production observations, not peer-reviewed research. They are useful for informing architecture decisions, not for making universal claims about model capability.