Skip to content

Audit MAP Methodology

The Audit MAP Execution Patterns document contains operational knowledge from running 4 real source code audits across 4 risk domains (health, children’s education, financial, cryptographic). It fills the gap between designing an audit and executing one well.

Every audit surfaced execution problems that the design methodology didn’t anticipate:

Problem discoveredWhich auditFix
22+ file paths wrong in pass 1Crypto protocol auditRecon-first session architecture
Context window exhausted on 8 lensesCrypto protocol auditMulti-pass execution
Validation treated as checkbox theaterFinancial auditValidation must quote evidence
Inherited findings couldn’t be re-verifiedCrypto + FinancialCarry-forward confidence penalties
Combined findings worse than individualFinancial auditAttack chain analysis
Context compaction lost earlier reasoningCrypto + FinancialCompaction recovery protocol
All confidence levels clustered at HIGHMultiple auditsConfidence calibration field rules
Severity assumed public deploymentFinancial auditDeployment context modifiers

Before any audit pass executes, Session 0 (recon) reads the actual codebase and updates the pass MAPs in place with verified information.

Session 0: RECON
├── Discover available tools (MCP servers, codebase access)
├── Read actual file structure
├── Get real dependency versions from lockfiles
├── Verify each file path in pass MAPs
├── UPDATE pass MAPs in place with correct details
└── Output: pass MAPs are verified, not templates

Recon does NOT execute lenses, file findings, or make audit judgments. It’s infrastructure.

Why separate: If recon and Pass 1 share a session, path discovery consumes context that Pass 1 needs for deep analysis.

Lens countDecision
1-5Single pass
6Consider multi-pass
7+Multi-pass recommended
8+Multi-pass mandatory

Each pass is self-contained with its own pre-flight, Finding Contract, validation, and checkpoint. The agent executing Pass 2 reads the Pass 1 checkpoint (optionally), not the Pass 1 MAP.

Session 0: Recon (verify paths, discover tools)
Session 1: Pass 1 (execute lenses, write checkpoint)
Session 2: Pass 2 (execute lenses, write checkpoint)
Session N: Synthesis (read all checkpoints, produce final report)
  • Group tightly-coupled lenses (crypto + zero-knowledge share source files)
  • Separate distinct attack surfaces
  • Put regression checks in the last pass
  • 2-3 lenses per pass ideal, 4 maximum

Every finding must include these 10 fields:

FieldRequirement
idUnique (e.g., SEC-001)
lensWhich lens produced it
decisionWhat is wrong
severityCRITICAL / HIGH / MEDIUM / LOW
confidenceHIGH / MEDIUM / LOW
reasoningMinimum 2 points
alternatives_rejectedMinimum 1
weaknesses_acknowledgedMinimum 1
evidenceFile, line, code snippet
remediation_hintDirection for fix

See also: JSON Sidecar Pattern for machine-readable output of findings.

When a later pass references findings from an earlier pass:

Original confidenceInherited without re-verification
HIGH→ MEDIUM
MEDIUM→ LOW
LOW→ stays LOW

Tag inherited findings: “Inherited from Pass N — not re-verified in this session.”

Never copy finding text as evidence. If carrying forward, the evidence must be freshly gathered.

Validation must quote evidence, not tick checkboxes.

Bad (theater):

All findings have >=2 reasoning points — PASS

Good (actual gate):

All findings have >=2 reasoning points — PASS
Spot-checked: SEC-003 has 3 reasoning points.
VAULT-001 has 2 (minimum). CHAIN-002 has 4.
Lowest count: 2. Contract met.
ConditionAction
All findings same confidence levelSTOP — recalibrate
HIGH confidence > 50%Suspicious — re-examine each
Zero positive observationsSuspicious — look for what’s done right too
Dynamic tests available but not runQuality gap — must note
  1. Domain knowledge gaps cap at MEDIUM — If you lack domain knowledge for a technology, findings cap at MEDIUM
  2. Depth budget scales with priority — CRITICAL lenses must exhaust verification paths before settling on MEDIUM
  3. Dual verification for HIGH — Requires both code evidence AND API/behaviour verification
  4. Dynamic tests upgrade confidencegrep, cargo test, npm audit results can upgrade specific findings
DeploymentModifier
Public internet, productionNo modifier (baseline)
Private networkConsider -1 for network-exposure findings
Preprod / stagingConsider -1 for operational findings
Air-gapped-1 for all network-exposure findings
Multi-tenant productionConsider +1 for isolation findings

Never apply downward modifiers to: auth bypass, crypto correctness, data integrity, design flaws.

After all lenses complete, scan for combinations:

PatternExample
Auth bypass + privileged actionAnyone can trigger destructive operations
Input validation gap + dangerous sinkInjection vulnerability
Key exposure + encrypted dataPlaintext recovery
State inconsistency + financial actionDouble-spend

Attack chains go in synthesis. They reference findings by ID — individual findings don’t change.

The methodology includes severity uplift tables for:

  • Children’s systems — COPPA, parental consent, audio recording
  • Financial systems — Payment integrity, blockchain irrecoverability, webhook trust
  • Cryptographic protocols — Nonce reuse, key material, timing side-channels, human cryptographer required
  • Health systems — GDPR special category, consent, biometric data

The complete execution patterns document (800+ lines with all tables, examples, and the quality checklist for MAP designers) is at audit-map-execution-patterns.md.