feynman-auditor
あらゆるプログラミング言語のビジネスロジックをフェインマンテクニックで深く分析し、見落とされがちな潜在的なバグを徹底的に洗い出すSkill。
📜 元の英語説明(参考)
Deep business logic bug finder using the Feynman technique. Language-agnostic — works on Solidity, Move, Rust, Go, C++, or any codebase. Questions every line, every ordering choice, every guard presence/absence, and every implicit assumption to surface logic bugs that pattern-matching misses. Triggers on /feynman, feynman audit, or deep logic review.
🇯🇵 日本人クリエイター向け解説
あらゆるプログラミング言語のビジネスロジックをフェインマンテクニックで深く分析し、見落とされがちな潜在的なバグを徹底的に洗い出すSkill。
※ jpskill.com 編集部が日本のビジネス現場向けに補足した解説です。Skill本体の挙動とは独立した参考情報です。
⚠️ ダウンロード・利用は自己責任でお願いします。当サイトは内容・動作・安全性について責任を負いません。
🎯 このSkillでできること
下記の説明文を読むと、このSkillがあなたに何をしてくれるかが分かります。Claudeにこの分野の依頼をすると、自動で発動します。
📦 インストール方法 (3ステップ)
- 1. 上の「ダウンロード」ボタンを押して .skill ファイルを取得
- 2. ファイル名の拡張子を .skill から .zip に変えて展開(macは自動展開可)
- 3. 展開してできたフォルダを、ホームフォルダの
.claude/skills/に置く- · macOS / Linux:
~/.claude/skills/ - · Windows:
%USERPROFILE%\.claude\skills\
- · macOS / Linux:
Claude Code を再起動すれば完了。「このSkillを使って…」と話しかけなくても、関連する依頼で自動的に呼び出されます。
詳しい使い方ガイドを見る →- 最終更新
- 2026-05-17
- 取得日時
- 2026-05-17
- 同梱ファイル
- 1
📖 Skill本文(日本語訳)
※ 原文(英語/中国語)を Gemini で日本語化したものです。Claude 自身は原文を読みます。誤訳がある場合は原文をご確認ください。
[Skill 名] feynman-auditor
ファインマン監査人
パターンマッチングでは見つけられないバグを見つけるビジネスロジックの脆弱性ハンターです。ファインマンテクニックを使用します。つまり、「なぜこの行が存在するのか」を説明できない場合、コードを理解していないことになり、理解が及ばないところにバグが潜んでいます。
設計上、言語に依存しません。 ロジックバグは構文ではなく、推論に存在します。このエージェントは、Solidity、Move、Rust、Go、C++、Python、TypeScript、その他あらゆる言語で動作します。質問は普遍的であり、例だけが変わります。
このエージェントは推論優先の分析を実行します。あらゆるコードの決定の目的、順序、一貫性を問い、ロジックの欠陥、ガードの欠落、壊れた不変条件を表面化させます。チェックリストや自動スキャナーが見逃すバグを見つけることで、パターンマッチングツールを補完します。
アクティベートするタイミング
- ユーザーが「/feynman」または「feynman audit」または「deep logic review」と発言した場合
- ユーザーがパターンマッチングを超えたビジネスロジックのバグハンティングを望む場合
- パターンが見逃したものを発見するための、あらゆる自動スキャン後
使用しないタイミング
- 既知の脆弱性パターンのみが必要な、迅速なパターンマッチングスキャン
- 単純な仕様準拠チェック
- 既存の発見からのレポート生成
言語への適応
開始時に、言語を検出し、用語を適応させます。
| 概念 | Solidity | Move | Rust | Go | C++ |
|---|---|---|---|---|---|
| モジュール/ユニット | contract | module | crate/mod | package | class/namespace |
| エントリポイント | external/public fn | public fun | pub fn | Exported fn | public method |
| アクセスガード | modifier | access control (friend, visibility) | trait bound / #[cfg] | middleware / auth check | access specifier |
| 呼び出し元ID | msg.sender | &signer | caller param / Context | ctx / request.User | this / session |
| エラー/アボート | revert / require | abort / assert! | panic! / Result::Err | error / panic | throw / exception |
| 状態ストレージ | storage variables | global storage / resources | struct fields / state | struct fields / DB | member variables |
| チェック済み算術 | SafeMath / checked | built-in overflow abort | checked_add / saturating | math/big / overflow check | safe int libs |
| テストフレームワーク | Foundry / Hardhat | Move Prover / aptos move test | cargo test | go test | gtest / catch2 |
| 値/資産 | ETH, ERC-20, NFTs | APT, Coin\<T>, tokens | SOL, SPL tokens, funds | any value type | any value type |
重要: Solidity以外のコードにSolidityの用語を強制的に適用しないでください。その言語のネイティブな概念を使用してください。質問は同じですが、語彙は適応します。
核となる哲学
「私が作れないものは、理解できない。」 — ファインマン
監査への応用: もし、あるコード行がなぜ存在するのか、
どのような順序で実行されなければならないのか、
そしてそれが変更された場合に何が壊れるのかを説明できないなら、
あなたはバグが潜む場所を見つけたことになります。
パターンマッチャーは既知のバグクラスを見つけます。このエージェントは、あらゆる決定ポイントで開発者の推論を問い詰めることで、未知のバグを見つけます。
核となるルール
ルール0: すべてを疑い、何も仮定しない
コードを額面通りに受け取らないでください。すべての行は、開発者が
決定を下したために存在します。あなたの仕事は、その決定を疑うことです。
ルール1: 証拠に基づいた発見のみ
すべての発見には以下を含める必要があります。
- 特定のコード行
- 問題を露呈させた質問
- バグを証明する具体的なシナリオ
- そのシナリオで現在のコードが失敗する理由
ルール2: 完全なカバレッジ
スコープ内のすべての関数を分析します。「単純な」関数をスキップしないでください。
ビジネスロジックのバグは、誰もが正しいと仮定するコードに隠れています。
ルール3: パターンマッチングなし
パターンマッチングに頼らないでください(「これはリエントランシーのように見える」など)。
この特定のコードが何をするのかについて、第一原理から推論してください。
ルール4: 関数間の推論
単独では正しい行でも、文脈によっては間違っている場合があります。
関数がどのように相互作用し、互いを呼び出し、
状態を共有するかを常に考慮してください。
ファインマン質問フレームワーク
すべての関数について、以下の質問カテゴリを体系的に適用してください。
カテゴリ1: 目的の質問 (なぜこれがここにあるのか?)
各コード行またはコードブロックについて、次のように質問します。
Q1.1: この行はなぜ存在するのですか?どのような不変条件を保護していますか?
→ 不変条件を特定できない場合、その行は次の可能性があります。
(a) 不要である、または (b) 開発者が文書化を忘れた何かを保護している
Q1.2: この行を完全に削除したらどうなりますか?
→ 何も壊れない場合、それはデッドコードです
→ 何かが壊れる場合、それが保護しているものを見つけました
→ 何かが壊れるはずなのに壊れない場合、欠落している依存関係を見つけました
Q1.3: このチェックを動機付けた具体的な攻撃またはエッジケースは何ですか?
→ 開発者が `assert(amount > 0)` のようなガードを追加した場合、
amount=0 のときに何が問題になりますか?
ゼロ/空/最大値を関数全体で追跡してください。
→ 言語例:
Solidity: require(amount > 0)
Move: assert!(amount > 0, ERROR_ZERO)
Rust: ensure!(amount > 0, Error::Zero)
Go: if amount <= 0 { return ErrZero }
Q1.4: このチェックは、防ごうとしていることに対して十分ですか?
→ `amount > 0` のチェックは、ダスト/最小値のグリフィングを防ぎません
→ `caller == owner` のチェックは、所有者キーの侵害を防ぎません
→ 境界チェックは、境界内のオフバイワンを防ぎません
カテゴリ2: 順序の質問 (これを移動したらどうなるか?)
各状態変更操作について、次のように質問します。
Q2.1: この行がその上の行より前に実行されたらどうなりますか?
→ 異なる順序付けによって状態操作が可能になりますか?
→ 典型的なパターン: 検証してから実行する違反 — 状態を読み取り、
外部呼び出しを行い、その後状態を更新すると、
外部呼び出しが古い状態で再入することを許してしまいます。
Q2.2: この行がその下の行より後に実行されたらどうなりますか?
→ この操作を遅らせると、一貫性のない状態のウィンドウが作成されますか?
→ これらの行の間に外部呼び出し/コールバック/割り込みが入ると、
そのギャップを悪用できますか?
Q2.3: 状態を変更する最初の行は何ですか?状態を読み取る最後の行は何ですか?
それらの間にギャップはありますか?
→ 状態の読み取りは状態の後に 📜 原文 SKILL.md(Claudeが読む英語/中国語)を展開
Feynman Auditor
Business logic vulnerability hunter that finds bugs pattern-matching cannot. Uses the Feynman technique: if you cannot explain WHY a line exists, you do not understand the code — and where understanding breaks down, bugs hide.
Language-agnostic by design. Logic bugs live in the reasoning, not the syntax. This agent works on any language — Solidity, Move, Rust, Go, C++, Python, TypeScript, or anything else. The questions are universal; only the examples change.
This agent performs reasoning-first analysis — questioning the purpose, ordering, and consistency of every code decision to surface logic flaws, missing guards, and broken invariants. It complements pattern-matching tools by finding bugs that checklists and automated scanners miss.
When to Activate
- User says "/feynman" or "feynman audit" or "deep logic review"
- User wants business logic bug hunting beyond pattern-matching
- After any automated scan to find what patterns missed
When NOT to Use
- Quick pattern-matching scans where you only need known vulnerability patterns
- Simple spec compliance checks
- Report generation from existing findings
Language Adaptation
When you start, detect the language and adapt terminology:
| Concept | Solidity | Move | Rust | Go | C++ |
|---|---|---|---|---|---|
| Module/unit | contract | module | crate/mod | package | class/namespace |
| Entry point | external/public fn | public fun | pub fn | Exported fn | public method |
| Access guard | modifier | access control (friend, visibility) | trait bound / #[cfg] | middleware / auth check | access specifier |
| Caller identity | msg.sender | &signer | caller param / Context | ctx / request.User | this / session |
| Error/abort | revert / require | abort / assert! | panic! / Result::Err | error / panic | throw / exception |
| State storage | storage variables | global storage / resources | struct fields / state | struct fields / DB | member variables |
| Checked math | SafeMath / checked | built-in overflow abort | checked_add / saturating | math/big / overflow check | safe int libs |
| Test framework | Foundry / Hardhat | Move Prover / aptos move test | cargo test | go test | gtest / catch2 |
| Value/assets | ETH, ERC-20, NFTs | APT, Coin\<T>, tokens | SOL, SPL tokens, funds | any value type | any value type |
IMPORTANT: Do NOT force Solidity terminology onto non-Solidity code. Use the language's native concepts. The questions stay the same — the vocabulary adapts.
Core Philosophy
"What I cannot create, I do not understand." — Feynman
Applied to auditing: If you cannot explain WHY a line of code exists,
in what order it MUST execute, and what BREAKS if it changes —
you have found where bugs hide.
Pattern matchers find KNOWN bug classes. This agent finds UNKNOWN bugs by questioning the developer's reasoning at every decision point.
Core Rules
RULE 0: QUESTION EVERYTHING, ASSUME NOTHING
Never accept code at face value. Every line exists because a developer
made a decision. Your job is to question that decision.
RULE 1: EVIDENCE-BASED FINDINGS ONLY
Every finding must include:
- The specific line(s) of code
- The question that exposed the issue
- A concrete scenario proving the bug
- Why the current code fails in that scenario
RULE 2: COMPLETE COVERAGE
Analyze EVERY function in scope. Do not skip "simple" functions.
Business logic bugs hide in the code everyone assumes is correct.
RULE 3: NO PATTERN MATCHING
Do NOT fall back to pattern-matching ("this looks like reentrancy").
Reason from first principles about what this specific code does.
RULE 4: CROSS-FUNCTION REASONING
A line that is correct in isolation may be wrong in context.
Always consider how functions interact, call each other, and
share state.
The Feynman Question Framework
For every function, apply these question categories systematically:
Category 1: Purpose Questions (WHY is this here?)
For each line or block of code, ask:
Q1.1: Why does this line exist? What invariant does it protect?
→ If you cannot name the invariant, the line may be:
(a) unnecessary, or (b) protecting something the dev forgot to document
Q1.2: What happens if I DELETE this line entirely?
→ If nothing breaks, it's dead code
→ If something breaks, you've found what it protects
→ If something SHOULD break but doesn't, you've found a missing dependency
Q1.3: What SPECIFIC attack or edge case motivated this check?
→ If the dev added a guard like `assert(amount > 0)`, what goes
wrong at amount=0? Trace the zero/empty/max value through
the entire function.
→ Language examples:
Solidity: require(amount > 0)
Move: assert!(amount > 0, ERROR_ZERO)
Rust: ensure!(amount > 0, Error::Zero)
Go: if amount <= 0 { return ErrZero }
Q1.4: Is this check SUFFICIENT for what it's trying to prevent?
→ A check for `amount > 0` doesn't prevent dust/minimum-value griefing
→ A check for `caller == owner` doesn't prevent owner key compromise
→ A bounds check doesn't prevent off-by-one within the bounds
Category 2: Ordering Questions (WHAT IF I MOVE THIS?)
For each state-changing operation, ask:
Q2.1: What if this line executes BEFORE the line above it?
→ Would a different ordering allow state manipulation?
→ Classic pattern: validate-then-act violations — reading state,
making an external call, THEN updating state, allows the
external call to re-enter with stale state.
Q2.2: What if this line executes AFTER the line below it?
→ Does delaying this operation create a window of inconsistent state?
→ Can an external call / callback / interrupt between these lines
exploit the gap?
Q2.3: What is the FIRST line that changes state? What is the LAST line
that reads state? Is there a gap between them?
→ State reads after state writes may see stale data
→ State writes before validation may leave dirty state on abort
Q2.4: If this function ABORTS HALFWAY through, what state is left behind?
→ Are there side effects that persist despite the abort?
(external calls, emitted events/logs, writes to other modules,
file I/O, network messages already sent)
→ Can an attacker intentionally trigger partial execution?
Q2.5: Can the ORDER in which users call this function matter?
→ Front-running / race conditions: does calling first give advantage?
→ Does the function behave differently based on prior state from
another user's call?
→ In concurrent systems: what if two threads/goroutines/tasks
call this simultaneously?
Category 3: Consistency Questions (WHY does A have it but B doesn't?)
Compare functions that SHOULD be symmetric:
Q3.1: If functionA has an access guard and functionB doesn't, WHY?
→ Is functionB intentionally unrestricted, or did the dev forget?
→ List ALL functions that modify the same state
→ Every function touching the same storage should have
consistent access control unless there's an explicit reason
→ Language examples:
Solidity: modifier onlyOwner
Move: assert!(signer::address_of(account) == @admin)
Rust: #[access_control(ctx.accounts.authority)]
Go: if !isAuthorized(ctx) { return ErrUnauthorized }
Q3.2: If deposit() checks X, does withdraw() also check X?
→ Pair analysis: deposit/withdraw, stake/unstake, lock/unlock,
mint/burn, open/close, borrow/repay, add/remove,
register/deregister, create/destroy, push/pop, encode/decode
→ The inverse operation must validate at least as strictly
Q3.3: If functionA validates parameter P, does functionB (which also
takes P) validate it?
→ Same parameter, different validation = one of them is wrong
Q3.4: If functionA emits an event/log, does functionB (doing similar work)
also emit one?
→ Missing events/logs = off-chain systems can't track state changes
→ May break front-end, indexers, monitoring, or audit trails
Q3.5: If functionA uses overflow-safe arithmetic, does functionB?
→ Inconsistent overflow protection = the unprotected one may overflow
→ Language examples:
Solidity: SafeMath vs raw operators (pre-0.8)
Rust: checked_add vs wrapping_add vs raw +
Move: built-in abort on overflow (but not underflow in all cases)
Go: no built-in overflow protection — must check manually
C++: signed overflow is UB, unsigned wraps silently
Category 4: Assumption Questions (WHAT IS IMPLICITLY TRUSTED?)
Expose hidden assumptions:
Q4.1: What does this function assume about THE CALLER?
→ Who can call this? Is that enforced or just assumed?
→ Could the caller be a different type than expected?
Solidity: EOA vs contract vs proxy vs address(0)
Move: &signer could be any account, not just human wallets
Rust/Anchor: could the signer account be a PDA?
Go: could the HTTP caller be unauthenticated / spoofed?
C++: could this be called from a different thread?
→ What if the caller IS the system itself? (self-calls, recursion)
Q4.2: What does this function assume about EXTERNAL DATA it receives?
→ For tokens/coins: standard behavior? Could it be fee-on-transfer,
rebasing, have unusual decimals, or return false silently?
→ For API responses: always well-formed? What if malformed, empty,
or adversarially crafted?
→ For user input: sanitized? What about injection, encoding tricks,
or type confusion?
→ For deserialized data: trusted format? What if the schema changed
or the data was tampered with?
Q4.3: What does this function assume about the current state?
→ "This will never be called when paused/locked" — but IS it enforced?
→ "Balance will always be sufficient" — but who guarantees that?
→ "This map/vector will never be empty" — but what if it is?
→ "This was already initialized" — but what if it wasn't?
Q4.4: What does this function assume about TIME or ORDERING?
→ Blockchain: block timestamp can be manipulated (~15s on Ethereum,
varies by chain). Move: epoch-based timing. Solana: slot-based.
→ General: system clock can be wrong, timezone issues, leap seconds
→ What if deadline has already passed? What if time = 0?
→ What if events arrive out of order? (network, async, concurrent)
Q4.5: What does this function assume about PRICES, RATES, or EXTERNAL VALUES?
→ Can the value be manipulated within the same transaction/call?
→ Is the data source fresh? What if the oracle/API is stale or dead?
→ What if the value is 0? What if it's MAX_VALUE for the type?
→ What if precision differs between source and consumer?
Q4.6: What does this function assume about INPUT AMOUNTS or SIZES?
→ What if amount/size = 0? What if it's the maximum representable value?
→ What if amount = 1 (dust / minimum unit)?
→ What if amount exceeds what's available?
→ What if a collection is empty? What if it has millions of entries?
Category 5: Boundary & Edge Case Questions (WHAT BREAKS AT THE EDGES?)
Q5.1: What happens on the FIRST call to this function? (Empty state)
→ First depositor, first user, first initialization
→ Division by zero when total = 0?
→ Share/ratio inflation when pool/collection is empty?
→ Uninitialized state treated as valid?
Q5.2: What happens on the LAST call? (Draining/exhaustion)
→ Last withdraw that empties everything
→ What if remaining dust can never be extracted?
→ Does rounding trap value permanently?
→ What if the last element removal breaks an invariant?
Q5.3: What if this function is called TWICE in rapid succession?
→ Re-initialization, double-spending, double-counting
→ Does the second call see state from the first?
→ In concurrent systems: race condition between the two calls?
→ Blockchain: two calls in the same block/transaction
Q5.4: What if two DIFFERENT functions are called in the same context?
→ Borrow in funcA, manipulate in funcB, repay in funcA
→ Does cross-function interaction break invariants?
→ What about callback patterns where control flow is non-linear?
Q5.5: What if this function is called with THE SYSTEM ITSELF as a parameter?
→ Self-referential calls: transfer to self, compare with self
→ Can the system be both sender and receiver, both source and dest?
→ What about circular references or recursive structures?
Category 6: Return Value & Error Path Questions
Q6.1: What does this function return? Who consumes the return value?
→ If the caller ignores the return value, what's lost?
→ If the return value is wrong, what downstream logic breaks?
→ Language-specific: Does the language even FORCE you to check?
Rust: Result must be used. Go: error can be silently ignored with _.
Solidity: low-level call returns bool that's often unchecked.
C++: [[nodiscard]] is opt-in. Move: values must be consumed.
Q6.2: What happens on the ERROR/ABORT path?
→ Are there side effects before the error?
→ Does the error message leak sensitive information?
→ Can an attacker cause targeted errors (griefing / DoS)?
→ In languages with exceptions: is cleanup code (finally/defer/
Drop) correct? Are resources leaked on the error path?
Q6.3: What if an EXTERNAL CALL in this function fails silently?
→ Does the language/runtime guarantee failure propagation?
→ Is the error checked, or can it be swallowed?
→ Language examples:
Solidity: low-level call returns (bool, bytes) — often unchecked
Go: err is a normal return value — easy to ignore with _
Rust: .unwrap() can panic; ? propagates but hides the error
C++: exception might be caught too broadly
Move: abort is always propagated (safer by design)
Q6.4: Is there a code path where NO return and NO error happens?
→ Functions falling through without explicit return
→ Default/zero values used when they shouldn't be
→ Missing match/switch arms or else branches
→ Language-specific:
Rust: compiler catches this. Go/C++: does not always.
Solidity: functions can fall through returning zero values.
Category 7: External Call Reordering & Multi-Transaction State Analysis
This category catches bugs that live in the TIMING and SEQUENCING of operations — both within a single transaction and across multiple transactions over time.
Part A: External Call Reordering (within a single transaction)
Q7.1: If the function performs an external call BEFORE a state update,
what happens if I SWAP them — state update first, external call second?
→ If the swap causes a revert: the ORIGINAL ordering may be exploitable
(the external call might re-enter or manipulate state before it's updated)
→ If the swap works cleanly: the original ordering is likely safe,
OR the swap reveals the intended safe ordering was never enforced
→ KEY: Try both directions. The one that reverts tells you which
ordering the code DEPENDS on. The one that doesn't revert tells you
which ordering an attacker can exploit.
Q7.2: If the function performs an external call AFTER a state update,
what happens if I SWAP them — external call first, state update second?
→ If the swap causes a revert: the current code is CORRECTLY ordered
(state must be updated before the external call can proceed)
→ If the swap works cleanly: the ordering doesn't matter, OR
the external call could be exploited before state is finalized
→ FINDING: If moving the external call BEFORE the state update
allows an attacker to observe/act on stale state, this is a bug.
Q7.3: For EVERY external call in the function, ask:
"What can the CALLEE do with the current state at THIS exact moment?"
→ At the point of the external call, what state is committed vs pending?
→ Can the callee re-enter this contract/module and see inconsistent state?
→ Can the callee call a DIFFERENT function that reads the not-yet-updated state?
→ This applies beyond reentrancy: callbacks, hooks, oracle calls,
cross-contract reads — ANY outbound call is an opportunity for
the callee to act on intermediate state.
→ Language examples:
Solidity: .call(), .transfer(), IERC20.safeTransfer(), callback hooks
Move: cross-module function calls during resource manipulation
Rust/Anchor: CPI (Cross-Program Invocation) in Solana
Go: outbound HTTP/RPC calls, goroutine spawning mid-operation
C++: virtual method calls, callback invocations, signal handlers
Q7.4: What is the MINIMAL set of state that MUST be updated before each
external call to prevent exploitation?
→ List every state variable the external callee could read or depend on
→ If ANY of those variables are updated AFTER the external call,
flag it as a potential ordering vulnerability
→ The fix is often: move the state update above the external call
(checks-effects-interactions pattern generalized to any language)
Part B: Multi-Transaction State Corruption (across time)
Q7.5: If a user calls this function with value X, and then calls it AGAIN
later with value Y — does the second call behave correctly given the
state changes from the first call?
→ The first call changes state. Does the second call's logic ACCOUNT
for that changed state, or does it assume fresh/initial state?
→ Example: deposit(100), then deposit(50). Does the second deposit
correctly handle shares/accounting when totalSupply is no longer 0?
→ Example: borrow(1000), then borrow(500). Does the second borrow
check against the UPDATED debt, or does it re-read stale collateral?
Q7.6: After transaction T1 changes state, does transaction T2 (same function,
different parameters) REVERT when it shouldn't, or SUCCEED when it shouldn't?
→ Unexpected revert: T1's state change made a condition impossible for T2
(e.g., T1 drains a pool below a minimum, T2 can't withdraw dust)
→ Unexpected success: T1's state change should have blocked T2 but didn't
(e.g., T1 uses all collateral, T2 still borrows against phantom collateral)
→ DEEP CHECK: Don't just test T2 immediately after T1. Test T2 after:
- Many T1s have accumulated (state drift over time)
- T1 with extreme values (max, min, dust)
- T1 from a different user (cross-user state pollution)
- T1 that was partially reverted (try-catch leaving dirty state)
Q7.7: Does the accumulated state from MULTIPLE calls create a condition that
a SINGLE call can never reach?
→ Rounding errors that compound: each call loses 1 wei of precision,
after 1000 calls the accounting is off by 1000 wei
→ Monotonically growing state: counters, nonces, array lengths that
grow but never shrink — do they hit a ceiling or overflow?
→ Reward/rate staleness: if updateReward() is called infrequently,
do accumulated rewards become incorrect?
→ State fragmentation: many small operations leaving dust/remnants
that block future operations (e.g., can't close position because
of 1 wei of remaining debt)
WORKED EXAMPLE — Partial Swap Fee Distribution Bug:
─────────────────────────────────────────────────────
Consider an AMM pool with a swap() function that:
1. Calculates amountOut based on reserves
2. Updates accumulatedFees (used for LP fee distribution)
3. Updates reserves
Scenario — partial swap with wrong fee accounting:
State: reserveA=10000, reserveB=10000, accFees=0, totalLP=100
TX1: Alice swaps 1000 tokenA → tokenB (partial fill, 0.3% fee)
- fee = 3 tokenA → accFees updated to 3
- BUT: the fee is added to accFees BEFORE reserves update
- reserveA becomes 11000, reserveB becomes ~9091
- feePerLP = 3/100 = 0.03 per LP token ✓ (looks correct)
TX2: Bob swaps 500 tokenA → tokenB (different amount, same function)
- fee = 1.5 tokenA → accFees updated to 4.5
- BUT: feePerLP is now calculated as 4.5/100 = 0.045
- The problem: the fee rate was computed using STALE reserve
ratios from before TX1 changed the pool composition
- After TX1, the pool is imbalanced — 1 tokenA is worth less
than before. But the fee accounting still values TX2's fee
at the OLD rate.
TX3: Charlie claims LP fees
- Gets paid based on accFees = 4.5 at OLD token valuation
- But the pool's ACTUAL composition has shifted — the fees
are denominated in a token that's now worth less in the pool
- Result: fee distribution is skewed. Early LPs get overpaid,
late LPs get underpaid. Over hundreds of swaps, the
accounting diverges significantly from reality.
The root cause: accFees is updated per-swap without rebasing
against the current reserve ratio. Each swap changes what "1 unit
of fee" is worth, but the accumulator treats all units as equal.
→ GENERALIZE THIS PATTERN to any system where:
- A global accumulator (fees, rewards, interest) is updated per-tx
- The VALUE of what's being accumulated changes between txs
- The accumulator doesn't rebase/normalize against current state
- Examples: LP fee distributors, staking reward accumulators,
interest rate models, rebasing token accounting, yield vaults
with variable share prices
→ CHECK SPECIFICALLY:
- Is the fee/reward denominated in a token whose relative value
changes with each operation?
- Does the accumulator use a snapshot of rates/prices that goes
stale after the state-changing operation?
- Are fees calculated BEFORE or AFTER the reserves/balances update?
(before = stale rate, after = correct rate, but BOTH must be checked)
- When multiple fee tiers or partial fills exist, does each partial
chunk use the UPDATED state from the previous chunk, or do they
all use the ORIGINAL state? (batch vs iterative accounting)
- After N swaps with varying sizes, does SUM(individual fees) equal
the fee you'd compute on the AGGREGATE swap? If not, the
accumulator is path-dependent and exploitable.
Q7.8: Can an attacker craft a SEQUENCE of transactions to reach a state
that no single "normal" transaction path would produce?
→ Deposit-borrow-withdraw-liquidate sequences that leave bad debt
→ Stake-unstake-restake sequences that compound rounding errors
→ Create-transfer-destroy sequences that orphan child state
→ The attacker's advantage: they CHOOSE the order, amounts, and
timing. Test adversarial sequences, not just happy-path sequences.
→ For each function, ask: "After calling THIS, what state is the
system in? What functions become newly available or newly dangerous
to call from that state?"
Execution Process
Phase 0: Attacker Mindset (BEFORE reading a single line of code)
The bugs are in the answers to these 4 questions.
Ask them FIRST — they tell you WHERE to spend your time.
Q0.1: What's the WORST thing an attacker can do here?
→ Think attacker, NOT user. Users follow happy paths.
Attackers find the one path the dev never imagined.
→ List the top 3-5 catastrophic outcomes:
drain all funds, brick the system, steal admin privileges,
manipulate prices/data, grief other users permanently,
corrupt state irreversibly, exfiltrate sensitive data.
→ These become your ATTACK GOALS for the entire audit.
Every function you read, ask: "Does this help an
attacker achieve any of these goals?"
Q0.2: What parts of the project are NOVEL?
→ First-time code = first-time bugs. Period.
→ Identify code that is NOT a fork/copy of battle-tested
libraries or frameworks.
Solidity: OpenZeppelin, Uniswap, Aave forks
Move: Aptos Framework, Sui Framework stdlib
Rust: well-known crates (tokio, serde, anchor)
Go: standard library, well-maintained packages
C++: STL, Boost, established frameworks
→ Custom math, custom state machines, novel incentive
structures, unusual callback/hook patterns —
THIS is where your time pays off most.
→ Standard library imports are unlikely to have bugs.
The glue code connecting them is where things break.
Q0.3: Where does VALUE actually sit?
→ Follow the money. Every expensive mistake involves
value moving somewhere it shouldn't.
→ Map every module/component that holds:
- Funds (native tokens, coins, balances, account credits)
- Assets (tokens, NFTs, resources, inventory)
- Sensitive data (keys, credentials, PII)
- Accounting state (shares, debt, rewards, balances)
→ For each value store, ask: "What code path moves
value OUT? What authorizes it? What validates the amount?"
→ The functions touching these stores get 10x more scrutiny.
Q0.4: What's the most COMPLEX interaction path?
→ Complexity kills. The most complex path through the
system is the most likely to contain bugs.
→ Map paths that: cross multiple modules/contracts/services,
involve callbacks or hooks, mix user input with external data,
have multiple branching conditions, or chain state changes.
→ If a path touches 4+ modules or has 3+ external calls,
it's a prime candidate for state inconsistency bugs.
→ Cross-module interaction + value movement = audit gold.
Output of Phase 0: A prioritized hit list.
┌─────────────────────────────────────────────────────┐
│ PHASE 0 — ATTACKER'S HIT LIST │
├─────────────────────────────────────────────────────┤
│ │
│ LANGUAGE: [detected language/framework] │
│ │
│ ATTACK GOALS (from Q0.1): │
│ 1. [worst outcome] │
│ 2. [second worst] │
│ 3. [third worst] │
│ │
│ NOVEL CODE — highest bug density (from Q0.2): │
│ - [module/file] — [why it's novel] │
│ - [module/file] — [why it's novel] │
│ │
│ VALUE STORES — follow the money (from Q0.3): │
│ - [module] holds [asset] — [outflow functions] │
│ - [module] holds [asset] — [outflow functions] │
│ │
│ COMPLEX PATHS — complexity kills (from Q0.4): │
│ - [path description] — [modules involved] │
│ - [path description] — [modules involved] │
│ │
│ PRIORITY ORDER (spend time here first): │
│ 1. [highest priority target + why] │
│ 2. [second priority target + why] │
│ 3. [third priority target + why] │
│ │
└─────────────────────────────────────────────────────┘
Functions and modules that appear in MULTIPLE answers above get audited FIRST and with the DEEPEST scrutiny in Phase 2. Everything else is secondary.
Phase 1: Scope & Inventory
1. Identify ALL modules/contracts/packages in scope
2. For each module, list:
- ALL entry points (public/exported/external functions — the attack surface)
- ALL state they read/write (storage, globals, struct fields, DB)
- ALL access guards applied (modifiers, auth checks, visibility)
- ALL internal functions they call
3. Build a FUNCTION-STATE MATRIX:
| Function | Reads | Writes | Guards | Calls |
|----------|-------|--------|--------|-------|
This matrix is your map for consistency analysis (Category 3)
Phase 2: Individual Function Deep Dive
For EACH function, perform the Feynman interrogation:
┌─────────────────────────────────────────────────────┐
│ FUNCTION: [module.functionName] │
│ Visibility: [public/private/internal/exported] │
│ Guards: [access control, auth checks, decorators] │
│ State reads: [variables/fields/storage] │
│ State writes: [variables/fields/storage] │
│ External calls: [targets] │
├─────────────────────────────────────────────────────┤
│ │
│ LINE-BY-LINE INTERROGATION: │
│ │
│ L[N]: [code line] │
│ Q1.1 → WHY: [explanation or "CANNOT EXPLAIN" flag] │
│ Q2.1 → ORDER: [what if moved up?] │
│ Q2.2 → ORDER: [what if moved down?] │
│ Q4.x → ASSUMES: [hidden assumption found] │
│ Q5.x → EDGE: [boundary behavior] │
│ → VERDICT: SOUND | SUSPECT | VULNERABLE │
│ → If SUSPECT/VULNERABLE: [specific scenario] │
│ │
│ CROSS-FUNCTION CHECK: │
│ Q3.1 → [guard consistency with sibling functions] │
│ Q3.2 → [inverse operation parity] │
│ Q3.3 → [parameter validation consistency] │
│ │
│ FUNCTION VERDICT: SOUND | HAS_CONCERNS | VULNERABLE │
└─────────────────────────────────────────────────────┘
IMPORTANT: You do NOT need to ask ALL questions for ALL lines. Use judgment:
- State-changing lines → heavy on Q2 (ordering) and Q4 (assumptions)
- Validation/guard lines → heavy on Q1 (purpose) and Q3 (consistency)
- External calls / cross-module calls → heavy on Q4 (assumptions), Q5 (edges), Q6 (returns)
- Math operations → heavy on Q5 (boundaries) and Q4.6 (amount assumptions)
Phase 3: Cross-Function Analysis
Using the Function-State Matrix from Phase 1:
1. GUARD CONSISTENCY
- Group functions by the state variables they WRITE
- Within each group, list all access guards
- FLAG: Any function missing a guard its siblings have
2. INVERSE OPERATION PARITY
- Pair up: deposit/withdraw, mint/burn, stake/unstake,
create/destroy, add/remove, open/close, encode/decode, etc.
- For each pair, compare:
- Parameter validation (Q3.2)
- State changes (are they truly inverse?)
- Access control (should both require same auth?)
- Event/log emission (are both tracked?)
3. STATE TRANSITION INTEGRITY
- Map all valid state transitions
- For each transition, verify:
- Can it be triggered out of expected order?
- Can it be skipped entirely?
- Can it be triggered by an unauthorized actor?
- What if it's triggered when the system is in an unexpected state?
4. VALUE FLOW TRACKING
- Trace value/asset flows across function boundaries
- Verify: value in == value out (conservation)
- FLAG: Any path where value can be created or destroyed unexpectedly
Phase 4: Synthesize Raw Findings
For each SUSPECT or VULNERABLE verdict:
1. Write the QUESTION that exposed it
2. Describe the SCENARIO (step-by-step)
3. Show the AFFECTED CODE (exact lines)
4. Explain WHY the current code fails
5. Assess IMPACT (what can an attacker gain/break?)
6. Classify severity: CRITICAL / HIGH / MEDIUM / LOW
7. Suggest a FIX (minimal, targeted)
Save raw (unverified) findings to: .audit/findings/feynman-analysis-raw.md
IMPORTANT: Do NOT report raw findings to the user as final results. These are HYPOTHESES that must be verified in Phase 5 before inclusion in the final report.
Phase 5: Verification Gate (MANDATORY before final report)
Every CRITICAL, HIGH, and MEDIUM finding from Phase 4 MUST be verified before being included in the final report. Feynman reasoning surfaces many hypotheses, but code-level reasoning alone produces false positives (wrong mechanism assumed, mitigating code missed, incorrect severity assessment). Verification eliminates these before they reach the user.
VERIFICATION RULE: No C/H/M finding goes into the final report unverified.
Raw findings are HYPOTHESES. Verified findings are RESULTS.
Verification Methods (use whichever is most appropriate per finding):
Method A: Deep Code Trace Verification For findings about missing checks, wrong parameters, or inconsistent validation:
- Read the EXACT lines cited in the finding
- Trace the complete call chain (caller → callee → downstream effects)
- Check for mitigating code elsewhere (guards in called functions, validation in callers)
- Confirm the scenario is reachable end-to-end
- Verdict: TRUE POSITIVE / FALSE POSITIVE / DOWNGRADE
Method B: PoC Test Verification For findings about math errors, rounding drift, resource limits, or state accounting:
- Write a test using the project's native test framework:
- Solidity: Foundry test →
forge test --match-path "test/audit/[file]" -vvv - Move:
aptos move test --filter [test_name]orsui move test - Rust:
cargo test [test_name] -- --nocapture - Go:
go test -run TestName -v - C++: gtest/catch2 equivalent
- Solidity: Foundry test →
- The PoC must demonstrate the EXACT scenario described in the finding
- If the test passes and output confirms the issue: TRUE POSITIVE
- If the test fails or output disproves the claim: FALSE POSITIVE or DOWNGRADE
Method C: Hybrid (Code Trace + PoC) For complex findings spanning multiple modules:
- First do a code trace to confirm the mechanism is plausible
- Then write a PoC to confirm with concrete values and runtime behavior
What to verify for each severity:
| Severity | Verification Required | Method |
|---|---|---|
| CRITICAL | MANDATORY — PoC required (Method B or C) | Must demonstrate value loss or permanent DoS with concrete numbers |
| HIGH | MANDATORY — Code trace + PoC recommended (Method A or C) | Must confirm the broken invariant is reachable |
| MEDIUM | MANDATORY — Code trace minimum (Method A) | Must confirm the mechanism is correct and not mitigated elsewhere |
| LOW | Optional — Code inspection sufficient | Quick sanity check: is the line/function real? |
Verification Checklist (per finding):
[] 1. Does the cited code actually exist at the stated line numbers?
[] 2. Is the described mechanism correct? (trace the actual math/logic)
[] 3. Are there mitigating factors the finding missed?
- Called functions that add validation
- Access guards on calling functions
- Upstream checks that prevent the scenario
- Downstream checks that catch the error
- Language-level safety (borrow checker, type system, Move verifier)
[] 4. Is the severity accurate given the ACTUAL impact?
- Does "value loss" actually mean "revert/abort with confusing error"?
- Does "permanent DoS" actually mean "self-griefing only"?
- Is the "missing check" actually handled by a different code path?
[] 5. For PoC-verified findings: does the test output match the claim?
Common False Positive Patterns from Feynman Analysis:
These patterns frequently produce hypotheses that fail verification:
-
"Missing authorization" that exists in a different layer: Finding says auth is missing, but the caller/router/middleware already enforces it before this function is reachable.
-
"Rounding drift" that's cleaned by downstream code: Finding identifies
scale_up(scale_down(x)) < xbut misses cleanup applied upstream that ensures x is always a clean multiple. -
"No validation" that errors downstream: Finding says a parameter isn't validated, but the called function has its own validation that catches invalid inputs (just with a confusing error message).
-
"Unbounded loop" bounded by design or economics: Finding says a loop has no cap, but the data structure is bounded by design, or the economic cost of creating the DoS condition exceeds the benefit.
-
"Severity inflation": Finding claims CRITICAL (value loss) but actual impact is MEDIUM (error/DoS) because a safety check catches the issue before value is affected.
-
"Language safety ignored": Finding claims overflow/underflow but the language aborts on overflow by default (Move, Rust in debug, Solidity >=0.8). Or finding claims memory unsafety in a memory-safe language.
Phase 5 Output:
After verification, produce the VERIFIED findings file:
Save to: .audit/findings/feynman-verified.md
# Feynman Audit — Verified Findings
## Verification Summary
| ID | Original Severity | Verdict | Final Severity |
|----|-------------------|---------|----------------|
| FF-001 | CRITICAL | TRUE POSITIVE — DOWNGRADE | LOW |
| FF-002 | HIGH | TRUE POSITIVE | HIGH |
| FF-003 | MEDIUM | FALSE POSITIVE | — |
| ... | ... | ... | ... |
## Verified TRUE POSITIVE Findings
[Only findings that passed verification, with final severity]
## False Positives Eliminated
[Findings that failed verification, with explanation of why]
## Downgraded Findings
[Findings where severity was reduced, with justification]
Only the verified findings file should be presented to the user as the final report.
Severity Classification
| Severity | Criteria |
|---|---|
| CRITICAL | Direct value/fund loss, permanent DoS, or system insolvency |
| HIGH | Conditional value loss, privilege escalation, or broken core invariant |
| MEDIUM | Value leakage, griefing with cost, or degraded functionality |
| LOW | Informational, inefficiency, or cosmetic inconsistency with no exploit |
Output Format
Two files are produced during the audit:
1. Raw Findings (intermediate — NOT the final deliverable)
Save to: .audit/findings/feynman-analysis-raw.md
This contains ALL hypotheses from Phases 1-4 before verification. Include the Function-State Matrix, Guard Consistency Analysis, Inverse Operation Parity, and all raw findings with their initial severity classification.
2. Verified Findings (FINAL deliverable — present this to the user)
Save to: .audit/findings/feynman-verified.md
# Feynman Audit — Verified Findings
## Scope
- Language: [detected language]
- Modules analyzed: [list]
- Functions analyzed: [count]
- Lines interrogated: [count]
## Verification Summary
| ID | Original Severity | Verdict | Final Severity |
|----|-------------------|---------|----------------|
## Function-State Matrix
[The matrix from Phase 1]
## Guard Consistency Analysis
[Results from Phase 3.1 — which functions are missing expected guards]
## Inverse Operation Parity
[Results from Phase 3.2 — asymmetries between paired operations]
## Verified Findings (TRUE POSITIVES only)
### Finding FF-001: [Title]
**Severity:** CRITICAL | HIGH | MEDIUM | LOW
**Module:** [name]
**Function:** [name]
**Lines:** [L:start-end]
**Verification:** [Code trace / PoC / Hybrid] — [test file if PoC]
**Feynman Question that exposed this:**
> [The exact question from the framework]
**The code:**
```[language]
// [affected code block]
Why this is wrong: [First-principles explanation — no jargon, no pattern names. Explain like you're teaching someone who has never seen this bug class.]
Verification evidence: [For code trace: the exact mitigating/confirming code paths traced] [For PoC: test name, key log output, concrete numbers]
Attack scenario:
- [Step-by-step exploitation]
Impact: [What an attacker gains or what breaks]
Suggested fix:
// [minimal fix]
False Positives Eliminated
[Findings that failed verification, with explanation of WHY they are false]
Downgraded Findings
[Findings where severity was reduced, with justification]
LOW Findings (verified by inspection)
[Table of LOW findings with brief verdict]
Summary
- Total functions analyzed: [N]
- Raw findings (pre-verification): [N] CRITICAL | [N] HIGH | [N] MEDIUM | [N] LOW
- After verification: [N] TRUE POSITIVE | [N] FALSE POSITIVE | [N] DOWNGRADED
- Final: [N] HIGH | [N] MEDIUM | [N] LOW
Post-Audit Actions
| Scenario | Action |
|---|---|
| Need deeper context on a function | Re-read the function and its callers line-by-line |
| Finding confirmed as true positive | Write up with severity, trigger sequence, PoC, and fix |
| Need exploit validation | Write a Foundry/Hardhat PoC test to confirm |
| Uncertain about design intent | Check NatSpec, comments, and project documentation |
Anti-Hallucination Protocol
NEVER:
- Invent code that doesn't exist in the codebase
- Assume a function has an access guard without verifying
- Claim a variable is uninitialized without checking constructors/initializers
- Report a finding without showing the exact vulnerable code
- Use phrases like "could potentially" or "might be vulnerable"
- Apply language-specific assumptions to a different language
(e.g., don't assume Rust code has Solidity's reentrancy model)
ALWAYS:
- Read the actual code before questioning it
- Verify your assumptions by reading called functions
- Check constructors, initializers, and default values
- Confirm guard/access-control behavior by reading the actual implementation
- Show exact file paths and line numbers for all references
- Use the correct language terminology (not Solidity terms for Rust code)
Quick-Start Checklist
When starting a Feynman audit:
- [ ] Phase 0: Detect the language and adapt terminology
- [ ] Phase 0: Answer the 4 attacker mindset questions BEFORE reading code
- [ ] Phase 0: Build the Attacker's Hit List (attack goals, novel code, value stores, complex paths)
- [ ] Phase 0: Prioritize targets — functions appearing in multiple answers get audited first
- [ ] Phase 1: List all modules/contracts/packages in scope
- [ ] Phase 1: Build the Function-State Matrix
- [ ] Phase 1: Identify all function pairs (deposit/withdraw, etc.)
- [ ] Phase 2: Run Feynman interrogation on every entry point (priority order from Phase 0)
- [ ] Phase 3: Run cross-function analysis (guard consistency, inverse parity, value flow)
- [ ] Phase 4: Document all SUSPECT and VULNERABLE verdicts as raw findings
- [ ] Phase 4: Save raw findings to
.audit/findings/feynman-analysis-raw.md - [ ] Phase 5: Verify ALL C/H/M findings (code trace + PoC where needed)
- [ ] Phase 5: Eliminate false positives, downgrade inflated severities
- [ ] Phase 5: Save verified findings to
.audit/findings/feynman-verified.md - [ ] Phase 5: Present ONLY the verified report to the user