multi-ai-code-review
Multi-perspective code review using Claude, Gemini, and Codex as specialized agents. 5-dimensional analysis (security, performance, maintainability, correctness, style) with LLM-as-judge consensus, quality scoring, and CI/CD integration. Use when reviewing PRs, auditing code quality, preparing production releases, or establishing code review workflows.
下記のコマンドをコピーしてターミナル(Mac/Linux)または PowerShell(Windows)に貼り付けてください。 ダウンロード → 解凍 → 配置まで全自動。
mkdir -p ~/.claude/skills && cd ~/.claude/skills && curl -L -o multi-ai-code-review.zip https://jpskill.com/download/9444.zip && unzip -o multi-ai-code-review.zip && rm multi-ai-code-review.zip
$d = "$env:USERPROFILE\.claude\skills"; ni -Force -ItemType Directory $d | Out-Null; iwr https://jpskill.com/download/9444.zip -OutFile "$d\multi-ai-code-review.zip"; Expand-Archive "$d\multi-ai-code-review.zip" -DestinationPath $d -Force; ri "$d\multi-ai-code-review.zip"
完了後、Claude Code を再起動 → 普通に「動画プロンプト作って」のように話しかけるだけで自動発動します。
💾 手動でダウンロードしたい(コマンドが難しい人向け)
- 1. 下の青いボタンを押して
multi-ai-code-review.zipをダウンロード - 2. ZIPファイルをダブルクリックで解凍 →
multi-ai-code-reviewフォルダができる - 3. そのフォルダを
C:\Users\あなたの名前\.claude\skills\(Win)または~/.claude/skills/(Mac)へ移動 - 4. Claude Code を再起動
⚠️ ダウンロード・利用は自己責任でお願いします。当サイトは内容・動作・安全性について責任を負いません。
🎯 このSkillでできること
下記の説明文を読むと、このSkillがあなたに何をしてくれるかが分かります。Claudeにこの分野の依頼をすると、自動で発動します。
📦 インストール方法 (3ステップ)
- 1. 上の「ダウンロード」ボタンを押して .skill ファイルを取得
- 2. ファイル名の拡張子を .skill から .zip に変えて展開(macは自動展開可)
- 3. 展開してできたフォルダを、ホームフォルダの
.claude/skills/に置く- · macOS / Linux:
~/.claude/skills/ - · Windows:
%USERPROFILE%\.claude\skills\
- · macOS / Linux:
Claude Code を再起動すれば完了。「このSkillを使って…」と話しかけなくても、関連する依頼で自動的に呼び出されます。
詳しい使い方ガイドを見る →- 最終更新
- 2026-05-18
- 取得日時
- 2026-05-18
- 同梱ファイル
- 1
📖 Claude が読む原文 SKILL.md(中身を展開)
この本文は AI(Claude)が読むための原文(英語または中国語)です。日本語訳は順次追加中。
Multi-AI Code Review
Overview
multi-ai-code-review provides comprehensive code review using multiple AI models as specialized agents, each analyzing code from a different perspective. Based on 2024-2025 best practices for AI-assisted code review.
Purpose: Multi-perspective code quality assessment using AI ensemble with human oversight
Pattern: Task-based (5 independent review dimensions + orchestration)
Key Principles (validated by tri-AI research):
- Multi-Agent Architecture - Specialized agents for each review dimension
- LLM-as-Judge Consensus - Flag issues only when 2+ models agree
- Progressive Severity - Critical → High → Medium → Low prioritization
- Human-in-Loop - AI suggests, human decides
- Quality Gates - Block merges for critical unresolved issues
- Actionable Feedback - Every comment has What/Where/Why/How
Quality Targets:
- False Positive Rate: <15%
- Fix Acceptance Rate: >40%
- Review Turnaround: <5 minutes
- Bug Catch Rate: >30% pre-production
When to Use
Use multi-ai-code-review when:
- Reviewing pull requests (any size)
- Auditing code quality before release
- Establishing consistent code review standards
- Security auditing code changes
- Performance profiling changes
- Technical debt assessment
- Onboarding reviews (mentorship mode)
When NOT to Use:
- Trivial changes (typos, comments only)
- Automated dependency updates (use dependabot labels)
- Generated code (migrations, scaffolds)
Prerequisites
Required
- Code to review (diff, file, or directory)
- At least one AI available (Claude required, Gemini/Codex optional)
Recommended
- Gemini CLI for web research and fast analysis
- Codex CLI for deep code reasoning
- Git repository context
Integration
- GitHub Actions (optional, for CI/CD)
- Pre-commit hooks (optional, for local checks)
Review Dimensions
5-Dimensional Analysis
| Dimension | Agent | Focus | Weight |
|---|---|---|---|
| Security | Security Specialist | OWASP Top 10, secrets, injection | 25% |
| Performance | Performance Engineer | Complexity, memory, latency | 20% |
| Maintainability | Architect | Patterns, modularity, DRY | 25% |
| Correctness | QA Engineer | Logic, edge cases, tests | 20% |
| Style | Nitpicker | Naming, formatting, conventions | 10% |
Severity Levels
| Level | Action | Examples |
|---|---|---|
| Critical | Block merge | SQL injection, exposed secrets, data loss |
| High | Require fix | Race conditions, missing auth, memory leaks |
| Medium | Suggest fix | Code duplication, missing tests, complexity |
| Low | Optional | Style issues, naming, minor refactors |
Operations
Operation 1: Quick Security Scan
Time: 2-5 minutes Automation: 80% Purpose: Fast security-focused review
Process:
- Scan for Critical Issues:
Review this code for security vulnerabilities:
- SQL injection
- XSS vulnerabilities
- Hardcoded secrets/API keys
- Authentication bypasses
- Authorization flaws
- Input validation gaps
- Insecure dependencies
Code: [PASTE CODE OR DIFF]
For each issue found, provide:
- Severity (Critical/High/Medium)
- Location (file:line)
- Description (what's wrong)
- Fix (specific code change)
- Validate with Gemini (optional):
gemini -p "Verify these security findings. Are any false positives? [PASTE CLAUDE FINDINGS]
Code context: [PASTE RELEVANT CODE]"
3. **Output**: Security report with consensus findings
---
### Operation 2: Comprehensive PR Review
**Time**: 10-30 minutes
**Automation**: 60%
**Purpose**: Full multi-dimensional review
**Process**:
**Step 1: Gather Context**
```bash
# Get PR diff
git diff main...HEAD > /tmp/pr_diff.txt
# Identify affected areas
grep -E "^(\\+\\+\\+|---)" /tmp/pr_diff.txt | head -20
Step 2: Run Parallel Agent Reviews
Use Task tool to launch parallel agents:
Launch 3 parallel review agents:
Agent 1 (Security):
"Review this diff for security issues. Focus on:
- OWASP Top 10 vulnerabilities
- Authentication/authorization
- Input validation
- Secrets exposure
Diff: [DIFF]"
Agent 2 (Maintainability):
"Review this diff for maintainability. Focus on:
- Design patterns used correctly
- Code duplication (DRY)
- Modularity and cohesion
- Documentation quality
Diff: [DIFF]"
Agent 3 (Correctness):
"Review this diff for correctness. Focus on:
- Logic errors
- Edge cases not handled
- Test coverage gaps
- Error handling
Diff: [DIFF]"
Step 3: Orchestrate & Deduplicate
Synthesize findings from all agents:
[PASTE ALL AGENT OUTPUTS]
Tasks:
1. Remove duplicate findings
2. Rank by severity (Critical > High > Medium > Low)
3. Group by file
4. Generate summary table
5. Create final report with consensus issues only
Step 4: Generate Report
Output format:
## PR Review Summary
| File | Risk | Issues | Critical | High | Medium |
|------|------|--------|----------|------|--------|
| auth.py | High | 3 | 1 | 2 | 0 |
| api.py | Medium | 2 | 0 | 1 | 1 |
### Critical Issues (Block Merge)
1. **[auth.py:45]** SQL Injection vulnerability
- Why: User input directly in query
- Fix: Use parameterized queries
### High Issues (Require Fix)
...
### Consensus Score: 72/100
- Security: 65/100
- Performance: 80/100
- Maintainability: 70/100
- Correctness: 75/100
- Style: 85/100
Operation 3: LLM-as-Judge Tribunal
Time: 5-15 minutes Automation: 70% Purpose: High-confidence findings through consensus
Process:
- Run Code Through Multiple Models:
Claude Analysis:
Analyze this code for issues. Rate severity 1-10 for each:
[CODE]
Gemini Analysis (via CLI):
gemini -p "Analyze this code for issues. Rate severity 1-10 for each:
[CODE]"
Codex Analysis (via CLI):
codex "Analyze this code for issues. Rate severity 1-10 for each:
[CODE]"
- Calculate Consensus:
Given these analyses from 3 AI models:
Claude: [FINDINGS] Gemini: [FINDINGS] Codex: [FINDINGS]
Identify issues where at least 2 models agree:
-
List consensus findings
-
Average severity scores
-
Note any disagreements
-
Final verdict for each issue
-
Output: High-confidence issue list (≥67% agreement)
Operation 4: Mentorship Review
Time: 15-30 minutes Automation: 40% Purpose: Educational code review for learning
Process:
Review this code in mentorship mode. For a developer learning [LANGUAGE/FRAMEWORK]:
Code: [CODE]
For each finding:
1. **What's the issue** (be encouraging, not critical)
2. **Why it matters** (explain the underlying concept)
3. **How to improve** (show before/after with explanation)
4. **Learn more** (link to relevant documentation)
Also highlight:
- What was done well
- Good patterns to continue using
- Growth opportunities
Tone: Supportive and educational, never condescending.
Operation 5: Pre-Release Audit
Time: 30-60 minutes Automation: 50% Purpose: Comprehensive review before production
Process:
-
Full Codebase Scan:
# Identify all changes since last release git diff v1.0.0...HEAD --stat git log v1.0.0...HEAD --oneline -
Security Deep Dive:
- Run all security checks
- Verify no new vulnerabilities
- Check dependency updates
- Audit secrets management
- Performance Review:
- Identify potential bottlenecks
- Review database queries
- Check for N+1 problems
- Validate caching strategies
- Test Coverage:
- Verify test coverage targets
- Check critical path coverage
- Validate edge case tests
- Generate Release Report:
## Pre-Release Audit: v1.1.0
Security Clearance: PASS ✓
- No critical vulnerabilities
- All high issues resolved
- Secrets audit: Clean
Performance Assessment: PASS ✓
- No new N+1 queries
- Response time within SLA
- Memory usage stable
Test Coverage: 82% (target: 80%)
- Critical paths: 95%
- Edge cases: 78%
Release Recommendation: APPROVED
---
## Multi-AI Coordination
### Agent Assignment Strategy
| Task | Primary | Verification | Speed |
|------|---------|--------------|-------|
| Security scan | Claude | Gemini | Fast |
| Architecture review | Claude | Codex | Medium |
| Logic validation | Codex | Claude | Medium |
| Style checking | Gemini | Claude | Fast |
| Performance analysis | Claude | Codex | Medium |
### Coordination Commands
**Launch Multi-Agent Review**:
```bash
# Using Task tool for parallel execution
# Each agent reviews independently, orchestrator synthesizes
Gemini Quick Check:
gemini -p "Quick security scan of this code: [CODE]"
Codex Deep Analysis:
codex "Analyze this code architecture and suggest improvements: [CODE]"
CI/CD Integration
GitHub Actions Workflow
# .github/workflows/ai-review.yml
name: Multi-AI Code Review
on: [pull_request]
jobs:
review:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Get PR Diff
run: |
git diff origin/main...HEAD > pr_diff.txt
- name: Claude Review
uses: anthropics/claude-code-action@v1
with:
anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }}
model: "claude-sonnet-4-5-20250929"
review_level: "detailed"
- name: Post Summary
uses: actions/github-script@v7
with:
script: |
github.rest.issues.createComment({
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.repo,
body: `## AI Review Summary\n${process.env.REVIEW_SUMMARY}`
})
Quality Gate Configuration
# Block merge for critical issues
quality_gates:
critical_issues: 0 # Must be zero
high_issues: 3 # Max allowed
coverage_minimum: 80 # Percent
score_minimum: 70 # Out of 100
Quality Scoring
Scoring Formula
Overall = (Security × 0.25) + (Performance × 0.20) +
(Maintainability × 0.25) + (Correctness × 0.20) +
(Style × 0.10)
Grade Mapping
| Score | Grade | Status |
|---|---|---|
| ≥90 | A | Excellent - Ship it |
| 80-89 | B | Good - Minor fixes |
| 70-79 | C | Acceptable - Address issues |
| 60-69 | D | Needs work - Significant fixes |
| <60 | F | Failing - Major revision needed |
Anti-Patterns to Detect
- Hardcoded Secrets - API keys, passwords in code
- SQL Injection - Unparameterized queries
- XSS Vulnerabilities - Unsanitized output
- Race Conditions - Unprotected shared state
- Memory Leaks - Unclosed resources
- N+1 Queries - Loop database calls
- Dead Code - Unreachable branches
- God Objects - Classes doing too much
- Copy-Paste Code - Duplicated logic
- Missing Error Handling - Unhandled exceptions
Example Review Session
User: Review this PR for my authentication module
Claude: I'll perform a comprehensive multi-dimensional review.
[Launches parallel agents for security, maintainability, correctness]
## PR Review: Authentication Module
### Critical Issues (1)
1. **[auth.py:67]** Password stored in plaintext
- Severity: Critical
- Consensus: 3/3 models agree
- Fix: Use bcrypt hashing
```python
# Before
user.password = request.password
# After
import bcrypt
user.password = bcrypt.hashpw(request.password.encode(), bcrypt.gensalt())
High Issues (2)
- [auth.py:45] No rate limiting on login endpoint
- [auth.py:89] JWT secret hardcoded
Quality Score: 58/100 (Grade: F)
- Security: 35/100 (Critical issues)
- Performance: 70/100
- Maintainability: 65/100
- Correctness: 60/100
- Style: 80/100
Recommendation: BLOCK MERGE
Resolve critical security issues before merging.
---
## Related Skills
- **multi-ai-testing**: Generate tests for reviewed code
- **multi-ai-verification**: Validate fixes
- **multi-ai-implementation**: Implement suggested fixes
- **codex-review**: Codex-specific review patterns
- **review-multi**: Skill-specific reviews
---
## References
- `references/security-checklist.md` - OWASP Top 10 checklist
- `references/performance-patterns.md` - Performance anti-patterns
- `references/ci-cd-integration.md` - Full CI/CD setup guide