📄 ドキュメントコミュニティ

litreview

Academic literature orientation skill that searches papers via Consensus, builds a strategic search plan using PICO (default) or SPIDER / Decomposition / hybrid as fallbacks, and synthesizes findings into a professionally formatted Word document (.docx) research guide. Grill-me intake (research question specificity + framework hint + tentative depth) before the recon search; a second forcing checkpoint after Phase 2 confirms framework + sub-areas + depth before searches consume budget. Configurable depth (5/10/20 queries) controls coverage vs. speed. Output is a 'launching pad' — not a finished review, but an orientation guide that lets a researcher dive in confidently. Triggers: 'litreview on [topic]', 'literature review on [topic]', 'I'm starting a literature review on X', 'I'm writing a paper on X', 'help me research X', 'I'm doing research on X', 'can you help me research X'. Do NOT trigger for single one-off paper searches where the user just wants a quick list — that's a plain Consensus search.

⚡ おすすめ: コマンド1行でインストール(60秒)

下記のコマンドをコピーしてターミナル(Mac/Linux)または PowerShell(Windows)に貼り付けてください。ダウンロード → 解凍 → 配置まで全自動。

🍎 Mac / 🐧 Linux

mkdir -p ~/.claude/skills && cd ~/.claude/skills && curl -L -o litreview.zip https://jpskill.com/download/21985.zip && unzip -o litreview.zip && rm litreview.zip

🪟 Windows (PowerShell)

$d = "$env:USERPROFILE\.claude\skills"; ni -Force -ItemType Directory $d | Out-Null; iwr https://jpskill.com/download/21985.zip -OutFile "$d\litreview.zip"; Expand-Archive "$d\litreview.zip" -DestinationPath $d -Force; ri "$d\litreview.zip"

完了後、Claude Code を再起動 → 普通に「動画プロンプト作って」のように話しかけるだけで自動発動します。

💾 手動でダウンロードしたい(コマンドが難しい人向け)

1. 下の青いボタンを押して litreview.zip をダウンロード
2. ZIPファイルをダブルクリックで解凍 → litreview フォルダができる
3. そのフォルダを C:\Users\あなたの名前\.claude\skills\(Win)または ~/.claude/skills/(Mac)へ移動
4. Claude Code を再起動

⬇ .zip でダウンロード(推奨) ⬇ .skill 形式(上級者用) 元のソース ↗

⚠️ ダウンロード・利用は自己責任でお願いします。当サイトは内容・動作・安全性について責任を負いません。

🎯 このSkillでできること

下記の説明文を読むと、このSkillがあなたに何をしてくれるかが分かります。Claudeにこの分野の依頼をすると、自動で発動します。

📦 インストール方法 (3ステップ)

1. 上の「ダウンロード」ボタンを押して .skill ファイルを取得
2. ファイル名の拡張子を .skill から .zip に変えて展開(macは自動展開可)
3. 展開してできたフォルダを、ホームフォルダの .claude/skills/ に置く
- · macOS / Linux: ~/.claude/skills/
- · Windows: %USERPROFILE%\.claude\skills\

Claude Code を再起動すれば完了。「このSkillを使って…」と話しかけなくても、関連する依頼で自動的に呼び出されます。

詳しい使い方ガイドを見る →

最終更新: 2026-05-18
取得日時: 2026-05-18
同梱ファイル: 7

📖 Skill本文(日本語訳)

※ 原文(英語/中国語)を Gemini で日本語化したものです。Claude 自身は原文を読みます。誤訳がある場合は原文をご確認ください。

[スキル名] litreview

Litreview — 学術文献の方向付け

移植性: Consensus MCP接続、ドキュメント生成用のdocxパッケージを備えたNode.js、および（CLIでは）bash_toolが必要です。Claude Code CLIではネイティブに動作します。Claude.aiとConsensus MCP + コード実行の組み合わせでも、ワークフローはサポートされています。

完成した文献レビューではなく、出発点となる、不慣れな分野に参入する研究者が自信を持って読書や検索を始めるために必要なすべてを提供する方向付けドキュメントを作成します。まるで、その分野を知っている寛大な同僚がコーヒーを飲みながら教えてくれるような内容を考えてください。

エージェントの整合性ルール（Research-Pack規約）

research-pack規約から継承されています。PR #657のクロススキル整合性監査により、逐語的にロックされています。

情報源の規律。 このセッションでConsensusから返された論文のみを引用します。[Not from Consensus — model knowledge]とラベル付けされたトレーニング知識は、引用数から除外されます。結果が少ない場合は明示的に述べ、決して黙って埋め合わせません。
計数の規律。 3つの数値を追跡します：実行された検索数 / 受信したユニークな論文数（重複排除済み） / 引用された論文数。引用されたすべての論文には、このセッションからの取得可能なConsensus URLがあります。確定的な計数にはscripts/citation_tracker.pyを使用します。
ツールの制約。 Consensusのクエリごとの上限はプランティアによって異なります。最初の検索で検出し、チェックポイントで報告します。レート制限は1クエリ/秒です — 逐次実行が必須です。
再試行ポリシー。 失敗した場合 → 3秒待機 → 1回再試行 → ログに記録します。3回連続で失敗した場合：停止し、ユーザーに警告し、収集されたものを共有します。
プランティアの検出。 最初の検索応答を解析し、「Showing top 10」/「upgrade」→ フリーティア（10件/検索）を検出します。20件返された場合 → プロ（20件/検索）です。理論上の上限を計算し、ユーザーが再調整できるようにチェックポイントで表示します。

逐次実行の根拠とプランティアのシグナルについては、references/search_budget_allocation.mdを参照してください。

エラー処理

失敗	動作
Consensusのレート制限に達した	3秒待機、1回再試行、結果をログに記録
検索結果が0件	明示的に「ニッチな用語か、真のギャップか」と記述。決して黙って埋め合わせない
プランティアの上限を検出	ティアをログに記録。チェックポイントで報告。監査で表示
3回連続で失敗	検索を停止、ユーザーに警告、収集されたものを共有、続行方法を尋ねる
サブエリアの結果が少ない（5件未満）	監査でフラグを立てる。手動でのPubMed/Scholarによる補完を提案
ユーザーがサブエリアを調整したい	テーブルを更新し、検索前に再確認
DOCXの検証に失敗した	XMLを解凍、修正、再パック

フェーズ0：グリルミーインテーク（3つの強制質問、1つずつ）

各質問には「なぜ尋ねるのか」が明示されています。停止条件：フェーズ1までに最大3つ。

Q1（根幹）— 研究質問の具体性

研究質問を1〜2文で述べてください。具体的な方が良いです — 「LLMは臨床推論タスクにおいて医師と比較してどのように機能するか？」は「医療におけるAI」よりも優れています。曖昧な質問は曖昧なレビューを生み出します。

なぜ尋ねるのか: 偵察検索は正確な用語に依存します。曖昧な質問は、有用なフレームワークの内訳をもたらさない薄い偵察結果を生み出します。

曖昧な回答は拒否します。 ユーザーが広すぎる場合は、例を挙げて一度再質問します。それでも曖昧な場合は、「広範な方向付けであり、詳細なレビューではない」という明示的な注意書きを付けて提供します。

Q2（Q1に依存）— フレームワークのヒント

フレームワーク — 1つ選ぶか、「あなたが選ぶ」と言ってください:

PICO (Population / Intervention / Comparison / Outcome — ほとんどの臨床質問)

SPIDER (Sample / Phenomenon / Design / Evaluation / Research-type — 社会/定性的)

Decomposition (Problem / Solution / Evaluation / Limitations — 技術中心)

Hybrid (どのフレームワークのどのコンポーネントを選ぶか)

あなたが選ぶ — Q1を分析して推奨

なぜ尋ねるのか: PICOは臨床質問の約70%のデフォルトですが、定性的な研究や技術評価にはうまくマッピングされません。事前に選択することで、偵察検索が誤ったフレームワークを提案するのを防ぎます。

デフォルト（「あなたが選ぶ」）で選択を強制します。スキルは偵察検索後に独自のフレームワーク推奨を表示し、ユーザーが上書きできるようにします。ヒューリスティックにはscripts/framework_recommender.pyを使用します。

PICO / SPIDER / Decompositionの規範については、references/framework_selection.mdを参照してください。

Q3（Q1に依存）— 暫定的な深さ

暫定的な深さ — 1つ選んでください。最終確認はフレームワークの内訳の後に行われます:

クイックスキャン (5回の検索)

標準レビュー (10回の検索)

ディープダイブ (20回の検索)

なぜ尋ねるのか: これを2回尋ねます — 1回は偵察検索の重点を調整するため、もう1回はフレームワークの内訳の後に確認するためです。暫定的な回答は、どのサブエリアを最初に表示するかに影響し、最終的な回答は検索予算の割り当てを決定します。

選択を強制します。ユーザーがフレームワークの内訳を見た後、フェーズ2後のチェックポイントで再質問されます。

停止条件: フェーズ1までに最大3つの質問。フェーズ2後のチェックポイントは、それ自体がグリルミーの瞬間です（フレームワークテーブル + サブエリア調整 + 深さの再確認）。

フェーズ1：初期偵察

テーマ、用語、方法論的な区別をマッピングするための1回の広範なConsensus検索。

クエリ：Q1の広範なバージョン（用語のバリエーションは許容されます。最初の検索は広く行われます）
記録：citation_tracker.py --action record_search --session NAME --query "..."
受信数を記録：citation_tracker.py --action record_papers_received --session NAME --count N
応答からプランティアを検出：「Showing top 10」/「upgrade」→ フリー。20件返された場合 → プロ。

チェックポイント用に統合します：

表面化したテーマ
用語のバリエーション（例：「LLM」 vs 「large language model」 vs 「GPT-style model」）
方法論的な区別（臨床試験 vs ベンチマーク評価 vs ケーススタディ）
カバレッジのギャップ（偵察結果にないサブ質問）

フェーズ2：フレームワークの選択 + サブエリアの生成

フレームワークを選択します（Q2から、または偵察に基づいて上書き）：

PICO — ほとんどの臨床

📜 原文 SKILL.md(Claudeが読む英語/中国語)を展開

Litreview — Academic Literature Orientation

Portability: Requires a Consensus MCP connection, Node.js with docx package for document generation, and (in CLI) bash_tool. Works in Claude Code CLI natively. In Claude.ai with Consensus MCP + Code Execution, the workflow is supported.

Produce a launching pad — not a finished literature review, but an orientation document that gives a researcher entering an unfamiliar field everything they need to start reading and searching with confidence. Think: what a generous colleague who knows the field would tell you over coffee.

Agent Integrity Rules (Research-Pack Convention)

Inherited from the research-pack convention; locked verbatim per PR #657's cross-skill consistency audit.

Source discipline. Only cite Consensus-returned papers from THIS session. Training knowledge labeled [Not from Consensus — model knowledge] and excluded from cited count. Sparse results stated explicitly, never silently filled.
Counting discipline. Three numbers tracked: searches executed / unique papers received (deduplicated) / papers cited. Every cited paper has a retrievable Consensus URL from this session. Use scripts/citation_tracker.py for deterministic counts.
Tool constraints. Consensus per-query cap depends on plan tier. Detect at first search, report at checkpoint. Rate limit is 1 query/sec — sequential execution mandatory.
Retry policy. On failure → wait 3s → retry once → log. After 3 consecutive failures: stop, alert user, share what was collected.
Plan-tier detection. Parse first-search response for "Showing top 10" / "upgrade" → free tier (10/search). 20 returned → Pro (20/search). Calculate theoretical ceiling and surface at checkpoint so user can recalibrate.

See references/search_budget_allocation.md for the sequential-execution rationale + plan-tier signals.

Error Handling

Failure	Behavior
Consensus rate-limit hit	Wait 3s, retry once, log outcome
Search returns 0 results	Note explicitly; "either niche terminology or genuine gap"; never silently fill
Plan-tier cap detected	Log tier; report at checkpoint; surface in audit
3 consecutive failures	Stop searching, alert user, share what's collected, ask how to proceed
Sub-area returns thin results (<5 papers)	Flag in audit; suggest manual PubMed/Scholar supplementation
User wants to adjust sub-areas	Update table, re-confirm before searching
DOCX validation fails	Unpack XML, fix, repack

Phase 0: Grill-Me Intake (3 forcing questions, one at a time)

Each question carries explicit "why I'm asking". Stop condition: max 3 before Phase 1.

Q1 (root) — Research question specificity

State the research question in 1–2 sentences. Specific is better — "How do LLMs perform on clinical reasoning tasks compared to physicians?" beats "AI in medicine". Vague questions produce vague reviews.

Why I'm asking: The reconnaissance search hinges on precise terminology. Vague questions produce thin recon results that don't yield a useful framework breakdown.

Refuse mush. Re-ask once with examples if user is too broad. If still vague, deliver with explicit "broad-scope orientation, not depth review" caveat.

Q2 (depends on Q1) — Framework hint

Framework — pick one or say "you pick":

PICO (Population / Intervention / Comparison / Outcome — most clinical questions)

SPIDER (Sample / Phenomenon / Design / Evaluation / Research-type — social/qualitative)

Decomposition (Problem / Solution / Evaluation / Limitations — technology-focused)

Hybrid (you pick which components from which framework)

You pick — analyze Q1 and recommend

Why I'm asking: PICO is the default for ~70% of clinical questions but maps poorly to qualitative work or technology evaluation. Picking upfront saves the recon search from suggesting a misaligned framework.

Forcing choice with default ("you pick"). The skill surfaces its own framework recommendation after the recon search so user can override. Use scripts/framework_recommender.py for the heuristic.

See references/framework_selection.md for PICO / SPIDER / Decomposition canon.

Q3 (depends on Q1) — Tentative depth

Tentative depth — pick one. Final confirmation comes after the framework breakdown:

Quick scan (5 searches)

Standard review (10 searches)

Deep dive (20 searches)

Why I'm asking: I ask this twice — once now to calibrate the recon search emphasis, once after the framework breakdown to confirm. Tentative answer affects which sub-areas to surface first; final answer drives search budget allocation.

Forcing choice. Re-asked at the post-Phase-2 checkpoint after the user has seen the framework breakdown.

Stop condition: 3 questions max before Phase 1. The post-Phase-2 checkpoint is its own grill-me moment (framework table + sub-area-adjustment + depth-reconfirmation).

Phase 1: Initial Reconnaissance

One broad Consensus search to map themes, terminology, methodological distinctions.

Query: broad version of Q1 (terminology variants are okay; first search casts wide)
Record: citation_tracker.py --action record_search --session NAME --query "..."
Record received count: citation_tracker.py --action record_papers_received --session NAME --count N
Detect plan tier from response: "Showing top 10" / "upgrade" → free; 20 returned → Pro

Synthesize for the checkpoint:

Themes that surfaced
Terminology variations (e.g., "LLM" vs "large language model" vs "GPT-style model")
Methodological distinctions (clinical trials vs benchmark eval vs case study)
Coverage gaps (sub-questions absent from recon results)

Phase 2: Framework Selection + Sub-area Generation

Choose framework (from Q2 OR override based on recon):

PICO — most clinical questions (~70% default)
SPIDER — social / qualitative
Decomposition — technology focus (Problem / Solution / Evaluation / Limitations)
Hybrid — explicit cross-framework mapping

Generate 4-5 sub-area questions mapped to framework components. Each becomes a targeted Phase 3 search.

Checkpoint (grill-me forcing-options moment)

After Phase 2, halt and present:

3-4 sentence recon summary

What themes surfaced
Terminology landscape
Evidence landscape characterization

Framework breakdown table

Framework Component	How It Maps to This Topic	Proposed Sub-area to Explore
(Component 1)	...	Sub-area 1
(Component 2)	...	Sub-area 2
(Component 3)	...	Sub-area 3
(Component 4)	...	Sub-area 4
Cross-cutting theme	...	Sub-area 5

Depth re-confirmation (forcing choice)

Surface the practical constraint: detected plan tier + theoretical ceiling.

Quick scan (5 searches × ~10 results each = ~50 papers max)
Standard review (10 searches × ~10 = ~100 papers)
Deep dive (20 searches × ~10 = ~200 papers)

Sub-area forcing options

"Looks good — proceed with these sub-areas"
"Adjust: add sub-area on [X]"
"Adjust: remove and replace [Y] with [Z]"
"Restart with different framework"

Why I'm asking (the rationale)

A wrong framework or sub-area set wastes the search budget. This is the last cheap moment to correct course.

Wait for user response before Phase 3. Refuse to start Phase 3 without explicit user choice.

Phase 3: Targeted Searches

Sequential (1 query/sec), budget per depth tier. See references/search_budget_allocation.md for full canon.

Quick scan (5 searches)

5 sub-area searches (one per sub-area)
Skip era-gated + review-specific

Standard review (10 searches)

5 sub-area searches
2 review article searches (top 2 sub-areas): "systematic review [topic]" / "meta-analysis [topic]"
2 era-gated searches (most important sub-area): year_max: 2015 + year_min: 2021
1 follow-up on highest-cited paper using its key terms + year_min after publication

Deep dive (20 searches)

5 sub-area searches
5 review article searches (one per sub-area)
4 era-gated searches (top 2 sub-areas, old + new each)
3 follow-ups on top 3 highest-cited papers
3 spare for emerging threads (surprising findings to chase)

Throughout: 1 q/sec rate limit. Sequential. Confirm response before next call. Record each via citation_tracker.py.

Cross-Search Intelligence

Three trackers across ALL search results — run scripts/cross_search_aggregator.py --session NAME after Phase 3 completes:

Repeat-hit papers — same paper appearing in 3+ sub-area searches = likely foundational
Recurring authors — same author in multiple searches = dominant research group; top 3-5 most frequent matter
Citation-per-year heuristic — a 2023 paper with 150 citations >> 2008 paper with 150 citations. Use for seminal-work identification.

These feed the "Start Here" + "Key Research Groups" + "Bibliography" DOCX sections.

Phase 4: DOCX Research Guide

Generate via Node.js + docx library. 8 sections (see references/docx_8_sections.md for full spec):

Topic Overview — single tight paragraph (4-6 sentences)
Start Here — Priority Reading Order — 5-7 papers ordered: best recent review → foundational → 2-3 frontier → gap/controversy. Each: hyperlinked title + authors/year + 1-sentence contribution + 1-sentence "what to look for"
How the Field Got Here — chronological narrative (1-2 paragraphs) + timeline table (5-8 milestones: Year / Milestone / Significance) + terminology evolution note
Sub-area Guides (one per sub-area, 4 parts each)
- 4a. What the Research Shows (2-3 sentence synthesis with inline citations)
- 4b. Key Papers (3-5 hyperlinked papers with citation count, year, 1-sentence importance)
- 4c. Key Search Terms (6-10 keywords, synonyms, MeSH, historical terms)
- 4d. Boolean Search Strings (2-3 ready-to-paste strings)
Key Research Groups — top 3-5 authors/groups with affiliations, sub-area coverage, representative paper link (from cross-search aggregator)
Open Questions & Gaps — three categories: methodological / population-context / conceptual-theoretical. Each gap explains why it matters.
Bibliography — alphabetical by first author. Every entry has clickable "View on Consensus" link. Every inline citation matches a bibliography entry.
Audit Log — search summary table (#, query, filters, papers returned, status), counts block, coverage notes including detected tier and theoretical ceiling

DOCX Technical Requirements

Document the key docx library patterns:

Page: US Letter, 1-inch margins
Lists: LevelFormat.BULLET (never unicode bullets)
Hyperlinks: ExternalHyperlink with style: "Hyperlink", full URL (never truncated)
Tables: dual widths (columnWidths + cell width), ShadingType.CLEAR
Validation step after save (python scripts/office/validate.py output.docx)

Reference the docx skill for setup patterns and best practices.

Output

research_guide_<topic-slug>_<YYYY-MM-DD>.docx

Plus:

Chat summary block: "Saved: <path>. Audit: N searches × M unique papers / K cited. Plan tier: <tier>."
Audit log printed inline if user asks for it

Tooling

Script	Role
`scripts/citation_tracker.py`	JSON-backed three-count audit at `~/.litreview_sessions/<session>.json`
`scripts/framework_recommender.py`	Heuristic PICO/SPIDER/Decomposition suggestion from research question
`scripts/cross_search_aggregator.py`	Repeat-hits + recurring-authors + citation-per-year ranking after Phase 3

References

references/framework_selection.md — PICO / SPIDER / Decomposition canon (7+ sources)
references/search_budget_allocation.md — depth tiers + cross-search intelligence + sequential execution rationale (7+ sources)
references/docx_8_sections.md — research guide DOCX spec + technical requirements (7+ sources)

Anti-Patterns To Reject

Parallelizing Consensus calls
Skipping the interactive checkpoint (running all searches without user confirmation)
Padding thin results with training knowledge
Defaulting to non-PICO framework without justification
Citing papers in chat that didn't come from Consensus this session
Hardcoding plan tier instead of detecting from first response
Skipping era-gated searches in standard/deep budgets
Skipping cross-search intelligence (repeat-hits, recurring authors)
Truncating Consensus URLs in hyperlinks

Version: 1.0.0 Source spec: megaprompts/09-litreview-megaprompt.md Build pattern: Path B (direct conversion). Sibling of pulse (research-pack shape).

同梱ファイル

※ ZIPに含まれるファイル一覧。`SKILL.md` 本体に加え、参考資料・サンプル・スクリプトが入っている場合があります。

📄 SKILL.md (14,459 bytes)
📎 references/docx_8_sections.md (11,835 bytes)
📎 references/framework_selection.md (9,235 bytes)
📎 references/search_budget_allocation.md (9,744 bytes)
📎 scripts/citation_tracker.py (10,264 bytes)
📎 scripts/cross_search_aggregator.py (9,910 bytes)
📎 scripts/framework_recommender.py (9,199 bytes)