📄 ドキュメントコミュニティ

paper-fetch

論文のDOIやタイトル、URLから、Unpaywallなどの合法的なオープンアクセスソースを使ってPDFをダウンロードし、Sci-Hubなどの違法な手段は決して利用しないようにするSkill。

📜 元の英語説明(参考)

Use when the user wants to download a paper PDF from a DOI, title, or URL via legal open-access sources. Tries Unpaywall, arXiv, bioRxiv/medRxiv, PubMed Central, and Semantic Scholar in order. Never uses Sci-Hub or paywall bypass.

🇯🇵 日本人クリエイター向け解説

一言でいうと

※ jpskill.com 編集部が日本のビジネス現場向けに補足した解説です。Skill本体の挙動とは独立した参考情報です。

⚡ おすすめ: コマンド1行でインストール(60秒)

下記のコマンドをコピーしてターミナル(Mac/Linux)または PowerShell(Windows)に貼り付けてください。ダウンロード → 解凍 → 配置まで全自動。

🍎 Mac / 🐧 Linux

mkdir -p ~/.claude/skills && cd ~/.claude/skills && curl -L -o paper-fetch.zip https://jpskill.com/download/10333.zip && unzip -o paper-fetch.zip && rm paper-fetch.zip

🪟 Windows (PowerShell)

$d = "$env:USERPROFILE\.claude\skills"; ni -Force -ItemType Directory $d | Out-Null; iwr https://jpskill.com/download/10333.zip -OutFile "$d\paper-fetch.zip"; Expand-Archive "$d\paper-fetch.zip" -DestinationPath $d -Force; ri "$d\paper-fetch.zip"

完了後、Claude Code を再起動 → 普通に「動画プロンプト作って」のように話しかけるだけで自動発動します。

💾 手動でダウンロードしたい(コマンドが難しい人向け)

1. 下の青いボタンを押して paper-fetch.zip をダウンロード
2. ZIPファイルをダブルクリックで解凍 → paper-fetch フォルダができる
3. そのフォルダを C:\Users\あなたの名前\.claude\skills\(Win)または ~/.claude/skills/(Mac)へ移動
4. Claude Code を再起動

⬇ .zip でダウンロード(推奨) ⬇ .skill 形式(上級者用) 元のソース ↗

⚠️ ダウンロード・利用は自己責任でお願いします。当サイトは内容・動作・安全性について責任を負いません。

🎯 このSkillでできること

下記の説明文を読むと、このSkillがあなたに何をしてくれるかが分かります。Claudeにこの分野の依頼をすると、自動で発動します。

📦 インストール方法 (3ステップ)

1. 上の「ダウンロード」ボタンを押して .skill ファイルを取得
2. ファイル名の拡張子を .skill から .zip に変えて展開(macは自動展開可)
3. 展開してできたフォルダを、ホームフォルダの .claude/skills/ に置く
- · macOS / Linux: ~/.claude/skills/
- · Windows: %USERPROFILE%\.claude\skills\

Claude Code を再起動すれば完了。「このSkillを使って…」と話しかけなくても、関連する依頼で自動的に呼び出されます。

詳しい使い方ガイドを見る →

最終更新: 2026-05-18
取得日時: 2026-05-18
同梱ファイル: 1

📖 Skill本文(日本語訳)

※ 原文(英語/中国語)を Gemini で日本語化したものです。Claude 自身は原文を読みます。誤訳がある場合は原文をご確認ください。

paper-fetch

DOI（またはタイトル）が与えられた論文に対して、法的にオープンアクセスな PDF を取得します。複数の OA ソースを優先順位に従って試し、最初に見つかった時点で停止します。

解決順序

Unpaywall — https://api.unpaywall.org/v2/{doi}?email=$UNPAYWALL_EMAIL を読み込み、best_oa_location.url_for_pdf を取得します（UNPAYWALL_EMAIL が設定されていない場合はスキップされます）。
Semantic Scholar — https://api.semanticscholar.org/graph/v1/paper/DOI:{doi}?fields=openAccessPdf,externalIds
arXiv — externalIds.ArXiv が存在する場合、https://arxiv.org/pdf/{arxiv_id}.pdf
PubMed Central OA — PMCID が存在する場合、https://www.ncbi.nlm.nih.gov/pmc/articles/{pmcid}/pdf/
bioRxiv / medRxiv — DOI プレフィックスが 10.1101 の場合、https://api.biorxiv.org/details/{server}/{doi} をクエリして最新バージョンの PDF URL を取得します。
それ以外の場合 → タイトル/著者とともに失敗を報告し、ユーザーが ILL 経由でリクエストできるようにします。

タイトルのみが与えられた場合、最初に Semantic Scholar の search_paper_by_title (asta MCP) または Crossref を介して DOI に解決します。

使用法

python scripts/fetch.py <DOI> [--out DIR] [--dry-run] [--format json|text]

フラグ

フラグ	デフォルト	説明
`doi`	—	取得する DOI (位置引数、例: `10.1038/s41586-020-2649-2`)
`--batch FILE`	—	一行に一つの DOI が記述された、一括ダウンロード用のファイル
`--out DIR`	`pdfs`	出力ディレクトリ
`--dry-run`	off	ダウンロードせずにソースを解決します。PDF URL とファイル名をプレビューします。
`--format`	`json`	出力形式: `json` (エージェント向け) または `text` (人間向け)

出力契約

stdout は、単一の JSON オブジェクトを出力します（--format json の場合）。

成功（すべての DOI が解決された場合）：

{
  "ok": true,
  "data": {
    "results": [
      {
        "doi": "10.1038/s41586-020-2649-2",
        "success": true,
        "source": "unpaywall",
        "pdf_url": "https://...",
        "file": "pdfs/Author_2020_Title.pdf",
        "meta": {"title": "...", "year": 2020, "author": "Smith"}
      }
    ],
    "summary": {"total": 1, "succeeded": 1, "failed": 0}
  }
}

部分的な失敗（バッチモード — 一部の DOI が失敗した場合、終了コード 1）：

{
  "ok": true,
  "data": {
    "results": [
      {
        "doi": "10.1038/s41586-020-2649-2",
        "success": true,
        "source": "semantic_scholar",
        "pdf_url": "https://...",
        "file": "pdfs/Harris_2020_Array_programming_with_NumPy.pdf",
        "meta": {"title": "Array programming with NumPy", "year": 2020, "author": "Charles R. Harris"}
      },
      {
        "doi": "10.1234/nonexistent",
        "success": false,
        "source": null,
        "pdf_url": null,
        "file": null,
        "meta": {},
        "error": {"code": "not_found", "message": "No open-access PDF found", "retryable": false}
      }
    ],
    "summary": {"total": 2, "succeeded": 1, "failed": 1}
  }
}

トップレベルの失敗（不正な引数、終了コード 3）：

{
  "ok": false,
  "error": {
    "code": "validation_error",
    "message": "Provide a DOI or --batch file",
    "retryable": false
  }
}

stderr は、人間が読める進捗状況の診断情報（ソースの試行、ダウンロードステータス）を伝えます。

終了コード

コード	意味
`0`	すべての DOI が正常に解決されました
`1`	ランタイムエラー（一部の DOI が失敗、ネットワーク/ダウンロードの問題）
`3`	バリデーションエラー（不正な引数、入力不足）

JSON のエラーコード

コード	意味	再試行可能
`validation_error`	不正な引数または空の入力	いいえ
`not_found`	オープンアクセスの PDF が見つかりませんでした	いいえ
`download_network_error`	ダウンロード中のネットワーク障害	はい
`download_not_a_pdf`	レスポンスが PDF ではありませんでした（HTML ランディングページ）	いいえ
`download_host_not_allowed`	PDF URL のホストが許可リストにありません	いいえ
`download_size_exceeded`	レスポンスが 50 MB の制限を超えました	いいえ
`download_io_error`	ローカルファイルシステムへの書き込みに失敗しました	いいえ
`internal_error`	予期しないエラー	いいえ

例

# 単一の DOI (エージェント向けの JSON 出力)
python scripts/fetch.py 10.1038/s41586-020-2649-2

# ドライランプレビュー (ダウンロードせずに解決)
python scripts/fetch.py 10.1038/s41586-020-2649-2 --dry-run

# 人間が読める出力
python scripts/fetch.py 10.1038/s41586-020-2649-2 --format text

# バッチダウンロード
python scripts/fetch.py --batch dois.txt --out ./papers

# UNPAYWALL_EMAIL がなくても動作します (Unpaywall をスキップし、残りの 4 つのソースを使用します)
python scripts/fetch.py 10.1038/s41586-020-2649-2

注記

UNPAYWALL_EMAIL はオプションですが、推奨されます。一度設定してください: export UNPAYWALL_EMAIL=you@example.com (例: ~/.zshrc 内)。設定しない場合、Unpaywall はスキップされ、残りの 4 つのソースが試行されます。
ダウンロードは、既知の OA プロバイダーのホスト許可リストに制限されており、PDF ごとに 50 MB のサイズ制限があります。
ペイウォールを迂回しようとすることはありません。OA コピーが存在しない場合、スキルは失敗を報告します — Sci-Hub などを示唆しないでください。
デフォルトの出力ディレクトリ: ./pdfs/。ファイル名: {first_author}_{year}_{short_title}.pdf。

自動更新

git clone 経由でインストールした場合、スキルは自動的にアップストリームと同期を保ちます。呼び出しごとに、fetch.py はスキルディレクトリ内で デタッチされたバックグラウンド git pull --ff-only を生成します。

非ブロッキング — 現在の呼び出しは遅延しません。プルは新しいセッションで実行され、完全にデタッチされます。
サイレント — すべての出力は /dev/null に送られ、stdout の JSON 契約が汚染されることはありません。
スロットリング — 最大で 24 時間ごとに 1 回 (.git/.paper-fetch-last-update を介してスタンプされます)
安全 — --ff-only はローカル編集がある場合にマージを拒否します。競合は発生しません。
収束 — 更新は現在の呼び出しではなく、次の呼び出しに適用されます (プルがバックグラウンド化されているため)。

環境変数

変数	デフォルト	目的
`PAPER_FETCH_NO_AUTO_UPDATE`	未設定	何らかの値を設定すると、自動更新が完全に無効になります
`PAPER_FETCH_UPDATE_INTERVAL`	`86400`	更新試行間のクールダウン時間 (秒単位)

スキルが git チェックアウトではない場合、自動更新は no-op です (例:

(原文がここで切り詰められています)

📜 原文 SKILL.md(Claudeが読む英語/中国語)を展開

paper-fetch

Fetch the legal open-access PDF for a paper given a DOI (or title). Tries multiple OA sources in priority order and stops at the first hit.

Resolution order

Unpaywall — https://api.unpaywall.org/v2/{doi}?email=$UNPAYWALL_EMAIL, read best_oa_location.url_for_pdf (skipped if UNPAYWALL_EMAIL not set)
Semantic Scholar — https://api.semanticscholar.org/graph/v1/paper/DOI:{doi}?fields=openAccessPdf,externalIds
arXiv — if externalIds.ArXiv present, https://arxiv.org/pdf/{arxiv_id}.pdf
PubMed Central OA — if PMCID present, https://www.ncbi.nlm.nih.gov/pmc/articles/{pmcid}/pdf/
bioRxiv / medRxiv — if DOI prefix is 10.1101, query https://api.biorxiv.org/details/{server}/{doi} for the latest version PDF URL
Otherwise → report failure with title/authors so the user can request via ILL

If only a title is given, resolve to a DOI first via Semantic Scholar search_paper_by_title (asta MCP) or Crossref.

Usage

python scripts/fetch.py <DOI> [--out DIR] [--dry-run] [--format json|text]

Flags

Flag	Default	Description
`doi`	—	DOI to fetch (positional, e.g. `10.1038/s41586-020-2649-2`)
`--batch FILE`	—	File with one DOI per line for bulk download
`--out DIR`	`pdfs`	Output directory
`--dry-run`	off	Resolve sources without downloading; preview the PDF URL and filename
`--format`	`json`	Output format: `json` (for agents) or `text` (for humans)

Output contract

stdout emits a single JSON object (when --format json):

Success (all DOIs resolved):

{
  "ok": true,
  "data": {
    "results": [
      {
        "doi": "10.1038/s41586-020-2649-2",
        "success": true,
        "source": "unpaywall",
        "pdf_url": "https://...",
        "file": "pdfs/Author_2020_Title.pdf",
        "meta": {"title": "...", "year": 2020, "author": "Smith"}
      }
    ],
    "summary": {"total": 1, "succeeded": 1, "failed": 0}
  }
}

Partial failure (batch mode — some DOIs failed, exit code 1):

{
  "ok": true,
  "data": {
    "results": [
      {
        "doi": "10.1038/s41586-020-2649-2",
        "success": true,
        "source": "semantic_scholar",
        "pdf_url": "https://...",
        "file": "pdfs/Harris_2020_Array_programming_with_NumPy.pdf",
        "meta": {"title": "Array programming with NumPy", "year": 2020, "author": "Charles R. Harris"}
      },
      {
        "doi": "10.1234/nonexistent",
        "success": false,
        "source": null,
        "pdf_url": null,
        "file": null,
        "meta": {},
        "error": {"code": "not_found", "message": "No open-access PDF found", "retryable": false}
      }
    ],
    "summary": {"total": 2, "succeeded": 1, "failed": 1}
  }
}

Top-level failure (bad arguments, exit code 3):

{
  "ok": false,
  "error": {
    "code": "validation_error",
    "message": "Provide a DOI or --batch file",
    "retryable": false
  }
}

stderr carries human-readable progress diagnostics (source attempts, download status).

Exit codes

Code	Meaning
`0`	All DOIs resolved successfully
`1`	Runtime error (some DOIs failed, network/download issues)
`3`	Validation error (bad arguments, missing input)

Error codes in JSON

Code	Meaning	Retryable
`validation_error`	Bad arguments or empty input	No
`not_found`	No open-access PDF found	No
`download_network_error`	Network failure during download	Yes
`download_not_a_pdf`	Response was not a PDF (HTML landing page)	No
`download_host_not_allowed`	PDF URL host not in allowlist	No
`download_size_exceeded`	Response exceeded 50 MB limit	No
`download_io_error`	Local filesystem write failed	No
`internal_error`	Unexpected error	No

Examples

# Single DOI (JSON output for agents)
python scripts/fetch.py 10.1038/s41586-020-2649-2

# Dry-run preview (resolve without downloading)
python scripts/fetch.py 10.1038/s41586-020-2649-2 --dry-run

# Human-readable output
python scripts/fetch.py 10.1038/s41586-020-2649-2 --format text

# Batch download
python scripts/fetch.py --batch dois.txt --out ./papers

# Works without UNPAYWALL_EMAIL (skips Unpaywall, uses remaining 4 sources)
python scripts/fetch.py 10.1038/s41586-020-2649-2

Notes

UNPAYWALL_EMAIL is optional but recommended. Set it once: export UNPAYWALL_EMAIL=you@example.com (e.g. in ~/.zshrc). Without it, Unpaywall is skipped and the remaining 4 sources are still tried.
Downloads are restricted to a host allowlist of known OA providers, with a 50 MB size limit per PDF.
Never attempts to bypass paywalls. If no OA copy exists, the skill reports failure — do not suggest Sci-Hub or similar.
Default output directory: ./pdfs/. Filenames: {first_author}_{year}_{short_title}.pdf.

Auto-update

When installed via git clone, the skill keeps itself in sync with upstream automatically. On each invocation, fetch.py spawns a detached background git pull --ff-only in the skill directory:

Non-blocking — the current invocation is not delayed; the pull runs in a new session and is fully detached
Silent — all output goes to /dev/null, JSON contract on stdout is never polluted
Throttled — at most once every 24 hours (stamped via .git/.paper-fetch-last-update)
Safe — --ff-only refuses to merge if you have local edits; conflicts never happen
Convergence — updates apply on the next invocation, not the current one (because the pull is backgrounded)

Environment variables

Variable	Default	Purpose
`PAPER_FETCH_NO_AUTO_UPDATE`	unset	Set to any value to completely disable auto-update
`PAPER_FETCH_UPDATE_INTERVAL`	`86400`	Cooldown in seconds between update attempts

Auto-update is a no-op when the skill is not a git checkout (e.g. tarball install), when git is unavailable, or when the cooldown stamp is fresh. Force an immediate check with rm <skill_dir>/.git/.paper-fetch-last-update.