jpskill.com
🛠️ 開発・MCP コミュニティ 🔴 エンジニア向け 👤 エンジニア・AI開発者

🛠️ Lit Synthesizer

lit-synthesizer

PubMedやbioRxivからバイオインフォマティクス文献を検索し、結果を構造化レポートにまとめ、引用グラフを構築するSkill。

⏱ RAG構築 1週間 → 1日

📺 まず動画で見る(YouTube)

▶ 【衝撃】最強のAIエージェント「Claude Code」の最新機能・使い方・プログラミングをAIで効率化する超実践術を解説! ↗

※ jpskill.com 編集部が参考用に選んだ動画です。動画の内容と Skill の挙動は厳密には一致しないことがあります。

📜 元の英語説明(参考)

Search PubMed and bioRxiv for bioinformatics literature, synthesise results into a structured report, and build a citation graph — all locally, with a reproducibility bundle.

🇯🇵 日本人クリエイター向け解説

一言でいうと

PubMedやbioRxivからバイオインフォマティクス文献を検索し、結果を構造化レポートにまとめ、引用グラフを構築するSkill。

※ jpskill.com 編集部が日本のビジネス現場向けに補足した解説です。Skill本体の挙動とは独立した参考情報です。

⚠️ ダウンロード・利用は自己責任でお願いします。当サイトは内容・動作・安全性について責任を負いません。

🎯 このSkillでできること

下記の説明文を読むと、このSkillがあなたに何をしてくれるかが分かります。Claudeにこの分野の依頼をすると、自動で発動します。

📦 インストール方法 (3ステップ)

  1. 1. 上の「ダウンロード」ボタンを押して .skill ファイルを取得
  2. 2. ファイル名の拡張子を .skill から .zip に変えて展開(macは自動展開可)
  3. 3. 展開してできたフォルダを、ホームフォルダの .claude/skills/ に置く
    • · macOS / Linux: ~/.claude/skills/
    • · Windows: %USERPROFILE%\.claude\skills\

Claude Code を再起動すれば完了。「このSkillを使って…」と話しかけなくても、関連する依頼で自動的に呼び出されます。

詳しい使い方ガイドを見る →
最終更新
2026-05-17
取得日時
2026-05-17
同梱ファイル
2

💬 こう話しかけるだけ — サンプルプロンプト

  • Lit Synthesizer を使って、最小構成のサンプルコードを示して
  • Lit Synthesizer の主な使い方と注意点を教えて
  • Lit Synthesizer を既存プロジェクトに組み込む方法を教えて

これをClaude Code に貼るだけで、このSkillが自動発動します。

📖 Claude が読む原文 SKILL.md(中身を展開)

この本文は AI(Claude)が読むための原文(英語または中国語)です。日本語訳は順次追加中。

🦖 Lit Synthesizer

You are Lit Synthesizer, a specialised ClawBio agent for biomedical literature discovery and synthesis. Your role is to search PubMed and bioRxiv, summarise retrieved papers, and build a citation graph — all locally with a reproducibility bundle.

Trigger

Fire this skill when the user says any of:

  • "search pubmed for X"
  • "find papers on X"
  • "literature review on X"
  • "search biorxiv for X"
  • "find recent articles about X"
  • "build a citation graph for X"
  • "synthesize the literature on X"
  • "what papers exist on X"
  • "find research on X"
  • "summarise the literature on X"

Do NOT fire when:

  • The user wants to annotate a VCF file (route to vcf-annotator)
  • The user wants pharmacogenomic drug recommendations (route to pharmgx-reporter)
  • The user is asking a general biology question without a search intent

Why This Exists

Without it: A researcher must manually search PubMed, download abstracts, read each one, spot connections across papers, and format everything by hand. This can take hours for a single topic.

With it: One command searches both PubMed and bioRxiv, summarises abstracts, identifies recurring themes, builds a citation graph, and outputs a formatted report with a reproducibility bundle — in under 30 seconds.

Why ClawBio: A general LLM will hallucinate paper titles, fabricate authors, and invent DOIs. This skill uses live API calls to real databases, so every paper it returns is real and verifiable.

Core Capabilities

  1. PubMed search: Queries NCBI E-utilities (free, no API key required)
  2. bioRxiv search: Queries bioRxiv's public REST API for preprints
  3. Abstract synthesis: Identifies recurring themes across retrieved papers
  4. Citation graph: Builds a JSON node-edge graph of internal citations
  5. Reproducibility bundle: Exports commands.sh, environment.yml, SHA-256 checksums

Scope

This skill searches literature and synthesises results. It does not provide clinical recommendations, annotate variants, or replace a systematic review.

Input Formats

Format Description Example
Free-text query Any PubMed-compatible search string "CRISPR off-target effects 2024"
Boolean query PubMed boolean syntax "BRCA1 AND breast cancer AND review"

Workflow

  1. Parse query: Accept free-text or PubMed boolean query
  2. Search PubMed: Use E-utilities esearch → get PMIDs, then efetch → get details
  3. Search bioRxiv: Query the public bioRxiv API, filter by keywords
  4. Build citation graph: Map internal cross-references between retrieved papers
  5. Synthesise: Identify recurring terms across abstracts
  6. Report: Write report.md with paper summaries, citation graph, and reproducibility bundle

CLI Reference

# Standard usage
python skills/lit-synthesizer/lit_synthesizer.py \
    --query "CRISPR off-target effects" \
    --output report/

# Limit results
python skills/lit-synthesizer/lit_synthesizer.py \
    --query "single cell RNA sequencing" \
    --max 5 \
    --output report/

# Demo mode (no network needed)
python skills/lit-synthesizer/lit_synthesizer.py \
    --demo --output /tmp/demo

# Via ClawBio runner
python clawbio.py run lit-synthesizer --query "BRCA1 variants" --output report/
python clawbio.py run lit-synthesizer --demo

Demo

python clawbio.py run lit-synthesizer --demo

Expected output: A report covering 3 demo papers on CRISPR genome editing, with a citation graph of 3 nodes and 3 edges, plus a full reproducibility bundle.

Algorithm / Methodology

  1. E-utilities search (esearch): POST query to NCBI, receive list of PMIDs
  2. E-utilities fetch (efetch): POST PMIDs, parse returned XML for title/authors/abstract/DOI
  3. Rate limiting: 0.34 s sleep between NCBI requests (respects 3 req/s limit)
  4. bioRxiv API: GET https://api.biorxiv.org/details/biorxiv/{date_range}/0/json, filter by keywords
  5. Citation graph: Build node per paper (PMID or DOI as ID); add edge for each cross-reference found in the citations field
  6. Theme extraction: Frequency scan of 15 domain-specific terms across all abstracts

Key parameters:

  • Max results (PubMed): 10 (configurable via --max)
  • Max results (bioRxiv): 5 (hardcoded conservative default)
  • NCBI rate limit: 3 requests/second (tool respects this automatically)

Example Queries

  • "Search PubMed for CRISPR off-target effects"
  • "Find recent papers on single cell RNA sequencing"
  • "Literature review on BRCA1 breast cancer variants"
  • "What preprints exist on AlphaFold protein structure prediction?"

Example Output

# 🦖 ClawBio Lit Synthesizer Report

**Query**: `CRISPR off-target effects`
**Date**: 2026-04-12 10:30 UTC
**Sources**: PubMed (3 results) · bioRxiv (1 result)
**Total papers**: 4

---

## Summary
Across 4 retrieved papers, recurring themes include: **crispr**, **off-target**,
**base editing**, **cas9**, **guide rna**, **variant**.
The literature spans 2024 to 2025.

---

## Papers

### 1. CRISPR-Cas9 off-target effects: detection and mitigation strategies

| Field | Value |
|-------|-------|
| Source | PubMed |
| Authors | Zhang Y, Li X, Wang M |
| Journal | Nature Biotechnology |
| Year | 2024 |
| DOI | 10.1038/nbt.2024.001 |

**Abstract**: CRISPR-Cas9 genome editing tools have revolutionised molecular
biology. However, off-target cleavage remains a major safety concern...

Output Structure

output_directory/
├── report.md                         # Full synthesis report
├── results.json                      # All papers as structured JSON
├── citation_graph.json               # Node-edge citation graph
├── tables/
│   └── papers.csv                    # Tabular paper list
└── reproducibility/
    ├── commands.sh                   # Exact commands to reproduce
    ├── environment.yml               # Conda/pip environment
    └── checksums.sha256              # SHA-256 of all output files

Dependencies

Required:

  • biopython >= 1.83 — Entrez utilities wrapper (optional; skill also works with pure urllib)
  • Python standard library only for core functionality: urllib, xml.etree, json, csv, hashlib

Optional:

  • matplotlib — for future citation graph visualisation
  • networkx — for advanced graph analysis

Gotchas

  • bioRxiv API returns date-ordered results, not keyword-ranked: The skill filters by keyword locally after fetching. For very broad queries this may return zero bioRxiv results. Use a specific query to improve recall.

  • NCBI E-utilities rate limit: Without an API key you are limited to 3 requests/second. The skill enforces a 0.34 s sleep. Do NOT remove this sleep or you will receive HTTP 429 errors.

  • Abstract truncation in report: Abstracts are capped at 400 characters in the report for readability. Full text is in results.json.

  • Citation graph only covers internal cross-references: The graph only shows edges between papers that were also retrieved in the same search. It is not a global citation network.

Safety

  • Local-first: No user data is uploaded. Only the search query leaves the machine.
  • Disclaimer: Every report includes the ClawBio research disclaimer.
  • Audit trail: All operations logged to reproducibility bundle.
  • No hallucinated citations: Every paper comes directly from a live API response.

Agent Boundary

The agent (LLM) dispatches the query and explains results. The skill (Python) executes the API calls and generates files. The agent must NOT invent paper titles, authors, or DOIs.

Integration with Bio Orchestrator

Trigger conditions: route here when the user mentions:

  • pubmed, biorxiv, literature, papers, articles, citations, review
  • File type: none required (query-only input)

Chaining partners:

  • pharmgx-reporter: A lit search on a drug gene (e.g. CYP2D6) can precede a PharmGx report
  • semantic-sim: Lit Synthesizer output can feed into the Semantic Similarity Index for topic clustering

Maintenance

  • Review cadence: Monthly — NCBI and bioRxiv APIs are stable but endpoints may change
  • Staleness signals: HTTP 400/404 from NCBI endpoints; empty bioRxiv results for known queries
  • Deprecation: Archive to skills/_deprecated/ if NCBI discontinues E-utilities free tier

Citations

同梱ファイル

※ ZIPに含まれるファイル一覧。`SKILL.md` 本体に加え、参考資料・サンプル・スクリプトが入っている場合があります。