biorxiv-database
Efficient database search tool for bioRxiv preprint server. Use this skill when searching for life sciences preprints by keywords, authors, date ranges, or categories, retrieving paper metadata, downloading PDFs, or conducting literature reviews.
下記のコマンドをコピーしてターミナル(Mac/Linux)または PowerShell(Windows)に貼り付けてください。 ダウンロード → 解凍 → 配置まで全自動。
mkdir -p ~/.claude/skills && cd ~/.claude/skills && curl -L -o biorxiv-database.zip https://jpskill.com/download/18347.zip && unzip -o biorxiv-database.zip && rm biorxiv-database.zip
$d = "$env:USERPROFILE\.claude\skills"; ni -Force -ItemType Directory $d | Out-Null; iwr https://jpskill.com/download/18347.zip -OutFile "$d\biorxiv-database.zip"; Expand-Archive "$d\biorxiv-database.zip" -DestinationPath $d -Force; ri "$d\biorxiv-database.zip"
完了後、Claude Code を再起動 → 普通に「動画プロンプト作って」のように話しかけるだけで自動発動します。
💾 手動でダウンロードしたい(コマンドが難しい人向け)
- 1. 下の青いボタンを押して
biorxiv-database.zipをダウンロード - 2. ZIPファイルをダブルクリックで解凍 →
biorxiv-databaseフォルダができる - 3. そのフォルダを
C:\Users\あなたの名前\.claude\skills\(Win)または~/.claude/skills/(Mac)へ移動 - 4. Claude Code を再起動
⚠️ ダウンロード・利用は自己責任でお願いします。当サイトは内容・動作・安全性について責任を負いません。
🎯 このSkillでできること
下記の説明文を読むと、このSkillがあなたに何をしてくれるかが分かります。Claudeにこの分野の依頼をすると、自動で発動します。
📦 インストール方法 (3ステップ)
- 1. 上の「ダウンロード」ボタンを押して .skill ファイルを取得
- 2. ファイル名の拡張子を .skill から .zip に変えて展開(macは自動展開可)
- 3. 展開してできたフォルダを、ホームフォルダの
.claude/skills/に置く- · macOS / Linux:
~/.claude/skills/ - · Windows:
%USERPROFILE%\.claude\skills\
- · macOS / Linux:
Claude Code を再起動すれば完了。「このSkillを使って…」と話しかけなくても、関連する依頼で自動的に呼び出されます。
詳しい使い方ガイドを見る →- 最終更新
- 2026-05-18
- 取得日時
- 2026-05-18
- 同梱ファイル
- 3
📖 Skill本文(日本語訳)
※ 原文(英語/中国語)を Gemini で日本語化したものです。Claude 自身は原文を読みます。誤訳がある場合は原文をご確認ください。
bioRxivデータベース
概要
このスキルは、bioRxivデータベースからプレプリントを検索し、取得するための効率的なPythonベースのツールを提供します。キーワード、著者、日付範囲、およびカテゴリによる包括的な検索を可能にし、タイトル、抄録、DOI、および引用情報を含む構造化されたJSONメタデータを返します。このスキルは、全文分析のためのPDFダウンロードもサポートしています。
このスキルを使用する場面
このスキルは、以下の場合に使用します。
- 特定の研究分野における最近のプレプリントを検索する場合
- 特定の著者による出版物を追跡する場合
- 系統的な文献レビューを実施する場合
- ある期間にわたる研究動向を分析する場合
- 引用管理のためのメタデータを取得する場合
- 分析のためにプレプリントのPDFをダウンロードする場合
- bioRxivの主題カテゴリで論文をフィルタリングする場合
主要な検索機能
1. キーワード検索
タイトル、抄録、または著者リストに特定のキーワードを含むプレプリントを検索します。
基本的な使用法:
python scripts/biorxiv_search.py \
--keywords "CRISPR" "gene editing" \
--start-date 2024-01-01 \
--end-date 2024-12-31 \
--output results.json
カテゴリフィルターを使用:
python scripts/biorxiv_search.py \
--keywords "neural networks" "deep learning" \
--days-back 180 \
--category neuroscience \
--output recent_neuroscience.json
検索フィールド:
デフォルトでは、キーワードはタイトルと抄録の両方で検索されます。--search-fieldsでカスタマイズします。
python scripts/biorxiv_search.py \
--keywords "AlphaFold" \
--search-fields title \
--days-back 365
2. 著者検索
特定の日付範囲内で特定の著者によるすべての論文を検索します。
基本的な使用法:
python scripts/biorxiv_search.py \
--author "Smith" \
--start-date 2023-01-01 \
--end-date 2024-12-31 \
--output smith_papers.json
最近の出版物:
# 日付が指定されていない場合は、デフォルトで昨年
python scripts/biorxiv_search.py \
--author "Johnson" \
--output johnson_recent.json
3. 日付範囲検索
特定の日付範囲内に投稿されたすべてのプレプリントを取得します。
基本的な使用法:
python scripts/biorxiv_search.py \
--start-date 2024-01-01 \
--end-date 2024-01-31 \
--output january_2024.json
カテゴリフィルターを使用:
python scripts/biorxiv_search.py \
--start-date 2024-06-01 \
--end-date 2024-06-30 \
--category genomics \
--output genomics_june.json
Days Backショートカット:
# 過去30日間
python scripts/biorxiv_search.py \
--days-back 30 \
--output last_month.json
4. DOIによる論文詳細
特定のプレプリントの詳細なメタデータを取得します。
基本的な使用法:
python scripts/biorxiv_search.py \
--doi "10.1101/2024.01.15.123456" \
--output paper_details.json
完全なDOI URLも使用可能:
python scripts/biorxiv_search.py \
--doi "https://doi.org/10.1101/2024.01.15.123456"
5. PDFダウンロード
任意のプレプリントの全文PDFをダウンロードします。
基本的な使用法:
python scripts/biorxiv_search.py \
--doi "10.1101/2024.01.15.123456" \
--download-pdf paper.pdf
バッチ処理: 複数のPDFの場合、検索結果JSONからDOIを抽出し、各論文をダウンロードします。
import json
from biorxiv_search import BioRxivSearcher
# 検索結果をロード
with open('results.json') as f:
data = json.load(f)
searcher = BioRxivSearcher(verbose=True)
# 各論文をダウンロード
for i, paper in enumerate(data['results'][:10]): # 最初の10件の論文
doi = paper['doi']
searcher.download_pdf(doi, f"papers/paper_{i+1}.pdf")
有効なカテゴリ
bioRxivの主題カテゴリで検索をフィルタリングします。
animal-behavior-and-cognitionbiochemistrybioengineeringbioinformaticsbiophysicscancer-biologycell-biologyclinical-trialsdevelopmental-biologyecologyepidemiologyevolutionary-biologygeneticsgenomicsimmunologymicrobiologymolecular-biologyneurosciencepaleontologypathologypharmacology-and-toxicologyphysiologyplant-biologyscientific-communication-and-educationsynthetic-biologysystems-biologyzoology
出力形式
すべての検索は、次の形式で構造化されたJSONを返します。
{
"query": {
"keywords": ["CRISPR"],
"start_date": "2024-01-01",
"end_date": "2024-12-31",
"category": "genomics"
},
"result_count": 42,
"results": [
{
"doi": "10.1101/2024.01.15.123456",
"title": "Paper Title Here",
"authors": "Smith J, Doe J, Johnson A",
"author_corresponding": "Smith J",
"author_corresponding_institution": "University Example",
"date": "2024-01-15",
"version": "1",
"type": "new results",
"license": "cc_by",
"category": "genomics",
"abstract": "Full abstract text...",
"pdf_url": "https://www.biorxiv.org/content/10.1101/2024.01.15.123456v1.full.pdf",
"html_url": "https://www.biorxiv.org/content/10.1101/2024.01.15.123456v1",
"jatsxml": "https://www.biorxiv.org/content/...",
"published": ""
}
]
}
一般的な使用パターン
文献レビューのワークフロー
-
広範なキーワード検索:
python scripts/biorxiv_search.py \ --keywords "organoids" "tissue engineering" \ --start-date 2023-01-01 \ --end-date 2024-12-31 \ --category bioengineering \ --output organoid_papers.json -
結果の抽出とレビュー:
import json
with open('organoid_papers.json') as f: data = json.load(f)
print(f"Found {data['result_count']} papers")
for paper in data['results'][:5]: print(f"\nTitle: {paper['title']}") print(f"Authors: {paper['authors']}") print(f"Date: {paper['date']}") print(f"DOI: {paper['doi']}")
3. **選択した論文のダウンロード:**
```python
from biorxiv_search import BioRxivSearcher
searcher = BioRxivSearcher()
selected_dois = ["10.1101/2024.01.15.123456", "10.1101/2024.02.20.789012"]
for doi in selected_dois:
filename = doi.replace("/", "_") 📜 原文 SKILL.md(Claudeが読む英語/中国語)を展開
bioRxiv Database
Overview
This skill provides efficient Python-based tools for searching and retrieving preprints from the bioRxiv database. It enables comprehensive searches by keywords, authors, date ranges, and categories, returning structured JSON metadata that includes titles, abstracts, DOIs, and citation information. The skill also supports PDF downloads for full-text analysis.
When to Use This Skill
Use this skill when:
- Searching for recent preprints in specific research areas
- Tracking publications by particular authors
- Conducting systematic literature reviews
- Analyzing research trends over time periods
- Retrieving metadata for citation management
- Downloading preprint PDFs for analysis
- Filtering papers by bioRxiv subject categories
Core Search Capabilities
1. Keyword Search
Search for preprints containing specific keywords in titles, abstracts, or author lists.
Basic Usage:
python scripts/biorxiv_search.py \
--keywords "CRISPR" "gene editing" \
--start-date 2024-01-01 \
--end-date 2024-12-31 \
--output results.json
With Category Filter:
python scripts/biorxiv_search.py \
--keywords "neural networks" "deep learning" \
--days-back 180 \
--category neuroscience \
--output recent_neuroscience.json
Search Fields:
By default, keywords are searched in both title and abstract. Customize with --search-fields:
python scripts/biorxiv_search.py \
--keywords "AlphaFold" \
--search-fields title \
--days-back 365
2. Author Search
Find all papers by a specific author within a date range.
Basic Usage:
python scripts/biorxiv_search.py \
--author "Smith" \
--start-date 2023-01-01 \
--end-date 2024-12-31 \
--output smith_papers.json
Recent Publications:
# Last year by default if no dates specified
python scripts/biorxiv_search.py \
--author "Johnson" \
--output johnson_recent.json
3. Date Range Search
Retrieve all preprints posted within a specific date range.
Basic Usage:
python scripts/biorxiv_search.py \
--start-date 2024-01-01 \
--end-date 2024-01-31 \
--output january_2024.json
With Category Filter:
python scripts/biorxiv_search.py \
--start-date 2024-06-01 \
--end-date 2024-06-30 \
--category genomics \
--output genomics_june.json
Days Back Shortcut:
# Last 30 days
python scripts/biorxiv_search.py \
--days-back 30 \
--output last_month.json
4. Paper Details by DOI
Retrieve detailed metadata for a specific preprint.
Basic Usage:
python scripts/biorxiv_search.py \
--doi "10.1101/2024.01.15.123456" \
--output paper_details.json
Full DOI URLs Accepted:
python scripts/biorxiv_search.py \
--doi "https://doi.org/10.1101/2024.01.15.123456"
5. PDF Downloads
Download the full-text PDF of any preprint.
Basic Usage:
python scripts/biorxiv_search.py \
--doi "10.1101/2024.01.15.123456" \
--download-pdf paper.pdf
Batch Processing: For multiple PDFs, extract DOIs from a search result JSON and download each paper:
import json
from biorxiv_search import BioRxivSearcher
# Load search results
with open('results.json') as f:
data = json.load(f)
searcher = BioRxivSearcher(verbose=True)
# Download each paper
for i, paper in enumerate(data['results'][:10]): # First 10 papers
doi = paper['doi']
searcher.download_pdf(doi, f"papers/paper_{i+1}.pdf")
Valid Categories
Filter searches by bioRxiv subject categories:
animal-behavior-and-cognitionbiochemistrybioengineeringbioinformaticsbiophysicscancer-biologycell-biologyclinical-trialsdevelopmental-biologyecologyepidemiologyevolutionary-biologygeneticsgenomicsimmunologymicrobiologymolecular-biologyneurosciencepaleontologypathologypharmacology-and-toxicologyphysiologyplant-biologyscientific-communication-and-educationsynthetic-biologysystems-biologyzoology
Output Format
All searches return structured JSON with the following format:
{
"query": {
"keywords": ["CRISPR"],
"start_date": "2024-01-01",
"end_date": "2024-12-31",
"category": "genomics"
},
"result_count": 42,
"results": [
{
"doi": "10.1101/2024.01.15.123456",
"title": "Paper Title Here",
"authors": "Smith J, Doe J, Johnson A",
"author_corresponding": "Smith J",
"author_corresponding_institution": "University Example",
"date": "2024-01-15",
"version": "1",
"type": "new results",
"license": "cc_by",
"category": "genomics",
"abstract": "Full abstract text...",
"pdf_url": "https://www.biorxiv.org/content/10.1101/2024.01.15.123456v1.full.pdf",
"html_url": "https://www.biorxiv.org/content/10.1101/2024.01.15.123456v1",
"jatsxml": "https://www.biorxiv.org/content/...",
"published": ""
}
]
}
Common Usage Patterns
Literature Review Workflow
-
Broad keyword search:
python scripts/biorxiv_search.py \ --keywords "organoids" "tissue engineering" \ --start-date 2023-01-01 \ --end-date 2024-12-31 \ --category bioengineering \ --output organoid_papers.json -
Extract and review results:
import json
with open('organoid_papers.json') as f: data = json.load(f)
print(f"Found {data['result_count']} papers")
for paper in data['results'][:5]: print(f"\nTitle: {paper['title']}") print(f"Authors: {paper['authors']}") print(f"Date: {paper['date']}") print(f"DOI: {paper['doi']}")
3. **Download selected papers:**
```python
from biorxiv_search import BioRxivSearcher
searcher = BioRxivSearcher()
selected_dois = ["10.1101/2024.01.15.123456", "10.1101/2024.02.20.789012"]
for doi in selected_dois:
filename = doi.replace("/", "_").replace(".", "_") + ".pdf"
searcher.download_pdf(doi, f"papers/{filename}")
Trend Analysis
Track research trends by analyzing publication frequencies over time:
python scripts/biorxiv_search.py \
--keywords "machine learning" \
--start-date 2020-01-01 \
--end-date 2024-12-31 \
--category bioinformatics \
--output ml_trends.json
Then analyze the temporal distribution in the results.
Author Tracking
Monitor specific researchers' preprints:
# Track multiple authors
authors = ["Smith", "Johnson", "Williams"]
for author in authors:
python scripts/biorxiv_search.py \
--author "{author}" \
--days-back 365 \
--output "{author}_papers.json"
Python API Usage
For more complex workflows, import and use the BioRxivSearcher class directly:
from scripts.biorxiv_search import BioRxivSearcher
# Initialize
searcher = BioRxivSearcher(verbose=True)
# Multiple search operations
keywords_papers = searcher.search_by_keywords(
keywords=["CRISPR", "gene editing"],
start_date="2024-01-01",
end_date="2024-12-31",
category="genomics"
)
author_papers = searcher.search_by_author(
author_name="Smith",
start_date="2023-01-01",
end_date="2024-12-31"
)
# Get specific paper details
paper = searcher.get_paper_details("10.1101/2024.01.15.123456")
# Download PDF
success = searcher.download_pdf(
doi="10.1101/2024.01.15.123456",
output_path="paper.pdf"
)
# Format results consistently
formatted = searcher.format_result(paper, include_abstract=True)
Best Practices
-
Use appropriate date ranges: Smaller date ranges return faster. For keyword searches over long periods, consider splitting into multiple queries.
-
Filter by category: When possible, use
--categoryto reduce data transfer and improve search precision. -
Respect rate limits: The script includes automatic delays (0.5s between requests). For large-scale data collection, add additional delays.
-
Cache results: Save search results to JSON files to avoid repeated API calls.
-
Version tracking: Preprints can have multiple versions. The
versionfield indicates which version is returned. PDF URLs include the version number. -
Handle errors gracefully: Check the
result_countin output JSON. Empty results may indicate date range issues or API connectivity problems. -
Verbose mode for debugging: Use
--verboseflag to see detailed logging of API requests and responses.
Advanced Features
Custom Date Range Logic
from datetime import datetime, timedelta
# Last quarter
end_date = datetime.now()
start_date = end_date - timedelta(days=90)
python scripts/biorxiv_search.py \
--start-date {start_date.strftime('%Y-%m-%d')} \
--end-date {end_date.strftime('%Y-%m-%d')}
Result Limiting
Limit the number of results returned:
python scripts/biorxiv_search.py \
--keywords "COVID-19" \
--days-back 30 \
--limit 50 \
--output covid_top50.json
Exclude Abstracts for Speed
When only metadata is needed:
# Note: Abstract inclusion is controlled in Python API
from scripts.biorxiv_search import BioRxivSearcher
searcher = BioRxivSearcher()
papers = searcher.search_by_keywords(keywords=["AI"], days_back=30)
formatted = [searcher.format_result(p, include_abstract=False) for p in papers]
Programmatic Integration
Integrate search results into downstream analysis pipelines:
import json
import pandas as pd
# Load results
with open('results.json') as f:
data = json.load(f)
# Convert to DataFrame for analysis
df = pd.DataFrame(data['results'])
# Analyze
print(f"Total papers: {len(df)}")
print(f"Date range: {df['date'].min()} to {df['date'].max()}")
print(f"\nTop authors by paper count:")
print(df['authors'].str.split(',').explode().str.strip().value_counts().head(10))
# Filter and export
recent = df[df['date'] >= '2024-06-01']
recent.to_csv('recent_papers.csv', index=False)
Testing the Skill
To verify that the bioRxiv database skill is working correctly, run the comprehensive test suite.
Prerequisites:
uv pip install requests
Run tests:
python tests/test_biorxiv_search.py
The test suite validates:
- Initialization: BioRxivSearcher class instantiation
- Date Range Search: Retrieving papers within specific date ranges
- Category Filtering: Filtering papers by bioRxiv categories
- Keyword Search: Finding papers containing specific keywords
- DOI Lookup: Retrieving specific papers by DOI
- Result Formatting: Proper formatting of paper metadata
- Interval Search: Fetching recent papers by time intervals
Expected Output:
🧬 bioRxiv Database Search Skill Test Suite
======================================================================
🧪 Test 1: Initialization
✅ BioRxivSearcher initialized successfully
🧪 Test 2: Date Range Search
✅ Found 150 papers between 2024-01-01 and 2024-01-07
First paper: Novel CRISPR-based approach for genome editing...
[... additional tests ...]
======================================================================
📊 Test Summary
======================================================================
✅ PASS: Initialization
✅ PASS: Date Range Search
✅ PASS: Category Filtering
✅ PASS: Keyword Search
✅ PASS: DOI Lookup
✅ PASS: Result Formatting
✅ PASS: Interval Search
======================================================================
Results: 7/7 tests passed (100%)
======================================================================
🎉 All tests passed! The bioRxiv database skill is working correctly.
Note: Some tests may show warnings if no papers are found in specific date ranges or categories. This is normal and does not indicate a failure.
Reference Documentation
For detailed API specifications, endpoint documentation, and response schemas, refer to:
references/api_reference.md- Complete bioRxiv API documentation
The reference file includes:
- Full API endpoint specifications
- Response format details
- Error handling patterns
- Rate limiting guidelines
- Advanced search patterns
同梱ファイル
※ ZIPに含まれるファイル一覧。`SKILL.md` 本体に加え、参考資料・サンプル・スクリプトが入っている場合があります。
- 📄 SKILL.md (12,525 bytes)
- 📎 references/api_reference.md (6,405 bytes)
- 📎 scripts/biorxiv_search.py (14,831 bytes)