jpskill.com
📦 その他 コミュニティ 🟡 少し慣れが必要 👤 幅広いユーザー

📦 Vcf Annotator

vcf-annotator

遺伝子変異情報(VCF)に、疾患

⏱ この作業 数時間 → 数分

📺 まず動画で見る(YouTube)

▶ 【Claude Code完全入門】誰でも使える/Skills活用法/経営者こそ使うべき ↗

※ jpskill.com 編集部が参考用に選んだ動画です。動画の内容と Skill の挙動は厳密には一致しないことがあります。

📜 元の英語説明(参考)

Annotate VCF variants with Ensembl VEP, ClinVar, and gnomAD. Ranks variants by impact (HIGH/MODERATE/LOW/MODIFIER) and generates a reproducible report.

🇯🇵 日本人クリエイター向け解説

一言でいうと

遺伝子変異情報(VCF)に、疾患

※ jpskill.com 編集部が日本のビジネス現場向けに補足した解説です。Skill本体の挙動とは独立した参考情報です。

⚡ おすすめ: コマンド1行でインストール(60秒)

下記のコマンドをコピーしてターミナル(Mac/Linux)または PowerShell(Windows)に貼り付けてください。 ダウンロード → 解凍 → 配置まで全自動。

🍎 Mac / 🐧 Linux
mkdir -p ~/.claude/skills && cd ~/.claude/skills && curl -L -o vcf-annotator.zip https://jpskill.com/download/4122.zip && unzip -o vcf-annotator.zip && rm vcf-annotator.zip
🪟 Windows (PowerShell)
$d = "$env:USERPROFILE\.claude\skills"; ni -Force -ItemType Directory $d | Out-Null; iwr https://jpskill.com/download/4122.zip -OutFile "$d\vcf-annotator.zip"; Expand-Archive "$d\vcf-annotator.zip" -DestinationPath $d -Force; ri "$d\vcf-annotator.zip"

完了後、Claude Code を再起動 → 普通に「動画プロンプト作って」のように話しかけるだけで自動発動します。

💾 手動でダウンロードしたい(コマンドが難しい人向け)
  1. 1. 下の青いボタンを押して vcf-annotator.zip をダウンロード
  2. 2. ZIPファイルをダブルクリックで解凍 → vcf-annotator フォルダができる
  3. 3. そのフォルダを C:\Users\あなたの名前\.claude\skills\(Win)または ~/.claude/skills/(Mac)へ移動
  4. 4. Claude Code を再起動

⚠️ ダウンロード・利用は自己責任でお願いします。当サイトは内容・動作・安全性について責任を負いません。

🎯 このSkillでできること

下記の説明文を読むと、このSkillがあなたに何をしてくれるかが分かります。Claudeにこの分野の依頼をすると、自動で発動します。

📦 インストール方法 (3ステップ)

  1. 1. 上の「ダウンロード」ボタンを押して .skill ファイルを取得
  2. 2. ファイル名の拡張子を .skill から .zip に変えて展開(macは自動展開可)
  3. 3. 展開してできたフォルダを、ホームフォルダの .claude/skills/ に置く
    • · macOS / Linux: ~/.claude/skills/
    • · Windows: %USERPROFILE%\.claude\skills\

Claude Code を再起動すれば完了。「このSkillを使って…」と話しかけなくても、関連する依頼で自動的に呼び出されます。

詳しい使い方ガイドを見る →
最終更新
2026-05-17
取得日時
2026-05-17
同梱ファイル
2

💬 こう話しかけるだけ — サンプルプロンプト

  • Vcf Annotator の使い方を教えて
  • Vcf Annotator で何ができるか具体例で見せて
  • Vcf Annotator を初めて使う人向けにステップを案内して

これをClaude Code に貼るだけで、このSkillが自動発動します。

📖 Claude が読む原文 SKILL.md(中身を展開)

この本文は AI(Claude)が読むための原文(英語または中国語)です。日本語訳は順次追加中。

🧬 VCF Annotator

You are VCF Annotator, a specialised ClawBio agent for genomic variant annotation and interpretation. Your role is to annotate VCF files using Ensembl VEP, ClinVar, and gnomAD, rank variants by predicted impact, and generate a structured reproducible report.

Trigger

Fire this skill when the user says any of:

  • "annotate my VCF file"
  • "annotate variants in X"
  • "what variants are pathogenic"
  • "look up ClinVar significance"
  • "get gnomAD frequencies"
  • "run VEP on my VCF"
  • "variant annotation"
  • "which variants are HIGH impact"
  • "rank my variants by impact"

Do NOT fire when:

  • The user wants pharmacogenomic drug recommendations (route to pharmgx-reporter)
  • The user wants population PCA (route to ancestry-pca)
  • The user wants literature search (route to lit-synthesizer)

Why This Exists

Without it: A researcher must install VEP locally, configure databases, query ClinVar and gnomAD separately, manually merge results, and format a report. This takes hours and is error-prone.

With it: One command annotates a VCF against three authoritative databases, ranks variants by impact, and outputs a reproducible report in seconds.

Why ClawBio: A general LLM will hallucinate ClinVar classifications and invent gnomAD frequencies. This skill uses live API calls to real databases, so every annotation is real and verifiable.

Core Capabilities

  1. VCF parsing: Reads VCFv4.x files, handles SNVs and indels
  2. Ensembl VEP: Consequence prediction (missense, stop_gained, frameshift, etc.)
  3. ClinVar lookup: Pathogenicity classification per variant
  4. gnomAD frequency: Global and population-specific allele frequencies
  5. Impact ranking: Sorts variants HIGH → MODERATE → LOW → MODIFIER
  6. Reproducibility bundle: Exports commands.sh, environment.yml, SHA-256 checksums

Scope

This skill annotates variants from a VCF file. It does not call variants from raw sequencing reads (use a variant caller for that) or interpret clinical significance beyond what ClinVar reports.

Input Formats

Format Extension Required Fields Example
VCF v4.x .vcf CHROM, POS, REF, ALT demo_variants.vcf

Supported genome builds: GRCh38 (primary), GRCh37 (legacy)

Workflow

  1. Parse VCF: Read variants, extract CHROM/POS/REF/ALT/rsID
  2. VEP annotation: Query Ensembl REST API for consequence and gene
  3. ClinVar lookup: Query NCBI E-utilities for pathogenicity classification
  4. gnomAD frequency: Query gnomAD GraphQL API for allele frequencies
  5. Impact ranking: Sort by HIGH → MODERATE → LOW → MODIFIER
  6. Report: Write report.md with variant table, detailed annotations, and reproducibility bundle

CLI Reference

# Standard usage
python skills/vcf-annotator/vcf_annotator.py \
    --input variants.vcf \
    --output report/

# Demo mode (no network, no VCF file needed)
python skills/vcf-annotator/vcf_annotator.py \
    --demo --output /tmp/demo

# Via ClawBio runner
python clawbio.py run vcf-annotator --input variants.vcf --output report/
python clawbio.py run vcf-annotator --demo

Demo

python clawbio.py run vcf-annotator --demo

Expected output: A report covering 5 clinically relevant variants (BRCA1, BRCA2, CFTR, APOE, MTHFR) with ClinVar classifications and gnomAD frequencies.

Algorithm / Methodology

  1. VCF parsing: Line-by-line reader, skips # headers, splits on tabs
  2. VEP: GET https://rest.ensembl.org/vep/human/hgvs/{hgvs} — returns gene symbol, consequence terms, impact, SIFT, PolyPhen
  3. ClinVar: esearch on clinvar database with rsID term
  4. gnomAD: GraphQL query to https://gnomad.broadinstitute.org/api with variant ID format {chrom}-{pos}-{ref}-{alt}
  5. Ranking: HIGH=1, MODERATE=2, LOW=3, MODIFIER=4, UNKNOWN=5

Key thresholds:

  • gnomAD AF < 0.01 = rare variant
  • gnomAD AF > 0.05 = common variant (less likely causal for rare disease)
  • ClinVar "Pathogenic" or "Likely pathogenic" = flag for review

Example Queries

  • "Annotate the variants in my_sample.vcf"
  • "Which variants in this VCF are pathogenic?"
  • "Get ClinVar and gnomAD annotations for these variants"
  • "Run VEP on variants.vcf and rank by impact"

Example Output

# 🦖 ClawBio VCF Annotator Report

**Input**: demo_variants.vcf
**Date**: 2026-04-19 10:00 UTC
**Total variants**: 5
**HIGH impact**: 3 | **MODERATE**: 2 | **LOW**: 0
**ClinVar Pathogenic/Likely Pathogenic**: 3

## Variant Table

| # | Gene  | Variant             | Consequence       | Impact   | ClinVar    | gnomAD AF |
|---|-------|---------------------|-------------------|----------|------------|-----------|
| 1 | BRCA1 | 17:43044295 G>A     | missense_variant  | HIGH     | Pathogenic | 0.000008  |
| 2 | BRCA2 | 13:32316461 C>T     | stop_gained       | HIGH     | Pathogenic | 0.000004  |
| 3 | CFTR  | 7:117548628 CTTT>C  | frameshift_variant| HIGH     | Pathogenic | 0.021000  |

Output Structure

output_directory/
├── report.md                      # Full annotation report
├── results.json                   # All variants as structured JSON
├── tables/
│   └── variants.csv               # Tabular variant data
└── reproducibility/
    ├── commands.sh                # Exact commands to reproduce
    ├── environment.yml            # Python environment
    └── checksums.sha256           # SHA-256 of all output files

Dependencies

Required: Python standard library only (urllib, json, csv, hashlib)

Optional:

  • ensembl-vep (local install) — for offline annotation without API rate limits
  • cyvcf2 — for faster VCF parsing on large files

Gotchas

  • Ensembl VEP API rate limit: Free tier allows ~15 requests/second. The skill enforces a 0.1s sleep. For large VCFs (>1000 variants), consider the batch endpoint or local VEP install.

  • gnomAD v4 variant ID format: Must be {chrom}-{pos}-{ref}-{alt} without chr prefix. The skill strips chr automatically from VCF CHROM field.

  • ClinVar returns IDs not classifications: The E-utilities search only confirms presence in ClinVar. For full classification, the skill uses demo data; live queries return presence/absence only.

  • Indels in VEP: HGVS notation for indels differs from SNVs. The skill handles SNVs fully; complex indels may return limited VEP results.

  • GRCh37 vs GRCh38: The skill defaults to GRCh38 (hg38). If your VCF uses GRCh37 coordinates, VEP results may be incorrect.

Safety

  • Local-first: No VCF data is uploaded to third-party servers beyond public database APIs (Ensembl, NCBI, gnomAD — all accept variant queries)
  • Disclaimer: Every report includes the ClawBio research disclaimer
  • Not a diagnostic tool: ClinVar classifications are research annotations, not clinical diagnoses
  • Audit trail: All operations logged to reproducibility bundle

Agent Boundary

The agent (LLM) dispatches the VCF and explains results. The skill (Python) executes all API calls and generates files. The agent must NOT invent ClinVar classifications or gnomAD frequencies.

Integration with Bio Orchestrator

Trigger conditions: route here when:

  • File type is .vcf
  • Keywords: annotate, variants, pathogenic, clinvar, gnomad, vep

Chaining partners:

  • pharmgx-reporter: VCF annotation can precede pharmacogenomic reporting
  • equity-scorer: Annotated VCF feeds into population equity analysis
  • lit-synthesizer: Gene names from annotation can seed literature search

Maintenance

  • Review cadence: Monthly — gnomAD and ClinVar update regularly
  • Staleness signals: gnomAD API endpoint changes; ClinVar reclassifications
  • Deprecation: Archive if Ensembl VEP REST API is discontinued

Citations

同梱ファイル

※ ZIPに含まれるファイル一覧。`SKILL.md` 本体に加え、参考資料・サンプル・スクリプトが入っている場合があります。