🛠️ 開発・MCP コミュニティ

flyscrape

flyscrapeは、jQueryのようなセレクタを使ってウェブサイトからデータを抽出したり、ページを巡回したり、ファイルをダウンロードしたりできる、スタンドアロンのCLIスクレーパーで、ウェブスクレイピングを効率的に行うSkill。

📜 元の英語説明(参考)

Write and run web scraping scripts with `flyscrape`, the standalone CLI scraper with jQuery-like selectors. Use this skill when the user asks to scrape a site, extract data from HTML, follow pagination, crawl multiple pages, download files, use browser mode, configure selectors, or mentions `flyscrape`, a scraping script, nested scraping, or depth-controlled crawling.

🇯🇵 日本人クリエイター向け解説

一言でいうと

※ jpskill.com 編集部が日本のビジネス現場向けに補足した解説です。Skill本体の挙動とは独立した参考情報です。

⚡ おすすめ: コマンド1行でインストール(60秒)

下記のコマンドをコピーしてターミナル(Mac/Linux)または PowerShell(Windows)に貼り付けてください。ダウンロード → 解凍 → 配置まで全自動。

🍎 Mac / 🐧 Linux

mkdir -p ~/.claude/skills && cd ~/.claude/skills && curl -L -o flyscrape.zip https://jpskill.com/download/8674.zip && unzip -o flyscrape.zip && rm flyscrape.zip

🪟 Windows (PowerShell)

$d = "$env:USERPROFILE\.claude\skills"; ni -Force -ItemType Directory $d | Out-Null; iwr https://jpskill.com/download/8674.zip -OutFile "$d\flyscrape.zip"; Expand-Archive "$d\flyscrape.zip" -DestinationPath $d -Force; ri "$d\flyscrape.zip"

完了後、Claude Code を再起動 → 普通に「動画プロンプト作って」のように話しかけるだけで自動発動します。

💾 手動でダウンロードしたい(コマンドが難しい人向け)

1. 下の青いボタンを押して flyscrape.zip をダウンロード
2. ZIPファイルをダブルクリックで解凍 → flyscrape フォルダができる
3. そのフォルダを C:\Users\あなたの名前\.claude\skills\(Win)または ~/.claude/skills/(Mac)へ移動
4. Claude Code を再起動

⬇ .zip でダウンロード(推奨) ⬇ .skill 形式(上級者用) 元のソース ↗

⚠️ ダウンロード・利用は自己責任でお願いします。当サイトは内容・動作・安全性について責任を負いません。

🎯 このSkillでできること

下記の説明文を読むと、このSkillがあなたに何をしてくれるかが分かります。Claudeにこの分野の依頼をすると、自動で発動します。

📦 インストール方法 (3ステップ)

1. 上の「ダウンロード」ボタンを押して .skill ファイルを取得
2. ファイル名の拡張子を .skill から .zip に変えて展開(macは自動展開可)
3. 展開してできたフォルダを、ホームフォルダの .claude/skills/ に置く
- · macOS / Linux: ~/.claude/skills/
- · Windows: %USERPROFILE%\.claude\skills\

Claude Code を再起動すれば完了。「このSkillを使って…」と話しかけなくても、関連する依頼で自動的に呼び出されます。

詳しい使い方ガイドを見る →

最終更新: 2026-05-18
取得日時: 2026-05-18
同梱ファイル: 1

📖 Skill本文(日本語訳)

※ 原文(英語/中国語)を Gemini で日本語化したものです。Claude 自身は原文を読みます。誤訳がある場合は原文をご確認ください。

Flyscrape

Flyscrape は、JavaScript スクリプトを使用するコマンドラインのウェブスクレイピングツールです。スタンドアロン（単一バイナリ）であり、jQuery のようなセレクタをサポートし、ヘッドレスブラウザを介して JavaScript を多用するページをレンダリングできます。

クイックリファレンス

flyscrape new script.js    # テンプレートから新しいスクリプトを作成
flyscrape dev script.js    # 開発モード: 変更を監視して再実行 (キャッシュ)
flyscrape run script.js    # スクレイパーを実行
flyscrape run script.js --url "http://example.com" --depth 3  # CLI 経由で設定をオーバーライド

スクリプトの構造

すべてのスクリプトは、config（動作を制御）と default function（データを抽出）の 2 つの部分で構成されています。

export const config = {
  url: "https://example.com",
  // すべてのオプションについては references/config.md を参照してください
};

export default function({ doc, url, absoluteURL, scrape, follow }) {
  // doc - jQuery のような API を持つ解析済みの HTML ドキュメント
  // url - 現在のページ URL
  // absoluteURL(path) - 相対 URL を絶対 URL に変換
  // scrape(url, fn) - リンクされたページのネストされたスクレイピング
  // follow(url) - リンクを手動でたどる (follow: [] と一緒に使用)

  return {
    title: doc.find("h1").text(),
    // Return オブジェクトは JSON 出力になります
  };
}

必須の設定オプション

Option	Default	Description
`url`	-	開始 URL
`urls`	`[]`	複数の開始 URL
`depth`	`0`	リンクをたどる深さ (0 = たどらない)
`follow`	`["a[href]"]`	たどるリンクの CSS セレクタ
`browser`	`false`	JS を多用するサイト向けにヘッドレス Chromium を有効にする
`cache`	-	リクエストをキャッシュするには `"file"` に設定
`rate`	-	1 分あたりのリクエスト制限
`concurrency`	-	最大同時リクエスト数

完全な設定リファレンスについては、references/config.md を参照してください。

クエリ API (jQuery ライク)

const el = doc.find(".selector");  // 要素を検索
el.text()                          // テキストコンテンツを取得
el.html()                          // inner HTML を取得
el.attr("href")                    // 属性を取得
el.hasAttr("data-id")              // 属性が存在するか確認
el.hasClass("active")              // クラスが存在するか確認

// コレクション
const items = doc.find("li");
items.length()                     // カウント
items.first() / items.last()       // 最初/最後の要素
items.get(0)                       // インデックスによる要素
items.map(el => el.text())         // 配列にマップ
items.filter(el => el.hasClass("x")) // 要素をフィルタリング

// トラバーサル
el.parent()                        // 親要素
el.children()                      // 直接の子
el.siblings()                      // 兄弟要素
el.prev() / el.next()              // 隣接する兄弟
el.prevAll() / el.nextAll()        // すべての前/次の兄弟
el.prevUntil("selector")           // セレクタまでの兄弟

完全な API リファレンスについては、references/query-api.md を参照してください。

一般的なパターン

ページネーションをたどる

export const config = {
  url: "https://example.com/posts",
  depth: 10,
  follow: [".pagination a.next"],
};

ブラウザモードでスクレイピングする (JS を多用するサイト)

export const config = {
  url: "https://spa-site.com",
  browser: true,
  headless: true,
};

ネストされたスクレイピング (詳細ページ)

export default function({ doc, scrape, absoluteURL }) {
  const links = doc.find(".product-link");

  return {
    products: links.map(link => {
      const detailUrl = absoluteURL(link.attr("href"));
      return scrape(detailUrl, ({ doc }) => ({
        name: doc.find("h1").text(),
        price: doc.find(".price").text(),
      }));
    }),
  };
}

ファイルをダウンロードする

import { download } from "flyscrape/http";

export default function({ doc, absoluteURL }) {
  doc.find("img").each(img => {
    download(absoluteURL(img.attr("src")), "images/");
  });
  return { downloaded: true };
}

レート制限とキャッシュ (礼儀正しく)

export const config = {
  url: "https://example.com",
  rate: 30,           // 30 requests/minute
  concurrency: 2,     // Max 2 concurrent
  cache: "file",      // scriptname.cache にキャッシュ
};

ワークフロー

作成: flyscrape new myscript.js
開発: flyscrape dev myscript.js - キャッシュされたレスポンスで反復処理
実行: flyscrape run myscript.js - 完全な実行
出力: flyscrape run myscript.js --output.file results.json

トラブルシューティングのクイックヒント

Problem	Solution
ブロックされる (403)	User-Agent ヘッダーを追加し、レートを下げ、`browser: true` を使用
空の結果	サイトがブラウザモードを必要とするか確認し、セレクタを検証
リンクがたどられない	`depth > 0` を設定し、`follow` セレクタを確認
パフォーマンスが遅い	`concurrency` を増やし、`cache: "file"` を有効にする

詳細な解決策については、references/troubleshooting.md を参照してください。

参照ファイル

references/config.md - 完全な設定オプション
references/query-api.md - 完全なクエリ API ドキュメント
references/recipes.md - 一般的なパターンとコードスニペット
references/troubleshooting.md - 問題解決ガイド
examples/ - すぐに使用できるサンプルスクリプト

外部リソース

ドキュメント: https://flyscrape.com/docs/getting-started/
GitHub: https://github.com/philippta/flyscrape
サンプル: https://github.com/philippta/flyscrape/tree/master/examples

📜 原文 SKILL.md(Claudeが読む英語/中国語)を展開

Flyscrape

Flyscrape is a command-line web scraping tool that uses JavaScript scraping scripts. It's standalone (single binary), supports jQuery-like selectors, and can render JavaScript-heavy pages via headless browser.

Quick Reference

flyscrape new script.js    # Create new script from template
flyscrape dev script.js    # Dev mode: watch & re-run on changes (cached)
flyscrape run script.js    # Run the scraper
flyscrape run script.js --url "http://example.com" --depth 3  # Override config via CLI

Script Structure

Every script has two parts: config (controls behavior) and default function (extracts data).

export const config = {
  url: "https://example.com",
  // See references/config.md for all options
};

export default function({ doc, url, absoluteURL, scrape, follow }) {
  // doc - parsed HTML document with jQuery-like API
  // url - the current page URL
  // absoluteURL(path) - converts relative URLs to absolute
  // scrape(url, fn) - nested scraping of linked pages
  // follow(url) - manually follow a link (use with follow: [])

  return {
    title: doc.find("h1").text(),
    // Return object becomes JSON output
  };
}

Essential Config Options

Option	Default	Description
`url`	-	Starting URL
`urls`	`[]`	Multiple starting URLs
`depth`	`0`	How deep to follow links (0 = no following)
`follow`	`["a[href]"]`	CSS selectors for links to follow
`browser`	`false`	Enable headless Chromium for JS-heavy sites
`cache`	-	Set to `"file"` to cache requests
`rate`	-	Requests per minute limit
`concurrency`	-	Max concurrent requests

See references/config.md for complete configuration reference.

Query API (jQuery-like)

const el = doc.find(".selector");  // Find element(s)
el.text()                          // Get text content
el.html()                          // Get inner HTML
el.attr("href")                    // Get attribute
el.hasAttr("data-id")              // Check attribute exists
el.hasClass("active")              // Check class exists

// Collections
const items = doc.find("li");
items.length()                     // Count
items.first() / items.last()       // First/last element
items.get(0)                       // Element by index
items.map(el => el.text())         // Map to array
items.filter(el => el.hasClass("x")) // Filter elements

// Traversal
el.parent()                        // Parent element
el.children()                      // Direct children
el.siblings()                      // Sibling elements
el.prev() / el.next()              // Adjacent siblings
el.prevAll() / el.nextAll()        // All prev/next siblings
el.prevUntil("selector")           // Siblings until selector

See references/query-api.md for full API reference.

Common Patterns

Follow Pagination

export const config = {
  url: "https://example.com/posts",
  depth: 10,
  follow: [".pagination a.next"],
};

Scrape with Browser Mode (JS-heavy sites)

export const config = {
  url: "https://spa-site.com",
  browser: true,
  headless: true,
};

Nested Scraping (detail pages)

export default function({ doc, scrape, absoluteURL }) {
  const links = doc.find(".product-link");

  return {
    products: links.map(link => {
      const detailUrl = absoluteURL(link.attr("href"));
      return scrape(detailUrl, ({ doc }) => ({
        name: doc.find("h1").text(),
        price: doc.find(".price").text(),
      }));
    }),
  };
}

Download Files

import { download } from "flyscrape/http";

export default function({ doc, absoluteURL }) {
  doc.find("img").each(img => {
    download(absoluteURL(img.attr("src")), "images/");
  });
  return { downloaded: true };
}

Rate Limiting & Caching (be polite)

export const config = {
  url: "https://example.com",
  rate: 30,           // 30 requests/minute
  concurrency: 2,     // Max 2 concurrent
  cache: "file",      // Cache to scriptname.cache
};

Workflow

Create: flyscrape new myscript.js
Develop: flyscrape dev myscript.js - iterates with cached responses
Run: flyscrape run myscript.js - full execution
Output: flyscrape run myscript.js --output.file results.json

Troubleshooting Quick Tips

Problem	Solution
Getting blocked (403)	Add User-Agent header, reduce rate, use `browser: true`
Empty results	Check if site needs browser mode, verify selectors
Links not followed	Set `depth > 0`, check `follow` selectors
Slow performance	Increase `concurrency`, enable `cache: "file"`

See references/troubleshooting.md for detailed solutions.

Reference Files

references/config.md - Complete configuration options
references/query-api.md - Full Query API documentation
references/recipes.md - Common patterns and code snippets
references/troubleshooting.md - Problem solving guide
examples/ - Ready-to-use example scripts

External Resources

Documentation: https://flyscrape.com/docs/getting-started/
GitHub: https://github.com/philippta/flyscrape
Examples: https://github.com/philippta/flyscrape/tree/master/examples