✍️ ライティングコミュニティ

llms-txt-crawler

ウェブサイトにあるllms.txtファイルから、ページURLを読み取って関連コンテンツをまとめてダウンロードし、ドキュメントや情報を効率的に収集するSkill。

📜 元の英語説明(参考)

Fetch and crawl llms.txt files from websites. Parses the llms.txt format to extract page URLs and downloads all listed content. Use when you need to gather documentation or content from a website that provides an llms.txt file.

🇯🇵 日本人クリエイター向け解説

一言でいうと

※ jpskill.com 編集部が日本のビジネス現場向けに補足した解説です。Skill本体の挙動とは独立した参考情報です。

⚡ おすすめ: コマンド1行でインストール(60秒)

下記のコマンドをコピーしてターミナル(Mac/Linux)または PowerShell(Windows)に貼り付けてください。ダウンロード → 解凍 → 配置まで全自動。

🍎 Mac / 🐧 Linux

mkdir -p ~/.claude/skills && cd ~/.claude/skills && curl -L -o llms-txt-crawler.zip https://jpskill.com/download/10614.zip && unzip -o llms-txt-crawler.zip && rm llms-txt-crawler.zip

🪟 Windows (PowerShell)

$d = "$env:USERPROFILE\.claude\skills"; ni -Force -ItemType Directory $d | Out-Null; iwr https://jpskill.com/download/10614.zip -OutFile "$d\llms-txt-crawler.zip"; Expand-Archive "$d\llms-txt-crawler.zip" -DestinationPath $d -Force; ri "$d\llms-txt-crawler.zip"

完了後、Claude Code を再起動 → 普通に「動画プロンプト作って」のように話しかけるだけで自動発動します。

💾 手動でダウンロードしたい(コマンドが難しい人向け)

1. 下の青いボタンを押して llms-txt-crawler.zip をダウンロード
2. ZIPファイルをダブルクリックで解凍 → llms-txt-crawler フォルダができる
3. そのフォルダを C:\Users\あなたの名前\.claude\skills\(Win)または ~/.claude/skills/(Mac)へ移動
4. Claude Code を再起動

⬇ .zip でダウンロード(推奨) ⬇ .skill 形式(上級者用) 元のソース ↗

⚠️ ダウンロード・利用は自己責任でお願いします。当サイトは内容・動作・安全性について責任を負いません。

🎯 このSkillでできること

下記の説明文を読むと、このSkillがあなたに何をしてくれるかが分かります。Claudeにこの分野の依頼をすると、自動で発動します。

📦 インストール方法 (3ステップ)

1. 上の「ダウンロード」ボタンを押して .skill ファイルを取得
2. ファイル名の拡張子を .skill から .zip に変えて展開(macは自動展開可)
3. 展開してできたフォルダを、ホームフォルダの .claude/skills/ に置く
- · macOS / Linux: ~/.claude/skills/
- · Windows: %USERPROFILE%\.claude\skills\

Claude Code を再起動すれば完了。「このSkillを使って…」と話しかけなくても、関連する依頼で自動的に呼び出されます。

詳しい使い方ガイドを見る →

最終更新: 2026-05-18
取得日時: 2026-05-18
同梱ファイル: 1

📖 Skill本文(日本語訳)

※ 原文(英語/中国語)を Gemini で日本語化したものです。Claude 自身は原文を読みます。誤訳がある場合は原文をご確認ください。

llms.txt クローラー Skill

この Skill を使用すると、ウェブサイトから llms.txt ファイルを取得し、その中にリストされているすべてのページをクロールできます。llms.txt 形式は、ウェブサイトが LLM フレンドリーなコンテンツリストを提供するための標準的な方法です。

概要

llms.txt ファイルは通常、次の形式に従います。

# Site Name

## Section Name

- [Page Title](https://example.com/page.md): ページの説明
- [Another Page](https://example.com/another.md): 別の説明

この Skill はこれらのファイルを解析し、リンクされているすべてのコンテンツをダウンロードします。

使い方

基本的な使い方

ターゲット URL を指定してクロールスクリプトを実行します。

cd /path/to/skills/llms-txt-crawler/scripts
npm install  # 初回のみ
node crawl.js --url https://example.com

コマンドラインオプション

オプション	短縮形	説明	デフォルト
`--url`	`-u`	llms.txt を持つサイトのベース URL	必須
`--output`	`-o`	クロールされたファイルの出力ディレクトリ	`./output`
`--format`	`-f`	出力形式: `md`、`json`、または `txt`	`md`
`--delay`	`-d`	リクエスト間の遅延時間 (ミリ秒)	`500`
`--concurrent`	`-c`	最大同時リクエスト数	`3`

例

agentskills.io ドキュメントのクロール:

node crawl.js --url https://agentskills.io --output ./agentskills-docs

カスタムレート制限でのクロール:

node crawl.js --url https://example.com --delay 1000 --concurrent 2

JSON として出力:

node crawl.js --url https://example.com --format json

出力構造

スクリプトは次の出力構造を作成します。

output/
├── llms.txt              # 元の llms.txt ファイル
├── index.json            # クロールされたすべてのページのメタデータ
└── pages/
    ├── page-1.md
    ├── page-2.md
    └── ...

エラー処理

ネットワークエラー: 指数バックオフで最大 3 回リトライ
レート制限: リクエスト間の遅延設定を尊重
ページの欠落: 警告をログに記録しますが、他のページのクロールを継続
無効な URL: スキップして無効な URL をログに記録

統合のヒント

この Skill をエージェントワークフローで使用する場合:

まず、クローラーを実行してコンテンツをダウンロードします
index.json ファイルには、すべてのページに関するメタデータが含まれています
ダウンロードした Markdown ファイルをコンテキストまたは分析に使用します

llms.txt Crawler Skill

This skill enables you to fetch llms.txt files from websites and crawl all pages listed within them. The llms.txt format is a standard way for websites to provide LLM-friendly content listings.

Overview

The llms.txt file typically follows this format:

# Site Name

## Section Name

- [Page Title](https://example.com/page.md): Description of the page
- [Another Page](https://example.com/another.md): Another description

This skill parses these files and downloads all linked content.

Usage

Basic Usage

Run the crawl script with a target URL:

cd /path/to/skills/llms-txt-crawler/scripts
npm install  # First time only
node crawl.js --url https://example.com

Command Line Options

Option	Short	Description	Default
`--url`	`-u`	Base URL of the site with llms.txt	Required
`--output`	`-o`	Output directory for crawled files	`./output`
`--format`	`-f`	Output format: `md`, `json`, or `txt`	`md`
`--delay`	`-d`	Delay between requests in milliseconds	`500`
`--concurrent`	`-c`	Maximum concurrent requests	`3`

Examples

Crawl agentskills.io documentation:

node crawl.js --url https://agentskills.io --output ./agentskills-docs

Crawl with custom rate limiting:

node crawl.js --url https://example.com --delay 1000 --concurrent 2

Output as JSON:

node crawl.js --url https://example.com --format json

Output Structure

The script creates the following output structure:

output/
├── llms.txt              # Original llms.txt file
├── index.json            # Metadata about all crawled pages
└── pages/
    ├── page-1.md
    ├── page-2.md
    └── ...

Error Handling

Network errors: Retries up to 3 times with exponential backoff
Rate limiting: Respects delay settings between requests
Missing pages: Logs warnings but continues crawling other pages
Invalid URLs: Skips and logs invalid URLs

Integration Tips

When using this skill in an agent workflow:

First run the crawler to download content
The index.json file contains metadata about all pages
Use the downloaded markdown files for context or analysis