jpskill.com
💬 コミュニケーション コミュニティ

theharvester

Passive email, subdomain, and IP harvesting from public sources using theHarvester. Use when: gathering corporate email lists, enumerating subdomains passively, pre-engagement recon, finding exposed employee contacts without triggering alerts.

⚡ おすすめ: コマンド1行でインストール(60秒)

下記のコマンドをコピーしてターミナル(Mac/Linux)または PowerShell(Windows)に貼り付けてください。 ダウンロード → 解凍 → 配置まで全自動。

🍎 Mac / 🐧 Linux
mkdir -p ~/.claude/skills && cd ~/.claude/skills && curl -L -o theharvester.zip https://jpskill.com/download/15475.zip && unzip -o theharvester.zip && rm theharvester.zip
🪟 Windows (PowerShell)
$d = "$env:USERPROFILE\.claude\skills"; ni -Force -ItemType Directory $d | Out-Null; iwr https://jpskill.com/download/15475.zip -OutFile "$d\theharvester.zip"; Expand-Archive "$d\theharvester.zip" -DestinationPath $d -Force; ri "$d\theharvester.zip"

完了後、Claude Code を再起動 → 普通に「動画プロンプト作って」のように話しかけるだけで自動発動します。

💾 手動でダウンロードしたい(コマンドが難しい人向け)
  1. 1. 下の青いボタンを押して theharvester.zip をダウンロード
  2. 2. ZIPファイルをダブルクリックで解凍 → theharvester フォルダができる
  3. 3. そのフォルダを C:\Users\あなたの名前\.claude\skills\(Win)または ~/.claude/skills/(Mac)へ移動
  4. 4. Claude Code を再起動

⚠️ ダウンロード・利用は自己責任でお願いします。当サイトは内容・動作・安全性について責任を負いません。

🎯 このSkillでできること

下記の説明文を読むと、このSkillがあなたに何をしてくれるかが分かります。Claudeにこの分野の依頼をすると、自動で発動します。

📦 インストール方法 (3ステップ)

  1. 1. 上の「ダウンロード」ボタンを押して .skill ファイルを取得
  2. 2. ファイル名の拡張子を .skill から .zip に変えて展開(macは自動展開可)
  3. 3. 展開してできたフォルダを、ホームフォルダの .claude/skills/ に置く
    • · macOS / Linux: ~/.claude/skills/
    • · Windows: %USERPROFILE%\.claude\skills\

Claude Code を再起動すれば完了。「このSkillを使って…」と話しかけなくても、関連する依頼で自動的に呼び出されます。

詳しい使い方ガイドを見る →
最終更新
2026-05-18
取得日時
2026-05-18
同梱ファイル
1
📖 Claude が読む原文 SKILL.md(中身を展開)

この本文は AI(Claude)が読むための原文(英語または中国語)です。日本語訳は順次追加中。

theHarvester

Overview

theHarvester is a passive OSINT tool that aggregates information about a target domain from multiple public sources. It finds email addresses, subdomains, hostnames, and IP ranges without making any direct requests to the target — making it ideal for stealth recon during the pre-engagement phase of penetration tests or OSINT investigations.

Sources include: Google, Bing, DuckDuckGo, LinkedIn, Shodan, Hunter.io, CertSpotter, DNSDumpster, VirusTotal, and more.

Instructions

Step 1: Install theHarvester

# Option 1: pip (in a virtual environment recommended)
pip install theHarvester

# Option 2: Clone from GitHub (most up-to-date)
git clone https://github.com/laramies/theHarvester.git
cd theHarvester
pip install -r requirements/base.txt

# Option 3: Docker
docker pull ghcr.io/laramies/theharvester
docker run ghcr.io/laramies/theharvester -d example.com -b google

Step 2: Basic usage

# Syntax: theHarvester -d <domain> -b <source> [options]
# -d  target domain
# -b  data source(s)
# -l  limit results (default: 500)
# -f  output filename (supports XML and JSON)
# -n  DNS lookup on discovered hosts
# -v  verify host via DNS resolution

# Search a single source
theHarvester -d example.com -b google

# Search all available sources
theHarvester -d example.com -b all

# Limit results, enable DNS lookup, save output
theHarvester -d example.com -b google,bing,linkedin -l 200 -n -f results_example

# Run from cloned repo
python3 theHarvester.py -d example.com -b all -l 500 -f output

Step 3: Choose sources strategically

# Email harvesting — best sources
theHarvester -d example.com -b google,bing,hunter,linkedin

# Subdomain enumeration — best sources
theHarvester -d example.com -b certspotter,dnsdumpster,virustotal,shodan

# Comprehensive (slower, uses all sources)
theHarvester -d example.com -b all -l 1000 -f full_recon_example

# LinkedIn employee discovery (requires LinkedIn API key in api-keys.yaml)
theHarvester -d example.com -b linkedin -l 200

Step 4: Configure API keys

# api-keys.yaml (place in theHarvester directory or specify with -c flag)
apikeys:
  hunter:
    key: YOUR_HUNTER_IO_KEY
  shodan:
    key: YOUR_SHODAN_KEY
  virustotal:
    key: YOUR_VIRUSTOTAL_KEY
  binaryedge:
    key: YOUR_BINARYEDGE_KEY
  fullhunt:
    key: YOUR_FULLHUNT_KEY
  securityTrails:
    key: YOUR_SECURITYTRAILS_KEY
  github:
    key: YOUR_GITHUB_TOKEN

Step 5: Parse and process output with Python

import json
import subprocess
import re

def run_harvester(domain, sources="google,bing,certspotter,dnsdumpster", limit=500):
    """Run theHarvester and return parsed results."""
    output_file = f"harvester_{domain.replace('.', '_')}"
    cmd = [
        "theHarvester",
        "-d", domain,
        "-b", sources,
        "-l", str(limit),
        "-f", output_file,
    ]
    print(f"Running: {' '.join(cmd)}")
    result = subprocess.run(cmd, capture_output=True, text=True, timeout=300)
    print(result.stdout)

    # Parse JSON output
    json_file = f"{output_file}.json"
    try:
        with open(json_file) as f:
            data = json.load(f)
        return data
    except FileNotFoundError:
        # Fall back to parsing stdout
        return parse_stdout(result.stdout)

def parse_stdout(output):
    """Extract emails, hosts, and IPs from raw stdout."""
    emails = set(re.findall(r'[\w\.-]+@[\w\.-]+\.\w+', output))
    # Filter out false positives
    emails = {e for e in emails if not e.endswith(('.png', '.jpg', '.css', '.js'))}

    hosts = set(re.findall(r'[\w\.-]+\.\w{2,}', output))
    ips = set(re.findall(r'\b(?:\d{1,3}\.){3}\d{1,3}\b', output))

    return {"emails": list(emails), "hosts": list(hosts), "ips": list(ips)}

def deduplicate_and_report(data, domain):
    """Clean and summarize harvested data."""
    emails = sorted(set(data.get("emails", [])))
    hosts = sorted(set(data.get("hosts", [])))
    ips = sorted(set(data.get("ips", [])))

    # Filter to target domain
    domain_emails = [e for e in emails if domain in e]
    domain_hosts = [h for h in hosts if domain in h]

    print(f"\n=== Harvest Report: {domain} ===")
    print(f"Emails found:    {len(domain_emails)}")
    print(f"Subdomains:      {len(domain_hosts)}")
    print(f"IP addresses:    {len(ips)}")

    if domain_emails:
        print("\nEmails:")
        for e in domain_emails[:20]:
            print(f"  {e}")

    if domain_hosts:
        print("\nSubdomains:")
        for h in domain_hosts[:20]:
            print(f"  {h}")

    return {
        "emails": domain_emails,
        "subdomains": domain_hosts,
        "ips": ips,
    }

# Usage
results = run_harvester("target-company.com", sources="google,bing,certspotter,hunter")
clean = deduplicate_and_report(results, "target-company.com")

# Save cleaned results
with open("clean_results.json", "w") as f:
    json.dump(clean, f, indent=2)

Step 6: Combine with other tools

# Pass discovered subdomains to nmap (only with explicit authorization)
theHarvester -d example.com -b all -f hosts
cat hosts.json | python3 -c "
import json, sys
data = json.load(sys.stdin)
for host in data.get('hosts', []):
    print(host)
" > subdomains.txt

# Feed subdomains into amass for deeper DNS enumeration
cat subdomains.txt | amass enum -df - -passive

# Check emails against breach databases
cat emails.txt | while read email; do
    curl -s "https://haveibeenpwned.com/api/v3/breachedaccount/$email" \
         -H "hibp-api-key: YOUR_HIBP_KEY"
done

Available Sources Reference

Source Data Type API Key Required
google Emails, subdomains No
bing Emails, subdomains No
duckduckgo Emails, subdomains No
linkedin Employees, emails Optional
hunter Emails Yes
certspotter Subdomains (SSL certs) No
dnsdumpster Subdomains, IPs No
virustotal Subdomains Yes
shodan IPs, open ports Yes
securitytrails Subdomains, DNS Yes
github Emails, code Yes
binaryedge IPs, services Yes

Guidelines

  • Always get authorization before running theHarvester against a target — passive does not mean invisible. Data queries may be logged by third-party services.
  • Rate limits: Without API keys, theHarvester relies on scraping search engines which may throttle or block requests. Add API keys for reliable results.
  • Combine sources: No single source is complete. Use multiple sources and deduplicate.
  • Email format detection: Once you have a few emails (e.g., jsmith@corp.com, john.smith@corp.com), infer the naming convention and use it to generate a target list.
  • DNS verification: Always use -n or -v to verify discovered hosts are live before reporting.