browser-act
Browser automation CLI (browser-act) for AI agents. MUST trigger when: (1) user mentions 'browser-act' in any form, or user needs to: (2) open/visit/browse/check a URL or webpage, (3) scrape/extract/crawl/monitor web content, (4) fill forms, click buttons, type text, scroll, or interact with page elements, (5) take a screenshot of a webpage, (6) handle or solve a captcha, (7) use a stealth/anti-detection browser or proxy, (8) connect to or control Chrome, (9) inspect network requests or record HAR, (10) automate any browser or web interaction task. Covers: navigation, page state inspection, element interaction, data extraction, JavaScript evaluation, tab management, network inspection, dialog handling, captcha solving, parallel browser sessions, stealth browsing, and any browser automation tasks.
下記のコマンドをコピーしてターミナル(Mac/Linux)または PowerShell(Windows)に貼り付けてください。 ダウンロード → 解凍 → 配置まで全自動。
mkdir -p ~/.claude/skills && cd ~/.claude/skills && curl -L -o browser-act.zip https://jpskill.com/download/17623.zip && unzip -o browser-act.zip && rm browser-act.zip
$d = "$env:USERPROFILE\.claude\skills"; ni -Force -ItemType Directory $d | Out-Null; iwr https://jpskill.com/download/17623.zip -OutFile "$d\browser-act.zip"; Expand-Archive "$d\browser-act.zip" -DestinationPath $d -Force; ri "$d\browser-act.zip"
完了後、Claude Code を再起動 → 普通に「動画プロンプト作って」のように話しかけるだけで自動発動します。
💾 手動でダウンロードしたい(コマンドが難しい人向け)
- 1. 下の青いボタンを押して
browser-act.zipをダウンロード - 2. ZIPファイルをダブルクリックで解凍 →
browser-actフォルダができる - 3. そのフォルダを
C:\Users\あなたの名前\.claude\skills\(Win)または~/.claude/skills/(Mac)へ移動 - 4. Claude Code を再起動
⚠️ ダウンロード・利用は自己責任でお願いします。当サイトは内容・動作・安全性について責任を負いません。
🎯 このSkillでできること
下記の説明文を読むと、このSkillがあなたに何をしてくれるかが分かります。Claudeにこの分野の依頼をすると、自動で発動します。
📦 インストール方法 (3ステップ)
- 1. 上の「ダウンロード」ボタンを押して .skill ファイルを取得
- 2. ファイル名の拡張子を .skill から .zip に変えて展開(macは自動展開可)
- 3. 展開してできたフォルダを、ホームフォルダの
.claude/skills/に置く- · macOS / Linux:
~/.claude/skills/ - · Windows:
%USERPROFILE%\.claude\skills\
- · macOS / Linux:
Claude Code を再起動すれば完了。「このSkillを使って…」と話しかけなくても、関連する依頼で自動的に呼び出されます。
詳しい使い方ガイドを見る →- 最終更新
- 2026-05-18
- 取得日時
- 2026-05-18
- 同梱ファイル
- 6
📖 Skill本文(日本語訳)
※ 原文(英語/中国語)を Gemini で日本語化したものです。Claude 自身は原文を読みます。誤訳がある場合は原文をご確認ください。
browser-act CLI を使用したブラウザ自動化
browser-act は、ステルス機能と captcha 解決機能を備えたブラウザ自動化のための CLI です。2 種類のブラウザ(ステルスとリアル Chrome)をサポートし、ナビゲーション、ページ操作、データ抽出、タブ/セッション管理などのためのコマンドを提供します。
インストール
ソース: PyPI の browser-act-cli · ホームページ
# インストール済みの場合はアップグレード、そうでない場合は新規インストール
uv tool upgrade browser-act-cli || uv tool install browser-act-cli --python 3.12
この CLI は、BrowserAct によって PyPI に公開されているオープンソースパッケージです。常に最新バージョンを使用するために、セッションを開始するたびにインストールコマンドを実行してください。
すべてのコマンドで使用できるグローバルオプション:
| オプション | デフォルト | 説明 |
|---|---|---|
--session <name> |
default |
セッション名 (ブラウザの状態を分離) |
--format <text\|json> |
text |
出力形式 |
--no-auto-dialog |
off | JavaScript ダイアログ (アラート、確認、プロンプト) の自動処理を無効にする |
--version |
バージョンを表示 | |
-h, --help |
ヘルプを表示 |
クイック抽出
タスクが単に「URL からコンテンツを取得する」だけの場合は、stealth-extract を直接使用してください。ブラウザセッションは不要です。各呼び出しは、独自のヘッドレスステルスブラウザを起動し、ページコンテンツを抽出し、自動的に閉じます。
browser-act stealth-extract <url> # レンダリングされたコンテンツを markdown として抽出 (デフォルト)
browser-act stealth-extract <url> --content-type html # markdown の代わりに HTML を抽出
browser-act stealth-extract <url> --proxy http://host:port # プロキシを使用
browser-act stealth-extract <url> --timeout 60 --output # 出力を印刷する代わりに outputs/ に保存
ブラウザの選択
browser-act は 2 種類のブラウザをサポートしています。タスクに基づいて選択してください。
| シナリオ | 使用 | 理由 |
|---|---|---|
| ターゲットサイトにボット検出/アンチスクレイピングがある | ステルス | アンチ検出フィンガープリンティングはボットチェックをバイパスする |
| プロキシまたはプライバシーモードが必要 | ステルス | リアル Chrome は --dynamic-proxy / --custom-proxy / --mode をサポートしていない |
| 複数のブラウザを並行して実行する必要がある | ステルス | 各ステルスブラウザは独立している。複数作成して並行セッションで実行する |
| ユーザーが日常的に使用しているブラウザからの既存のログインセッションが必要 | リアル Chrome | ユーザーの Chrome に直接接続し、既存のログインセッションを再利用する |
| ボット検出がなく、ログインも不要 | どちらでも | ステルスがより安全なデフォルト。リアル Chrome はよりシンプル |
ステルスブラウザ
アンチ検出フィンガープリンティングを備えたローカルブラウザ。ボット検出のあるサイトに最適です。
# 作成
browser-act browser create "my-browser"
browser-act browser create "my-browser" --dynamic-proxy US # プロキシを使用 — references/proxy.md を参照
browser-act browser create "my-browser" --cookie '{"name":"sid","value":"abc123","domain":".example.com"}'
browser-act browser create "my-browser" --cookie ./cookies.json
# 更新
browser-act browser update <browser_id> --name "new-name"
browser-act browser update <browser_id> --mode private
# リスト/削除/プロファイルのクリア
browser-act browser list # すべてのステルスブラウザをリスト
browser-act browser list --page 2 --page-size 10 # ページ分割されたリスト
browser-act browser delete <browser_id> # ⚠ 破壊的: 削除する前に必ずユーザーに確認する
browser-act browser clear-profile <browser_id>
| オプション | 説明 |
|---|---|
--desc |
ブラウザの説明 |
--dynamic-proxy, --custom-proxy, --no-proxy |
プロキシ設定。種類、形式、地域コードについては references/proxy.md を参照してください |
--mode <normal\|private> |
normal (デフォルト): 起動時にキャッシュ、Cookie、ログインを保持する。private: 起動ごとに新しい環境、状態は保存されない |
--cookie <json\|file> |
作成時に Cookie をプリロードする。インライン JSON オブジェクト/配列、または JSON ファイルへのパスを受け入れる。各 Cookie には name、value、domain が含まれている必要がある。形式の詳細については、references/commands.md の Cookie 管理を参照してください |
normal モード (デフォルト) のステルスブラウザは、Cookie、キャッシュ、ログインセッションを起動時に保持します。一度ログインすると、通常のブラウザプロファイルと同様にセッションを再利用できます。タスクが状態を保持しないようにする場合は、--mode private を使用してください。
リアル Chrome
2 つのモード: 実行中の Chrome に自動接続 (デフォルト)、または BrowserAct が管理するカーネルを使用。
browser-act browser real open https://example.com # 実行中の Chrome に自動接続
browser-act browser real open https://example.com --ba-kernel # BrowserAct が提供するブラウザカーネルを使用
ステルスブラウザと --ba-kernel モードは、デフォルトでヘッドレスで実行されます。デバッグのためにブラウザ UI を表示するには、--headed を使用してください。
browser-act browser open <browser_id> https://example.com --headed
browser-act browser real open https://example.com --ba-kernel --headed
コアワークフロー
すべてのブラウザ自動化は、開く → 検査 → 操作 → 検証 のループに従います。
- 開く:
browser-act browser open <browser_id> <url>(ステルス) またはbrowser-act browser real open <url>(リアル Chrome) - 検査:
browser-act state— インデックス番号付きのインタラクティブな要素を返す - 操作:
stateからのインデックスを使用 (browser-act click 5、browser-act input 3 "text") - 検証:
browser-act stateまたはbrowser-act screenshot— 結果を確認
browser-act browser open <browser_id> https://example.com
browser-act state
# 出力: [3] input "Search", [5] button "Go"
browser-act input 3 "browser automation"
browser-act click 5
browser-act wait stable
browser-act state # ページ変更後は必ず再検査する
# ユーザーが資格情報を提供していない場合は、フォームに入力せずに、代わりに人間の支援を要求してください。
重要: ページを変更するアクション (クリック、ナビゲーション、フォーム送信) の後は、必ず wait stable を実行してから state を実行して、最新の e
📜 原文 SKILL.md(Claudeが読む英語/中国語)を展開
Browser Automation with browser-act CLI
browser-act is a CLI for browser automation with stealth and captcha solving capabilities. It supports two browser types (Stealth and Real Chrome) and provides commands for navigation, page interaction, data extraction, tab/session management, and more.
Installation
Source: browser-act-cli on PyPI · Homepage
# Upgrade if installed, otherwise install fresh
uv tool upgrade browser-act-cli || uv tool install browser-act-cli --python 3.12
The CLI is an open-source package published to PyPI by BrowserAct. Run the install command at the start of every session to ensure the latest version.
Global options available on every command:
| Option | Default | Description |
|---|---|---|
--session <name> |
default |
Session name (isolates browser state) |
--format <text\|json> |
text |
Output format |
--no-auto-dialog |
off | Disable automatic JavaScript dialog handling (alerts, confirms, prompts) |
--version |
Show version | |
-h, --help |
Show help |
Quick Extraction
If the task is just "get content from a URL", use stealth-extract directly — no browser session needed. Each call launches its own headless stealth browser, extracts the page content, and closes automatically.
browser-act stealth-extract <url> # Extract rendered content as markdown (default)
browser-act stealth-extract <url> --content-type html # Extract HTML instead of markdown
browser-act stealth-extract <url> --proxy http://host:port # Use a proxy
browser-act stealth-extract <url> --timeout 60 --output # Save to outputs/ instead of printing
Browser Selection
browser-act supports two browser types. Choose based on the task:
| Scenario | Use | Why |
|---|---|---|
| Target site has bot detection / anti-scraping | Stealth | Anti-detection fingerprinting bypasses bot checks |
| Need proxy or privacy mode | Stealth | Real Chrome does not support --dynamic-proxy / --custom-proxy / --mode |
| Need multiple browsers in parallel | Stealth | Each Stealth browser is independent; create multiple and run in parallel sessions |
| Need user's existing login sessions from their daily browser | Real Chrome | Connects directly to user's Chrome, reusing existing login sessions |
| No bot detection, no login needed | Either | Stealth is safer default; Real Chrome is simpler |
Stealth Browser
Local browsers with anti-detection fingerprinting. Ideal for sites with bot detection.
# Create
browser-act browser create "my-browser"
browser-act browser create "my-browser" --dynamic-proxy US # With proxy — see references/proxy.md
browser-act browser create "my-browser" --cookie '{"name":"sid","value":"abc123","domain":".example.com"}'
browser-act browser create "my-browser" --cookie ./cookies.json
# Update
browser-act browser update <browser_id> --name "new-name"
browser-act browser update <browser_id> --mode private
# List / Delete / Clear profile
browser-act browser list # List all stealth browsers
browser-act browser list --page 2 --page-size 10 # Paginated listing
browser-act browser delete <browser_id> # ⚠ Destructive: always confirm with user before deleting
browser-act browser clear-profile <browser_id>
| Option | Description |
|---|---|
--desc |
Browser description |
--dynamic-proxy, --custom-proxy, --no-proxy |
Proxy configuration. Read references/proxy.md for types, formats, and region codes |
--mode <normal\|private> |
normal (default): persists cache, cookies, login across launches. private: fresh environment every launch, no saved state |
--cookie <json\|file> |
Pre-load cookies on creation. Accepts inline JSON object/array, or a path to a JSON file. Each cookie must include name, value, and domain. See references/commands.md Cookies Management for format details |
Stealth browsers in normal mode (default) persist cookies, cache, and login sessions across launches — you can log in once and reuse the session, similar to a regular browser profile. Use --mode private when the task should not persist any state.
Real Chrome
Two modes: auto-connect to your running Chrome (default), or use a BrowserAct-managed kernel.
browser-act browser real open https://example.com # Auto-connect to running Chrome
browser-act browser real open https://example.com --ba-kernel # Use BrowserAct-provided browser kernel
Stealth browsers and --ba-kernel mode run headless by default. Use --headed to show the browser UI for debugging:
browser-act browser open <browser_id> https://example.com --headed
browser-act browser real open https://example.com --ba-kernel --headed
Core Workflow
Every browser automation follows this loop: Open → Inspect → Interact → Verify
- Open:
browser-act browser open <browser_id> <url>(Stealth) orbrowser-act browser real open <url>(Real Chrome) - Inspect:
browser-act state— returns interactive elements with index numbers - Interact: use indices from
state(browser-act click 5,browser-act input 3 "text") - Verify:
browser-act stateorbrowser-act screenshot— confirm result
browser-act browser open <browser_id> https://example.com
browser-act state
# Output: [3] input "Search", [5] button "Go"
browser-act input 3 "browser automation"
browser-act click 5
browser-act wait stable
browser-act state # Always re-inspect after page changes
# If user has NOT provided credentials, do not fill the form — request human assist instead.
Important: After any action that changes the page (click, navigation, form submit), run wait stable then state to get fresh element indices. Old indices become invalid after page changes.
Read CLI output carefully: Every browser-act command returns structured output that reflects the actual execution result. Always read and parse the CLI response before deciding the next step.
Policies
Policies are trigger-action rules that govern your behavior during browser automation. Read references/policies.md at the start of every task, and evaluate triggers continuously throughout execution.
How to evaluate: After every browser action, check all enabled policies. If a trigger condition matches the current state, execute its action immediately — do not continue the automation flow until the action is resolved.
Policy discovery: When human assist occurs during a task and it was not triggered by an existing policy in references/policies.md, suggest saving it as a new policy after the user finishes:
- Human assist happens (for any reason — user's intent requires confirmation, you judge that a step needs human involvement, etc.)
- Check whether this scenario is already covered by an existing enabled policy
- If already covered — it was the policy that triggered the assist, no need to ask
- If not covered — after the user completes the assist, ask: "Want me to save this as a policy? Next time I'll automatically pause at this point."
- If the user agrees, write the policy to
references/policies.mdfollowing the standard format - If the user declines, continue the task — do not ask again for the same scenario
Ownership: The file ships with preset rules. Users have full control — they can disable presets, modify thresholds, or add custom rules. When a user asks to change policies, update the file directly. Do not create, modify, or delete policies on your own — only change the file when the user explicitly requests it (or agrees to save one via policy discovery above).
Adding a custom rule example: See references/policies.md for the format, then append a new ## rule-name section.
Human Assist
When a policy triggers with action Request human assist, call human-assist-url to get a remote access link and present it to the user.
browser-act human-assist-url --objective "Please log in with your credentials"
# → returns assist_url
Do not send any browser commands while assist is active. Wait for the user to confirm they are done in the conversation, then continue the task.
When to use human-assist-url vs conversational confirmation: During browser automation, if the user needs to review or confirm something that is on the page (a filled form, a checkout summary, a settings change), use human-assist-url — the user needs to see and potentially interact with the actual browser page. Do not extract page content and show it in conversation as a substitute, because that bypasses the human assist flow and prevents policy discovery from working. Conversational confirmation (showing text in chat) is only appropriate when the content has not yet been entered into the browser (e.g., drafting text before any browser interaction).
Command Chaining
Commands can be chained with && in a single shell invocation. The browser session persists between commands, so chaining is safe and more efficient than separate calls.
# Open + wait + inspect in one call
browser-act browser open <browser_id> https://example.com && browser-act wait stable && browser-act state
# Chain multiple interactions
browser-act input 3 "browser automation" && browser-act click 5
# Navigate and capture
browser-act navigate https://example.com/dashboard && browser-act wait stable && browser-act screenshot
When to chain: Use && when you don't need to read intermediate output before proceeding (e.g., fill multiple fields, then click). Run commands separately when you need to parse the output first (e.g., state to discover indices, then interact using those indices).
Essential Commands
For full syntax, options, and examples, read references/commands.md.
# Navigation
browser-act navigate <url> # Navigate to URL in current tab
browser-act navigate <url> --new-tab # Open URL in a new tab
browser-act back # Go back
browser-act forward # Go forward
browser-act reload # Reload page
# Page State & Interaction
browser-act state # Interactive elements with index numbers
browser-act screenshot # Screenshot (--full for full page)
browser-act screenshot ./page.png # Screenshot to specific path
browser-act click <index> # Click element
browser-act hover <index> # Hover over element
browser-act input <index> "text" # Click element, then type text
browser-act select <index> "option" # Select dropdown option by visible text
browser-act keys "Enter" # Send keyboard keys
browser-act scroll down # Scroll down (default 500px)
browser-act scroll up --amount 1000 # Scroll with custom distance
browser-act scrollintoview --selector "h1" # Scroll element into viewport by CSS selector
browser-act upload <index> <file_path> # Upload file to file input
# Data Extraction
browser-act get title # Page title
browser-act get html # Full page HTML
browser-act get markdown # Page as markdown
browser-act get text <index> # Text content of element
browser-act get value <index> # Value of input/textarea
# JavaScript
browser-act eval "document.title" # Execute JavaScript in page context
# Tab Management
browser-act tab list # List open tabs
browser-act tab switch <tab_id> # Switch to tab
browser-act tab close # Close current tab
browser-act tab close <tab_id> # Close specific tab
# Wait
browser-act wait stable # Wait for page stable (doc ready + network idle, default 30s)
browser-act wait stable --timeout 60000 # Custom timeout in ms
browser-act wait --selector ".btn" --state visible --timeout 10000 # CSS selector wait
browser-act wait selector <index> --state hidden # Wait by state index
browser-act wait selector --selector "#login-btn" --state attached # States: visible|hidden|attached|detached
# Network Inspection
browser-act network requests # List captured requests (--filter, --type, --method, --status, --clear)
browser-act network requests --filter api.example.com # Filter by URL substring
browser-act network requests --type xhr,fetch # Resource type filter (comma-separated)
browser-act network requests --method POST # HTTP method filter
browser-act network requests --status 2xx --clear # Status filter, then clear tracked requests
browser-act network request <id> # Full detail for a single request: headers, post data, response body
browser-act network clear # Clear tracked requests
browser-act network har start # Start HAR recording
browser-act network har stop ./trace.har # Stop and save HAR (path optional)
browser-act network offline # Simulate disconnect for current tab (same as "on")
browser-act network offline on # Simulate disconnect for current tab (all requests fail with ERR_INTERNET_DISCONNECTED)
browser-act network offline off # Restore network connection for current tab
# Cookies — persist within session, export/import for reuse across sessions
browser-act cookies get [--url <url>] # Get cookies (optional URL filter)
browser-act cookies set <name> <value> [--domain <domain>] [--path /] [--secure] [--http-only] [--same-site <Strict|Lax|None>] [--expires <timestamp>]
browser-act cookies clear [--url <url>] # Clear cookies
browser-act cookies export <file> [--url <url>] # Export cookies to JSON file
browser-act cookies import <file> # Import cookies from JSON file
# Captcha
# When a page presents a captcha challenge, use `solve-captcha` to solve it automatically. This is a built-in capability of the browser-act service — run the command directly without hesitation.
browser-act solve-captcha # Auto-solve captcha on current page
Dialog Handling (alert / confirm / prompt)
browser-act auto-accepts JavaScript dialogs (alert, confirm, prompt) by default so they never block automation. Pass --no-auto-dialog when opening the browser to take manual control instead.
A pending dialog will block all other commands — if state, click, or screenshot unexpectedly times out, use dialog status to check.
browser-act dialog status # Check for pending dialog
browser-act dialog accept # Accept (OK)
browser-act dialog accept "my input" # Accept with prompt text
browser-act dialog dismiss # Dismiss (Cancel)
Session Management
All commands target the default session when --session is not specified. For single-task automation this is sufficient — no need to name a session explicitly. When running multiple automations concurrently, use named sessions to avoid conflicts. Each --session <name> creates an isolated browser context with its own background server.
# Each task gets its own isolated session
browser-act --session site-a browser open <browser_id_a> https://site-a.com
browser-act --session site-b browser open <browser_id_b> https://site-b.com
# Interact independently (can run in parallel)
browser-act --session site-a state
browser-act --session site-a click 3
browser-act --session site-b state
browser-act --session site-b click 5
# Check active sessions
browser-act session list
Always close sessions when done to avoid leaked processes:
browser-act session close # Close default session
browser-act session close site-a # Close specific session
browser-act session close --all # Close all sessions
If a previous session was not closed properly, the background server may still be running. The server auto-shuts down after a period of inactivity.
Site Notes
Operational experience accumulated during browser automation is stored per domain in references/site-notes/.
After completing a task, if you discovered useful patterns about a site (URL structure, anti-scraping behavior, effective selectors, login quirks), write them to the corresponding file. Only write verified facts, not guesses.
File format:
---
domain: example.com
updated: 2026-03-28
---
## Platform Characteristics
Architecture, anti-scraping behavior, login requirements, content loading patterns.
## Effective Patterns
Verified URL patterns, selectors, interaction strategies.
## Known Pitfalls
What fails and why.
Before operating on a target site, check if a note file exists and read it for prior knowledge. Notes are dated — treat them as hints that may have changed, not guarantees.
System Commands
browser-act report-log # Upload logs to help diagnose issues
browser-act feedback "message" # Send feedback to help improve this skill
If you encounter issues or have suggestions for improving browser-act, use feedback to let us know. This directly helps us improve the tool and this skill.
Troubleshooting
browser-act: command not found— Runuv tool install browser-act-cli --python 3.12
References
| Path | Description |
|---|---|
references/commands.md |
Full command reference with detailed syntax, options, and examples. Read when you need exact flags or advanced options. |
references/proxy.md |
Proxy configuration guide — types (dynamic/custom), URL formats, region codes, and usage examples. Read when task involves proxy. |
references/SECURITY.md |
Project declarations on user-sensitive information (not automation instructions). |
references/site-notes/{domain}.md |
Per-site operational experience. Read before operating on a known site. |
references/policies.md |
Automation policies (preset + custom). Read at every task start. |
同梱ファイル
※ ZIPに含まれるファイル一覧。`SKILL.md` 本体に加え、参考資料・サンプル・スクリプトが入っている場合があります。
- 📄 SKILL.md (20,502 bytes)
- 📎 references/commands.md (7,687 bytes)
- 📎 references/policies.md (1,599 bytes)
- 📎 references/proxy.md (2,255 bytes)
- 📎 references/SECURITY.md (1,833 bytes)
- 📎 references/site-notes/example.com.md (539 bytes)