🔌 Claude API 開発・最適化アシスタント
Anthropic Claude API を使ったアプリ開発・最適化・モデル移行を支援するSkill(キャッシュ・思考モード・バッチ等)。
📺 まず動画で見る(YouTube)
▶ 【最新版】Claude(クロード)完全解説!20以上の便利機能をこの動画1本で全て解説 ↗
※ jpskill.com 編集部が参考用に選んだ動画です。動画の内容と Skill の挙動は厳密には一致しないことがあります。
📜 元の英語説明(参考)
Build, debug, and optimize Claude API / Anthropic SDK apps. Apps built with this skill should include prompt caching. Also handles migrating existing Claude API code between Claude model versions (4.5 → 4.6, 4.6 → 4.7, retired-model replacements). TRIGGER when: code imports `anthropic`/`@anthropic-ai/sdk`; user asks for the Claude API, Anthropic SDK, or Managed Agents; user adds/modifies/tunes a Claude feature (caching, thinking, compaction, tool use, batch, files, citations, memory) or model (Opus/Sonnet/Haiku) in a file; questions about prompt caching / cache hit rate in an Anthropic SDK project. SKIP: file imports `openai`/other-provider SDK, filename like `*-openai.py`/`*-generic.py`, provider-neutral code, general programming/ML.
🇯🇵 日本人クリエイター向け解説
Anthropic Claude API を使ったアプリ開発・最適化・モデル移行を支援するSkill(キャッシュ・思考モード・バッチ等)。
※ jpskill.com 編集部が日本のビジネス現場向けに補足した解説です。Skill本体の挙動とは独立した参考情報です。
⚠️ ダウンロード・利用は自己責任でお願いします。当サイトは内容・動作・安全性について責任を負いません。
🎯 このSkillでできること
下記の説明文を読むと、このSkillがあなたに何をしてくれるかが分かります。Claudeにこの分野の依頼をすると、自動で発動します。
📦 インストール方法 (3ステップ)
- 1. 上の「ダウンロード」ボタンを押して .skill ファイルを取得
- 2. ファイル名の拡張子を .skill から .zip に変えて展開(macは自動展開可)
- 3. 展開してできたフォルダを、ホームフォルダの
.claude/skills/に置く- · macOS / Linux:
~/.claude/skills/ - · Windows:
%USERPROFILE%\.claude\skills\
- · macOS / Linux:
Claude Code を再起動すれば完了。「このSkillを使って…」と話しかけなくても、関連する依頼で自動的に呼び出されます。
詳しい使い方ガイドを見る →- 最終更新
- 2026-05-17
- 取得日時
- 2026-05-17
- 同梱ファイル
- 2
💬 こう話しかけるだけ — サンプルプロンプト
- › Claude APIで社内チャットボットを作る最小コードを Python で
- › プロンプトキャッシュ機能を有効にしてコストを下げる実装方法
- › Sonnet 4.6 から Opus 4.7 に切り替えるとき、注意すべきAPI挙動の違いは?
- › ツール使用(Tool Use)で、社内DBを叩かせる実装例
- › バッチAPIで月次の大量処理をコスト50%減で回したい
これをClaude Code に貼るだけで、このSkillが自動発動します。
🔗 関連するSkill
📖 Skill本文(日本語訳)
※ 原文(英語/中国語)を Gemini で日本語化したものです。Claude 自身は原文を読みます。誤訳がある場合は原文をご確認ください。
Claude を使用した LLM 搭載アプリケーションの構築
このスキルは、Claude を使用して LLM 搭載アプリケーションを構築するのに役立ちます。ニーズに基づいて適切なインターフェースを選択し、プロジェクトの言語を検出し、関連する言語固有のドキュメントを読んでください。
開始する前に
ターゲットファイル(または、ターゲットファイルがない場合はプロンプトとプロジェクト)をスキャンし、Anthropic 以外のプロバイダーマーカー(import openai、from openai、langchain_openai、OpenAI(、gpt-4、gpt-5、agent-openai.py や *-generic.py のようなファイル名、またはコードをプロバイダーに依存しないようにする明示的な指示)がないか確認してください。これらが見つかった場合は、停止して、このスキルが Claude/Anthropic SDK コードを生成することをユーザーに伝え、ファイルを Claude に切り替えるか、Claude 以外の実装を希望するかを尋ねてください。Anthropic 以外のファイルを Anthropic SDK 呼び出しで編集しないでください。
出力要件
ユーザーが Claude 機能の追加、変更、または実装を要求した場合、コードは以下のいずれかを介して Claude を呼び出す必要があります。
- プロジェクトの言語用の公式 Anthropic SDK (
anthropic、@anthropic-ai/sdk、com.anthropic.*など)。これは、サポートされている SDK がプロジェクトに存在する場合のデフォルトです。 - 生の HTTP (
curl、requests、fetch、httpxなど) — ユーザーが明示的に cURL/REST/生の HTTP を要求した場合、プロジェクトがシェル/cURL プロジェクトである場合、または言語に公式 SDK がない場合にのみ使用します。
この2つを混在させないでください。Python や TypeScript プロジェクトで、より軽量に感じるからといって requests/fetch を使用しないでください。OpenAI 互換のシムに頼らないでください。
SDK の使用法を推測しないでください。 関数名、クラス名、名前空間、メソッドシグネチャ、およびインポートパスは、明示的なドキュメント(このスキル内の {lang}/ ファイル、または shared/live-sources.md にリストされている公式 SDK リポジトリまたはドキュメントリンク)から取得する必要があります。必要なバインディングがスキルファイルに明示的に文書化されていない場合は、コードを記述する前に shared/live-sources.md から関連する SDK リポジトリを WebFetch してください。cURL の形式や他の言語の SDK から Ruby/Java/Go/PHP/C# の API を推測しないでください。
デフォルト
ユーザーが特に指定しない限り:
Claude モデルのバージョンには、claude-opus-4-7 という正確なモデル文字列でアクセスできる Claude Opus 4.7 を使用してください。少しでも複雑なことには、適応的思考 (thinking: {type: "adaptive"}) をデフォルトで使用してください。最後に、長い入力、長い出力、または高い max_tokens を伴う可能性のあるすべてのリクエストには、ストリーミングをデフォルトで使用してください。これにより、リクエストのタイムアウトを防ぐことができます。個々のストリームイベントを処理する必要がない場合は、SDK の .get_final_message() / .finalMessage() ヘルパーを使用して完全な応答を取得してください。
サブコマンド
このプロンプトの最後にあるユーザーリクエストが、単なるサブコマンド文字列(散文なし)である場合、このドキュメントのすべてのサブコマンドテーブル(以下に追記されたセクションのテーブルを含む)を検索し、一致するアクション列に直接従ってください。これにより、ユーザーは /claude-api <subcommand> を介して特定のフローを呼び出すことができます。ドキュメント内のどのテーブルも一致しない場合は、リクエストを通常の散文として扱ってください。
言語検出
コード例を読む前に、ユーザーがどの言語で作業しているかを判断してください。
-
プロジェクトファイルを見て言語を推測します。
*.py、requirements.txt、pyproject.toml、setup.py、Pipfile→ Python —python/から読み取ります*.ts、*.tsx、package.json、tsconfig.json→ TypeScript —typescript/から読み取ります*.js、*.jsx(.tsファイルがない場合) → TypeScript — JS は同じ SDK を使用します。typescript/から読み取ります*.java、pom.xml、build.gradle→ Java —java/から読み取ります*.kt、*.kts、build.gradle.kts→ Java — Kotlin は Java SDK を使用します。java/から読み取ります*.scala、build.sbt→ Java — Scala は Java SDK を使用します。java/から読み取ります*.go、go.mod→ Go —go/から読み取ります*.rb、Gemfile→ Ruby —ruby/から読み取ります*.cs、*.csproj→ C# —csharp/から読み取ります*.php、composer.json→ PHP —php/から読み取ります
-
複数の言語が検出された場合(例:Python と TypeScript の両方のファイルがある場合):
- ユーザーの現在のファイルまたは質問がどの言語に関連しているかを確認します。
- まだ曖昧な場合は、「Python と TypeScript の両方のファイルを検出しました。Claude API 統合にはどちらの言語を使用していますか?」と尋ねます。
-
言語を推測できない場合(空のプロジェクト、ソースファイルなし、またはサポートされていない言語):
- オプション付きで AskUserQuestion を使用します:Python、TypeScript、Java、Go、Ruby、cURL/raw HTTP、C#、PHP
- AskUserQuestion が利用できない場合は、Python の例をデフォルトとし、「Python の例を表示しています。別の言語が必要な場合はお知らせください。」と注記します。
-
サポートされていない言語が検出された場合(Rust、Swift、C++、Elixir など):
curl/から cURL/raw HTTP の例を提案し、コミュニティ SDK が存在する可能性があることを注記します。- 参照実装として Python または TypeScript の例を表示することを提案します。
-
ユーザーが cURL/raw HTTP の例を必要とする場合は、
curl/から読み取ります。
言語固有の機能サポート
| 言語 | ツールランナー | マネージドエージェント | 注記 |
|---|---|---|---|
| Python | はい (ベータ) | はい (ベータ) | 完全サポート — @beta_tool デコレータ |
| TypeScript | はい (ベータ) | はい (ベータ) | 完全サポート — betaZodTool + Zod |
| Java | はい (ベータ) | はい (ベータ) | アノテーション付きクラスでのベータツール使用 |
| Go | はい (ベータ) | はい (ベータ) | toolrunner パッケージの BetaToolRunner |
| Ruby | はい (ベータ) | はい (ベータ) | ベータ版の BaseTool + tool_runner |
| C# | いいえ | いいえ | 公式 SDK |
| PHP | はい (ベータ) | はい (ベータ) | BetaRunnableTool + toolRunner() |
| cURL | N/A | はい (ベータ) | 生の HTTP、SDK 機能なし |
マネージドエージェントのコード例: Python、TypeScript、Go、Ruby、PHP、Java、および cURL 用に専用の言語固有の README が提供されています (
{lang}/managed-agents/README.md、curl/managed-agent
(原文がここで切り詰められています)
📜 原文 SKILL.md(Claudeが読む英語/中国語)を展開
Building LLM-Powered Applications with Claude
This skill helps you build LLM-powered applications with Claude. Choose the right surface based on your needs, detect the project language, then read the relevant language-specific documentation.
Before You Start
Scan the target file (or, if no target file, the prompt and project) for non-Anthropic provider markers — import openai, from openai, langchain_openai, OpenAI(, gpt-4, gpt-5, file names like agent-openai.py or *-generic.py, or any explicit instruction to keep the code provider-neutral. If you find any, stop and tell the user that this skill produces Claude/Anthropic SDK code; ask whether they want to switch the file to Claude or want a non-Claude implementation. Do not edit a non-Anthropic file with Anthropic SDK calls.
Output Requirement
When the user asks you to add, modify, or implement a Claude feature, your code must call Claude through one of:
- The official Anthropic SDK for the project's language (
anthropic,@anthropic-ai/sdk,com.anthropic.*, etc.). This is the default whenever a supported SDK exists for the project. - Raw HTTP (
curl,requests,fetch,httpx, etc.) — only when the user explicitly asks for cURL/REST/raw HTTP, the project is a shell/cURL project, or the language has no official SDK.
Never mix the two — don't reach for requests/fetch in a Python or TypeScript project just because it feels lighter. Never fall back to OpenAI-compatible shims.
Never guess SDK usage. Function names, class names, namespaces, method signatures, and import paths must come from explicit documentation — either the {lang}/ files in this skill or the official SDK repositories or documentation links listed in shared/live-sources.md. If the binding you need is not explicitly documented in the skill files, WebFetch the relevant SDK repo from shared/live-sources.md before writing code. Do not infer Ruby/Java/Go/PHP/C# APIs from cURL shapes or from another language's SDK.
Defaults
Unless the user requests otherwise:
For the Claude model version, please use Claude Opus 4.7, which you can access via the exact model string claude-opus-4-7. Please default to using adaptive thinking (thinking: {type: "adaptive"}) for anything remotely complicated. And finally, please default to streaming for any request that may involve long input, long output, or high max_tokens — it prevents hitting request timeouts. Use the SDK's .get_final_message() / .finalMessage() helper to get the complete response if you don't need to handle individual stream events
Subcommands
If the User Request at the bottom of this prompt is a bare subcommand string (no prose), search every Subcommands table in this document — including any in sections appended below — and follow the matching Action column directly. This lets users invoke specific flows via /claude-api <subcommand>. If no table in the document matches, treat the request as normal prose.
Language Detection
Before reading code examples, determine which language the user is working in:
-
Look at project files to infer the language:
*.py,requirements.txt,pyproject.toml,setup.py,Pipfile→ Python — read frompython/*.ts,*.tsx,package.json,tsconfig.json→ TypeScript — read fromtypescript/*.js,*.jsx(no.tsfiles present) → TypeScript — JS uses the same SDK, read fromtypescript/*.java,pom.xml,build.gradle→ Java — read fromjava/*.kt,*.kts,build.gradle.kts→ Java — Kotlin uses the Java SDK, read fromjava/*.scala,build.sbt→ Java — Scala uses the Java SDK, read fromjava/*.go,go.mod→ Go — read fromgo/*.rb,Gemfile→ Ruby — read fromruby/*.cs,*.csproj→ C# — read fromcsharp/*.php,composer.json→ PHP — read fromphp/
-
If multiple languages detected (e.g., both Python and TypeScript files):
- Check which language the user's current file or question relates to
- If still ambiguous, ask: "I detected both Python and TypeScript files. Which language are you using for the Claude API integration?"
-
If language can't be inferred (empty project, no source files, or unsupported language):
- Use AskUserQuestion with options: Python, TypeScript, Java, Go, Ruby, cURL/raw HTTP, C#, PHP
- If AskUserQuestion is unavailable, default to Python examples and note: "Showing Python examples. Let me know if you need a different language."
-
If unsupported language detected (Rust, Swift, C++, Elixir, etc.):
- Suggest cURL/raw HTTP examples from
curl/and note that community SDKs may exist - Offer to show Python or TypeScript examples as reference implementations
- Suggest cURL/raw HTTP examples from
-
If user needs cURL/raw HTTP examples, read from
curl/.
Language-Specific Feature Support
| Language | Tool Runner | Managed Agents | Notes |
|---|---|---|---|
| Python | Yes (beta) | Yes (beta) | Full support — @beta_tool decorator |
| TypeScript | Yes (beta) | Yes (beta) | Full support — betaZodTool + Zod |
| Java | Yes (beta) | Yes (beta) | Beta tool use with annotated classes |
| Go | Yes (beta) | Yes (beta) | BetaToolRunner in toolrunner pkg |
| Ruby | Yes (beta) | Yes (beta) | BaseTool + tool_runner in beta |
| C# | No | No | Official SDK |
| PHP | Yes (beta) | Yes (beta) | BetaRunnableTool + toolRunner() |
| cURL | N/A | Yes (beta) | Raw HTTP, no SDK features |
Managed Agents code examples: dedicated language-specific READMEs are provided for Python, TypeScript, Go, Ruby, PHP, Java, and cURL (
{lang}/managed-agents/README.md,curl/managed-agents.md). Read your language's README plus the language-agnosticshared/managed-agents-*.mdconcept files. Agents are persistent — create once, reference by ID. Store the agent ID returned byagents.createand pass it to every subsequentsessions.create; do not callagents.createin the request path. The Anthropic CLI is one convenient way to create agents and environments from version-controlled YAML — its URL is inshared/live-sources.md. If a binding you need isn't shown in the README, WebFetch the relevant entry fromshared/live-sources.mdrather than guess. C# does not currently have Managed Agents support; use cURL-style raw HTTP requests against the API.
Which Surface Should I Use?
Start simple. Default to the simplest tier that meets your needs. Single API calls and workflows handle most use cases — only reach for agents when the task genuinely requires open-ended, model-driven exploration.
| Use Case | Tier | Recommended Surface | Why |
|---|---|---|---|
| Classification, summarization, extraction, Q&A | Single LLM call | Claude API | One request, one response |
| Batch processing or embeddings | Single LLM call | Claude API | Specialized endpoints |
| Multi-step pipelines with code-controlled logic | Workflow | Claude API + tool use | You orchestrate the loop |
| Custom agent with your own tools | Agent | Claude API + tool use | Maximum flexibility |
| Server-managed stateful agent with workspace | Agent | Managed Agents | Anthropic runs the loop and hosts the tool-execution sandbox |
| Persisted, versioned agent configs | Agent | Managed Agents | Agents are stored objects; sessions pin to a version |
| Long-running multi-turn agent with file mounts | Agent | Managed Agents | Per-session containers, SSE event stream, Skills + MCP |
Note: Managed Agents is the right choice when you want Anthropic to run the agent loop and host the container where tools execute — file ops, bash, code execution all run in the per-session workspace. If you want to host the compute yourself or run your own custom tool runtime, Claude API + tool use is the right choice — use the tool runner for automatic loop handling, or the manual loop for fine-grained control (approval gates, custom logging, conditional execution).
Third-party providers (Amazon Bedrock, Google Vertex AI, Microsoft Foundry): Managed Agents is not available on Bedrock, Vertex, or Foundry. If you are deploying through any third-party provider, use Claude API + tool use for all use cases — including ones where Managed Agents would otherwise be the recommended surface.
Decision Tree
What does your application need?
0. Are you deploying through Amazon Bedrock, Google Vertex AI, or Microsoft Foundry?
└── Yes → Claude API (+ tool use for agents) — Managed Agents is 1P only.
No → continue.
1. Single LLM call (classification, summarization, extraction, Q&A)
└── Claude API — one request, one response
2. Do you want Anthropic to run the agent loop and host a per-session
container where Claude executes tools (bash, file ops, code)?
└── Yes → Managed Agents — server-managed sessions, persisted agent configs,
SSE event stream, Skills + MCP, file mounts.
Examples: "stateful coding agent with a workspace per task",
"long-running research agent that streams events to a UI",
"agent with persisted, versioned config used across many sessions"
3. Workflow (multi-step, code-orchestrated, with your own tools)
└── Claude API with tool use — you control the loop
4. Open-ended agent (model decides its own trajectory, your own tools, you host the compute)
└── Claude API agentic loop (maximum flexibility)
Should I Build an Agent?
Before choosing the agent tier, check all four criteria:
- Complexity — Is the task multi-step and hard to fully specify in advance? (e.g., "turn this design doc into a PR" vs. "extract the title from this PDF")
- Value — Does the outcome justify higher cost and latency?
- Viability — Is Claude capable at this task type?
- Cost of error — Can errors be caught and recovered from? (tests, review, rollback)
If the answer is "no" to any of these, stay at a simpler tier (single call or workflow).
Architecture
Everything goes through POST /v1/messages. Tools and output constraints are features of this single endpoint — not separate APIs.
User-defined tools — You define tools (via decorators, Zod schemas, or raw JSON), and the SDK's tool runner handles calling the API, executing your functions, and looping until Claude is done. For full control, you can write the loop manually.
Server-side tools — Anthropic-hosted tools that run on Anthropic's infrastructure. Code execution is fully server-side (declare it in tools, Claude runs code automatically). Computer use can be server-hosted or self-hosted.
Structured outputs — Constrains the Messages API response format (output_config.format) and/or tool parameter validation (strict: true). The recommended approach is client.messages.parse() which validates responses against your schema automatically. Note: the old output_format parameter is deprecated; use output_config: {format: {...}} on messages.create().
Supporting endpoints — Batches (POST /v1/messages/batches), Files (POST /v1/files), Token Counting, and Models (GET /v1/models, GET /v1/models/{id} — live capability/context-window discovery) feed into or support Messages API requests.
Current Models (cached: 2026-04-15)
| Model | Model ID | Context | Input $/1M | Output $/1M |
|---|---|---|---|---|
| Claude Opus 4.7 | claude-opus-4-7 |
1M | $5.00 | $25.00 |
| Claude Opus 4.6 | claude-opus-4-6 |
1M | $5.00 | $25.00 |
| Claude Sonnet 4.6 | claude-sonnet-4-6 |
1M | $3.00 | $15.00 |
| Claude Haiku 4.5 | claude-haiku-4-5 |
200K | $1.00 | $5.00 |
ALWAYS use claude-opus-4-7 unless the user explicitly names a different model. This is non-negotiable. Do not use claude-sonnet-4-6, claude-sonnet-4-5, or any other model unless the user literally says "use sonnet" or "use haiku". Never downgrade for cost — that's the user's decision, not yours.
CRITICAL: Use only the exact model ID strings from the table above — they are complete as-is. Do not append date suffixes. For example, use claude-sonnet-4-5, never claude-sonnet-4-5-20250514 or any other date-suffixed variant you might recall from training data. If the user requests an older model not in the table (e.g., "opus 4.5", "sonnet 3.7"), read shared/models.md for the exact ID — do not construct one yourself.
A note: if any of the model strings above look unfamiliar to you, that's to be expected — that just means they were released after your training data cutoff. Rest assured they are real models; we wouldn't mess with you like that.
Live capability lookup: The table above is cached. When the user asks "what's the context window for X", "does X support vision/thinking/effort", or "which models support Y", query the Models API (client.models.retrieve(id) / client.models.list()) — see shared/models.md for the field reference and capability-filter examples.
Thinking & Effort (Quick Reference)
Opus 4.7 — Adaptive thinking only: Use thinking: {type: "adaptive"}. thinking: {type: "enabled", budget_tokens: N} returns a 400 on Opus 4.7 — adaptive is the only on-mode. {type: "disabled"} and omitting thinking both work. Sampling parameters (temperature, top_p, top_k) are also removed and will 400. See shared/model-migration.md → Migrating to Opus 4.7 for the full breaking-change list.
Opus 4.6 — Adaptive thinking (recommended): Use thinking: {type: "adaptive"}. Claude dynamically decides when and how much to think. No budget_tokens needed — budget_tokens is deprecated on Opus 4.6 and Sonnet 4.6 and should not be used for new code. Adaptive thinking also automatically enables interleaved thinking (no beta header needed). When the user asks for "extended thinking", a "thinking budget", or budget_tokens: always use Opus 4.7 or 4.6 with thinking: {type: "adaptive"}. The concept of a fixed token budget for thinking is deprecated — adaptive thinking replaces it. Do NOT use budget_tokens for new 4.6/4.7 code and do NOT switch to an older model. Gradual-migration carve-out: budget_tokens is still functional on Opus 4.6 and Sonnet 4.6 as a transitional escape hatch — if you're migrating existing code and need a hard token ceiling before you've tuned effort, see shared/model-migration.md → Transitional escape hatch. Note: this carve-out does not apply to Opus 4.7 — budget_tokens is fully removed there.
Effort parameter (GA, no beta header): Controls thinking depth and overall token spend via output_config: {effort: "low"|"medium"|"high"|"max"} (inside output_config, not top-level). Default is high (equivalent to omitting it). max is Opus-tier only (Opus 4.6 and later — not Sonnet or Haiku). Opus 4.7 adds "xhigh" (between high and max) — the best setting for most coding and agentic use cases on 4.7, and the default in Claude Code; use a minimum of high for most intelligence-sensitive work. Works on Opus 4.5, Opus 4.6, Opus 4.7, and Sonnet 4.6. Will error on Sonnet 4.5 / Haiku 4.5. On Opus 4.7, effort matters more than on any prior Opus — re-tune it when migrating. Combine with adaptive thinking for the best cost-quality tradeoffs. Lower effort means fewer and more-consolidated tool calls, less preamble, and terser confirmations — high is often the sweet spot balancing quality and token efficiency; use max when correctness matters more than cost; use low for subagents or simple tasks.
Opus 4.7 — thinking content omitted by default: thinking blocks still stream but their text is empty unless you opt in with thinking: {type: "adaptive", display: "summarized"} (default is "omitted"). Silent change — no error. If you stream reasoning to users, the default looks like a long pause before output; set "summarized" to restore visible progress.
Task Budgets (beta, Opus 4.7): output_config: {task_budget: {type: "tokens", total: N}} tells the model how many tokens it has for a full agentic loop — it sees a running countdown and self-moderates (minimum 20,000; beta header task-budgets-2026-03-13). Distinct from max_tokens, which is an enforced per-response ceiling the model is not aware of. See shared/model-migration.md → Task Budgets.
Sonnet 4.6: Supports adaptive thinking (thinking: {type: "adaptive"}). budget_tokens is deprecated on Sonnet 4.6 — use adaptive thinking instead.
Older models (only if explicitly requested): If the user specifically asks for Sonnet 4.5 or another older model, use thinking: {type: "enabled", budget_tokens: N}. budget_tokens must be less than max_tokens (minimum 1024). Never choose an older model just because the user mentions budget_tokens — use Opus 4.7 with adaptive thinking instead.
Compaction (Quick Reference)
Beta, Opus 4.7, Opus 4.6, and Sonnet 4.6. For long-running conversations that may exceed the 1M context window, enable server-side compaction. The API automatically summarizes earlier context when it approaches the trigger threshold (default: 150K tokens). Requires beta header compact-2026-01-12.
Critical: Append response.content (not just the text) back to your messages on every turn. Compaction blocks in the response must be preserved — the API uses them to replace the compacted history on the next request. Extracting only the text string and appending that will silently lose the compaction state.
See {lang}/claude-api/README.md (Compaction section) for code examples. Full docs via WebFetch in shared/live-sources.md.
Prompt Caching (Quick Reference)
Prefix match. Any byte change anywhere in the prefix invalidates everything after it. Render order is tools → system → messages. Keep stable content first (frozen system prompt, deterministic tool list), put volatile content (timestamps, per-request IDs, varying questions) after the last cache_control breakpoint.
Top-level auto-caching (cache_control: {type: "ephemeral"} on messages.create()) is the simplest option when you don't need fine-grained placement. Max 4 breakpoints per request. Minimum cacheable prefix is ~1024 tokens — shorter prefixes silently won't cache.
Verify with usage.cache_read_input_tokens — if it's zero across repeated requests, a silent invalidator is at work (datetime.now() in system prompt, unsorted JSON, varying tool set).
For placement patterns, architectural guidance, and the silent-invalidator audit checklist: read shared/prompt-caching.md. Language-specific syntax: {lang}/claude-api/README.md (Prompt Caching section).
Managed Agents (Beta)
Managed Agents is a third surface: server-managed stateful agents with Anthropic-hosted tool execution. You create a persisted, versioned Agent config (POST /v1/agents), then start Sessions that reference it. Each session provisions a container as the agent's workspace — bash, file ops, and code execution run there; the agent loop itself runs on Anthropic's orchestration layer and acts on the container via tools. The session streams events; you send messages and tool results back.
Managed Agents is first-party only. It is not available on Amazon Bedrock, Google Vertex AI, or Microsoft Foundry. For agents on third-party providers, use Claude API + tool use.
Mandatory flow: Agent (once) → Session (every run). model/system/tools live on the agent, never the session. See shared/managed-agents-overview.md for the full reading guide, beta headers, and pitfalls.
Beta headers: managed-agents-2026-04-01 — the SDK sets this automatically for all client.beta.{agents,environments,sessions,vaults,memory_stores}.* calls. Skills API uses skills-2025-10-02 and Files API uses files-api-2025-04-14, but you don't need to explicitly pass those in for endpoints other than /v1/skills and /v1/files.
Subcommands — invoke directly with /claude-api <subcommand>:
| Subcommand | Action |
|---|---|
managed-agents-onboard |
Walk the user through setting up a Managed Agent from scratch. Read shared/managed-agents-onboarding.md immediately and follow its interview script: mental model → know-or-explore branch → template config → session setup → emit code. Do not summarize — run the interview. |
Reading guide: Start with shared/managed-agents-overview.md, then the topical shared/managed-agents-*.md files (core, environments, tools, events, outcomes, multiagent, webhooks, memory, client-patterns, onboarding, api-reference). For Python, TypeScript, Go, Ruby, PHP, and Java, read {lang}/managed-agents/README.md for code examples. For cURL, read curl/managed-agents.md. Agents are persistent — create once, reference by ID. Store the agent ID returned by agents.create and pass it to every subsequent sessions.create; do not call agents.create in the request path. The Anthropic CLI is one convenient way to create agents and environments from version-controlled YAML (URL in shared/live-sources.md). If a binding you need isn't shown in the language README, WebFetch the relevant entry from shared/live-sources.md rather than guess. C# does not currently have Managed Agents support; use raw HTTP from curl/managed-agents.md as a reference.
When the user wants to set up a Managed Agent from scratch (e.g. "how do I get started", "walk me through creating one", "set up a new agent"): read shared/managed-agents-onboarding.md and run its interview — same flow as the managed-agents-onboard subcommand.
When the user asks "how do I write the client code for X": reach for shared/managed-agents-client-patterns.md — covers lossless stream reconnect, processed_at queued/processed gate, interrupt, tool_confirmation round-trip, the correct idle/terminated break gate, post-idle status race, stream-first ordering, file-mount gotchas, keeping credentials host-side via custom tools, etc.
Reading Guide
After detecting the language, read the relevant files based on what the user needs:
Quick Task Reference
Single text classification/summarization/extraction/Q&A:
→ Read only {lang}/claude-api/README.md
Chat UI or real-time response display:
→ Read {lang}/claude-api/README.md + {lang}/claude-api/streaming.md
Long-running conversations (may exceed context window):
→ Read {lang}/claude-api/README.md — see Compaction section
Migrating to a newer model (Opus 4.7 / Opus 4.6 / Sonnet 4.6) or replacing a retired model:
→ Read shared/model-migration.md
Prompt caching / optimize caching / "why is my cache hit rate low":
→ Read shared/prompt-caching.md + {lang}/claude-api/README.md (Prompt Caching section)
Function calling / tool use / agents:
→ Read {lang}/claude-api/README.md + shared/tool-use-concepts.md + {lang}/claude-api/tool-use.md
Agent design (tool surface, context management, caching strategy):
→ Read shared/agent-design.md
Batch processing (non-latency-sensitive):
→ Read {lang}/claude-api/README.md + {lang}/claude-api/batches.md
File uploads across multiple requests:
→ Read {lang}/claude-api/README.md + {lang}/claude-api/files-api.md
Managed Agents (server-managed stateful agents with workspace):
→ Read shared/managed-agents-overview.md + the rest of the shared/managed-agents-*.md files. For Python, TypeScript, Go, Ruby, PHP, and Java, read {lang}/managed-agents/README.md for code examples. For cURL, read curl/managed-agents.md. Agents are persistent — create once, reference by ID. Store the agent ID returned by agents.create and pass it to every subsequent sessions.create; do not call agents.create in the request path. The Anthropic CLI is one convenient way to create agents and environments from version-controlled YAML (URL in shared/live-sources.md). If a binding you need isn't shown in the language README, WebFetch the relevant entry from shared/live-sources.md rather than guess. C# does not currently support Managed Agents — use raw HTTP from curl/managed-agents.md as a reference.
Claude API (Full File Reference)
Read the language-specific Claude API folder ({language}/claude-api/):
{language}/claude-api/README.md— Read this first. Installation, quick start, common patterns, error handling.shared/tool-use-concepts.md— Read when the user needs function calling, code execution, memory, or structured outputs. Covers conceptual foundations.shared/agent-design.md— Read when designing an agent: bash vs. dedicated tools, programmatic tool calling, tool search/skills, context editing vs. compaction vs. memory, caching principles.{language}/claude-api/tool-use.md— Read for language-specific tool use code examples (tool runner, manual loop, code execution, memory, structured outputs).{language}/claude-api/streaming.md— Read when building chat UIs or interfaces that display responses incrementally.{language}/claude-api/batches.md— Read when processing many requests offline (not latency-sensitive). Runs asynchronously at 50% cost.{language}/claude-api/files-api.md— Read when sending the same file across multiple requests without re-uploading.shared/prompt-caching.md— Read when adding or optimizing prompt caching. Covers prefix-stability design, breakpoint placement, and anti-patterns that silently invalidate cache.shared/error-codes.md— Read when debugging HTTP errors or implementing error handling.shared/model-migration.md— Read when upgrading to newer models, replacing retired models, or translatingbudget_tokens/ prefill patterns to the current API.shared/live-sources.md— WebFetch URLs for fetching the latest official documentation.
Note: For Java, Go, Ruby, C#, PHP, and cURL — these have a single file each covering all basics. Read that file plus
shared/tool-use-concepts.mdandshared/error-codes.mdas needed.
Note: For the Managed Agents file reference, see the
## Managed Agents (Beta)section above — it lists everyshared/managed-agents-*.mdfile and the language-specific READMEs.
When to Use WebFetch
Use WebFetch to get the latest documentation when:
- User asks for "latest" or "current" information
- Cached data seems incorrect
- User asks about features not covered here
Live documentation URLs are in shared/live-sources.md.
Common Pitfalls
- Don't truncate inputs when passing files or content to the API. If the content is too long to fit in the context window, notify the user and discuss options (chunking, summarization, etc.) rather than silently truncating.
- Opus 4.7 thinking: Adaptive only.
thinking: {type: "enabled", budget_tokens: N}returns 400 on Opus 4.7 —budget_tokensis fully removed there (along withtemperature,top_p,top_k). Usethinking: {type: "adaptive"}. - Opus 4.6 / Sonnet 4.6 thinking: Use
thinking: {type: "adaptive"}— do NOT usebudget_tokensfor new 4.6 code (deprecated on both Opus 4.6 and Sonnet 4.6; for gradual migration of existing code, see the transitional escape hatch inshared/model-migration.md— note this carve-out does not apply to Opus 4.7). For older models,budget_tokensmust be less thanmax_tokens(minimum 1024). This will throw an error if you get it wrong. - 4.6/4.7 family prefill removed: Assistant message prefills (last-assistant-turn prefills) return a 400 error on Opus 4.6, Opus 4.7, and Sonnet 4.6. Use structured outputs (
output_config.format) or system prompt instructions to control response format instead. - Confirm migration scope before editing: When a user asks to migrate code to a newer Claude model without naming a specific file, directory, or file list, ask which scope to apply first — the entire working directory, a specific subdirectory, or a specific set of files. Do not start editing until the user confirms. Imperative phrasings like "migrate my codebase", "move my project to X", "upgrade to Sonnet 4.6", or bare "migrate to Opus 4.7" are still ambiguous — they tell you what to do but not where, so ask. Proceed without asking only when the prompt names an exact file, a specific directory, or an explicit file list ("migrate
app.py", "migrate everything underservices/", "updatea.pyandb.py"). Seeshared/model-migration.mdStep 0. max_tokensdefaults: Don't lowballmax_tokens— hitting the cap truncates output mid-thought and requires a retry. For non-streaming requests, default to~16000(keeps responses under SDK HTTP timeouts). For streaming requests, default to~64000(timeouts aren't a concern, so give the model room). Only go lower when you have a hard reason: classification (~256), cost caps, or deliberately short outputs.- 128K output tokens: Opus 4.6 and Opus 4.7 support up to 128K
max_tokens, but the SDKs require streaming for values that large to avoid HTTP timeouts. Use.stream()with.get_final_message()/.finalMessage(). - Tool call JSON parsing (4.6/4.7 family): Opus 4.6, Opus 4.7, and Sonnet 4.6 may produce different JSON string escaping in tool call
inputfields (e.g., Unicode or forward-slash escaping). Always parse tool inputs withjson.loads()/JSON.parse()— never do raw string matching on the serialized input. - Structured outputs (all models): Use
output_config: {format: {...}}instead of the deprecatedoutput_formatparameter onmessages.create(). This is a general API change, not 4.6-specific. - Don't reimplement SDK functionality: The SDK provides high-level helpers — use them instead of building from scratch. Specifically: use
stream.finalMessage()instead of wrapping.on()events innew Promise(); use typed exception classes (Anthropic.RateLimitError, etc.) instead of string-matching error messages; use SDK types (Anthropic.MessageParam,Anthropic.Tool,Anthropic.Message, etc.) instead of redefining equivalent interfaces. - Don't define custom types for SDK data structures: The SDK exports types for all API objects. Use
Anthropic.MessageParamfor messages,Anthropic.Toolfor tool definitions,Anthropic.ToolUseBlock/Anthropic.ToolResultBlockParamfor tool results,Anthropic.Messagefor responses. Defining your owninterface ChatMessage { role: string; content: unknown }duplicates what the SDK already provides and loses type safety. - Report and document output: For tasks that produce reports, documents, or visualizations, the code execution sandbox has
python-docx,python-pptx,matplotlib,pillow, andpypdfpre-installed. Claude can generate formatted files (DOCX, PDF, charts) and return them via the Files API — consider this for "report" or "document" type requests instead of plain stdout text.
同梱ファイル
※ ZIPに含まれるファイル一覧。`SKILL.md` 本体に加え、参考資料・サンプル・スクリプトが入っている場合があります。
- 📄 SKILL.md (33,041 bytes)
- 📎 LICENSE.txt (11,345 bytes)