🛠️ 開発・MCP コミュニティ

structured-output

LLM（大規模言語モデル）から、自由なテキストではなく、型定義された検証済みのJSON形式でデータを取り出すことを可能にし、AIからの構造化されたデータ抽出や、不適切な応答に対する再試行戦略などを実現するSkill。

📜 元の英語説明(参考)

Force LLMs to return typed, validated JSON — not free-text. Use when someone asks to "get structured data from LLM", "parse LLM response as JSON", "make AI return typed output", "validate LLM output", "extract structured data with AI", "use instructor with OpenAI", or "get reliable JSON from Claude/GPT". Covers OpenAI structured outputs, Anthropic tool_use for structured data, Instructor library, Zod schemas, Pydantic models, and retry strategies for malformed responses.

🇯🇵 日本人クリエイター向け解説

一言でいうと

※ jpskill.com 編集部が日本のビジネス現場向けに補足した解説です。Skill本体の挙動とは独立した参考情報です。

⚡ おすすめ: コマンド1行でインストール(60秒)

下記のコマンドをコピーしてターミナル(Mac/Linux)または PowerShell(Windows)に貼り付けてください。ダウンロード → 解凍 → 配置まで全自動。

🍎 Mac / 🐧 Linux

mkdir -p ~/.claude/skills && cd ~/.claude/skills && curl -L -o structured-output.zip https://jpskill.com/download/15429.zip && unzip -o structured-output.zip && rm structured-output.zip

🪟 Windows (PowerShell)

$d = "$env:USERPROFILE\.claude\skills"; ni -Force -ItemType Directory $d | Out-Null; iwr https://jpskill.com/download/15429.zip -OutFile "$d\structured-output.zip"; Expand-Archive "$d\structured-output.zip" -DestinationPath $d -Force; ri "$d\structured-output.zip"

完了後、Claude Code を再起動 → 普通に「動画プロンプト作って」のように話しかけるだけで自動発動します。

💾 手動でダウンロードしたい(コマンドが難しい人向け)

1. 下の青いボタンを押して structured-output.zip をダウンロード
2. ZIPファイルをダブルクリックで解凍 → structured-output フォルダができる
3. そのフォルダを C:\Users\あなたの名前\.claude\skills\(Win)または ~/.claude/skills/(Mac)へ移動
4. Claude Code を再起動

⬇ .zip でダウンロード(推奨) ⬇ .skill 形式(上級者用) 元のソース ↗

⚠️ ダウンロード・利用は自己責任でお願いします。当サイトは内容・動作・安全性について責任を負いません。

🎯 このSkillでできること

下記の説明文を読むと、このSkillがあなたに何をしてくれるかが分かります。Claudeにこの分野の依頼をすると、自動で発動します。

📦 インストール方法 (3ステップ)

1. 上の「ダウンロード」ボタンを押して .skill ファイルを取得
2. ファイル名の拡張子を .skill から .zip に変えて展開(macは自動展開可)
3. 展開してできたフォルダを、ホームフォルダの .claude/skills/ に置く
- · macOS / Linux: ~/.claude/skills/
- · Windows: %USERPROFILE%\.claude\skills\

Claude Code を再起動すれば完了。「このSkillを使って…」と話しかけなくても、関連する依頼で自動的に呼び出されます。

詳しい使い方ガイドを見る →

最終更新: 2026-05-18
取得日時: 2026-05-18
同梱ファイル: 1

📖 Skill本文(日本語訳)

※ 原文(英語/中国語)を Gemini で日本語化したものです。Claude 自身は原文を読みます。誤訳がある場合は原文をご確認ください。

構造化された出力

概要

LLM は長々と語りたがります。JSON が必要なのに、Markdown でラップされた JSON が返ってきたり、リストが必要なのに、段落が返ってきたりします。構造化された出力は、モデルが定義したスキーマと完全に一致するものを返すように強制します。検証され、型付けされ、解析の苦労なしにコードで使用できる状態になります。

どのような時に使うか

非構造化テキスト（請求書、メール、記事）から構造化データを抽出する場合
スキーマに一致する必要がある LLM を利用した API レスポンスを構築する場合
LLM の出力がデータベースに供給される信頼性の高いデータパイプラインを作成する場合
自然言語から構成ファイル、テストケース、またはコードを生成する場合
LLM から散文ではなく JSON が必要な場合

手順

戦略 1: OpenAI の構造化された出力 (ネイティブ)

OpenAI の response_format と JSON Schema を使用すると、スキーマに一致する有効な JSON が保証されます。モデルは文字通り、準拠しない出力を生成できません。

// openai-structured.ts — OpenAI を使用した型安全な LLM 出力
/**
 * OpenAI のネイティブな構造化された出力を使用して、
 * レスポンスが JSON Schema に一致することを保証します。
 * 解析も、再試行も、"有効な JSON を返してください" という
 * プロンプトのハックも必要ありません。
 */
import OpenAI from "openai";
import { z } from "zod";
import { zodResponseFormat } from "openai/helpers/zod";

const openai = new OpenAI();

// Zod で出力スキーマを定義します
const ProductAnalysis = z.object({
  name: z.string().describe("製品名"),
  category: z.enum(["electronics", "clothing", "food", "software", "other"]),
  sentiment: z.enum(["positive", "negative", "neutral", "mixed"]),
  score: z.number().min(0).max(10).describe("全体的な品質スコア"),
  pros: z.array(z.string()).describe("肯定的な側面の一覧"),
  cons: z.array(z.string()).describe("否定的な側面の一覧"),
  summary: z.string().max(200).describe("一文の要約"),
});

type ProductAnalysis = z.infer<typeof ProductAnalysis>;

async function analyzeReview(review: string): Promise<ProductAnalysis> {
  const response = await openai.beta.chat.completions.parse({
    model: "gpt-4o-mini",
    messages: [
      {
        role: "system",
        content: "製品レビューを分析し、構造化されたデータを抽出します。",
      },
      { role: "user", content: review },
    ],
    response_format: zodResponseFormat(ProductAnalysis, "product_analysis"),
  });

  // 有効であることが保証されます — try/catch は不要です
  return response.choices[0].message.parsed!;
}

戦略 2: Anthropic のツール使用による構造化データ

Claude にはネイティブな JSON モードはありませんが、tool_use は関数呼び出しスキーマを通じて構造化された出力を強制します。

// anthropic-structured.ts — tool_use を介した Claude からの構造化された出力
/**
 * Anthropic の tool_use 機能を使用して、構造化されたデータを抽出します。
 * 出力スキーマを持つ「ツール」を定義します — Claude はそれを
 * 自由形式のテキストではなく、構造化されたツール呼び出しとして入力します。
 */
import Anthropic from "@anthropic-ai/sdk";

const anthropic = new Anthropic();

interface ExtractedEvent {
  title: string;
  date: string;        // ISO 8601
  location: string;
  attendees: number;
  description: string;
}

async function extractEvent(text: string): Promise<ExtractedEvent> {
  const response = await anthropic.messages.create({
    model: "claude-sonnet-4-20250514",
    max_tokens: 1024,
    tools: [
      {
        name: "extract_event",
        description: "テキストからイベントの詳細を抽出します",
        input_schema: {
          type: "object" as const,
          properties: {
            title: { type: "string", description: "イベントのタイトル" },
            date: { type: "string", description: "ISO 8601 形式のイベントの日付" },
            location: { type: "string", description: "イベントの場所" },
            attendees: { type: "number", description: "予想される参加者数" },
            description: { type: "string", description: "イベントの簡単な説明" },
          },
          required: ["title", "date", "location", "attendees", "description"],
        },
      },
    ],
    tool_choice: { type: "tool", name: "extract_event" },  // このツールを強制します
    messages: [{ role: "user", content: `イベントの詳細を抽出します:\n\n${text}` }],
  });

  // tool_use ブロックを見つけます
  const toolBlock = response.content.find((b) => b.type === "tool_use");
  if (!toolBlock || toolBlock.type !== "tool_use") {
    throw new Error("レスポンスに tool_use ブロックがありません");
  }

  return toolBlock.input as ExtractedEvent;
}

戦略 3: Instructor ライブラリ (Python)

Instructor は OpenAI/Anthropic クライアントをパッチして、Pydantic モデルを直接返します。Python で構造化された出力を取得する最も人間工学的な方法です。


# instructor_extract.py — Pydantic モデルを使用した構造化された LLM 出力
"""
Instructor ライブラリを使用して OpenAI/Anthropic クライアントをパッチし、
検証済みの Pydantic モデルを返します。モデルが無効なデータを返した場合、
自動的に再試行を処理します。
"""
import instructor
from pydantic import BaseModel, Field
from openai import OpenAI
from typing import Optional

# OpenAI クライアントをパッチします
client = instructor.from_openai(OpenAI())

class ContactInfo(BaseModel):
    """非構造化テキストから抽出された連絡先情報。"""
    name: str = Field(description="氏名")
    email: Optional[str] = Field(default=None, description="メールアドレス")
    phone: Optional[str] = Field(default=None, description="電話番号")
    company: Optional[str] = Field(default=None, description="会社名")
    role: Optional[str] = Field(default=None, description="役職または役割")

class EmailAnalysis(BaseModel):
    """メールの構造化された分析。"""
    sender: ContactInfo
    intent: str = Field(description="主な意図: 問い合わせ、苦情、要求、情報、スパム")
    urgency: str = Field(description="低、中、高、重大")
    action_items: list[str] = Field(description="必要なアクションの一覧")
    sentiment: float = Field(ge=-1.0, le=1.0, description="感情スコア -1 から 1")
    summary: str = Field(max_length=200, description="一文の要約")

def analyze_email(email_text: str) -> EmailAnalysis:


(原文がここで切り詰められています)

📜 原文 SKILL.md(Claudeが読む英語/中国語)を展開

Structured Output

Overview

LLMs love to ramble. When you need JSON, you get JSON wrapped in markdown. When you need a list, you get a paragraph. Structured output forces the model to return exactly the schema you define — validated, typed, and ready to use in code without parsing gymnastics.

When to Use

Extracting structured data from unstructured text (invoices, emails, articles)
Building API responses powered by LLMs that must match a schema
Creating reliable data pipelines where LLM output feeds into databases
Generating configuration files, test cases, or code from natural language
Any time you need JSON from an LLM, not prose

Instructions

Strategy 1: OpenAI Structured Outputs (Native)

OpenAI's response_format with JSON Schema guarantees valid JSON matching your schema. The model literally cannot produce non-conforming output.

// openai-structured.ts — Type-safe LLM outputs with OpenAI
/**
 * Uses OpenAI's native structured outputs to guarantee
 * responses match a JSON Schema. No parsing, no retries,
 * no "please return valid JSON" prompting hacks.
 */
import OpenAI from "openai";
import { z } from "zod";
import { zodResponseFormat } from "openai/helpers/zod";

const openai = new OpenAI();

// Define your output schema with Zod
const ProductAnalysis = z.object({
  name: z.string().describe("Product name"),
  category: z.enum(["electronics", "clothing", "food", "software", "other"]),
  sentiment: z.enum(["positive", "negative", "neutral", "mixed"]),
  score: z.number().min(0).max(10).describe("Overall quality score"),
  pros: z.array(z.string()).describe("List of positive aspects"),
  cons: z.array(z.string()).describe("List of negative aspects"),
  summary: z.string().max(200).describe("One-sentence summary"),
});

type ProductAnalysis = z.infer<typeof ProductAnalysis>;

async function analyzeReview(review: string): Promise<ProductAnalysis> {
  const response = await openai.beta.chat.completions.parse({
    model: "gpt-4o-mini",
    messages: [
      {
        role: "system",
        content: "Analyze the product review and extract structured data.",
      },
      { role: "user", content: review },
    ],
    response_format: zodResponseFormat(ProductAnalysis, "product_analysis"),
  });

  // Guaranteed to be valid — no try/catch needed
  return response.choices[0].message.parsed!;
}

Strategy 2: Anthropic Tool Use for Structured Data

Claude doesn't have native JSON mode, but tool_use forces structured output through a function call schema.

// anthropic-structured.ts — Structured output from Claude via tool_use
/**
 * Uses Anthropic's tool_use feature to extract structured data.
 * Define a "tool" with your output schema — Claude fills it in
 * as a structured tool call instead of free text.
 */
import Anthropic from "@anthropic-ai/sdk";

const anthropic = new Anthropic();

interface ExtractedEvent {
  title: string;
  date: string;        // ISO 8601
  location: string;
  attendees: number;
  description: string;
}

async function extractEvent(text: string): Promise<ExtractedEvent> {
  const response = await anthropic.messages.create({
    model: "claude-sonnet-4-20250514",
    max_tokens: 1024,
    tools: [
      {
        name: "extract_event",
        description: "Extract event details from text",
        input_schema: {
          type: "object" as const,
          properties: {
            title: { type: "string", description: "Event title" },
            date: { type: "string", description: "Event date in ISO 8601 format" },
            location: { type: "string", description: "Event location" },
            attendees: { type: "number", description: "Expected number of attendees" },
            description: { type: "string", description: "Brief event description" },
          },
          required: ["title", "date", "location", "attendees", "description"],
        },
      },
    ],
    tool_choice: { type: "tool", name: "extract_event" },  // Force this tool
    messages: [{ role: "user", content: `Extract event details:\n\n${text}` }],
  });

  // Find the tool_use block
  const toolBlock = response.content.find((b) => b.type === "tool_use");
  if (!toolBlock || toolBlock.type !== "tool_use") {
    throw new Error("No tool_use block in response");
  }

  return toolBlock.input as ExtractedEvent;
}

Strategy 3: Instructor Library (Python)

Instructor patches the OpenAI/Anthropic client to return Pydantic models directly. The most ergonomic way to get structured output in Python.

# instructor_extract.py — Structured LLM output with Pydantic models
"""
Uses the Instructor library to patch OpenAI/Anthropic clients
and return validated Pydantic models. Handles retries automatically
when the model returns invalid data.
"""
import instructor
from pydantic import BaseModel, Field
from openai import OpenAI
from typing import Optional

# Patch the OpenAI client
client = instructor.from_openai(OpenAI())

class ContactInfo(BaseModel):
    """Extracted contact information from unstructured text."""
    name: str = Field(description="Full name")
    email: Optional[str] = Field(default=None, description="Email address")
    phone: Optional[str] = Field(default=None, description="Phone number")
    company: Optional[str] = Field(default=None, description="Company name")
    role: Optional[str] = Field(default=None, description="Job title or role")

class EmailAnalysis(BaseModel):
    """Structured analysis of an email."""
    sender: ContactInfo
    intent: str = Field(description="Primary intent: inquiry, complaint, request, info, spam")
    urgency: str = Field(description="low, medium, high, critical")
    action_items: list[str] = Field(description="List of required actions")
    sentiment: float = Field(ge=-1.0, le=1.0, description="Sentiment score -1 to 1")
    summary: str = Field(max_length=200, description="One-sentence summary")

def analyze_email(email_text: str) -> EmailAnalysis:
    """Analyze an email and return structured data.

    Args:
        email_text: Raw email body text

    Returns:
        Validated EmailAnalysis with all fields populated
    """
    return client.chat.completions.create(
        model="gpt-4o-mini",
        response_model=EmailAnalysis,  # Instructor magic
        max_retries=3,                 # Auto-retry on validation failure
        messages=[
            {"role": "system", "content": "Analyze the email and extract structured data."},
            {"role": "user", "content": email_text},
        ],
    )

# Usage
result = analyze_email("Hi, I'm John from Acme Corp. We need the API docs by Friday...")
print(result.model_dump_json(indent=2))

Strategy 4: Retry with Validation (Generic)

When the model doesn't support native structured output, validate and retry.

// retry-structured.ts — Validate and retry LLM output
/**
 * Generic structured output with retry logic.
 * Works with any LLM provider. Sends the validation error
 * back to the model so it can self-correct.
 */
import { z, ZodSchema } from "zod";

async function structuredLLM<T>(
  llmCall: (messages: Array<{ role: string; content: string }>) => Promise<string>,
  schema: ZodSchema<T>,
  prompt: string,
  maxRetries: number = 3
): Promise<T> {
  const messages: Array<{ role: string; content: string }> = [
    {
      role: "system",
      content: `Respond with valid JSON matching this schema:\n${JSON.stringify(zodToJsonSchema(schema), null, 2)}`,
    },
    { role: "user", content: prompt },
  ];

  for (let attempt = 1; attempt <= maxRetries; attempt++) {
    const raw = await llmCall(messages);

    // Extract JSON from response (handle markdown code blocks)
    const jsonStr = raw.replace(/```json?\n?/g, "").replace(/```/g, "").trim();

    try {
      const parsed = JSON.parse(jsonStr);
      return schema.parse(parsed);  // Zod validation
    } catch (error: any) {
      if (attempt === maxRetries) {
        throw new Error(`Failed after ${maxRetries} retries: ${error.message}`);
      }

      // Send error back to model for self-correction
      messages.push(
        { role: "assistant", content: raw },
        { role: "user", content: `Invalid output. Error: ${error.message}\nPlease fix and return valid JSON.` }
      );
    }
  }

  throw new Error("Unreachable");
}

Examples

Example 1: Extract data from invoices

User prompt: "Parse PDF invoices and extract line items, totals, vendor info as structured JSON."

The agent will use OpenAI structured outputs with a Zod schema for InvoiceData (vendor, line items array, subtotal, tax, total, date, invoice number) and process each invoice through the model.

Example 2: Build a structured API from natural language

User prompt: "I want users to describe what they need in English, and the API returns a structured search query object."

The agent will define a SearchQuery schema (filters, sort, pagination, date range), use Instructor to convert natural language to the schema, and validate the output before passing it to the database.

Guidelines

Use native structured output when available — OpenAI's response_format is 100% reliable
Anthropic tool_use works — force a single tool to get structured data from Claude
Instructor is the best Python option — auto-retries, validation, works with multiple providers
Always validate — even native structured output should be validated with Zod/Pydantic
Keep schemas simple — deeply nested schemas increase error rates
Descriptions matter — add .describe() to every field; the model uses them for guidance
Retry with error feedback — sending the validation error back lets the model self-correct
Use enums over free strings — z.enum(["low", "medium", "high"]) beats z.string()
Temperature 0 for extraction — deterministic output reduces schema violations
Cost: extraction is cheap — gpt-4o-mini handles most extraction tasks at $0.15/1M tokens