💼 ビジネスコミュニティ

ibis

PythonのデータフレームライブラリIbisに関する専門的なアドバイスを提供し、pandasのようなAPIで記述された分析コードを、DuckDB、PostgreSQLなどの様々なデータベース上でSQLを書き換えることなく実行できるように支援するSkill。

📜 元の英語説明(参考)

Expert guidance for Ibis, the Python dataframe library that provides a pandas-like API but generates SQL for execution on any backend — DuckDB, PostgreSQL, BigQuery, Snowflake, Spark, and more. Helps developers write analytics code once and run it anywhere without rewriting SQL for each database.

🇯🇵 日本人クリエイター向け解説

一言でいうと

※ jpskill.com 編集部が日本のビジネス現場向けに補足した解説です。Skill本体の挙動とは独立した参考情報です。

⚡ おすすめ: コマンド1行でインストール(60秒)

下記のコマンドをコピーしてターミナル(Mac/Linux)または PowerShell(Windows)に貼り付けてください。ダウンロード → 解凍 → 配置まで全自動。

🍎 Mac / 🐧 Linux

mkdir -p ~/.claude/skills && cd ~/.claude/skills && curl -L -o ibis.zip https://jpskill.com/download/14997.zip && unzip -o ibis.zip && rm ibis.zip

🪟 Windows (PowerShell)

$d = "$env:USERPROFILE\.claude\skills"; ni -Force -ItemType Directory $d | Out-Null; iwr https://jpskill.com/download/14997.zip -OutFile "$d\ibis.zip"; Expand-Archive "$d\ibis.zip" -DestinationPath $d -Force; ri "$d\ibis.zip"

完了後、Claude Code を再起動 → 普通に「動画プロンプト作って」のように話しかけるだけで自動発動します。

💾 手動でダウンロードしたい(コマンドが難しい人向け)

1. 下の青いボタンを押して ibis.zip をダウンロード
2. ZIPファイルをダブルクリックで解凍 → ibis フォルダができる
3. そのフォルダを C:\Users\あなたの名前\.claude\skills\(Win)または ~/.claude/skills/(Mac)へ移動
4. Claude Code を再起動

⬇ .zip でダウンロード(推奨) ⬇ .skill 形式(上級者用) 元のソース ↗

⚠️ ダウンロード・利用は自己責任でお願いします。当サイトは内容・動作・安全性について責任を負いません。

🎯 このSkillでできること

下記の説明文を読むと、このSkillがあなたに何をしてくれるかが分かります。Claudeにこの分野の依頼をすると、自動で発動します。

📦 インストール方法 (3ステップ)

1. 上の「ダウンロード」ボタンを押して .skill ファイルを取得
2. ファイル名の拡張子を .skill から .zip に変えて展開(macは自動展開可)
3. 展開してできたフォルダを、ホームフォルダの .claude/skills/ に置く
- · macOS / Linux: ~/.claude/skills/
- · Windows: %USERPROFILE%\.claude\skills\

Claude Code を再起動すれば完了。「このSkillを使って…」と話しかけなくても、関連する依頼で自動的に呼び出されます。

詳しい使い方ガイドを見る →

最終更新: 2026-05-18
取得日時: 2026-05-18
同梱ファイル: 1

📖 Skill本文(日本語訳)

※ 原文(英語/中国語)を Gemini で日本語化したものです。Claude 自身は原文を読みます。誤訳がある場合は原文をご確認ください。

Ibis — ポータブルな Python 分析

概要

Ibis は、pandas のような API を提供する Python のデータフレームライブラリですが、DuckDB、PostgreSQL、BigQuery、Snowflake、Spark など、あらゆるバックエンドで実行するための SQL を生成します。開発者は、データベースごとに SQL を書き換えることなく、一度分析コードを記述してどこでも実行できます。

手順

基本的な使い方

# src/analytics.py — Ibis を使用したポータブルな分析
import ibis
from ibis import _                         # 列参照の省略形

# バックエンドに接続します (ローカル開発には DuckDB)
con = ibis.duckdb.connect("analytics.duckdb")

# または、本番データベースに接続します — 同じコード、異なるバックエンド
# con = ibis.postgres.connect(url="postgresql://...")
# con = ibis.bigquery.connect(project_id="my-project")
# con = ibis.snowflake.connect(...)

# データをロードします
orders = con.table("orders")

# クエリを構築します — これは遅延評価です (.execute() を呼び出すまで実行されません)
monthly_revenue = (
    orders
    .filter(_.status == "completed")
    .filter(_.created_at >= "2026-01-01")
    .group_by(month=_.created_at.truncate("M"))
    .agg(
        revenue=_.amount.sum(),
        order_count=_.count(),
        unique_customers=_.customer_id.nunique(),
        avg_order_value=_.amount.mean(),
    )
    .order_by(_.month)
)

# 実行して pandas DataFrame を取得します
df = monthly_revenue.execute()
print(df)

# または、生成された SQL を確認します
print(ibis.to_sql(monthly_revenue))

複雑な変換

# ウィンドウ関数、結合、および case 式
import ibis
from ibis import _

con = ibis.duckdb.connect("analytics.duckdb")
orders = con.table("orders")
customers = con.table("customers")

# ウィンドウ関数 — 累計とランキング
ranked = (
    orders
    .filter(_.status == "completed")
    .group_by(_.customer_id)
    .agg(
        total_spent=_.amount.sum(),
        order_count=_.count(),
        first_order=_.created_at.min(),
        last_order=_.created_at.max(),
    )
    .mutate(
        # 収益で顧客をランク付けします
        revenue_rank=ibis.rank().over(
            order_by=ibis.desc(_.total_spent)
        ),
        # パーセンタイル
        revenue_percentile=ibis.percent_rank().over(
            order_by=_.total_spent
        ),
        # 支出に基づく顧客セグメント
        segment=ibis.case()
            .when(_.total_spent >= 1000, "whale")
            .when(_.total_spent >= 100, "regular")
            .else_("casual")
            .end(),
    )
)

# 結合
customer_analytics = (
    ranked
    .join(customers, _.customer_id == customers.id)
    .select(
        _.customer_id,
        customers.name,
        customers.email,
        customers.plan,
        _.total_spent,
        _.order_count,
        _.segment,
        _.revenue_rank,
        # 最後の注文からの日数
        days_inactive=(ibis.now() - _.last_order).cast("int32") // 86400,
    )
)

# コホート分析
cohorts = (
    orders
    .filter(_.status == "completed")
    .group_by(_.customer_id)
    .mutate(
        cohort_month=_.created_at.min().truncate("M"),
    )
    .mutate(
        months_since=((_.created_at.truncate("M") - _.cohort_month)
                      .cast("int32") // (30 * 86400)),
    )
    .group_by(_.cohort_month, _.months_since)
    .agg(
        active_users=_.customer_id.nunique(),
        revenue=_.amount.sum(),
    )
)

バックエンドの移植性

# 同じ分析コードがあらゆるバックエンドで実行されます
import ibis

def build_revenue_report(con: ibis.BaseBackend):
    """収益レポートを作成します — あらゆる Ibis バックエンドで動作します。

    Args:
        con: 任意の Ibis 接続 (DuckDB、Postgres、BigQuery など)
    """
    orders = con.table("orders")

    return (
        orders
        .filter(_.status == "completed")
        .group_by(
            month=_.created_at.truncate("M"),
            category=_.category,
        )
        .agg(
            revenue=_.amount.sum(),
            orders=_.count(),
        )
        .order_by(_.month.desc())
    )

# 開発: ローカルの Parquet ファイル上の DuckDB
dev_con = ibis.duckdb.connect()
dev_con.read_parquet("data/orders.parquet", table_name="orders")
report = build_revenue_report(dev_con).execute()

# 本番: BigQuery
prod_con = ibis.bigquery.connect(project_id="prod-project", dataset_id="analytics")
report = build_revenue_report(prod_con).execute()

# テスト: DuckDB を使用したインメモリ
test_con = ibis.duckdb.connect()
test_con.create_table("orders", test_data_df)
report = build_revenue_report(test_con).execute()

UDF とカスタム関数

# カスタムのスカラー関数と集計関数
import ibis
from ibis import udf

@udf.scalar.python
def normalize_email(email: str) -> str:
    """重複排除のためにメールアドレスを正規化します。"""
    local, domain = email.lower().split("@")
    # Gmail からドットとプラスエイリアスを削除します
    if domain in ("gmail.com", "googlemail.com"):
        local = local.split("+")[0].replace(".", "")
    return f"{local}@{domain}"

# クエリで使用します
customers = con.table("customers")
deduped = (
    customers
    .mutate(clean_email=normalize_email(_.email))
    .group_by(_.clean_email)
    .agg(
        count=_.count(),
        first_seen=_.created_at.min(),
    )
    .filter(_.count > 1)
)

インストール

# コアライブラリ
pip install ibis-framework

# 特定のバックエンドを使用する場合
pip install "ibis-framework[duckdb]"
pip install "ibis-framework[postgres]"
pip install "ibis-framework[bigquery]"
pip install "ibis-framework[snowflake]"
pip install "ibis-framework[pyspark]"

# インタラクティブモード (ノートブック用)
ibis.options.interactive = True   # 結果を自動的に実行して表示します

例

例 1: pandas パイプラインを移行して BigQuery で実行する

ユーザーからのリクエスト:

イベントテーブルからコホートリテンションを計算する pandas スクリプトがあります。すべてをメモリにロードする代わりに、Ibis を使用して書き換えて、BigQuery で実行できるようにしてください。

エージェントは、Ibis 式 (ibis.bigquery.connect()、t.group_by()、_.mutate を使用して pandas コードを書き換えます。

📜 原文 SKILL.md(Claudeが読む英語/中国語)を展開

Ibis — Portable Python Analytics

Overview

Ibis, the Python dataframe library that provides a pandas-like API but generates SQL for execution on any backend — DuckDB, PostgreSQL, BigQuery, Snowflake, Spark, and more. Helps developers write analytics code once and run it anywhere without rewriting SQL for each database.

Instructions

Basic Usage

# src/analytics.py — Portable analytics with Ibis
import ibis
from ibis import _                         # Shorthand for column references

# Connect to a backend (DuckDB for local development)
con = ibis.duckdb.connect("analytics.duckdb")

# Or connect to production databases — same code, different backend
# con = ibis.postgres.connect(url="postgresql://...")
# con = ibis.bigquery.connect(project_id="my-project")
# con = ibis.snowflake.connect(...)

# Load data
orders = con.table("orders")

# Build a query — this is lazy (no execution until you call .execute())
monthly_revenue = (
    orders
    .filter(_.status == "completed")
    .filter(_.created_at >= "2026-01-01")
    .group_by(month=_.created_at.truncate("M"))
    .agg(
        revenue=_.amount.sum(),
        order_count=_.count(),
        unique_customers=_.customer_id.nunique(),
        avg_order_value=_.amount.mean(),
    )
    .order_by(_.month)
)

# Execute and get a pandas DataFrame
df = monthly_revenue.execute()
print(df)

# Or see the generated SQL
print(ibis.to_sql(monthly_revenue))

Complex Transformations

# Window functions, joins, and case expressions
import ibis
from ibis import _

con = ibis.duckdb.connect("analytics.duckdb")
orders = con.table("orders")
customers = con.table("customers")

# Window functions — running totals and rankings
ranked = (
    orders
    .filter(_.status == "completed")
    .group_by(_.customer_id)
    .agg(
        total_spent=_.amount.sum(),
        order_count=_.count(),
        first_order=_.created_at.min(),
        last_order=_.created_at.max(),
    )
    .mutate(
        # Rank customers by revenue
        revenue_rank=ibis.rank().over(
            order_by=ibis.desc(_.total_spent)
        ),
        # Percentile
        revenue_percentile=ibis.percent_rank().over(
            order_by=_.total_spent
        ),
        # Customer segment based on spending
        segment=ibis.case()
            .when(_.total_spent >= 1000, "whale")
            .when(_.total_spent >= 100, "regular")
            .else_("casual")
            .end(),
    )
)

# Joins
customer_analytics = (
    ranked
    .join(customers, _.customer_id == customers.id)
    .select(
        _.customer_id,
        customers.name,
        customers.email,
        customers.plan,
        _.total_spent,
        _.order_count,
        _.segment,
        _.revenue_rank,
        # Days since last order
        days_inactive=(ibis.now() - _.last_order).cast("int32") // 86400,
    )
)

# Cohort analysis
cohorts = (
    orders
    .filter(_.status == "completed")
    .group_by(_.customer_id)
    .mutate(
        cohort_month=_.created_at.min().truncate("M"),
    )
    .mutate(
        months_since=((_.created_at.truncate("M") - _.cohort_month)
                      .cast("int32") // (30 * 86400)),
    )
    .group_by(_.cohort_month, _.months_since)
    .agg(
        active_users=_.customer_id.nunique(),
        revenue=_.amount.sum(),
    )
)

Backend Portability

# The same analytics code runs on any backend
import ibis

def build_revenue_report(con: ibis.BaseBackend):
    """Build a revenue report — works on any Ibis backend.

    Args:
        con: Any Ibis connection (DuckDB, Postgres, BigQuery, etc.)
    """
    orders = con.table("orders")

    return (
        orders
        .filter(_.status == "completed")
        .group_by(
            month=_.created_at.truncate("M"),
            category=_.category,
        )
        .agg(
            revenue=_.amount.sum(),
            orders=_.count(),
        )
        .order_by(_.month.desc())
    )

# Development: DuckDB on local Parquet files
dev_con = ibis.duckdb.connect()
dev_con.read_parquet("data/orders.parquet", table_name="orders")
report = build_revenue_report(dev_con).execute()

# Production: BigQuery
prod_con = ibis.bigquery.connect(project_id="prod-project", dataset_id="analytics")
report = build_revenue_report(prod_con).execute()

# Testing: in-memory with DuckDB
test_con = ibis.duckdb.connect()
test_con.create_table("orders", test_data_df)
report = build_revenue_report(test_con).execute()

UDFs and Custom Functions

# Custom scalar and aggregate functions
import ibis
from ibis import udf

@udf.scalar.python
def normalize_email(email: str) -> str:
    """Normalize email addresses for deduplication."""
    local, domain = email.lower().split("@")
    # Remove dots and plus aliases from Gmail
    if domain in ("gmail.com", "googlemail.com"):
        local = local.split("+")[0].replace(".", "")
    return f"{local}@{domain}"

# Use in queries
customers = con.table("customers")
deduped = (
    customers
    .mutate(clean_email=normalize_email(_.email))
    .group_by(_.clean_email)
    .agg(
        count=_.count(),
        first_seen=_.created_at.min(),
    )
    .filter(_.count > 1)
)

Installation

# Core library
pip install ibis-framework

# With specific backends
pip install "ibis-framework[duckdb]"
pip install "ibis-framework[postgres]"
pip install "ibis-framework[bigquery]"
pip install "ibis-framework[snowflake]"
pip install "ibis-framework[pyspark]"

# Interactive mode (for notebooks)
ibis.options.interactive = True   # Auto-execute and display results

Examples

Example 1: Migrating a pandas pipeline to run on BigQuery

User request:

I have a pandas script that calculates cohort retention from our events table. Rewrite it using Ibis so it runs on BigQuery instead of loading everything into memory.

The agent rewrites the pandas code using Ibis expressions (ibis.bigquery.connect(), t.group_by(), _.mutate(), window functions with ibis.cumulative_window()), keeping the same logic but generating SQL that executes on BigQuery. The script goes from loading 50M rows into memory to pushing all computation to the warehouse.

Example 2: Building a portable analytics module with DuckDB for dev

User request:

Write an analytics module that computes daily active users and revenue per plan from Parquet files locally, but can switch to Snowflake in production.

The agent creates a module using ibis.duckdb.connect() for local development with Parquet files, writes composable Ibis expressions for DAU (t.select('user_id', 'event_date').distinct().group_by('event_date').count()) and revenue by plan, and adds a get_connection() function that switches to ibis.snowflake.connect() based on an environment variable — same analytics code, different backend.

Guidelines

Write once, run anywhere — Define analytics logic with Ibis; swap backends by changing the connection, not the code
Lazy by default — Ibis expressions are lazy; they only execute when you call .execute() or .to_pandas()
DuckDB for development — Use DuckDB locally with Parquet files; switch to BigQuery/Snowflake for production
Use _ for readability — from ibis import _ gives you clean column references: _.amount.sum() vs orders.amount.sum()
Generate SQL for debugging — Use ibis.to_sql(expr) to see the SQL being generated; helps debug unexpected results
Functions for reuse — Wrap analytics logic in functions that take a connection; test with DuckDB, deploy on any backend
Interactive mode in notebooks — Set ibis.options.interactive = True for immediate result display during exploration
Type your schemas — Use ibis.schema() to define expected table schemas; catch type mismatches early