jpskill.com
🛠️ 開発・MCP コミュニティ

querying-data-lake

Execute and manage Athena SQL queries across default and federated catalogs (Glue, S3 Tables, Redshift). Triggers on phrases like: query data, run SQL, athena query, analyze table, SQL query, workgroup status, profile table, query Redshift catalog, query S3 Tables. Do NOT use for finding specific data assets (use finding-data-lake-assets), full catalog audits (use exploring-data-catalog), importing data (use ingesting-into-data-lake).

⚡ おすすめ: コマンド1行でインストール(60秒)

下記のコマンドをコピーしてターミナル(Mac/Linux)または PowerShell(Windows)に貼り付けてください。 ダウンロード → 解凍 → 配置まで全自動。

🍎 Mac / 🐧 Linux
mkdir -p ~/.claude/skills && cd ~/.claude/skills && curl -L -o querying-data-lake.zip https://jpskill.com/download/23348.zip && unzip -o querying-data-lake.zip && rm querying-data-lake.zip
🪟 Windows (PowerShell)
$d = "$env:USERPROFILE\.claude\skills"; ni -Force -ItemType Directory $d | Out-Null; iwr https://jpskill.com/download/23348.zip -OutFile "$d\querying-data-lake.zip"; Expand-Archive "$d\querying-data-lake.zip" -DestinationPath $d -Force; ri "$d\querying-data-lake.zip"

完了後、Claude Code を再起動 → 普通に「動画プロンプト作って」のように話しかけるだけで自動発動します。

💾 手動でダウンロードしたい(コマンドが難しい人向け)
  1. 1. 下の青いボタンを押して querying-data-lake.zip をダウンロード
  2. 2. ZIPファイルをダブルクリックで解凍 → querying-data-lake フォルダができる
  3. 3. そのフォルダを C:\Users\あなたの名前\.claude\skills\(Win)または ~/.claude/skills/(Mac)へ移動
  4. 4. Claude Code を再起動

⚠️ ダウンロード・利用は自己責任でお願いします。当サイトは内容・動作・安全性について責任を負いません。

🎯 このSkillでできること

下記の説明文を読むと、このSkillがあなたに何をしてくれるかが分かります。Claudeにこの分野の依頼をすると、自動で発動します。

📦 インストール方法 (3ステップ)

  1. 1. 上の「ダウンロード」ボタンを押して .skill ファイルを取得
  2. 2. ファイル名の拡張子を .skill から .zip に変えて展開(macは自動展開可)
  3. 3. 展開してできたフォルダを、ホームフォルダの .claude/skills/ に置く
    • · macOS / Linux: ~/.claude/skills/
    • · Windows: %USERPROFILE%\.claude\skills\

Claude Code を再起動すれば完了。「このSkillを使って…」と話しかけなくても、関連する依頼で自動的に呼び出されます。

詳しい使い方ガイドを見る →
最終更新
2026-05-18
取得日時
2026-05-18
同梱ファイル
3
📖 Claude が読む原文 SKILL.md(中身を展開)

この本文は AI(Claude)が読むための原文(英語または中国語)です。日本語訳は順次追加中。

Query Data Lake

Execute SQL queries on Amazon Athena across default and federated catalogs (Glue, S3 Tables, Redshift) with workgroup selection, statement classification, and error recovery.

Overview

Executes and manages Athena SQL queries across default and federated catalogs. Selects a workgroup, resolves target assets (delegating fuzzy references to finding-data-lake-assets), classifies statements for safety, and reports cost and data scanned. Use the AWS MCP server for sandboxed execution and audit logging; the same AWS CLI commands work directly when the MCP server is not available.

Constraints for parameter acquisition:

  • You MUST accept a single optional argument: SQL text, a named-query name, a workgroup name, a catalog name, or profile TABLE_NAME
  • You MUST accept the argument as direct text or a pointer to a file containing SQL
  • You MUST ask the user for the target AWS region if not already set
  • You MUST confirm the output S3 location before executing any non-trivial query
  • You MUST respect the user's decision to abort at any step

Common Tasks

1. Verify Dependencies

Check for required tools and AWS access before running queries.

Constraints:

  • You MUST verify AWS MCP server tools are available (aws___call_aws) and run queries through them when present; fall back to AWS CLI only if the MCP server is unavailable
  • You MUST NOT fall back to shell or Bash for query execution — results must be captured via the MCP tool or aws athena CLI so output location and cost are tracked
  • You MUST confirm credentials with aws sts get-caller-identity and inform the user about any missing tools

2. Resolve Workgroup

Check caller identity, list workgroups, auto-select the best one (see workgroup-selection.md).

Constraints:

  • You MUST select a workgroup before submitting any query (prevents output-location errors)
  • You MUST present the selected workgroup and its output location to the user
  • You MUST NOT auto-escalate to a different workgroup on failure without user confirmation

3. Resolve the Target Asset

If the user refers to a table by name, by business concept ("our quarterly report", "the sales data"), by S3 path, or by catalog without specifying the table, delegate to finding-data-lake-assets to return the concrete database.table (and catalog if non-default).

Constraints:

  • You MUST NOT attempt to resolve fuzzy asset references with athena list-data-catalogs or by iterating get-tables — those miss federated catalogs and waste tokens
  • You SHOULD skip this step only when the user provides a fully-qualified reference (exact database.table) or raw SQL they want executed as-is
  • You MUST state the resolved asset explicitly before building the query: "Found [table] in [catalog]. Using this for the query."
  • You SHOULD default to the default Glue catalog unless the user mentions "federated", "Redshift", "S3 Tables", or finding-data-lake-assets returns a different catalog

4. Discover Schema

For analytical queries, You SHOULD profile the target table before building the final query. You MUST show sample rows (SELECT ... LIMIT 5) as part of profiling.

5. Build Query

Table addressing depends on catalog type:

  • Default Glue catalog: database.table (omit the catalog prefix for single-catalog queries). In cross-catalog queries, qualify default-catalog tables with "awsdatacatalog".database.table.
  • Registered data source: datasource.database.table
  • Unregistered Glue catalog: "catalog/subcatalog".database.table

6. Classify and Execute

Classify the SQL statement before executing:

Statement Behavior
SELECT, SHOW, DESCRIBE, EXPLAIN Safe — execute
INSERT, UPDATE, DELETE, DROP, ALTER, CREATE, TRUNCATE, MERGE Destructive — warn the user and require explicit confirmation
Unsure Treat as destructive; confirm

Example tool call (via AWS MCP server):

aws___call_aws(command="aws athena start-query-execution --work-group <WORKGROUP_NAME> --query-string '<sql>' --query-execution-context Database=<db>")

For federated or S3 Tables catalogs, also set Catalog=<CATALOG_PATH> in the execution context (e.g. Catalog=s3tablescatalog/<BUCKET_NAME>).

Constraints:

  • You MUST warn the user before executing when the target is Redshift-federated ("No partition pruning — every query scans the full table")
  • You MUST warn the user before executing a cross-catalog join ("Cross-catalog joins incur network overhead and may be slow")
  • You MUST confirm the output S3 location before executing
  • You MUST explain which tool is being called before executing
  • You MUST respect the user's decision to abort

7. Present and Recover

Present results with cost, data scanned, duration, and actionable insights. On failure, list available workgroups and let the user choose which to retry with.

Argument Routing

Resolve in this order; stop at the first match:

  1. Contains SQL keywords (SELECT, SHOW, DESCRIBE, INSERT, etc.) — SQL text, execute directly
  2. profile TABLE_NAME — run comprehensive table profiling (see query-patterns.md)
  3. Matches a known named query — look up and execute
  4. Matches a known workgroup — show workgroup status and recent queries
  5. Matches a known catalog — delegate to exploring-data-catalog to enumerate databases and tables
  6. No args — show recent query activity and available tables

Principles

  • Always select workgroup before executing (prevents output-location errors)
  • Profile unfamiliar tables before running analytical queries
  • Present cost alongside results so users build cost awareness
  • Suggest LIMIT for exploratory queries on large tables
  • Never ask domain questions with obvious answers, but always confirm security-relevant actions (workgroup switches, output location changes, non-SELECT statements)

Troubleshooting

Error Cause Fix
Redshift identifier error with mixed case Redshift-federated names are lowercase only Lowercase the identifier
CatalogId validation failure ARN passed instead of catalog name Pass the catalog name, not the ARN
Cross-catalog information_schema returns nothing Missing catalog qualifier Use catalog-qualified path: "catalog".information_schema.tables
Query fails with output-location error Workgroup has no output location configured Select a different workgroup with an output location, or configure one
Destructive statement executed without confirmation Statement classification skipped Always classify INSERT/UPDATE/DELETE/DROP/ALTER/CREATE/TRUNCATE/MERGE and confirm with the user

Additional Resources

同梱ファイル

※ ZIPに含まれるファイル一覧。`SKILL.md` 本体に加え、参考資料・サンプル・スクリプトが入っている場合があります。