🛠️ 開発・MCP コミュニティ

vector

Datadogが開発した高速なデータ処理ツールVectorについて、ログやメトリクスなどのデータを効率的に収集・変換・転送する方法を、開発者が少ないリソースで実現できるよう専門的なアドバイスを提供するSkill。

📜 元の英語説明(参考)

Expert guidance for Vector, the high-performance observability data pipeline built in Rust by Datadog. Helps developers collect, transform, and route logs, metrics, and traces from any source to any destination with minimal resource usage. Vector replaces Logstash, Fluentd, and Filebeat with a single, faster tool.

🇯🇵 日本人クリエイター向け解説

一言でいうと

※ jpskill.com 編集部が日本のビジネス現場向けに補足した解説です。Skill本体の挙動とは独立した参考情報です。

⚡ おすすめ: コマンド1行でインストール(60秒)

下記のコマンドをコピーしてターミナル(Mac/Linux)または PowerShell(Windows)に貼り付けてください。ダウンロード → 解凍 → 配置まで全自動。

🍎 Mac / 🐧 Linux

mkdir -p ~/.claude/skills && cd ~/.claude/skills && curl -L -o vector.zip https://jpskill.com/download/15533.zip && unzip -o vector.zip && rm vector.zip

🪟 Windows (PowerShell)

$d = "$env:USERPROFILE\.claude\skills"; ni -Force -ItemType Directory $d | Out-Null; iwr https://jpskill.com/download/15533.zip -OutFile "$d\vector.zip"; Expand-Archive "$d\vector.zip" -DestinationPath $d -Force; ri "$d\vector.zip"

完了後、Claude Code を再起動 → 普通に「動画プロンプト作って」のように話しかけるだけで自動発動します。

💾 手動でダウンロードしたい(コマンドが難しい人向け)

1. 下の青いボタンを押して vector.zip をダウンロード
2. ZIPファイルをダブルクリックで解凍 → vector フォルダができる
3. そのフォルダを C:\Users\あなたの名前\.claude\skills\(Win)または ~/.claude/skills/(Mac)へ移動
4. Claude Code を再起動

⬇ .zip でダウンロード(推奨) ⬇ .skill 形式(上級者用) 元のソース ↗

⚠️ ダウンロード・利用は自己責任でお願いします。当サイトは内容・動作・安全性について責任を負いません。

🎯 このSkillでできること

下記の説明文を読むと、このSkillがあなたに何をしてくれるかが分かります。Claudeにこの分野の依頼をすると、自動で発動します。

📦 インストール方法 (3ステップ)

1. 上の「ダウンロード」ボタンを押して .skill ファイルを取得
2. ファイル名の拡張子を .skill から .zip に変えて展開(macは自動展開可)
3. 展開してできたフォルダを、ホームフォルダの .claude/skills/ に置く
- · macOS / Linux: ~/.claude/skills/
- · Windows: %USERPROFILE%\.claude\skills\

Claude Code を再起動すれば完了。「このSkillを使って…」と話しかけなくても、関連する依頼で自動的に呼び出されます。

詳しい使い方ガイドを見る →

最終更新: 2026-05-18
取得日時: 2026-05-18
同梱ファイル: 1

📖 Skill本文(日本語訳)

※ 原文(英語/中国語)を Gemini で日本語化したものです。Claude 自身は原文を読みます。誤訳がある場合は原文をご確認ください。

Vector — 高性能な可観測性データパイプライン

概要

Vector は、Datadog によって Rust で構築された高性能な可観測性データパイプラインです。開発者が、最小限のリソース使用量で、あらゆるソースからあらゆる宛先へ、ログ、メトリクス、トレースを収集、変換、ルーティングするのを支援します。Vector は、Logstash、Fluentd、Filebeat を、より高速な単一のツールで置き換えます。

手順

設定

# vector.toml — 可観測性データを収集、変換、ルーティングします

# --- Sources: データの取得元 ---

# ファイルから収集 (Filebeat のように)
[sources.app_logs]
type = "file"
include = ["/var/log/app/*.log"]
read_from = "beginning"

# Syslog 経由で受信
[sources.syslog]
type = "syslog"
address = "0.0.0.0:514"
mode = "tcp"

# HTTP 経由で受信 (ログを POST するアプリ向け)
[sources.http_logs]
type = "http_server"
address = "0.0.0.0:8686"
encoding = "json"

# ホストメトリクスを収集 (CPU、メモリ、ディスク、ネットワーク)
[sources.host_metrics]
type = "host_metrics"
collectors = ["cpu", "memory", "disk", "network"]
scrape_interval_secs = 15

# OpenTelemetry データを受信
[sources.otel]
type = "opentelemetry"
grpc.address = "0.0.0.0:4317"
http.address = "0.0.0.0:4318"

# --- Transforms: データの処理とエンリッチ ---

# JSON ログを解析
[transforms.parse_json]
type = "remap"
inputs = ["app_logs"]
source = '''
  # ログ行から JSON を解析
  . = parse_json!(.message)

  # 環境タグを追加
  .environment = get_env_var("ENVIRONMENT") ?? "production"

  # 機密フィールドを編集
  if exists(.email) {
    .email = redact(.email, filters: ["pattern"], redactor: "full",
      patterns: [r'\S+@\S+'])
  }

  # タイムスタンプを解析
  .timestamp = parse_timestamp!(.timestamp, format: "%Y-%m-%dT%H:%M:%S%.fZ")
'''

# ヘルスチェックのノイズを除外
[transforms.filter_noise]
type = "filter"
inputs = ["parse_json"]
condition = '''
  !includes(["GET /health", "GET /ready", "GET /metrics"], .message) &&
  .level != "debug"
'''

# 大量のログをサンプリング (info ログの 10%、エラーの 100% を保持)
[transforms.sample]
type = "sample"
inputs = ["filter_noise"]
rate = 10
condition = '.level == "info"'
exclude = '.level == "error" || .level == "warn"'

# 派生フィールドを追加
[transforms.enrich]
type = "remap"
inputs = ["sample"]
source = '''
  # ログパスまたはフィールドからサービスを分類
  .service = .service ?? "unknown"

  # 課金追跡のためにログサイズを計算
  .log_size_bytes = length(encode_json(.))

  # 重大度レベルを正規化
  .severity = if .level == "fatal" || .level == "critical" { "error" }
              else if .level == "warning" { "warn" }
              else { .level }
'''

# メトリクスを集計 (カーディナリティを削減)
[transforms.aggregate_metrics]
type = "aggregate"
inputs = ["host_metrics"]
interval_ms = 60000

# --- Sinks: データの送信先 ---

# ログを Elasticsearch/OpenSearch に送信
[sinks.elasticsearch]
type = "elasticsearch"
inputs = ["enrich"]
endpoints = ["https://es.example.com:9200"]
bulk.index = "logs-%Y-%m-%d"
auth.user = "${ES_USER}"
auth.password = "${ES_PASSWORD}"
compression = "gzip"
batch.max_bytes = 10485760
batch.timeout_secs = 5

# 長期アーカイブのために S3 に送信 (安価なストレージ)
[sinks.s3_archive]
type = "aws_s3"
inputs = ["enrich"]
bucket = "logs-archive"
key_prefix = "logs/{{ service }}/year=%Y/month=%m/day=%d/"
compression = "gzip"
encoding.codec = "json"
batch.max_bytes = 104857600        # 効率的な S3 ストレージのために 100MB のファイル
batch.timeout_secs = 300

# メトリクスを Prometheus に送信
[sinks.prometheus]
type = "prometheus_exporter"
inputs = ["aggregate_metrics"]
address = "0.0.0.0:9598"

# エラーを Slack にルーティングして即座に可視化
[sinks.slack_errors]
type = "http"
inputs = ["enrich"]
uri = "${SLACK_WEBHOOK_URL}"
method = "post"
encoding.codec = "json"
condition = '.level == "error" || .level == "fatal"'
batch.max_events = 1
request.rate_limit_num = 5         # 1 秒あたり最大 5 件の Slack メッセージ

VRL (Vector Remap Language)

# VRL は Vector のデータ変換言語です — 高速、安全、型チェック済み

# 複雑なログ行を解析して再構築
. = parse_json!(.message)

# Geoip エンリッチメント
.geo = get_enrichment_table_record("geoip", {"ip": .client_ip}) ?? {}
.country = .geo.country_code ?? "unknown"
del(.geo)

# コンテンツに基づいてルーティング
if starts_with(.message, "AUDIT:") {
  .metadata.sink = "audit-logs"
} else if .status_code >= 500 {
  .metadata.sink = "error-logs"
} else {
  .metadata.sink = "general-logs"
}

# 型の強制
.duration_ms = to_float(.duration_ms) ?? 0.0
.status_code = to_int(.status_code) ?? 0

# より良いインデックス作成のためにネストされたオブジェクトをフラット化
.user_email = del(.user.email)
.user_id = del(.user.id)
del(.user)

インストール

# macOS
brew install vector

# Linux (スクリプト)
curl --proto '=https' --tlsv1.2 -sSfL https://sh.vector.dev | bash

# Docker
docker run -v $(pwd)/vector.toml:/etc/vector/vector.toml timberio/vector:latest-alpine

# Helm (Kubernetes)
helm repo add vector https://helm.vector.dev
helm install vector vector/vector

# 設定の検証
vector validate vector.toml

# 実行
vector --config vector.toml

例

例 1: マイクロサービスプロジェクトのための Vector のセットアップ

ユーザーリクエスト:

Node.js API と React フロントエンドを Docker で実行しています。監視/デプロイのために Vector をセットアップしてください。

エージェントは、# vector.toml — Collect, transform, and route observabilityのようなパターンに基づいて必要な設定ファイルを作成し、既存の Docker セットアップとの統合をセットアップし、Node.js + React スタックに適切なデフォルトを設定し、すべてが動作していることを確認するための検証コマンドを提供します。

例 2: vrl の問題のトラブルシューティング

ユーザーリクエスト:

Vector が vrl でエラーを表示しています。ログは次のとおりです: [エラー出力]

エージェントはエラー出力を分析し、一般的な Vector の問題との相互参照によって根本原因を特定し、修正を適用 (設定の更新、リソース制限の調整、または構文の修正) し、適切なヘルスチェックで解決を確認します。

ガイドライン

**Repl

📜 原文 SKILL.md(Claudeが読む英語/中国語)を展開

Vector — High-Performance Observability Data Pipeline

Overview

Vector, the high-performance observability data pipeline built in Rust by Datadog. Helps developers collect, transform, and route logs, metrics, and traces from any source to any destination with minimal resource usage. Vector replaces Logstash, Fluentd, and Filebeat with a single, faster tool.

Instructions

Configuration

# vector.toml — Collect, transform, and route observability data

# --- Sources: Where data comes from ---

# Collect from files (like Filebeat)
[sources.app_logs]
type = "file"
include = ["/var/log/app/*.log"]
read_from = "beginning"

# Receive via Syslog
[sources.syslog]
type = "syslog"
address = "0.0.0.0:514"
mode = "tcp"

# Receive via HTTP (for apps that POST logs)
[sources.http_logs]
type = "http_server"
address = "0.0.0.0:8686"
encoding = "json"

# Collect host metrics (CPU, memory, disk, network)
[sources.host_metrics]
type = "host_metrics"
collectors = ["cpu", "memory", "disk", "network"]
scrape_interval_secs = 15

# Receive OpenTelemetry data
[sources.otel]
type = "opentelemetry"
grpc.address = "0.0.0.0:4317"
http.address = "0.0.0.0:4318"

# --- Transforms: Process and enrich data ---

# Parse JSON logs
[transforms.parse_json]
type = "remap"
inputs = ["app_logs"]
source = '''
  # Parse JSON from log line
  . = parse_json!(.message)

  # Add environment tag
  .environment = get_env_var("ENVIRONMENT") ?? "production"

  # Redact sensitive fields
  if exists(.email) {
    .email = redact(.email, filters: ["pattern"], redactor: "full",
      patterns: [r'\S+@\S+'])
  }

  # Parse timestamp
  .timestamp = parse_timestamp!(.timestamp, format: "%Y-%m-%dT%H:%M:%S%.fZ")
'''

# Filter out health check noise
[transforms.filter_noise]
type = "filter"
inputs = ["parse_json"]
condition = '''
  !includes(["GET /health", "GET /ready", "GET /metrics"], .message) &&
  .level != "debug"
'''

# Sample high-volume logs (keep 10% of info logs, 100% of errors)
[transforms.sample]
type = "sample"
inputs = ["filter_noise"]
rate = 10
condition = '.level == "info"'
exclude = '.level == "error" || .level == "warn"'

# Add derived fields
[transforms.enrich]
type = "remap"
inputs = ["sample"]
source = '''
  # Categorize by service from log path or field
  .service = .service ?? "unknown"

  # Calculate log size for billing tracking
  .log_size_bytes = length(encode_json(.))

  # Normalize severity levels
  .severity = if .level == "fatal" || .level == "critical" { "error" }
              else if .level == "warning" { "warn" }
              else { .level }
'''

# Aggregate metrics (reduce cardinality)
[transforms.aggregate_metrics]
type = "aggregate"
inputs = ["host_metrics"]
interval_ms = 60000

# --- Sinks: Where data goes ---

# Send logs to Elasticsearch/OpenSearch
[sinks.elasticsearch]
type = "elasticsearch"
inputs = ["enrich"]
endpoints = ["https://es.example.com:9200"]
bulk.index = "logs-%Y-%m-%d"
auth.user = "${ES_USER}"
auth.password = "${ES_PASSWORD}"
compression = "gzip"
batch.max_bytes = 10485760
batch.timeout_secs = 5

# Send to S3 for long-term archive (cheap storage)
[sinks.s3_archive]
type = "aws_s3"
inputs = ["enrich"]
bucket = "logs-archive"
key_prefix = "logs/{{ service }}/year=%Y/month=%m/day=%d/"
compression = "gzip"
encoding.codec = "json"
batch.max_bytes = 104857600        # 100MB files for efficient S3 storage
batch.timeout_secs = 300

# Send metrics to Prometheus
[sinks.prometheus]
type = "prometheus_exporter"
inputs = ["aggregate_metrics"]
address = "0.0.0.0:9598"

# Route errors to Slack for immediate visibility
[sinks.slack_errors]
type = "http"
inputs = ["enrich"]
uri = "${SLACK_WEBHOOK_URL}"
method = "post"
encoding.codec = "json"
condition = '.level == "error" || .level == "fatal"'
batch.max_events = 1
request.rate_limit_num = 5         # Max 5 Slack messages per second

VRL (Vector Remap Language)

# VRL is Vector's data transformation language — fast, safe, type-checked

# Parse and restructure a complex log line
. = parse_json!(.message)

# Geoip enrichment
.geo = get_enrichment_table_record("geoip", {"ip": .client_ip}) ?? {}
.country = .geo.country_code ?? "unknown"
del(.geo)

# Route based on content
if starts_with(.message, "AUDIT:") {
  .metadata.sink = "audit-logs"
} else if .status_code >= 500 {
  .metadata.sink = "error-logs"
} else {
  .metadata.sink = "general-logs"
}

# Coerce types
.duration_ms = to_float(.duration_ms) ?? 0.0
.status_code = to_int(.status_code) ?? 0

# Flatten nested objects for better indexing
.user_email = del(.user.email)
.user_id = del(.user.id)
del(.user)

Installation

# macOS
brew install vector

# Linux (script)
curl --proto '=https' --tlsv1.2 -sSfL https://sh.vector.dev | bash

# Docker
docker run -v $(pwd)/vector.toml:/etc/vector/vector.toml timberio/vector:latest-alpine

# Helm (Kubernetes)
helm repo add vector https://helm.vector.dev
helm install vector vector/vector

# Validate config
vector validate vector.toml

# Run
vector --config vector.toml

Examples

Example 1: Setting up Vector for a microservices project

User request:

I have a Node.js API and a React frontend running in Docker. Set up Vector for monitoring/deployment.

The agent creates the necessary configuration files based on patterns like # vector.toml — Collect, transform, and route observability, sets up the integration with the existing Docker setup, configures appropriate defaults for a Node.js + React stack, and provides verification commands to confirm everything is working.

Example 2: Troubleshooting vrl issues

User request:

Vector is showing errors in our vrl. Here are the logs: [error output]

The agent analyzes the error output, identifies the root cause by cross-referencing with common Vector issues, applies the fix (updating configuration, adjusting resource limits, or correcting syntax), and verifies the resolution with appropriate health checks.

Guidelines

Replace Logstash/Fluentd — Vector uses 10x less memory than Logstash; deploy as a drop-in replacement
Filter before sending — Remove debug logs, health checks, and noise in Vector; don't pay to store data you'll never query
Sample high-volume logs — Keep 100% of errors, sample info logs at 10-20%; reduce storage costs without losing signal
S3 for archives — Route all logs to S3 (compressed) for cheap long-term storage; route only recent/important logs to Elasticsearch
VRL over regex — VRL is compiled and type-checked; it's 5-10x faster than Logstash's Ruby filters
One Vector per host — Run Vector as an agent on each host (DaemonSet in K8s); it handles collection, transformation, and shipping
Disk buffers for reliability — Enable disk-based buffers to prevent data loss during destination outages
Test transforms — Use vector vrl REPL and vector test to validate transforms before deploying