🛠️ 開発・MCP コミュニティ

machine-learning

Machine learning development patterns, model training, evaluation, and deployment. Use when building ML pipelines, training models, feature engineering, model evaluation, or deploying ML systems to production.

⬇ このSkillをダウンロード(.skill) 元のソースを見る ↗

⚠️ ダウンロード・利用は自己責任でお願いします。当サイトは内容・動作・安全性について責任を負いません。

🎯 このSkillでできること

下記の説明文を読むと、このSkillがあなたに何をしてくれるかが分かります。Claudeにこの分野の依頼をすると、自動で発動します。

📦 インストール方法 (3ステップ)

1. 上の「ダウンロード」ボタンを押して .skill ファイルを取得
2. ファイル名の拡張子を .skill から .zip に変えて展開(macは自動展開可)
3. 展開してできたフォルダを、ホームフォルダの .claude/skills/ に置く
- · macOS / Linux: ~/.claude/skills/
- · Windows: %USERPROFILE%\.claude\skills\

Claude Code を再起動すれば完了。「このSkillを使って…」と話しかけなくても、関連する依頼で自動的に呼び出されます。

詳しい使い方ガイドを見る →

最終更新: 2026-05-17
取得日時: 2026-05-17
同梱ファイル: 1

📖 Skill本文(日本語訳)

※ 原文(英語/中国語)を Gemini で日本語化したものです。Claude 自身は原文を読みます。誤訳がある場合は原文をご確認ください。

機械学習

実験から本番環境へのデプロイまで、機械学習のライフサイクル全体を網羅する包括的な機械学習スキルです。

このスキルを使用する場面

機械学習パイプラインの構築
特徴量エンジニアリングとデータ前処理
モデルのトレーニング、評価、選択
ハイパーパラメータのチューニングと最適化
モデルのデプロイと提供
ML実験の追跡とバージョン管理
本番環境でのML監視とメンテナンス

ML開発ライフサイクル

1. 問題定義

分類の種類:

二値分類 (スパム/非スパム)
多クラス分類 (画像カテゴリ)
多ラベル分類 (ドキュメントタグ)
回帰 (価格予測)
クラスタリング (顧客セグメンテーション)
ランキング (検索結果)
異常検知 (不正検知)

問題タイプ別の成功指標:

問題タイプ	主要指標	副次指標
二値分類	AUC-ROC, F1	Precision, Recall, PR-AUC
多クラス	Macro F1, Accuracy	クラスごとの指標
回帰	RMSE, MAE	R², MAPE
ランキング	NDCG, MAP	MRR
クラスタリング	Silhouette, Calinski-Harabasz	Davies-Bouldin

2. データ準備

データ品質チェック:

欠損値分析と補完戦略
外れ値検出と処理
データ型検証
分布分析
ターゲットリーク検出

特徴量エンジニアリングのパターン:

数値: スケーリング、ビニング、対数変換、多項式特徴量
カテゴリ: ワンホット、ターゲットエンコーディング、頻度エンコーディング、埋め込み
時系列: ラグ特徴量、移動統計、周期エンコーディング
テキスト: TF-IDF、単語埋め込み、Transformer埋め込み
地理空間: 距離特徴量、クラスタリング、グリッドエンコーディング

学習/テスト分割戦略:

ランダム分割 (標準)
層化分割 (不均衡なクラス)
時間ベース分割 (時系列データ)
グループ分割 (データリーク防止)
K分割交差検証

3. モデル選択

アルゴリズム選択ガイド:

データサイズ	問題	推奨モデル
小規模 (<10K)	分類	Logistic Regression, SVM, Random Forest
小規模 (<10K)	回帰	Linear Regression, Ridge, SVR
中規模 (10K-1M)	分類	XGBoost, LightGBM, Neural Networks
中規模 (10K-1M)	回帰	XGBoost, LightGBM, Neural Networks
大規模 (>1M)	任意	Deep Learning, 分散学習
表形式	任意	Gradient Boosting (XGBoost, LightGBM, CatBoost)
画像	分類	CNN, ResNet, EfficientNet, Vision Transformers
テキスト	NLP	Transformers (BERT, RoBERTa, GPT)
シーケンシャル	時系列	LSTM, Transformer, Prophet

4. モデルトレーニング

ハイパーパラメータチューニング:

グリッドサーチ: 網羅的、狭い探索空間に適しています
ランダムサーチ: 効率的、広い探索空間に適しています
ベイズ最適化: 賢い探索 (Optuna, Hyperopt)
早期停止: 過学習の防止

一般的なハイパーパラメータ:

モデル	主要パラメータ
XGBoost	learning_rate, max_depth, n_estimators, subsample
LightGBM	num_leaves, learning_rate, n_estimators, feature_fraction
Random Forest	n_estimators, max_depth, min_samples_split
Neural Networks	learning_rate, batch_size, layers, dropout

5. モデル評価

評価のベストプラクティス:

最終評価には必ずホールドアウトテストセットを使用する
開発中は交差検証を使用する
過学習の確認 (学習と検証のギャップ)
複数の指標で評価する
エラーを定性的に分析する

不均衡データの処理:

リサンプリング: SMOTE, アンダーサンプリング
クラス重み: 重み付き損失関数
しきい値チューニング: 決定しきい値の最適化
評価: ROC-AUCよりもPR-AUCを使用する

6. 本番環境へのデプロイ

モデル提供パターン:

REST API (Flask, FastAPI, TF Serving)
バッチ推論 (スケジュールされたジョブ)
ストリーミング (リアルタイム予測)
エッジデプロイ (モバイル、IoT)

本番環境での考慮事項:

レイテンシ要件 (p50, p95, p99)
スループット (1秒あたりのリクエスト数)
モデルサイズとメモリフットプリント
フォールバック戦略
A/Bテストフレームワーク

7. 監視とメンテナンス

監視対象:

予測レイテンシ
入力特徴量分布 (データドリフト)
予測分布 (コンセプトドリフト)
モデル性能指標
エラー率と種類

再学習のトリガー:

性能がしきい値を下回る劣化
顕著なデータドリフトの検出
スケジュールされた再学習 (毎日、毎週)
新しい学習データの利用可能化

MLOpsのベストプラクティス

実験追跡

すべての実験について以下を追跡します。

コードバージョン (git commit)
データバージョン (ハッシュまたはバージョンID)
ハイパーパラメータ
指標 (学習、検証、テスト)
モデルアーティファクト
環境 (パッケージ、バージョン)

モデルのバージョン管理

models/
├── model_v1.0.0/
│   ├── model.pkl
│   ├── metadata.json
│   ├── requirements.txt
│   └── metrics.json
├── model_v1.1.0/
└── model_v2.0.0/

MLのためのCI/CD

継続的インテグレーション:
- データ検証テスト
- モデル学習テスト
- 性能回帰テスト
継続的デプロイ:
- ステージング環境での検証
- シャドウモードテスト
- 段階的ロールアウト (カナリア)
- 自動ロールバック

参照ファイル

詳細なパターンとコード例については、必要に応じて参照ファイルを読み込んでください。

references/preprocessing.md - データ前処理パターンと特徴量エンジニアリング手法
references/model_patterns.md - モデルアーキテクチャパターンと実装例
references/evaluation.md - 包括的な評価戦略と指標

他のスキルとの統合

performance - 推論レイテンシの最適化のため
testing - ML固有のテストパターンのため
database-optimization - 特徴量ストアのクエリのため
debugging - モデルのデバッグとエラー分析のため

📜 原文 SKILL.md(Claudeが読む英語/中国語)を展開

Machine Learning

Comprehensive machine learning skill covering the full ML lifecycle from experimentation to production deployment.

When to Use This Skill

Building machine learning pipelines
Feature engineering and data preprocessing
Model training, evaluation, and selection
Hyperparameter tuning and optimization
Model deployment and serving
ML experiment tracking and versioning
Production ML monitoring and maintenance

ML Development Lifecycle

1. Problem Definition

Classification Types:

Binary classification (spam/not spam)
Multi-class classification (image categories)
Multi-label classification (document tags)
Regression (price prediction)
Clustering (customer segmentation)
Ranking (search results)
Anomaly detection (fraud detection)

Success Metrics by Problem Type:

Problem Type	Primary Metrics	Secondary Metrics
Binary Classification	AUC-ROC, F1	Precision, Recall, PR-AUC
Multi-class	Macro F1, Accuracy	Per-class metrics
Regression	RMSE, MAE	R², MAPE
Ranking	NDCG, MAP	MRR
Clustering	Silhouette, Calinski-Harabasz	Davies-Bouldin

2. Data Preparation

Data Quality Checks:

Missing value analysis and imputation strategies
Outlier detection and handling
Data type validation
Distribution analysis
Target leakage detection

Feature Engineering Patterns:

Numerical: scaling, binning, log transforms, polynomial features
Categorical: one-hot, target encoding, frequency encoding, embeddings
Temporal: lag features, rolling statistics, cyclical encoding
Text: TF-IDF, word embeddings, transformer embeddings
Geospatial: distance features, clustering, grid encoding

Train/Test Split Strategies:

Random split (standard)
Stratified split (imbalanced classes)
Time-based split (temporal data)
Group split (prevent data leakage)
K-fold cross-validation

3. Model Selection

Algorithm Selection Guide:

Data Size	Problem	Recommended Models
Small (<10K)	Classification	Logistic Regression, SVM, Random Forest
Small (<10K)	Regression	Linear Regression, Ridge, SVR
Medium (10K-1M)	Classification	XGBoost, LightGBM, Neural Networks
Medium (10K-1M)	Regression	XGBoost, LightGBM, Neural Networks
Large (>1M)	Any	Deep Learning, Distributed training
Tabular	Any	Gradient Boosting (XGBoost, LightGBM, CatBoost)
Images	Classification	CNN, ResNet, EfficientNet, Vision Transformers
Text	NLP	Transformers (BERT, RoBERTa, GPT)
Sequential	Time Series	LSTM, Transformer, Prophet

4. Model Training

Hyperparameter Tuning:

Grid Search: exhaustive, good for small spaces
Random Search: efficient, good for large spaces
Bayesian Optimization: smart exploration (Optuna, Hyperopt)
Early stopping: prevent overfitting

Common Hyperparameters:

Model	Key Parameters
XGBoost	learning_rate, max_depth, n_estimators, subsample
LightGBM	num_leaves, learning_rate, n_estimators, feature_fraction
Random Forest	n_estimators, max_depth, min_samples_split
Neural Networks	learning_rate, batch_size, layers, dropout

5. Model Evaluation

Evaluation Best Practices:

Always use held-out test set for final evaluation
Use cross-validation during development
Check for overfitting (train vs validation gap)
Evaluate on multiple metrics
Analyze errors qualitatively

Handling Imbalanced Data:

Resampling: SMOTE, undersampling
Class weights: weighted loss functions
Threshold tuning: optimize decision threshold
Evaluation: use PR-AUC over ROC-AUC

6. Production Deployment

Model Serving Patterns:

REST API (Flask, FastAPI, TF Serving)
Batch inference (scheduled jobs)
Streaming (real-time predictions)
Edge deployment (mobile, IoT)

Production Considerations:

Latency requirements (p50, p95, p99)
Throughput (requests per second)
Model size and memory footprint
Fallback strategies
A/B testing framework

7. Monitoring & Maintenance

What to Monitor:

Prediction latency
Input feature distributions (data drift)
Prediction distributions (concept drift)
Model performance metrics
Error rates and types

Retraining Triggers:

Performance degradation below threshold
Significant data drift detected
Scheduled retraining (daily, weekly)
New training data available

MLOps Best Practices

Experiment Tracking

Track for every experiment:

Code version (git commit)
Data version (hash or version ID)
Hyperparameters
Metrics (train, validation, test)
Model artifacts
Environment (packages, versions)

Model Versioning

models/
├── model_v1.0.0/
│   ├── model.pkl
│   ├── metadata.json
│   ├── requirements.txt
│   └── metrics.json
├── model_v1.1.0/
└── model_v2.0.0/

CI/CD for ML

Continuous Integration:
- Data validation tests
- Model training tests
- Performance regression tests
Continuous Deployment:
- Staging environment validation
- Shadow mode testing
- Gradual rollout (canary)
- Automatic rollback

Reference Files

For detailed patterns and code examples, load reference files as needed:

references/preprocessing.md - Data preprocessing patterns and feature engineering techniques
references/model_patterns.md - Model architecture patterns and implementation examples
references/evaluation.md - Comprehensive evaluation strategies and metrics

Integration with Other Skills

performance - For optimizing inference latency
testing - For ML-specific testing patterns
database-optimization - For feature store queries
debugging - For model debugging and error analysis