data-masking
データベースやログ、APIに含まれる個人情報や機密データを、開発環境での保護、ログからの削除、分析用匿名化、GDPR準拠のための加工など、目的に応じてマスキング、編集、匿名化処理するSkill。
📜 元の英語説明(参考)
Mask, redact, and anonymize sensitive data (PII, PCI, PHI) in databases, logs, and APIs. Use when protecting PII in dev/staging environments, redacting sensitive data from logs, anonymizing data for analytics, or applying k-anonymity and differential privacy for GDPR-compliant data sharing.
🇯🇵 日本人クリエイター向け解説
データベースやログ、APIに含まれる個人情報や機密データを、開発環境での保護、ログからの削除、分析用匿名化、GDPR準拠のための加工など、目的に応じてマスキング、編集、匿名化処理するSkill。
※ jpskill.com 編集部が日本のビジネス現場向けに補足した解説です。Skill本体の挙動とは独立した参考情報です。
下記のコマンドをコピーしてターミナル(Mac/Linux)または PowerShell(Windows)に貼り付けてください。 ダウンロード → 解凍 → 配置まで全自動。
mkdir -p ~/.claude/skills && cd ~/.claude/skills && curl -L -o data-masking.zip https://jpskill.com/download/14816.zip && unzip -o data-masking.zip && rm data-masking.zip
$d = "$env:USERPROFILE\.claude\skills"; ni -Force -ItemType Directory $d | Out-Null; iwr https://jpskill.com/download/14816.zip -OutFile "$d\data-masking.zip"; Expand-Archive "$d\data-masking.zip" -DestinationPath $d -Force; ri "$d\data-masking.zip"
完了後、Claude Code を再起動 → 普通に「動画プロンプト作って」のように話しかけるだけで自動発動します。
💾 手動でダウンロードしたい(コマンドが難しい人向け)
- 1. 下の青いボタンを押して
data-masking.zipをダウンロード - 2. ZIPファイルをダブルクリックで解凍 →
data-maskingフォルダができる - 3. そのフォルダを
C:\Users\あなたの名前\.claude\skills\(Win)または~/.claude/skills/(Mac)へ移動 - 4. Claude Code を再起動
⚠️ ダウンロード・利用は自己責任でお願いします。当サイトは内容・動作・安全性について責任を負いません。
🎯 このSkillでできること
下記の説明文を読むと、このSkillがあなたに何をしてくれるかが分かります。Claudeにこの分野の依頼をすると、自動で発動します。
📦 インストール方法 (3ステップ)
- 1. 上の「ダウンロード」ボタンを押して .skill ファイルを取得
- 2. ファイル名の拡張子を .skill から .zip に変えて展開(macは自動展開可)
- 3. 展開してできたフォルダを、ホームフォルダの
.claude/skills/に置く- · macOS / Linux:
~/.claude/skills/ - · Windows:
%USERPROFILE%\.claude\skills\
- · macOS / Linux:
Claude Code を再起動すれば完了。「このSkillを使って…」と話しかけなくても、関連する依頼で自動的に呼び出されます。
詳しい使い方ガイドを見る →- 最終更新
- 2026-05-18
- 取得日時
- 2026-05-18
- 同梱ファイル
- 1
📖 Skill本文(日本語訳)
※ 原文(英語/中国語)を Gemini で日本語化したものです。Claude 自身は原文を読みます。誤訳がある場合は原文をご確認ください。
データマスキング
概要
データマスキングは、実際の機密データを、形式と構造を保持したまま、現実的だが偽のデータに置き換えます。以下に不可欠です。
- Dev/staging environments: 実際のPIIを公開せずに、マスクされた本番データを使用します。
- Log sanitization: PIIがログ集約システムに表示されるのを防ぎます。
- Analytics: 生のPIIなしで行動パターンを分析します。
- Testing: 実際の結果を引き起こさない現実的なテストデータです。
マスキング技術
| 技術 | 方法 | 使用するタイミング |
|---|---|---|
| Static masking | 保存時にデータを永続的に置き換えます | Dev DB copy |
| Dynamic masking | 読み込み時にマスクし、オリジナルは保持されます | ロールベースのビュー |
| Tokenization | 実数値にマッピングするトークンに置き換えます | Payment cards |
| Format-preserving | 形式を保持し、値を変更します (例: 実物のようなSSN) | Testing |
| Redaction | プレースホルダー ([REDACTED]) に置き換えます |
Logs |
| Generalization | 特定の値を範囲に置き換えます (年齢34 → 30-40) | Analytics |
PIIパターンライブラリ
import re
PII_PATTERNS = {
"email": r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b',
"phone_us": r'\b(?:\+1[-.]?)?\(?[0-9]{3}\)?[-.\s]?[0-9]{3}[-.\s]?[0-9]{4}\b',
"ssn": r'\b(?!000|666|9\d{2})\d{3}-(?!00)\d{2}-(?!0000)\d{4}\b',
"credit_card": r'\b(?:4[0-9]{12}(?:[0-9]{3})?|5[1-5][0-9]{14}|3[47][0-9]{13}|6(?:011|5[0-9]{2})[0-9]{12})\b',
"ip_address": r'\b(?:(?:25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(?:25[0-5]|2[0-4]\d|[01]?\d\d?)\b',
"date_of_birth": r'\b(?:0[1-9]|1[0-2])[\/\-](?:0[1-9]|[12]\d|3[01])[\/\-](?:19|20)\d{2}\b',
"passport": r'\b[A-Z]{1,2}[0-9]{6,9}\b',
"zip_code": r'\b\d{5}(?:-\d{4})?\b',
}
Emailとクレジットカードのマスカー
import random
import string
from faker import Faker
fake = Faker()
def mask_email(email: str) -> str:
"""Mask email preserving domain structure."""
local, domain = email.split('@')
masked_local = local[0] + '*' * (len(local) - 2) + local[-1] if len(local) > 2 else '***'
return f"{masked_local}@{domain}"
def mask_email_fake(email: str) -> str:
"""Replace email with realistic fake."""
return fake.email()
def mask_credit_card(card_number: str) -> str:
"""Mask credit card — show only last 4 digits."""
cleaned = re.sub(r'[\s-]', '', card_number)
return '*' * (len(cleaned) - 4) + cleaned[-4:]
def mask_ssn(ssn: str) -> str:
"""Mask SSN — show only last 4."""
cleaned = ssn.replace('-', '').replace(' ', '')
return f"***-**-{cleaned[-4:]}"
def mask_phone(phone: str) -> str:
"""Mask phone — show only last 4 digits."""
digits = re.sub(r'\D', '', phone)
return f"***-***-{digits[-4:]}"
def generate_fake_pii() -> dict:
"""Generate a complete set of realistic fake PII for testing."""
return {
"name": fake.name(),
"email": fake.email(),
"phone": fake.phone_number(),
"address": fake.address(),
"ssn": fake.ssn(),
"dob": fake.date_of_birth(minimum_age=18, maximum_age=90).isoformat(),
"credit_card": fake.credit_card_number(card_type='visa'),
"company": fake.company(),
}
ログサニタイザーミドルウェア
# Express.js log scrubbing middleware
const PII_PATTERNS = {
email: /\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b/g,
creditCard: /\b(?:4[0-9]{12}(?:[0-9]{3})?|5[1-5][0-9]{14}|3[47][0-9]{13})\b/g,
ssn: /\b(?!000|666|9\d{2})\d{3}-(?!00)\d{2}-(?!0000)\d{4}\b/g,
phone: /\b(?:\+1[-.]?)?\(?[0-9]{3}\)?[-.\s]?[0-9]{3}[-.\s]?[0-9]{4}\b/g,
password: /"password"\s*:\s*"[^"]*"/g,
token: /"(?:token|api_key|secret|authorization)"\s*:\s*"[^"]*"/gi,
};
function sanitizeLog(data) {
let sanitized = typeof data === 'string' ? data : JSON.stringify(data);
sanitized = sanitized.replace(PII_PATTERNS.email, '[EMAIL]');
sanitized = sanitized.replace(PII_PATTERNS.creditCard, '[CREDIT_CARD]');
sanitized = sanitized.replace(PII_PATTERNS.ssn, '[SSN]');
sanitized = sanitized.replace(PII_PATTERNS.phone, '[PHONE]');
sanitized = sanitized.replace(PII_PATTERNS.password, '"password":"[REDACTED]"');
sanitized = sanitized.replace(PII_PATTERNS.token, (match) => {
const key = match.split(':')[0];
return `${key}:"[REDACTED]"`;
});
return sanitized;
}
// Wrap Winston logger to auto-sanitize
const winston = require('winston');
const logger = winston.createLogger({
transports: [new winston.transports.Console()],
format: winston.format.combine(
winston.format.printf(({ level, message, ...meta }) => {
return JSON.stringify({
level,
message: sanitizeLog(message),
...JSON.parse(sanitizeLog(JSON.stringify(meta)))
});
})
)
});
データベースマスキング (PostgreSQL)
-- Create masked view for dev access
CREATE OR REPLACE VIEW users_masked AS
SELECT
id,
-- Mask name: keep first letter + ***
LEFT(first_name, 1) || '***' AS first_name,
LEFT(last_name, 1) || '***' AS last_name,
-- Mask email: preserve domain
REGEXP_REPLACE(email, '^([^@])([^@]*)(@.+)$', '\1***\3') AS email,
-- Mask phone: show only last 4
'***-***-' || RIGHT(phone, 4) AS phone,
-- Mask SSN: show only last 4
'***-**-' || RIGHT(ssn, 4) AS ssn,
-- Keep non-sensitive fields as-is
created_at,
status,
country
FROM users;
-- Grant dev team access to masked view only (not base table)
GRANT SELECT ON users_masked TO dev_team;
REVOKE SELECT ON users FROM dev_team;
-- Column-level masking function using pgcrypto for format-preserving
CREATE OR REPLACE FUNCTION mask_pan(pan TEXT) RETURNS TEXT AS $$
BEGIN
RETURN RPAD(LEFT(pan, 6), LENGTH(pan) - 4, '*') || RIGHT(pan, 4);
END;
$$ LANGUAGE plpgsql IMMUTABLE;
-- Dynamic masking based on current user role
CREATE OR REPLACE FUNCTION get_user_data(p_user_id UUID)
RETURNS TABLE (name TEXT, email TEXT, phone TEXT) AS $$
BEGIN
IF current_user = 'admin_role' THEN
RETURN QUERY SELECT u.name, u.email, u.phone FROM users u WHERE u.id = p_user_id;
ELSE
RETURN QUERY SELECT NULL, NULL, NULL;
END IF;
END;
$$ LANGUAGE plpgsql SECURITY DEFINER;
-- Example usage:
SELECT * FROM get_user_data('a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11'); 📜 原文 SKILL.md(Claudeが読む英語/中国語)を展開
Data Masking
Overview
Data masking replaces real sensitive data with realistic but fake data, preserving format and structure. Essential for:
- Dev/staging environments: Use masked production data without exposing real PII
- Log sanitization: Prevent PII from appearing in log aggregation systems
- Analytics: Analyze behavioral patterns without raw PII
- Testing: Realistic test data that won't trigger real consequences
Masking Techniques
| Technique | How | When to Use |
|---|---|---|
| Static masking | Replace data at rest permanently | Dev DB copy |
| Dynamic masking | Mask on-read, original preserved | Role-based views |
| Tokenization | Replace with token that maps to real value | Payment cards |
| Format-preserving | Keep format, change values (e.g., real-looking SSN) | Testing |
| Redaction | Replace with placeholder ([REDACTED]) |
Logs |
| Generalization | Replace specific value with range (age 34 → 30-40) | Analytics |
PII Pattern Library
import re
PII_PATTERNS = {
"email": r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b',
"phone_us": r'\b(?:\+1[-.]?)?\(?[0-9]{3}\)?[-.\s]?[0-9]{3}[-.\s]?[0-9]{4}\b',
"ssn": r'\b(?!000|666|9\d{2})\d{3}-(?!00)\d{2}-(?!0000)\d{4}\b',
"credit_card": r'\b(?:4[0-9]{12}(?:[0-9]{3})?|5[1-5][0-9]{14}|3[47][0-9]{13}|6(?:011|5[0-9]{2})[0-9]{12})\b',
"ip_address": r'\b(?:(?:25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(?:25[0-5]|2[0-4]\d|[01]?\d\d?)\b',
"date_of_birth": r'\b(?:0[1-9]|1[0-2])[\/\-](?:0[1-9]|[12]\d|3[01])[\/\-](?:19|20)\d{2}\b',
"passport": r'\b[A-Z]{1,2}[0-9]{6,9}\b',
"zip_code": r'\b\d{5}(?:-\d{4})?\b',
}
Email and Credit Card Maskers
import random
import string
from faker import Faker
fake = Faker()
def mask_email(email: str) -> str:
"""Mask email preserving domain structure."""
local, domain = email.split('@')
masked_local = local[0] + '*' * (len(local) - 2) + local[-1] if len(local) > 2 else '***'
return f"{masked_local}@{domain}"
def mask_email_fake(email: str) -> str:
"""Replace email with realistic fake."""
return fake.email()
def mask_credit_card(card_number: str) -> str:
"""Mask credit card — show only last 4 digits."""
cleaned = re.sub(r'[\s-]', '', card_number)
return '*' * (len(cleaned) - 4) + cleaned[-4:]
def mask_ssn(ssn: str) -> str:
"""Mask SSN — show only last 4."""
cleaned = ssn.replace('-', '').replace(' ', '')
return f"***-**-{cleaned[-4:]}"
def mask_phone(phone: str) -> str:
"""Mask phone — show only last 4 digits."""
digits = re.sub(r'\D', '', phone)
return f"***-***-{digits[-4:]}"
def generate_fake_pii() -> dict:
"""Generate a complete set of realistic fake PII for testing."""
return {
"name": fake.name(),
"email": fake.email(),
"phone": fake.phone_number(),
"address": fake.address(),
"ssn": fake.ssn(),
"dob": fake.date_of_birth(minimum_age=18, maximum_age=90).isoformat(),
"credit_card": fake.credit_card_number(card_type='visa'),
"company": fake.company(),
}
Log Sanitizer Middleware
# Express.js log scrubbing middleware
const PII_PATTERNS = {
email: /\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b/g,
creditCard: /\b(?:4[0-9]{12}(?:[0-9]{3})?|5[1-5][0-9]{14}|3[47][0-9]{13})\b/g,
ssn: /\b(?!000|666|9\d{2})\d{3}-(?!00)\d{2}-(?!0000)\d{4}\b/g,
phone: /\b(?:\+1[-.]?)?\(?[0-9]{3}\)?[-.\s]?[0-9]{3}[-.\s]?[0-9]{4}\b/g,
password: /"password"\s*:\s*"[^"]*"/g,
token: /"(?:token|api_key|secret|authorization)"\s*:\s*"[^"]*"/gi,
};
function sanitizeLog(data) {
let sanitized = typeof data === 'string' ? data : JSON.stringify(data);
sanitized = sanitized.replace(PII_PATTERNS.email, '[EMAIL]');
sanitized = sanitized.replace(PII_PATTERNS.creditCard, '[CREDIT_CARD]');
sanitized = sanitized.replace(PII_PATTERNS.ssn, '[SSN]');
sanitized = sanitized.replace(PII_PATTERNS.phone, '[PHONE]');
sanitized = sanitized.replace(PII_PATTERNS.password, '"password":"[REDACTED]"');
sanitized = sanitized.replace(PII_PATTERNS.token, (match) => {
const key = match.split(':')[0];
return `${key}:"[REDACTED]"`;
});
return sanitized;
}
// Wrap Winston logger to auto-sanitize
const winston = require('winston');
const logger = winston.createLogger({
transports: [new winston.transports.Console()],
format: winston.format.combine(
winston.format.printf(({ level, message, ...meta }) => {
return JSON.stringify({
level,
message: sanitizeLog(message),
...JSON.parse(sanitizeLog(JSON.stringify(meta)))
});
})
)
});
Database Masking (PostgreSQL)
-- Create masked view for dev access
CREATE OR REPLACE VIEW users_masked AS
SELECT
id,
-- Mask name: keep first letter + ***
LEFT(first_name, 1) || '***' AS first_name,
LEFT(last_name, 1) || '***' AS last_name,
-- Mask email: preserve domain
REGEXP_REPLACE(email, '^([^@])([^@]*)(@.+)$', '\1***\3') AS email,
-- Mask phone: show only last 4
'***-***-' || RIGHT(phone, 4) AS phone,
-- Mask SSN: show only last 4
'***-**-' || RIGHT(ssn, 4) AS ssn,
-- Keep non-sensitive fields as-is
created_at,
status,
country
FROM users;
-- Grant dev team access to masked view only (not base table)
GRANT SELECT ON users_masked TO dev_team;
REVOKE SELECT ON users FROM dev_team;
-- Column-level masking function using pgcrypto for format-preserving
CREATE OR REPLACE FUNCTION mask_pan(pan TEXT) RETURNS TEXT AS $$
BEGIN
RETURN RPAD(LEFT(pan, 6), LENGTH(pan) - 4, '*') || RIGHT(pan, 4);
END;
$$ LANGUAGE plpgsql IMMUTABLE;
-- Dynamic masking based on current user role
CREATE OR REPLACE FUNCTION get_user_data(p_user_id UUID)
RETURNS TABLE (name TEXT, email TEXT, phone TEXT) AS $$
BEGIN
IF current_user = 'admin_role' THEN
RETURN QUERY SELECT u.name, u.email, u.phone FROM users u WHERE u.id = p_user_id;
ELSE
RETURN QUERY SELECT
LEFT(u.name, 1) || '***',
REGEXP_REPLACE(u.email, '^([^@])([^@]*)(@.+)$', '\1***\3'),
'***-***-' || RIGHT(u.phone, 4)
FROM users u WHERE u.id = p_user_id;
END IF;
END;
$$ LANGUAGE plpgsql SECURITY DEFINER;
Microsoft Presidio — Auto-Detection
# Presidio automatically detects and masks PII using NLP
from presidio_analyzer import AnalyzerEngine
from presidio_anonymizer import AnonymizerEngine, AnonymizerConfig
from presidio_anonymizer.entities import OperatorConfig
analyzer = AnalyzerEngine()
anonymizer = AnonymizerEngine()
def mask_text_presidio(text: str, masking_style: str = "replace") -> str:
"""Auto-detect and mask PII using Presidio NLP."""
results = analyzer.analyze(text=text, language="en")
if masking_style == "replace":
# Replace with type label: [EMAIL_ADDRESS]
operators = {
"DEFAULT": OperatorConfig("replace", {"new_value": "[REDACTED]"}),
"EMAIL_ADDRESS": OperatorConfig("replace", {"new_value": "[EMAIL]"}),
"PHONE_NUMBER": OperatorConfig("replace", {"new_value": "[PHONE]"}),
"PERSON": OperatorConfig("replace", {"new_value": "[NAME]"}),
"US_SSN": OperatorConfig("replace", {"new_value": "[SSN]"}),
}
elif masking_style == "hash":
# Hash for consistent pseudonymization (same input → same output)
operators = {"DEFAULT": OperatorConfig("hash", {"hash_type": "sha256"})}
anonymized = anonymizer.anonymize(
text=text,
analyzer_results=results,
operators=operators
)
return anonymized.text
# Example
text = "Contact John Smith at john.smith@email.com or 555-123-4567"
print(mask_text_presidio(text))
# → "Contact [NAME] at [EMAIL] or [PHONE]"
Production DB → Dev DB Pipeline
#!/bin/bash
# mask-db-for-dev.sh — Safe production → dev data pipeline
set -e
PROD_DB="postgresql://prod-server/app"
DEV_DB="postgresql://dev-server/app_dev"
echo "Dumping production schema..."
pg_dump --schema-only $PROD_DB > schema.sql
echo "Applying schema to dev..."
psql $DEV_DB < schema.sql
echo "Copying and masking data..."
psql $PROD_DB -c "\COPY (
SELECT
id,
LEFT(first_name, 1) || 'XXXX' AS first_name,
'User' AS last_name,
'user_' || id || '@example.com' AS email,
'555-000-' || LPAD((ROW_NUMBER() OVER())::TEXT, 4, '0') AS phone,
created_at,
status
FROM users
) TO STDOUT WITH CSV" | psql $DEV_DB -c "\COPY users_masked FROM STDIN WITH CSV"
echo "Done. Dev database ready with masked data."
Statistical Anonymization (GDPR)
Anonymization vs Pseudonymization (GDPR Article 4):
- Anonymization: Irreversible -- data can never be linked to an individual. Falls outside GDPR scope.
- Pseudonymization: Reversible -- data can be re-linked with additional info. Still personal data under GDPR.
Key techniques for true anonymization:
- k-Anonymity: Each record is indistinguishable from at least k-1 others on quasi-identifiers (age, ZIP, gender). Generalize values into ranges and suppress groups smaller than k.
- l-Diversity: Each equivalence class has at least l distinct sensitive attribute values, preventing attribute disclosure.
- Differential Privacy: Mathematical privacy guarantee controlled by epsilon -- add calibrated noise to query results. Use
diffprivlib(Python) or Google DP libraries.
k-anonymity alone is often insufficient for GDPR -- combine with l-diversity and/or differential privacy.
Compliance Checklist
- [ ] PII inventory completed (what data, where it lives)
- [ ] Log scrubbing middleware deployed in all services
- [ ] Dev/staging environments use masked data only
- [ ] Database views/roles restrict raw PII access
- [ ] API responses mask PII for non-privileged callers
- [ ] CI pipeline scans for hardcoded PII/secrets
- [ ] Masked data pipeline documented and tested
- [ ] Masking solution reviewed annually