🛠️ 動画Editing
AIを活用し、撮影した映像のカットや構成、
📺 まず動画で見る(YouTube)
▶ 【衝撃】最強のAIエージェント「Claude Code」の最新機能・使い方・プログラミングをAIで効率化する超実践術を解説! ↗
※ jpskill.com 編集部が参考用に選んだ動画です。動画の内容と Skill の挙動は厳密には一致しないことがあります。
📜 元の英語説明(参考)
AI-assisted video editing workflows for cutting, structuring, and augmenting real footage. Covers the full pipeline from raw capture through FFmpeg, Remotion, ElevenLabs, fal.ai, and final polish in Descript or CapCut. Use when the user wants to edit video, cut footage, create vlogs, or build video content.
🇯🇵 日本人クリエイター向け解説
AIを活用し、撮影した映像のカットや構成、
※ jpskill.com 編集部が日本のビジネス現場向けに補足した解説です。Skill本体の挙動とは独立した参考情報です。
⚠️ ダウンロード・利用は自己責任でお願いします。当サイトは内容・動作・安全性について責任を負いません。
🎯 このSkillでできること
下記の説明文を読むと、このSkillがあなたに何をしてくれるかが分かります。Claudeにこの分野の依頼をすると、自動で発動します。
📦 インストール方法 (3ステップ)
- 1. 上の「ダウンロード」ボタンを押して .skill ファイルを取得
- 2. ファイル名の拡張子を .skill から .zip に変えて展開(macは自動展開可)
- 3. 展開してできたフォルダを、ホームフォルダの
.claude/skills/に置く- · macOS / Linux:
~/.claude/skills/ - · Windows:
%USERPROFILE%\.claude\skills\
- · macOS / Linux:
Claude Code を再起動すれば完了。「このSkillを使って…」と話しかけなくても、関連する依頼で自動的に呼び出されます。
詳しい使い方ガイドを見る →- 最終更新
- 2026-05-17
- 取得日時
- 2026-05-17
- 同梱ファイル
- 1
💬 こう話しかけるだけ — サンプルプロンプト
- › Video Editing を使って、新商品PRの15秒動画プロンプトを作って
- › Video Editing で、Instagram Reels 向けの縦動画プロンプトを作って
- › Video Editing で参考にしたい動画のURLがある。これに近い雰囲気のプロンプトを生成
これをClaude Code に貼るだけで、このSkillが自動発動します。
📖 Claude が読む原文 SKILL.md(中身を展開)
この本文は AI(Claude)が読むための原文(英語または中国語)です。日本語訳は順次追加中。
Video Editing
AI-assisted editing for real footage. Not generation from prompts. Editing existing video fast.
When to Activate
- User wants to edit, cut, or structure video footage
- Turning long recordings into short-form content
- Building vlogs, tutorials, or demo videos from raw capture
- Adding overlays, subtitles, music, or voiceover to existing video
- Reframing video for different platforms (YouTube, TikTok, Instagram)
- User says "edit video", "cut this footage", "make a vlog", or "video workflow"
Core Thesis
AI video editing is useful when you stop asking it to create the whole video and start using it to compress, structure, and augment real footage. The value is not generation. The value is compression.
The Pipeline
Screen Studio / raw footage
→ Claude / Codex
→ FFmpeg
→ Remotion
→ ElevenLabs / fal.ai
→ Descript or CapCut
Each layer has a specific job. Do not skip layers. Do not try to make one tool do everything.
Layer 1: Capture (Screen Studio / Raw Footage)
Collect the source material:
- Screen Studio: polished screen recordings for app demos, coding sessions, browser workflows
- Raw camera footage: vlog footage, interviews, event recordings
- Desktop capture via VideoDB: session recording with real-time context (see
videodbskill)
Output: raw files ready for organization.
Layer 2: Organization (Claude / Codex)
Use Claude Code or Codex to:
- Transcribe and label: generate transcript, identify topics and themes
- Plan structure: decide what stays, what gets cut, what order works
- Identify dead sections: find pauses, tangents, repeated takes
- Generate edit decision list: timestamps for cuts, segments to keep
- Scaffold FFmpeg and Remotion code: generate the commands and compositions
Example prompt:
"Here's the transcript of a 4-hour recording. Identify the 8 strongest segments
for a 24-minute vlog. Give me FFmpeg cut commands for each segment."
This layer is about structure, not final creative taste.
Layer 3: Deterministic Cuts (FFmpeg)
FFmpeg handles the boring but critical work: splitting, trimming, concatenating, and preprocessing.
Extract segment by timestamp
ffmpeg -i raw.mp4 -ss 00:12:30 -to 00:15:45 -c copy segment_01.mp4
Batch cut from edit decision list
#!/bin/bash
# cuts.txt: start,end,label
while IFS=, read -r start end label; do
ffmpeg -i raw.mp4 -ss "$start" -to "$end" -c copy "segments/${label}.mp4"
done < cuts.txt
Concatenate segments
# Create file list
for f in segments/*.mp4; do echo "file '$f'"; done > concat.txt
ffmpeg -f concat -safe 0 -i concat.txt -c copy assembled.mp4
Create proxy for faster editing
ffmpeg -i raw.mp4 -vf "scale=960:-2" -c:v libx264 -preset ultrafast -crf 28 proxy.mp4
Extract audio for transcription
ffmpeg -i raw.mp4 -vn -acodec pcm_s16le -ar 16000 audio.wav
Normalize audio levels
ffmpeg -i segment.mp4 -af loudnorm=I=-16:TP=-1.5:LRA=11 -c:v copy normalized.mp4
Layer 4: Programmable Composition (Remotion)
Remotion turns editing problems into composable code. Use it for things that traditional editors make painful:
When to use Remotion
- Overlays: text, images, branding, lower thirds
- Data visualizations: charts, stats, animated numbers
- Motion graphics: transitions, explainer animations
- Composable scenes: reusable templates across videos
- Product demos: annotated screenshots, UI highlights
Basic Remotion composition
import { AbsoluteFill, Sequence, Video, useCurrentFrame } from "remotion";
export const VlogComposition: React.FC = () => {
const frame = useCurrentFrame();
return (
<AbsoluteFill>
{/* Main footage */}
<Sequence from={0} durationInFrames={300}>
<Video src="/segments/intro.mp4" />
</Sequence>
{/* Title overlay */}
<Sequence from={30} durationInFrames={90}>
<AbsoluteFill style={{
justifyContent: "center",
alignItems: "center",
}}>
<h1 style={{
fontSize: 72,
color: "white",
textShadow: "2px 2px 8px rgba(0,0,0,0.8)",
}}>
The AI Editing Stack
</h1>
</AbsoluteFill>
</Sequence>
{/* Next segment */}
<Sequence from={300} durationInFrames={450}>
<Video src="/segments/demo.mp4" />
</Sequence>
</AbsoluteFill>
);
};
Render output
npx remotion render src/index.ts VlogComposition output.mp4
See the Remotion docs for detailed patterns and API reference.
Layer 5: Generated Assets (ElevenLabs / fal.ai)
Generate only what you need. Do not generate the whole video.
Voiceover with ElevenLabs
import os
import requests
resp = requests.post(
f"https://api.elevenlabs.io/v1/text-to-speech/{voice_id}",
headers={
"xi-api-key": os.environ["ELEVENLABS_API_KEY"],
"Content-Type": "application/json"
},
json={
"text": "Your narration text here",
"model_id": "eleven_turbo_v2_5",
"voice_settings": {"stability": 0.5, "similarity_boost": 0.75}
}
)
with open("voiceover.mp3", "wb") as f:
f.write(resp.content)
Music and SFX with fal.ai
Use the fal-ai-media skill for:
- Background music generation
- Sound effects (ThinkSound model for video-to-audio)
- Transition sounds
Generated visuals with fal.ai
Use for insert shots, thumbnails, or b-roll that doesn't exist:
generate(model_name: "fal-ai/nano-banana-pro", input: {
"prompt": "professional thumbnail for tech vlog, dark background, code on screen",
"image_size": "landscape_16_9"
})
VideoDB generative audio
If VideoDB is configured:
voiceover = coll.generate_voice(text="Narration here", voice="alloy")
music = coll.generate_music(prompt="lo-fi background for coding vlog", duration=120)
sfx = coll.generate_sound_effect(prompt="subtle whoosh transition")
Layer 6: Final Polish (Descript / CapCut)
The last layer is human. Use a traditional editor for:
- Pacing: adjust cuts that feel too fast or slow
- Captions: auto-generated, then manually cleaned
- Color grading: basic correction and mood
- Final audio mix: balance voice, music, and SFX levels
- Export: platform-specific formats and quality settings
This is where taste lives. AI clears the repetitive work. You make the final calls.
Social Media Reframing
Different platforms need different aspect ratios:
| Platform | Aspect Ratio | Resolution |
|---|---|---|
| YouTube | 16:9 | 1920x1080 |
| TikTok / Reels | 9:16 | 1080x1920 |
| Instagram Feed | 1:1 | 1080x1080 |
| X / Twitter | 16:9 or 1:1 | 1280x720 or 720x720 |
Reframe with FFmpeg
# 16:9 to 9:16 (center crop)
ffmpeg -i input.mp4 -vf "crop=ih*9/16:ih,scale=1080:1920" vertical.mp4
# 16:9 to 1:1 (center crop)
ffmpeg -i input.mp4 -vf "crop=ih:ih,scale=1080:1080" square.mp4
Reframe with VideoDB
# Smart reframe (AI-guided subject tracking)
reframed = video.reframe(start=0, end=60, target="vertical", mode=ReframeMode.smart)
Scene Detection and Auto-Cut
FFmpeg scene detection
# Detect scene changes (threshold 0.3 = moderate sensitivity)
ffmpeg -i input.mp4 -vf "select='gt(scene,0.3)',showinfo" -vsync vfr -f null - 2>&1 | grep showinfo
Silence detection for auto-cut
# Find silent segments (useful for cutting dead air)
ffmpeg -i input.mp4 -af silencedetect=noise=-30dB:d=2 -f null - 2>&1 | grep silence
Highlight extraction
Use Claude to analyze transcript + scene timestamps:
"Given this transcript with timestamps and these scene change points,
identify the 5 most engaging 30-second clips for social media."
What Each Tool Does Best
| Tool | Strength | Weakness |
|---|---|---|
| Claude / Codex | Organization, planning, code generation | Not the creative taste layer |
| FFmpeg | Deterministic cuts, batch processing, format conversion | No visual editing UI |
| Remotion | Programmable overlays, composable scenes, reusable templates | Learning curve for non-devs |
| Screen Studio | Polished screen recordings immediately | Only screen capture |
| ElevenLabs | Voice, narration, music, SFX | Not the center of the workflow |
| Descript / CapCut | Final pacing, captions, polish | Manual, not automatable |
Key Principles
- Edit, don't generate. This workflow is for cutting real footage, not creating from prompts.
- Structure before style. Get the story right in Layer 2 before touching anything visual.
- FFmpeg is the backbone. Boring but critical. Where long footage becomes manageable.
- Remotion for repeatability. If you'll do it more than once, make it a Remotion component.
- Generate selectively. Only use AI generation for assets that don't exist, not for everything.
- Taste is the last layer. AI clears repetitive work. You make the final creative calls.
Related Skills
fal-ai-media— AI image, video, and audio generationvideodb— Server-side video processing, indexing, and streamingcontent-engine— Platform-native content distribution