🧩 Skill開発コミュニティ

explainer-video-guide

製品デモやチュートリアル動画など、説明動画の企画から制作まで、台本作成、ナレーション、映像構成の全工程を効率的にガイドするSkill。

📜 元の英語説明(参考)

Explainer video production guide: scripting, voiceover, visuals, and assembly. Covers script formulas, pacing rules, scene planning, and multi-tool pipelines. Use for: product demos, how-it-works videos, onboarding videos, social explainers. Triggers: explainer video, how to make explainer, product video, demo video, video production, video script, animated explainer, product demo video, tutorial video, onboarding video, walkthrough video, video pipeline

🇯🇵 日本人クリエイター向け解説

一言でいうと

製品デモやチュートリアル動画など、説明動画の企画から制作まで、台本作成、ナレーション、映像構成の全工程を効率的にガイドするSkill。

※ jpskill.com 編集部が日本のビジネス現場向けに補足した解説です。Skill本体の挙動とは独立した参考情報です。

⬇ このSkillをダウンロード(.skill) 元のソースを見る ↗

⚠️ ダウンロード・利用は自己責任でお願いします。当サイトは内容・動作・安全性について責任を負いません。

🎯 このSkillでできること

下記の説明文を読むと、このSkillがあなたに何をしてくれるかが分かります。Claudeにこの分野の依頼をすると、自動で発動します。

📦 インストール方法 (3ステップ)

1. 上の「ダウンロード」ボタンを押して .skill ファイルを取得
2. ファイル名の拡張子を .skill から .zip に変えて展開(macは自動展開可)
3. 展開してできたフォルダを、ホームフォルダの .claude/skills/ に置く
- · macOS / Linux: ~/.claude/skills/
- · Windows: %USERPROFILE%\.claude\skills\

Claude Code を再起動すれば完了。「このSkillを使って…」と話しかけなくても、関連する依頼で自動的に呼び出されます。

詳しい使い方ガイドを見る →

最終更新: 2026-05-17
取得日時: 2026-05-17
同梱ファイル: 1

📖 Skill本文(日本語訳)

※ 原文(英語/中国語)を Gemini で日本語化したものです。Claude 自身は原文を読みます。誤訳がある場合は原文をご確認ください。

[Skill 名] explainer-video-guide

説明動画ガイド

inference.sh CLI を介して、スクリプト作成から最終編集まで説明動画を作成します。

クイックスタート

curl -fsSL https://cli.inference.sh | sh && infsh login

# 説明動画のシーンを生成する
infsh app run google/veo-3-1-fast --input '{
  "prompt": "クリーンなモーショングラフィックスタイルのアニメーション、接続されたノード間を流れる抽象的なデータ、青と白の配色、プロフェッショナルな企業美学、スムーズなトランジション"
}'

スクリプトの公式

問題提起・扇動・解決 (PAS) — 60秒

セクション	時間	内容	単語数
問題提起	10秒	視聴者が抱える問題点を述べる	約25語
扇動	10秒	問題が思っているよりも深刻であることを示す	約25語
解決	15秒	製品/アイデアを紹介する	約35語
仕組み	20秒	3つの主要なステップまたは機能を示す	約50語
CTA	5秒	1つの明確な次のアクション	約12語

Before-After-Bridge (BAB) — 90秒

セクション	時間	内容
Before	15秒	現在の不満な状態を示す
After	15秒	理想的な結果を示す
Bridge	40秒	製品がどのようにしてそこに到達させるかを説明する
Social Proof	10秒	簡単な統計またはお客様の声
CTA	10秒	明確な次のステップ

機能スポットライト — 30秒 (ソーシャル)

セクション	時間	内容
Hook	3秒	驚くべき事実または質問
Feature	15秒	1つの機能が1つの問題を解決することを示す
Result	7秒	結果/メリット
CTA	5秒	試す / 詳細を見る

ペーシングルール

コンテンツタイプ	1分あたりの単語数	備考
標準ナレーション	150 wpm	会話ペース
複雑/技術的	120 wpm	処理時間を確保
エネルギッシュ/ソーシャル	170 wpm	短尺向けに速め
子供向けコンテンツ	100 wpm	明確でゆっくり

重要なルール: 1つの主要なメッセージにつき1つのシーン。複数のアイデアを1つのビジュアルに詰め込まないでください。

シーンの長さの目安

導入ショット: 3-5秒
機能デモンストレーション: 5-8秒
画面上のテキスト/統計: 3-4秒 (読めること)
トランジション: 0.5-1秒
CTA画面: 3-5秒

ビジュアル制作

シーンの種類

# 文脈における製品
infsh app run google/veo-3-1-fast --input '{
  "prompt": "クリーンな製品デモンストレーションビデオ、ラップトップでダッシュボードインターフェースを表示しながらタイピングする手、明るくモダンなオフィス、柔らかな自然光、プロフェッショナル"
}'

# 抽象的な概念の視覚化
infsh app run bytedance/seedance-1-5-pro --input '{
  "prompt": "抽象的なモーショングラフィックス、浮遊する幾何学的形状を接続するカラフルなデータストリーム、滑らかで流動的なアニメーション、光る要素のある暗い背景、テクノロジー的な美学"
}'

# ライフスタイル/結果ショット
infsh app run google/veo-3-1-fast --input '{
  "prompt": "ラップトップを持ってソファでくつろぐ幸せな人、画面を見て微笑む、明るく風通しの良いリビングルーム、暖かい午後の光、満足した顧客の感情、ライフスタイルコマーシャルスタイル"
}'

# Before/After比較
infsh app run falai/flux-dev-lora --input '{
  "prompt": "分割画面比較、左側は書類とストレスで散らかった机、右側はきれいに整理されたミニマリストのワークスペース、劇的な違い、クリーンなデザイン"
}'

シーンの画像から動画への変換

# まず静止画を生成する
infsh app run falai/flux-dev-lora --input '{
  "prompt": "光るホログラフィックインターフェースを備えたプロフェッショナルなワークスペース、未来的だがクリーン、青いアクセント照明"
}'

# アニメーション化する
infsh app run falai/wan-2-5-i2v --input '{
  "prompt": "穏やかなカメラのプッシュイン、ホログラフィック要素が微妙に浮遊し回転する、柔らかな環境光の変化",
  "image": "path/to/workspace-still.png"
}'

ナレーション制作

スクリプト作成のヒント

短い文。1文あたり最大15語。
能動態。「データは追跡できます」ではなく「データを追跡できます」。
会話調。声に出して読んでみてください。堅苦しく聞こえる場合は書き直してください。
1文に1つのアイデア。1つの視覚的なビートに1つの文。

ナレーションの生成

# Dia TTSによるプロフェッショナルなナレーション
infsh app run falai/dia-tts --input '{
  "prompt": "[S1] 誰も読まないレポートに何時間も費やすのにうんざりしていませんか？もっと良い方法があります。DataFlowをご紹介します。生データを視覚的なストーリーに変えます...数秒で。ソースを接続し、テンプレートを選んで共有するだけです。今すぐDataFlowを無料でお試しください。"
}'

TTSでのペーシング制御

テクニック	効果	例
ピリオド `.`	中程度のポーズ	"これはすべてを変えます。その方法はこちらです。"
省略記号 `...`	長いポーズ (ドラマチック)	"そして結果は...信じられないものでした。"
コンマ `,`	短いポーズ	"速く、シンプルに、パワフルに。"
感嘆符 `!`	強調/エネルギー	"今日から始めましょう！"
疑問符 `?`	語尾上げ	"もっと良い方法があったらどうでしょう？"

音楽とオーディオ

BGMのガイドライン

音量: ナレーションの20-30%下 (ナレーション再生時に6-12dB下げる)
スタイル: ブランドのトーンに合わせる (企業向け = アンビエントエレクトロニック、スタートアップ = アップビートインディー)
構成: イントロの盛り上がり (最初の3秒) -> ナレーションの下で微妙なループ -> CTAでの盛り上がり
ボーカルなし: ナレーションの下はインストゥルメンタルのみ

# BGMを生成する
infsh app run <music-gen-app> --input '{
  "prompt": "アップビートな企業向けBGM、モダンエレクトロニック、90 BPM、ポジティブでプロフェッショナル、ボーカルなし、製品説明動画に適している"
}'

組み立てパイプライン

完全な制作ワークフロー

# 1. ナレーションを生成する
infsh app run falai/dia-tts --input '{
  "prompt": "[S1] あなたのスクリプトをここに..."
}'

# 2. シーンのビジュアルを生成する (並行して)
infsh app run google/veo-3-1-fast --input '{"prompt": "シーン1の説明"}' --no-wait
infsh app run google/veo-3-1-fast --input '{"prompt": "シーン2の説明"}' --no-wait
infsh app run google/veo-3-1-fast --input '{"prompt": "シーン3の説明"}' --no-wait

# 3. シーンをシーケンスに結合する
infsh app run infsh/media-merger --input '{
  "media

📜 原文 SKILL.md(Claudeが読む英語/中国語)を展開

Explainer Video Guide

Create explainer videos from script to final cut via inference.sh CLI.

Quick Start

curl -fsSL https://cli.inference.sh | sh && infsh login

# Generate a scene for an explainer
infsh app run google/veo-3-1-fast --input '{
  "prompt": "Clean motion graphics style animation, abstract data flowing between connected nodes, blue and white color scheme, professional corporate aesthetic, smooth transitions"
}'

Script Formulas

Problem-Agitate-Solve (PAS) — 60 seconds

Section	Duration	Content	Word Count
Problem	10s	State the pain point the viewer has	~25 words
Agitate	10s	Show why it's worse than they think	~25 words
Solution	15s	Introduce your product/idea	~35 words
How It Works	20s	Show 3 key steps or features	~50 words
CTA	5s	One clear next action	~12 words

Before-After-Bridge (BAB) — 90 seconds

Section	Duration	Content
Before	15s	Show the current frustrating state
After	15s	Show the ideal outcome
Bridge	40s	Explain how your product gets them there
Social Proof	10s	Quick stat or testimonial
CTA	10s	Clear next step

Feature Spotlight — 30 seconds (social)

Section	Duration	Content
Hook	3s	Surprising fact or question
Feature	15s	Show one feature solving one problem
Result	7s	The outcome/benefit
CTA	5s	Try it / Learn more

Pacing Rules

Content Type	Words Per Minute	Notes
Standard narration	150 wpm	Conversational pace
Complex/technical	120 wpm	Allow processing time
Energetic/social	170 wpm	Faster for short-form
Children's content	100 wpm	Clear and slow

Key rule: 1 scene per key message. Don't pack multiple ideas into one visual.

Scene Duration Guidelines

Establishing shot: 3-5 seconds
Feature demonstration: 5-8 seconds
Text/stat on screen: 3-4 seconds (must be readable)
Transition: 0.5-1 second
CTA screen: 3-5 seconds

Visual Production

Scene Types

# Product in context
infsh app run google/veo-3-1-fast --input '{
  "prompt": "Clean product demonstration video, hands typing on a laptop showing a dashboard interface, bright modern office, soft natural lighting, professional"
}'

# Abstract concept visualization
infsh app run bytedance/seedance-1-5-pro --input '{
  "prompt": "Abstract motion graphics, colorful data streams connecting floating geometric shapes, smooth fluid animation, dark background with glowing elements, tech aesthetic"
}'

# Lifestyle/outcome shot
infsh app run google/veo-3-1-fast --input '{
  "prompt": "Happy person relaxing on couch with laptop, smiling at screen, bright airy living room, warm afternoon light, satisfied customer feeling, lifestyle commercial style"
}'

# Before/after comparison
infsh app run falai/flux-dev-lora --input '{
  "prompt": "Split screen comparison, left side cluttered messy desk with papers and stress, right side clean organized minimalist workspace, dramatic difference, clean design"
}'

Image-to-Video for Scenes

# Generate a still frame first
infsh app run falai/flux-dev-lora --input '{
  "prompt": "Professional workspace with glowing holographic interface, futuristic but clean, blue accent lighting"
}'

# Animate it
infsh app run falai/wan-2-5-i2v --input '{
  "prompt": "Gentle camera push in, holographic elements subtly floating and rotating, soft ambient light shifts",
  "image": "path/to/workspace-still.png"
}'

Voiceover Production

Script Writing Tips

Short sentences. Max 15 words per sentence.
Active voice. "You can track your data" not "Your data can be tracked."
Conversational tone. Read it aloud — if it sounds stiff, rewrite.
One idea per sentence. One sentence per visual beat.

Generating Voiceover

# Professional narration with Dia TTS
infsh app run falai/dia-tts --input '{
  "prompt": "[S1] Tired of spending hours on reports that nobody reads? There is a better way. Meet DataFlow. It turns your raw data into visual stories... in seconds. Just connect your source, pick a template, and share. Try DataFlow free today."
}'

Pacing Control in TTS

Technique	Effect	Example
Period `.`	Medium pause	"This changes everything. Here's how."
Ellipsis `...`	Long pause (dramatic)	"And the result... was incredible."
Comma `,`	Short pause	"Fast, simple, powerful."
Exclamation `!`	Emphasis/energy	"Start building today!"
Question `?`	Rising intonation	"What if there was a better way?"

Music & Audio

Background Music Guidelines

Volume: 20-30% under narration (duck 6-12dB when voice plays)
Style: match the brand tone (corporate = ambient electronic, startup = upbeat indie)
Structure: intro swell (first 3s) -> subtle loop under narration -> swell at CTA
No vocals: instrumental only under narration

# Generate background music
infsh app run <music-gen-app> --input '{
  "prompt": "upbeat corporate background music, modern electronic, 90 BPM, positive and professional, no vocals, suitable for product explainer video"
}'

Assembly Pipeline

Full Production Workflow

# 1. Generate voiceover
infsh app run falai/dia-tts --input '{
  "prompt": "[S1] Your script here..."
}'

# 2. Generate scene visuals (in parallel)
infsh app run google/veo-3-1-fast --input '{"prompt": "scene 1 description"}' --no-wait
infsh app run google/veo-3-1-fast --input '{"prompt": "scene 2 description"}' --no-wait
infsh app run google/veo-3-1-fast --input '{"prompt": "scene 3 description"}' --no-wait

# 3. Merge scenes into sequence
infsh app run infsh/media-merger --input '{
  "media": ["scene1.mp4", "scene2.mp4", "scene3.mp4"]
}'

# 4. Add voiceover to video
infsh app run infsh/video-audio-merger --input '{
  "video": "merged-scenes.mp4",
  "audio": "voiceover.mp3"
}'

# 5. Add captions
infsh app run infsh/caption-videos --input '{
  "video": "final-with-audio.mp4",
  "caption_file": "captions.srt"
}'

Video Length by Format

Format	Length	Platform
Social teaser	15-30s	TikTok, Instagram Reels, YouTube Shorts
Product demo	60-90s	Website, landing page
Feature explainer	90-120s	YouTube, email
Tutorial/walkthrough	2-5min	YouTube, help center
Investor pitch video	2-3min	Pitch deck supplement

Transition Types

Transition	When to Use	Effect
Cut	Default between related scenes	Clean, professional
Dissolve/Crossfade	Time passing, mood shift	Soft, contemplative
Wipe	New topic or section	Clear separation
Zoom/Push	Drilling into detail	Focus attention
Match cut	Visual similarity between scenes	Clever, memorable

Common Mistakes

Mistake	Problem	Fix
Script too wordy	Voiceover rushed, viewer overwhelmed	Cut to 150 wpm max
No hook in first 3s	Viewers leave immediately	Start with the problem or surprising stat
Visuals lag narration	Confusing disconnect	Visuals should match or slightly precede words
Background music too loud	Can't hear narration	Duck music 6-12dB under voice
No captions	85% of social video watched silent	Always add captions
Too many ideas	Viewer retains nothing	One core message per video

Related Skills

npx skills add inferencesh/skills@ai-video-generation
npx skills add inferencesh/skills@video-prompting-guide
npx skills add inferencesh/skills@text-to-speech
npx skills add inferencesh/skills@prompt-engineering

Browse all apps: infsh app list