🎬 動画AI コミュニティ

happyhorse-1-0

RunComfy上で動作するHappyHorse 1.0を用いて、高品質な動画をテキストから生成し、多言語プロンプトやキャラクターの一貫性もサポート、必要に応じて他のモデルに切り替えて最適な動画作成を支援するSkill。

📜 元の英語説明(参考)

Generate text-to-video with HappyHorse 1.0 on RunComfy. Documents HappyHorse 1.0's strengths (#1 on Artificial Analysis Video Arena, native 1080p with in-pass synchronized audio, multi-shot character consistency, 6-language prompt support), the duration / aspect-ratio / resolution schema, and when to route to Wan 2.7 / Seedance 2 / LTX 2 instead. Calls `runcomfy run happyhorse/happyhorse-1-0/text-to-video` through the local RunComfy CLI. Triggers on "happyhorse", "happy horse", "happyhorse 1.0", "happyhorse video", or any explicit ask to generate video with this model.

🇯🇵 日本人クリエイター向け解説

一言でいうと

※ jpskill.com 編集部が日本のビジネス現場向けに補足した解説です。Skill本体の挙動とは独立した参考情報です。

⚡ おすすめ: コマンド1行でインストール(60秒)

下記のコマンドをコピーしてターミナル(Mac/Linux)または PowerShell(Windows)に貼り付けてください。ダウンロード → 解凍 → 配置まで全自動。

🍎 Mac / 🐧 Linux

mkdir -p ~/.claude/skills && cd ~/.claude/skills && curl -L -o happyhorse-1-0.zip https://jpskill.com/download/10365.zip && unzip -o happyhorse-1-0.zip && rm happyhorse-1-0.zip

🪟 Windows (PowerShell)

$d = "$env:USERPROFILE\.claude\skills"; ni -Force -ItemType Directory $d | Out-Null; iwr https://jpskill.com/download/10365.zip -OutFile "$d\happyhorse-1-0.zip"; Expand-Archive "$d\happyhorse-1-0.zip" -DestinationPath $d -Force; ri "$d\happyhorse-1-0.zip"

完了後、Claude Code を再起動 → 普通に「動画プロンプト作って」のように話しかけるだけで自動発動します。

💾 手動でダウンロードしたい(コマンドが難しい人向け)

1. 下の青いボタンを押して happyhorse-1-0.zip をダウンロード
2. ZIPファイルをダブルクリックで解凍 → happyhorse-1-0 フォルダができる
3. そのフォルダを C:\Users\あなたの名前\.claude\skills\(Win)または ~/.claude/skills/(Mac)へ移動
4. Claude Code を再起動

⬇ .zip でダウンロード(推奨) ⬇ .skill 形式(上級者用) 元のソース ↗

⚠️ ダウンロード・利用は自己責任でお願いします。当サイトは内容・動作・安全性について責任を負いません。

🎯 このSkillでできること

下記の説明文を読むと、このSkillがあなたに何をしてくれるかが分かります。Claudeにこの分野の依頼をすると、自動で発動します。

📦 インストール方法 (3ステップ)

1. 上の「ダウンロード」ボタンを押して .skill ファイルを取得
2. ファイル名の拡張子を .skill から .zip に変えて展開(macは自動展開可)
3. 展開してできたフォルダを、ホームフォルダの .claude/skills/ に置く
- · macOS / Linux: ~/.claude/skills/
- · Windows: %USERPROFILE%\.claude\skills\

Claude Code を再起動すれば完了。「このSkillを使って…」と話しかけなくても、関連する依頼で自動的に呼び出されます。

詳しい使い方ガイドを見る →

最終更新: 2026-05-18
取得日時: 2026-05-18
同梱ファイル: 1

📖 Skill本文(日本語訳)

※ 原文(英語/中国語)を Gemini で日本語化したものです。Claude 自身は原文を読みます。誤訳がある場合は原文をご確認ください。

HappyHorse 1.0 — RunComfy の Pro Pack

runcomfy.com · Text-to-video · GitHub

HappyHorse 1.0 — 現在 Artificial Analysis Video Arena で #1 (Elo 1333 t2v / 1392 i2v) — RunComfy Model API でホストされています。ネイティブ 1080p ビデオと、インパス同期オーディオ(ダイアログ、アンビエント、フォーリー)およびマルチショットのキャラクターの一貫性を備えています。

npx skills add agentspace-so/runcomfy-skills --skill happyhorse-1-0 -g

このモデルを選ぶべき時 (他のモデルとの比較)

必要なもの	使用するモデル
キャラクター/衣装の一貫性のあるマルチショットストーリー	HappyHorse 1.0
同じ生成パスでのネイティブオーディオ	HappyHorse 1.0
現在 #1 のブラインド投票ビデオモデル	HappyHorse 1.0
詳細なリップシンクダイアログ + リファレンスビデオ	Seedance 2.0 Pro
微細なモーションコントロール + マルチリファレンスコンディショニング	Wan 2.7
超高速イテレーション (フレームあたり1秒未満)	LTX 2
既存の映像に対する映画のようなモーション編集	Kling Video O1

ユーザーが明示的に "HappyHorse" / "happy horse video" と言った場合は、ここにルーティングしてください。

前提条件

RunComfy CLI — npm i -g @runcomfy/cli
RunComfy アカウント — runcomfy login はブラウザのデバイスコードフローを開きます。
CI / コンテナ — runcomfy login の代わりに RUNCOMFY_TOKEN=<token> を設定します。

エンドポイント + 入力スキーマ

`happyhorse/happyhorse-1-0/text-to-video`

フィールド	タイプ	必須	デフォルト	注
`prompt`	string	yes	—	最大 2,500 文字。6言語 (CN/EN/JP/KR/DE/FR)。
`aspect_ratio`	enum	no	`16:9`	`16:9`, `9:16`, `1:1`, `4:3`, `3:4` のみ。
`resolution`	enum	no	`1080P`	`720P` または `1080P`。
`duration`	int	no	5	3–15 秒。
`seed`	int	no	0	0..2^31-1。バリアント比較のために再利用します。
`watermark`	bool	no	true	プロバイダーのウォーターマーク。

呼び出し方法

デフォルト (16:9 1080p 5s):

runcomfy run happyhorse/happyhorse-1-0/text-to-video \
  --input '{"prompt": "<user prompt>"}' \
  --output-dir <absolute/path>

縦型ショート (9:16, 8s, ウォーターマークなし):

runcomfy run happyhorse/happyhorse-1-0/text-to-video \
  --input '{
    "prompt": "<user prompt>",
    "aspect_ratio": "9:16",
    "duration": 8,
    "watermark": false
  }' \
  --output-dir <absolute/path>

より安価なテストパス (720p):

runcomfy run happyhorse/happyhorse-1-0/text-to-video \
  --input '{"prompt": "<user prompt>", "resolution": "720P", "duration": 3}' \
  --output-dir <absolute/path>

CLI は送信し、ターミナルに到達するまで 2 秒ごとにポーリングし、結果から *.runcomfy.net / *.runcomfy.com URL を --output-dir にダウンロードします。Stdout は結果の JSON です。Stderr は進捗状況です。

プロンプト — 実際に効果があるもの

静止画ではなく、時間の経過に伴うモーションを記述します。 「女性が窓から振り返り、デスクまで2歩歩き、カップを取り、顔に近づけ、一口飲む」は、「コーヒーを飲む女性」よりも優れています。

カメラ + ショットをわかりやすい英語で記述します。 ショットを最初に記述します。"Wide shot. ..." / "Tracking shot. ..." / "Locked tripod, low angle. ..." は、実際の指示として機能します。レンズの感触を指定します。"35mm anamorphic"、"shallow DOF"、"crushed shadows"。

イテレーション時には、クリップごとに1つの視覚的なビートを記述します。 「彼女が歩き、犬が走り、車が通り過ぎる」のように積み重ねないでください。ビートを選び、それをシャープにし、次にマルチショットプロンプトでレイヤー化します。

マルチショットの一貫性 — 2つのビートを記述する場合、それぞれのアンカーを再記述します。"Shot 1: tall woman in red wool coat, blue scarf, in a rainy alley. Shot 2: same woman in red coat / blue scarf, now ducking under an awning." HappyHorse は外観を保持しますが、アンカーが必要です。

オーディオの指示 — 聞きたいことを記述します。"distant temple bells, footsteps on wet pavement, no dialogue" または "warm friendly tone, English"。

アンチパターン:

静止フレームの説明 (時間的な動詞がない) → モーションがあいまいになります。
スタイルの指示が競合する → キャンセルされます。
2500 文字のプロンプト → 品質が低下します。
サポートされている 5 つのアスペクト比以外 → 422。

強み

ユースケース	HappyHorse 1.0 を選ぶ理由
一貫したキャラクターによるマルチショットのブランドストーリー	ネイティブのクロスショットのアイデンティティ保持
クリップ内のボイスオーバー + アンビエントを必要とするトーキングヘッドの説明	同じパスでの同期オーディオ
多言語のショートフォーム広告	6 つのプロンプト言語、スクリプト品質の低下なし
映画のような 1080p 配信	ネイティブ 1080p 出力、放送対応
一般的なビデオ品質のブラインド投票リーダー	Artificial Analysis Video Arena で #1

サンプルプロンプト (強力な結果を生み出すことが確認されています)

モデルページから (映画のようなスコープ):

Wide shot. A lone astronaut in dusty orange suit with blue-gray harness
skis across lunar plain, leaving parallel tracks in gray regolith.
Mid-stride, poles planted, pushing in 1/6th gravity with subtle upward
drift. Fine dust haze along ski tracks. Crescent Earth above lunar
horizon, blue-white glow against black sky. Raw sunlight, crushed
shadows, no fill. 8K photorealistic.

マルチショットの一貫性:

Shot 1: Medium close-up. A woman in a navy trench coat enters a
rain-slick neon-lit Tokyo alley, looks left, holds up an umbrella.
Shot 2: Same woman in same navy trench, now under the awning of a
ramen shop, shaking water off the umbrella. Warm interior glow, soft
chatter, gentle rain on metal roof in the audio.

縦型プラットフォームネイティブ:

9:16 vertical short. A barista in a black apron pulls a single
espresso shot, steam rising into the morning sun, rich crema slowly
forming. Close-up handheld, shallow DOF, warm cafe ambience and the
hiss of the steam wand.

制限事項

継続時間の上限 15 秒 — より長いナラティブの場合は、マルチショットプロンプトに分割し、

(原文がここで切り詰められています)

📜 原文 SKILL.md(Claudeが読む英語/中国語)を展開

HappyHorse 1.0 — Pro Pack on RunComfy

runcomfy.com · Text-to-video · GitHub

HappyHorse 1.0 — currently #1 on Artificial Analysis Video Arena (Elo 1333 t2v / 1392 i2v) — hosted on the RunComfy Model API. Native 1080p video with in-pass synchronized audio (dialogue, ambient, Foley) and multi-shot character consistency.

npx skills add agentspace-so/runcomfy-skills --skill happyhorse-1-0 -g

When to pick this model (vs siblings)

You want	Use
Multi-shot story with character / wardrobe consistency	HappyHorse 1.0
Native audio in the same generation pass	HappyHorse 1.0
Currently-#1 blind-vote video model	HappyHorse 1.0
Detailed lip-synced dialogue + reference video	Seedance 2.0 Pro
Fine motion control + multi-reference conditioning	Wan 2.7
Ultra-fast iteration (sub-second per frame)	LTX 2
Cinematic motion editing on existing footage	Kling Video O1

If the user said "HappyHorse" / "happy horse video" explicitly, route here regardless.

Prerequisites

RunComfy CLI — npm i -g @runcomfy/cli
RunComfy account — runcomfy login opens a browser device-code flow.
CI / containers — set RUNCOMFY_TOKEN=<token> instead of runcomfy login.

Endpoints + input schema

`happyhorse/happyhorse-1-0/text-to-video`

Field	Type	Required	Default	Notes
`prompt`	string	yes	—	Up to 2,500 chars. 6 languages (CN/EN/JP/KR/DE/FR).
`aspect_ratio`	enum	no	`16:9`	`16:9`, `9:16`, `1:1`, `4:3`, `3:4` only.
`resolution`	enum	no	`1080P`	`720P` or `1080P`.
`duration`	int	no	5	3–15 seconds.
`seed`	int	no	0	0..2^31-1. Reuse for variant comparisons.
`watermark`	bool	no	true	Provider watermark.

How to invoke

Default (16:9 1080p 5s):

runcomfy run happyhorse/happyhorse-1-0/text-to-video \
  --input '{"prompt": "<user prompt>"}' \
  --output-dir <absolute/path>

Vertical short (9:16, 8s, no watermark):

runcomfy run happyhorse/happyhorse-1-0/text-to-video \
  --input '{
    "prompt": "<user prompt>",
    "aspect_ratio": "9:16",
    "duration": 8,
    "watermark": false
  }' \
  --output-dir <absolute/path>

Cheaper test pass (720p):

runcomfy run happyhorse/happyhorse-1-0/text-to-video \
  --input '{"prompt": "<user prompt>", "resolution": "720P", "duration": 3}' \
  --output-dir <absolute/path>

The CLI submits, polls every 2s until terminal, then downloads any *.runcomfy.net / *.runcomfy.com URL from the result into --output-dir. Stdout is the result JSON. Stderr is progress.

Prompting — what actually works

Describe motion over time, not a still. "A woman turns from the window, walks two paces to the desk, picks up the cup, lifts it to her face, takes a sip" beats "a woman drinking coffee".

Camera + shot in plain English. Front-load the shot: "Wide shot. ..." / "Tracking shot. ..." / "Locked tripod, low angle. ..." works as a real directive. Specify lens feel: "35mm anamorphic", "shallow DOF", "crushed shadows".

One visual beat per clip when iterating. Don't pile up "she walks AND the dog runs AND a car passes". Pick the beat, get it sharp, then layer with multi-shot prompts.

Multi-shot consistency — when describing two beats, restate the anchor at each: "Shot 1: tall woman in red wool coat, blue scarf, in a rainy alley. Shot 2: same woman in red coat / blue scarf, now ducking under an awning." HappyHorse holds the look but needs the anchor.

Audio direction — say what you want to hear: "distant temple bells, footsteps on wet pavement, no dialogue" or "warm friendly tone, English".

Anti-patterns:

Static-frame descriptions (no temporal verbs) → motion will be vague.
Conflicting style directions → cancels.
2500 char prompts → degrades.
Aspect ratios outside the 5 supported → 422.

Where it shines

Use case	Why HappyHorse 1.0
Multi-shot brand stories with one consistent character	Native cross-shot identity preservation
Talking-head explainers needing in-clip voiceover + ambient	Synchronized audio in the same pass
Multilingual short-form ads	6 prompt languages, no script-quality drop
Cinematic 1080p delivery	Native 1080p output, broadcast-ready
Blind-vote leader for general video quality	#1 on Artificial Analysis Video Arena

Sample prompts (verified to produce strong results)

From the model page (cinematic scope):

Wide shot. A lone astronaut in dusty orange suit with blue-gray harness
skis across lunar plain, leaving parallel tracks in gray regolith.
Mid-stride, poles planted, pushing in 1/6th gravity with subtle upward
drift. Fine dust haze along ski tracks. Crescent Earth above lunar
horizon, blue-white glow against black sky. Raw sunlight, crushed
shadows, no fill. 8K photorealistic.

Multi-shot consistency:

Shot 1: Medium close-up. A woman in a navy trench coat enters a
rain-slick neon-lit Tokyo alley, looks left, holds up an umbrella.
Shot 2: Same woman in same navy trench, now under the awning of a
ramen shop, shaking water off the umbrella. Warm interior glow, soft
chatter, gentle rain on metal roof in the audio.

Vertical platform-native:

9:16 vertical short. A barista in a black apron pulls a single
espresso shot, steam rising into the morning sun, rich crema slowly
forming. Close-up handheld, shallow DOF, warm cafe ambience and the
hiss of the steam wand.

Limitations

Duration cap 15s — for longer narratives, segment into multi-shot prompts and stitch.
Aspect ratios — only the 5 documented values; ultra-wide cinematic gets cropped or rejected.
Audio is in-pass only — you can't pass external audio to drive lip-sync. For audio-driven lip-sync, use Wan 2.7 (which accepts an audio_url) or Seedance 2.0 Pro.
No free image-to-video on this template — i2v is supported by HappyHorse via a separate pipeline; the t2v endpoint here is text-only.

Exit codes

The runcomfy CLI uses sysexits-style codes:

code	meaning
0	success
64	bad CLI args
65	bad input JSON / schema mismatch (e.g. `duration: 30` would 422)
69	upstream 5xx
75	retryable: timeout / 429
77	not signed in or token rejected

Full reference: docs.runcomfy.com/cli/troubleshooting.

How it works

The skill invokes runcomfy run happyhorse/happyhorse-1-0/text-to-video with a JSON body matching the schema.
The CLI POSTs to https://model-api.runcomfy.net/v1/models/happyhorse/happyhorse-1-0/text-to-video with the user's bearer token.
The Model API returns a request_id; the CLI polls GET .../requests/<id>/status every 2 seconds.
On terminal status, the CLI fetches GET .../requests/<id>/result and downloads any URL whose host ends with .runcomfy.net or .runcomfy.com into --output-dir. Other URLs are listed but not fetched.
Ctrl-C while polling sends POST .../requests/<id>/cancel so you don't get billed for GPU you stopped.

What this skill is not

Not a self-hosted video runner. Not a capability grant — depends on a working RunComfy account.

Security & Privacy

Token storage: runcomfy login writes the API token to ~/.config/runcomfy/token.json with mode 0600 (owner-only read/write). Set RUNCOMFY_TOKEN env var to bypass the file entirely in CI / containers.
Input boundary: the user prompt is passed as a JSON string to the CLI via --input. The CLI does NOT shell-expand the prompt; it transmits the JSON body directly to the Model API over HTTPS. No shell injection surface from prompt content.
Third-party content: image / mask / video URLs you pass are fetched by the RunComfy model server, not by the CLI on your machine. Treat external URLs as untrusted; image-based prompt injection is a known risk for any image-edit / video-edit model.
Outbound endpoints: only model-api.runcomfy.net (request submission) and *.runcomfy.net / *.runcomfy.com (download whitelist for generated outputs). No telemetry, no callbacks.
Generated-file size cap: the CLI aborts any single download > 2 GiB to prevent disk-fill from a malicious or runaway model output.