🎨 デザインコミュニティ

elevenlabs-music-generation

ElevenLabs Musicを活用し、RunComfyを通じて指示されたスタイルや歌詞に基づき、高品質な音楽や歌を生成し、BGM、ジングル、ゲーム音楽など、商用利用可能な楽曲をテキストから手軽に作成するSkill。

📜 元の英語説明(参考)

Generate full songs and instrumental tracks with ElevenLabs Music on RunComfy via the `runcomfy` CLI. ElevenLabs Music turns a style description plus structured lyrics into studio-quality 44.1 kHz stereo audio — 5 seconds to 5 minutes — with section-level control (Intro / Verse / Chorus / Bridge), multilingual vocals, and commercial-friendly output. Generate a backing track, a full vocal song, a jingle, a podcast intro, a game loop, or an instrumental bed. Calls `runcomfy run elevenlabs/elevenlabs/music-generation` through the local RunComfy CLI. Triggers on "generate music", "make a song", "AI music", "background music", "instrumental track", "ElevenLabs Music", "soundtrack", "jingle", "theme music", "royalty-free music", "compose", or any explicit ask to generate music or a song from a text description.

🇯🇵 日本人クリエイター向け解説

一言でいうと

※ jpskill.com 編集部が日本のビジネス現場向けに補足した解説です。Skill本体の挙動とは独立した参考情報です。

⚡ おすすめ: コマンド1行でインストール(60秒)

下記のコマンドをコピーしてターミナル(Mac/Linux)または PowerShell(Windows)に貼り付けてください。ダウンロード → 解凍 → 配置まで全自動。

🍎 Mac / 🐧 Linux

mkdir -p ~/.claude/skills && cd ~/.claude/skills && curl -L -o elevenlabs-music-generation.zip https://jpskill.com/download/10359.zip && unzip -o elevenlabs-music-generation.zip && rm elevenlabs-music-generation.zip

🪟 Windows (PowerShell)

$d = "$env:USERPROFILE\.claude\skills"; ni -Force -ItemType Directory $d | Out-Null; iwr https://jpskill.com/download/10359.zip -OutFile "$d\elevenlabs-music-generation.zip"; Expand-Archive "$d\elevenlabs-music-generation.zip" -DestinationPath $d -Force; ri "$d\elevenlabs-music-generation.zip"

完了後、Claude Code を再起動 → 普通に「動画プロンプト作って」のように話しかけるだけで自動発動します。

💾 手動でダウンロードしたい(コマンドが難しい人向け)

1. 下の青いボタンを押して elevenlabs-music-generation.zip をダウンロード
2. ZIPファイルをダブルクリックで解凍 → elevenlabs-music-generation フォルダができる
3. そのフォルダを C:\Users\あなたの名前\.claude\skills\(Win)または ~/.claude/skills/(Mac)へ移動
4. Claude Code を再起動

⬇ .zip でダウンロード(推奨) ⬇ .skill 形式(上級者用) 元のソース ↗

⚠️ ダウンロード・利用は自己責任でお願いします。当サイトは内容・動作・安全性について責任を負いません。

🎯 このSkillでできること

下記の説明文を読むと、このSkillがあなたに何をしてくれるかが分かります。Claudeにこの分野の依頼をすると、自動で発動します。

📦 インストール方法 (3ステップ)

1. 上の「ダウンロード」ボタンを押して .skill ファイルを取得
2. ファイル名の拡張子を .skill から .zip に変えて展開(macは自動展開可)
3. 展開してできたフォルダを、ホームフォルダの .claude/skills/ に置く
- · macOS / Linux: ~/.claude/skills/
- · Windows: %USERPROFILE%\.claude\skills\

Claude Code を再起動すれば完了。「このSkillを使って…」と話しかけなくても、関連する依頼で自動的に呼び出されます。

詳しい使い方ガイドを見る →

最終更新: 2026-05-18
取得日時: 2026-05-18
同梱ファイル: 1

📖 Skill本文(日本語訳)

※ 原文(英語/中国語)を Gemini で日本語化したものです。Claude 自身は原文を読みます。誤訳がある場合は原文をご確認ください。

ElevenLabs AI Music Generation — RunComfy上のPro Pack

テキストによる説明から、フルソングやインストゥルメンタルトラックを生成します。スタジオ品質の44.1 kHzステレオ、5秒から5分、セクションレベルの構造制御が可能です。RunComfy Model API上のElevenLabs Musicは、runcomfy CLIを通じて呼び出されます。

runcomfy.com · ElevenLabs Music model · CLI docs

このSkillをインストールする

npx skills add agentspace-so/runcomfy-agent-skills --skill elevenlabs-music-generation -g

RunComfy CLIによる実行

# 1. インストール (いずれか一つ — 詳細はruncomfy-cli skillを参照)
npm i -g @runcomfy/cli                              # グローバルインストール
npx -y @runcomfy/cli --version                      # インストール不要

# 2. サインイン
runcomfy login                                      # またはCIで: export RUNCOMFY_TOKEN=<token>

# 3. 音楽を生成
runcomfy run elevenlabs/elevenlabs/music-generation \
  --input '{"prompt": "..."}' \
  --output-dir ./out

CLIの詳細: runcomfy-cli skill。

ElevenLabs Musicの利用場面

ElevenLabs Musicの強みは、リアルなボーカルを含む構造化された楽曲です。スタイル概要とセクションマーカー付きの歌詞を入力すると、まとまりのあるミックスされたトラックが返されます。以下のような場合に選択してください。

フルボーカルソング — バース/コーラス構造、多言語歌詞、一貫した拍子
インストゥルメンタルベッド — バックグラウンドミュージック、ポッドキャストのイントロ、ゲームループにはforce_instrumental: true
短いブランドアセット — ジングル、スティンガー、テーマ音楽 (5–30秒)
長尺トラック — 1回の呼び出しで最大5分
商用利用 — 出力は商用利用に適しています

ユーザーが単に環境音や単発のSFX（雷、足音）を求めている場合、それは音楽ではなく効果音のタスクです。ElevenLabs Musicは楽曲とトラックのためのものです。

エンドポイント + 入力スキーマ

モデル: elevenlabs/elevenlabs/music-generation

フィールド	タイプ	必須	デフォルト	注
`prompt`	string	yes	—	スタイルの説明とセクションマーカー付きの歌詞。プロンプトのヒントを参照
`music_length_ms`	int	no	`40000`	出力時間（ミリ秒）。5000–300000 (5秒 – 5分)
`force_instrumental`	bool	no	`false`	`true` = インストゥルメンタルのみ、ボーカルなし
`output_format`	string	no	`mp3_standard`	`mp3_standard` (デフォルト)、またはWAV — フォーマットの完全なリストについては、モデルページのAPIタブを参照

出力: 44.1 kHzステレオオーディオ。結果のJSONには、生成されたオーディオURLが含まれています。CLIはそれを--output-dirにダウンロードします。

価格: 生成されたオーディオの1秒あたり約$0.0083 (30秒 ≈ $0.25, 60秒 ≈ $0.50, 5分 ≈ $2.49)。コストはmusic_length_msに比例するため、短いもので下書きし、長いもので最終化してください。

呼び出し方法

構造化されたフルボーカルソング:

runcomfy run elevenlabs/elevenlabs/music-generation \
  --input '{
    "prompt": "アップビートなインディーポップアンセム、明るいエレキギター、力強いドラム、120 BPM、女性リードボーカル。[Intro 8 bars] インストゥルメンタルビルド。[Verse] 手のひらのチョーク、二重に結ばれた靴紐、尾根の朝。[Chorus] 我々は立ち上がり、攻撃し、決して消えることはない。[Bridge] ソフトなブレイクダウン、ピアノと声だけ。[Outro] フルバンド、フェード。",
    "music_length_ms": 60000
  }' \
  --output-dir ./out

インストゥルメンタルバックグラウンドベッド:

runcomfy run elevenlabs/elevenlabs/music-generation \
  --input '{
    "prompt": "勉強用プレイリストのための穏やかなローファイヒップホップインストゥルメンタル。温かいローズピアノ、ソフトなレコードノイズ、メロウなブームバップドラム、75 BPM。ボーカルなし。全体を通して一貫したループフレンドリーなグルーヴ。",
    "music_length_ms": 90000,
    "force_instrumental": true
  }' \
  --output-dir ./out

短いブランドジングル:

runcomfy run elevenlabs/elevenlabs/music-generation \
  --input '{
    "prompt": "5秒の陽気なブランドスティンガー、明るいマリンバと単一の盛り上がるコード解決、ボーカルなし。",
    "music_length_ms": 5000,
    "force_instrumental": true
  }' \
  --output-dir ./out

プロンプトのヒント

ElevenLabs Musicは、スタイルの概要と歌詞の両方を伝える1つのpromptフィールドを読み取ります。うまく構造化してください。

スタイルの概要から始める: ジャンル、ムード、テンポ (BPM)、主要な楽器、ボーカルタイプ。"アップビートなインディーポップアンセム、明るいエレキギター、120 BPM、女性リードボーカル。"
次にセクションマーカー付きの歌詞: [Intro], [Verse], [Chorus], [Bridge], [Outro]. おおよその長さまたは小節数を追加 — [Intro 8 bars], [Verse 16 bars].
歌詞の拍子を一貫させる — 1行あたりの音節数を均等にし、明確な韻を踏む。モデルは拍子に従います。拍子が不揃いだと、ぎこちないフレーズになります。
主要な楽器とミックスの優先順位を指定する — "エレキギターがコーラスを運び、ドラムはバースで控えめにする。"
インストゥルメンタルの場合、force_instrumental: trueを設定し、プロンプトで「ボーカルなし」と言う — 念には念を。
多言語: ターゲット言語で歌詞を書く。必要に応じて、アクセント/言語をインラインで注釈を付ける ([Verse] (sung in Brazilian Portuguese) ...)。
矛盾するスタイルの指示を避ける — 1つのプロンプトで「アグレッシブなメタル」+「ソフトな子守唄」と言うと、モデルが混乱します。1回の呼び出しにつき1つのまとまりのある指示。
短いもので下書きし、長いもので最終化する: 5分間のレンダリングにお金を払う前に、30〜45秒の下書き (music_length_ms: 35000) で方向性を検証します。

一般的なパターン

ビデオのテーマソング

完全な概要 + 歌詞 + [Intro]/[Verse]/[Chorus]構造、music_length_msをビデオの長さに合わせる

ポッドキャストのイントロ/アウトロ

force_instrumental: true, 10–20秒, "loop

(原文がここで切り詰められています)

📜 原文 SKILL.md(Claudeが読む英語/中国語)を展開

ElevenLabs AI Music Generation — Pro Pack on RunComfy

Generate full songs and instrumental tracks from a text description — studio-quality 44.1 kHz stereo, 5 seconds to 5 minutes, with section-level structure control. ElevenLabs Music on the RunComfy Model API, called through the runcomfy CLI.

runcomfy.com · ElevenLabs Music model · CLI docs

Install this skill

npx skills add agentspace-so/runcomfy-agent-skills --skill elevenlabs-music-generation -g

Powered by the RunComfy CLI

# 1. Install (one of — see runcomfy-cli skill for details)
npm i -g @runcomfy/cli                              # global install
npx -y @runcomfy/cli --version                      # zero-install

# 2. Sign in
runcomfy login                                      # or in CI: export RUNCOMFY_TOKEN=<token>

# 3. Generate music
runcomfy run elevenlabs/elevenlabs/music-generation \
  --input '{"prompt": "..."}' \
  --output-dir ./out

CLI deep dive: runcomfy-cli skill.

When to use ElevenLabs Music

ElevenLabs Music's strength is structured songs with real vocals — it takes a style brief plus lyrics with section markers and returns a coherent, mixed track. Pick it for:

Full vocal songs — verse/chorus structure, multilingual lyrics, consistent meter
Instrumental beds — force_instrumental: true for background music, podcast intros, game loops
Short brand assets — jingles, stingers, theme music (5–30 s)
Long-form tracks — up to 5 minutes in a single call
Commercial work — output is commercial-friendly

If the user just wants ambient sound or a one-off SFX (thunder, footsteps), that's a sound-effects task, not music — ElevenLabs Music is for songs and tracks.

Endpoint + input schema

Model: elevenlabs/elevenlabs/music-generation

Field	Type	Required	Default	Notes
`prompt`	string	yes	—	Style description and lyrics with section markers. See prompting tips
`music_length_ms`	int	no	`40000`	Output duration in ms. 5000–300000 (5 s – 5 min)
`force_instrumental`	bool	no	`false`	`true` = instrumental only, no vocals
`output_format`	string	no	`mp3_standard`	`mp3_standard` (default), or WAV — see the model page API tab for the full format list

Output: 44.1 kHz stereo audio. The result JSON contains the generated audio URL — the CLI downloads it into --output-dir.

Pricing: ~$0.0083 per second of generated audio (30 s ≈ $0.25, 60 s ≈ $0.50, 5 min ≈ $2.49). Cost scales with music_length_ms, so draft short and finalize long.

How to invoke

Full vocal song with structure:

runcomfy run elevenlabs/elevenlabs/music-generation \
  --input '{
    "prompt": "Upbeat indie-pop anthem, bright electric guitars, driving drums, 120 BPM, female lead vocal. [Intro 8 bars] instrumental build. [Verse] Chalk on the palms, laces double-knotted, morning on the ridge. [Chorus] We rise, we strike, we never fade out. [Bridge] soft breakdown, just piano and voice. [Outro] full band, fade.",
    "music_length_ms": 60000
  }' \
  --output-dir ./out

Instrumental background bed:

runcomfy run elevenlabs/elevenlabs/music-generation \
  --input '{
    "prompt": "Calm lo-fi hip-hop instrumental for a study playlist. Warm Rhodes piano, soft vinyl crackle, mellow boom-bap drums, 75 BPM. No vocals. Consistent loop-friendly groove throughout.",
    "music_length_ms": 90000,
    "force_instrumental": true
  }' \
  --output-dir ./out

Short brand jingle:

runcomfy run elevenlabs/elevenlabs/music-generation \
  --input '{
    "prompt": "5-second cheerful brand stinger, bright marimba and a single uplifting chord resolve, no vocals.",
    "music_length_ms": 5000,
    "force_instrumental": true
  }' \
  --output-dir ./out

Prompting tips

ElevenLabs Music reads one prompt field that carries both the style brief and the lyrics. Structure it well:

Lead with the style brief: genre, mood, tempo (BPM), key instruments, vocal type. "Upbeat indie-pop anthem, bright electric guitars, 120 BPM, female lead vocal."
Then the lyrics with section markers: [Intro], [Verse], [Chorus], [Bridge], [Outro]. Add approximate durations or bar counts — [Intro 8 bars], [Verse 16 bars].
Keep lyrical meter consistent — even syllable counts per line, clear rhyme scheme. The model follows meter; sloppy meter produces awkward phrasing.
Name lead instruments and mix priorities — "electric guitar carries the chorus, drums sit back in the verse."
For instrumental, set force_instrumental: true AND say "no vocals" in the prompt — belt and suspenders.
Multilingual: write the lyrics in the target language; annotate accent/language inline if needed ([Verse] (sung in Brazilian Portuguese) ...).
Avoid contradictory style instructions — "aggressive metal" + "soft lullaby" in one prompt confuses the model. One coherent direction per call.
Draft short, finalize long: validate the direction with a 30–45 s draft (music_length_ms: 35000) before paying for a 5-minute render.

Common patterns

Theme song for a video

Full brief + lyrics + [Intro]/[Verse]/[Chorus] structure, music_length_ms matched to the video length

Podcast intro / outro

force_instrumental: true, 10–20 s, "loop-friendly, clean ending"

Game background loop

force_instrumental: true, describe "seamless loop", 60–120 s, consistent groove

Multilingual release (same song, multiple languages)

One call per language, identical style brief, swap only the lyric lines

Iterate then commit

Draft at music_length_ms: 35000 to lock genre/tempo/structure → final render at full length

Limitations

One prompt field carries everything (style + lyrics). There is no separate "lyrics" parameter.
5 s – 5 min per call (music_length_ms 5000–300000). For longer pieces, generate sections and stitch externally.
Cost scales with duration — a 5-minute render is ~10× a 30-second one.
force_instrumental is the only vocal toggle — you can't request specific voice identities or clone a singer through this endpoint.
This skill pins ElevenLabs Music specifically. For sound effects, text-to-speech, or voice cloning, that's a different ElevenLabs capability not exposed through this endpoint.

Exit codes

code	meaning
0	success
64	bad CLI args
65	bad input JSON / schema mismatch
69	upstream 5xx
75	retryable: timeout / 429
77	not signed in or token rejected

Full reference: docs.runcomfy.com/cli/troubleshooting.

How it works

The skill invokes runcomfy run elevenlabs/elevenlabs/music-generation with the JSON body. The CLI POSTs to the RunComfy Model API, polls request status, fetches the result, and downloads the generated audio file into --output-dir. Ctrl-C cancels the remote request before exit.

Security & Privacy

Install via verified package manager only. Use npm i -g @runcomfy/cli or npx -y @runcomfy/cli. Agents must not pipe an arbitrary remote install script into a shell on the user's behalf — if the operator wants the curl-pipe path documented at docs.runcomfy.com/cli/install, they should review the script first.
Token storage: runcomfy login writes the API token to ~/.config/runcomfy/token.json with mode 0600. Set RUNCOMFY_TOKEN env var to bypass the file in CI / containers. Never echo the token into a prompt, log it, or check it in.
Input boundary (shell injection): the prompt is passed as a JSON string via --input. The CLI does not shell-expand prompt content; it transmits the JSON body directly to the Model API over HTTPS. No shell-injection surface from prompt content, even with backticks, quotes, or $(...) patterns.
Lyrics provenance: if the user supplies lyrics, confirm they have the rights to them. Generating music around copyrighted lyrics is the operator's responsibility — the skill does not check.
Outbound endpoints (allowlist): only model-api.runcomfy.net (request submission) and *.runcomfy.net / *.runcomfy.com (download whitelist for generated audio). No telemetry, no callbacks.
Generated-file size cap: the CLI aborts any single download > 2 GiB.
Scope of bash usage: the skill only invokes runcomfy <subcommand> — npm / npx lines are one-time operator setup, not commands the skill executes per call.