A

音频内容生成器

(Audio Content Generator)

audio-gen

🌐 English

按需生成有声读物、播客或教育音频内容。用户提供想法或主题，Claude AI 编写脚本，ElevenLabs 将其转换为高质量音频。支持多种格式（有声读物、播客、教育）、自定义长度和语音效果。当要求创建时使用

数据来源：ClawHub。在 ClawSkills 查看

2.6k下载量

1收藏数

1浏览量

安装

选择你使用的 Agent

方法一：命令行安装（推荐）

关于音频内容生成器

--- name: audio-gen description: Generate audiobooks, podcasts, or educational audio content on demand. User provides an idea or topic, Claude AI writes a script, and ElevenLabs converts it to high-quality audio. Supports multiple formats (audiobook, podcast, educational), custom lengths, and voice effects. Use when asked to create audio content, make a podcast, generate an audiobook, or produce educational audio. Returns MP3 audio file via MEDIA token. homepage: https://github.com/clawdbot/clawdbot metadata: {"clawdbot":{"emoji":"🎙️","requires":{"skills":["sag"],"env":["ANTHROPIC_API_KEY","ELEVENLABS_API_KEY"]},"primaryEnv":"ANTHROPIC_API_KEY"}} ---

🎙️ Audio Content Generator

Generate high-quality audiobooks, podcasts, or educational audio content on demand using AI-written scripts and ElevenLabs text-to-speech.

Quick Start

Create an audiobook chapter:

User: "Create a 5-minute audiobook chapter about a dragon discovering friendship"

Generate a podcast:

User: "Make a 10-minute podcast about the history of coffee"

Produce educational content:

User: "Generate a 15-minute educational audio explaining how neural networks work"

Content Formats

Audiobook

Style: Narrative storytelling with emotional depth

Clear beginning, middle, and end
Descriptive language and vivid imagery
Dramatic pacing with thoughtful pauses
Emotional tone that matches the story
Use voice effects like [whispers], [excited], [serious] for impact

Example Structure:

[Opening hook - set the scene]
[long pause]

[Story development with character emotions]
[short pause] between sentences
[long pause] between paragraphs

[Climax with dramatic tension]
[long pause]

[Resolution and emotional closure]

Podcast

Style: Conversational and engaging

Warm, welcoming intro (15-30 seconds)
Main content with natural flow
Transitions between topics
Memorable outro with key takeaways
Conversational tone throughout

Example Structure:

**Intro:** "Welcome to [topic]. I'm excited to share..."
[short pause]

**Main Content:** "Let's start with... [topic 1]"
[long pause] between segments

**Outro:** "Thanks for listening! Remember..."

Educational Content

Style: Clear explanations for learning

Simple introductions to complex topics
Step-by-step breakdowns
Real-world examples and analogies
Recap of key concepts at the end
Enthusiastic delivery with [excited] for important points

Example Structure:

**Introduction:** What is [topic] and why it matters?

**Main Content:**
- Concept 1: Explanation + Example
- Concept 2: Explanation + Example
- Concept 3: Explanation + Example

**Summary:** Key takeaways and next steps

Length Guidelines

Word Count to Duration Conversion:

5 minutes = ~375 words
10 minutes = ~750 words
15 minutes = ~1,125 words
20 minutes = ~1,500 words
30 minutes = ~2,250 words

Pacing: Average conversational speed is ~75 words per minute

Practical Limits:

Minimum: 2 minutes (~150 words)
Maximum: 30 minutes (~2,250 words)
Sweet spot: 5-15 minutes for best engagement

Workflow Instructions

Step 1: Understand the Request

Parse the user's request for:

Content type (audiobook, podcast, educational, or inferred from topic)
Topic/theme (what should the content be about)
Target length (how many minutes)
Tone/style (dramatic, casual, educational, etc.)
Special requests (specific voice, emphasis on certain points)

Step 2: Calculate Word Count

target_words = target_minutes × 75

Example: 10 minutes = 10 × 75 = 750 words

Step 3: Generate the Script

Write the complete script following these rules:

Content Guidelines:

Start strong with an engaging hook
Maintain natural, conversational flow
Use active voice and simple sentence structure
Include relevant examples and stories
End with a satisfying conclusion

Formatting Rules:

Add [short pause] after sentences (use sparingly, not every sentence)
Add [long pause] between paragraphs or major sections
Use voice effects strategically: [whispers], [shouts], [excited], [serious], [sarcastic], [sings], [laughs]
Write numbers as words: "twenty-three" not "23"
Spell out acronyms first time: "AI, or artificial intelligence"
Avoid complex punctuation (em-dashes work, but semicolons don't read well)
Remove markdown formatting before TTS conversion

Step 4: Present the Script

Show the script to the user and ask:

Here's the [format] script I've created (approximately [length] minutes):

[Display the script]

Would you like me to:
1. Generate the audio now
2. Make changes to the script
3. Adjust the length or tone

Step 5: Handle User Feedback

If user requests changes:

Regenerate the script with adjustments
Maintain the target word count
Present the revised version

If user approves:

Proceed to audio generation

Step 6: Generate Audio

Format the script for TTS:

Remove any remaining markdown (headers, bold, italics)
Ensure voice effects are in proper [effect] format
Check that pauses are appropriately placed
Verify numbers and acronyms are spelled out

Invoke the TTS script:

IMPORTANT: The ELEVENLABS_API_KEY environment variable is already configured in the system. Simply invoke the TTS script directly.

uv run /home/clawdbot/clawdbot/skills/sag/scripts/tts.py \
  -o /tmp/audio-gen-[timestamp]-[topic-slug].mp3 \
  -m eleven_multilingual_v2 \
  "[formatted_script]"

For long scripts, use heredoc:

uv run /home/clawdbot/clawdbot/skills/sag/scripts/tts.py \
  -o /tmp/audio-gen-[timestamp]-[topic-slug].mp3 \
  -m eleven_multilingual_v2 \
  "$(cat <<'EOF'
[formatted_script]
EOF
)"

Return the result:

MEDIA:/tmp/audio-gen-[timestamp]-[topic-slug].mp3

Your [format] is ready! [Brief description of content]. Duration: approximately [X] minutes.

Voice Effects (SSML Tags)

Available voice modulation effects (use sparingly for impact):

[whispers] - Soft, intimate delivery
[shouts] - Loud, emphatic delivery
[excited] - Enthusiastic, energetic tone
[serious] - Grave, solemn tone
[sarcastic] - Ironic, mocking tone
[sings] - Musical, melodic delivery
[laughs] - Amused, jovial tone
[short pause] - Brief silence (~0.5s)
[long pause] - Extended silence (~1-2s)

Best Practices:

Use effects for emotional moments, not every sentence
Pauses are your most powerful tool for pacing
Voice effects work best in audiobooks and dramatic content
Keep podcasts and educational content mostly natural

Error Handling

Script Too Long

If the generated script exceeds target by >20%:

The script I generated is [X] words ([Y] minutes), which is longer than your target of [Z] minutes. Would you like me to:
1. Condense it to fit the target length
2. Split it into multiple parts
3. Keep it as is

Script Too Short

If the generated script is under target by >20%:

The script is [X] words ([Y] minutes), shorter than your target. Would you like me to:
1. Expand it with more detail
2. Add additional examples or stories
3. Generate as is

TTS Generation Fails

If the TTS script fails:

I've created the script, but I'm unable to generate the audio right now. Here's your script:

[Display script]

Error: [specific error message]

You can:
1. Check that ELEVENLABS_API_KEY is configured
2. Use the script with your own text-to-speech tool
3. Try again in a moment
4. Ask me to troubleshoot the audio generation

Common TTS Issues:

API key not set: Verify ELEVENLABS_API_KEY in config
Rate limit: Wait a moment and try again
Text too long: Break into smaller chunks (max ~5000 characters)

...

Prompt 示例

安装音频内容生成器后，可以对 AI 说这些话来触发它

U

Help me get started with Audio Content Generator

A

Explains what Audio Content Generator does, walks through the setup, and runs a quick demo based on your current project

U

Use Audio Content Generator to generate audiobooks, podcasts, or educational audio content on demand

A

Invokes Audio Content Generator with the right parameters and returns the result directly in the conversation

U

What can I do with Audio Content Generator in my design & creative workflow?

A

Lists the top use cases for Audio Content Generator, with example commands for each scenario

常见问题

如何安装音频内容生成器？▾

将技能文件夹放到 ~/.claude/skills/audio-gen/ 目录（个人级，所有项目可用），或 .claude/skills/audio-gen/（项目级）。重启 AI 客户端后，用 /audio-gen 主动调用，或让 AI 根据上下文自动发现并使用。

音频内容生成器支持哪些 AI 平台？▾

音频内容生成器支持 Claude、Cursor、OpenClaw，可与这些 AI 平台无缝集成，扩展其能力。

音频内容生成器是免费的吗？▾

音频内容生成器可免费安装使用。请查阅仓库了解许可证信息。

音频内容生成器有什么功能？▾

按需生成有声读物、播客或教育音频内容。用户提供想法或主题，Claude AI 编写脚本，ElevenLabs 将其转换为高质量音频。支持多种格式（有声读物、播客、教育）、自定义长度和语音效果。当要求创建时使用

音频内容生成器属于哪个分类？▾

音频内容生成器属于「Design & Creative」分类，该分类的技能帮助 AI 智能体在此领域执行专业任务。

使用场景

Getting Started with Audio Content Generator→Automate Design & Creative Workflows with Audio Content Generator→Team Collaboration with Audio Content Generator→

音频内容生成器

安装

关于音频内容生成器

🎙️ Audio Content Generator

Quick Start

Content Formats

Audiobook

Podcast

Educational Content

Length Guidelines

Workflow Instructions

Step 1: Understand the Request

Step 2: Calculate Word Count

Step 3: Generate the Script

Step 4: Present the Script

Step 5: Handle User Feedback

Step 6: Generate Audio

Voice Effects (SSML Tags)

Error Handling

Script Too Long

Script Too Short

TTS Generation Fails

Prompt 示例

常见问题

使用场景

同类技能推荐

Humanizer

Nano Banana Pro

Openai Whisper

YouTube Watcher

音频内容生成器

安装

关于 音频内容生成器

🎙️ Audio Content Generator

Quick Start

Content Formats

Audiobook

Podcast

Educational Content

Length Guidelines

Workflow Instructions

Step 1: Understand the Request

Step 2: Calculate Word Count

Step 3: Generate the Script

Step 4: Present the Script

Step 5: Handle User Feedback

Step 6: Generate Audio

Voice Effects (SSML Tags)

Error Handling

Script Too Long

Script Too Short

TTS Generation Fails

Prompt 示例

常见问题

使用场景

同类技能推荐

Humanizer

Nano Banana Pro

Openai Whisper

YouTube Watcher

关于音频内容生成器