Free local speech-to-text transcription using OpenAI Whisper. Transcribe audio files (mp3, wav, m4a, ogg, etc.) to text without API costs. Use when: (1) User...
数据来源:ClawHub。 在 ClawSkills 查看
选择你使用的 Agent
方法一:命令行安装(推荐)
推荐(无需提前安装 clawhub)
npx clawhub@latest --dir ~/.claude/skills install whisper-stt或使用 clawhub CLI(需提前安装)
clawhub --dir ~/.claude/skills install whisper-stt⚠️ 需要 Node.js 18+,没有 Node?请使用下方方法二直接下载 ZIP。 安装 Node.js →
方法二:手动下载安装(无需 Node)
下载 ZIP,解压后将文件夹放到以下路径,重启 Agent 即可:
安装路径
~/.claude/skills/whisper-stt/💡解压后将文件夹放到上方路径,重启 Agent 即可生效
--- name: whisper-stt description: | Free local speech-to-text transcription using OpenAI Whisper. Transcribe audio files (mp3, wav, m4a, ogg, etc.) to text without API costs. Use when: (1) User needs audio/video transcription, (2) Converting voice memos to text, (3) Generating subtitles (SRT/VTT), (4) Free local STT without cloud API costs. ---
Free, local speech-to-text using OpenAI Whisper.
Install dependencies (one-time setup):
pip install openai-whisper torch
Optional: Install ffmpeg for broader format support:
brew install ffmpegsudo apt install ffmpegpython ~/.openclaw/skills/whisper-stt/scripts/transcribe.py <audio_file>
| Option | Description | |--------|-------------| | --model | Model size: tiny, base, small, medium, large, large-v3-turbo (default: base) | | --language, -l | Language code: zh, en, ja, etc. (auto-detect if not specified) | | --output, -o | Output format: json, txt, srt, vtt (default: json) |
Chinese audio to text:
python ~/.openclaw/skills/whisper-stt/scripts/transcribe.py recording.m4a --language zh --output txt
Generate subtitles (SRT):
python ~/.openclaw/skills/whisper-stt/scripts/transcribe.py video.mp4 --output srt > subtitles.srt
Use faster model:
python ~/.openclaw/skills/whisper-stt/scripts/transcribe.py audio.mp3 --model tiny --output txt
High accuracy (slower):
python ~/.openclaw/skills/whisper-stt/scripts/transcribe.py audio.mp3 --model large-v3 --output txt
| Model | Speed | Accuracy | VRAM/RAM | Best For | |-------|-------|----------|----------|----------| | tiny | ~32x | Basic | ~1GB | Quick tests, low resource | | base | ~16x | Good | ~1GB | Balanced speed/accuracy | | small | ~6x | Better | ~2GB | Better accuracy | | medium | ~2x | Very Good | ~5GB | High accuracy | | large | 1x | Excellent | ~10GB | Best quality | | large-v3-turbo | ~8x | Excellent | ~6GB | Fast + accurate (recommended) |
"ModuleNotFoundError: No module named 'whisper'" → Run: pip install openai-whisper torch
"ffmpeg not found" → Install ffmpeg or convert audio to WAV format first
Slow transcription → Use smaller model (tiny/base) or ensure GPU is available (Apple Silicon MPS, NVIDIA CUDA)
Poor accuracy on Chinese → Use --language zh explicitly and consider larger model (medium/large)
Powered by OpenAI Whisper - open source speech recognition.
安装 Whisper STT 后,可以对 AI 说这些话来触发它
Help me get started with Whisper STT
Explains what Whisper STT does, walks through the setup, and runs a quick demo based on your current project
Use Whisper STT to free local speech-to-text transcription using OpenAI Whisper
Invokes Whisper STT with the right parameters and returns the result directly in the conversation
What can I do with Whisper STT in my design & creative workflow?
Lists the top use cases for Whisper STT, with example commands for each scenario
将技能文件夹放到 ~/.claude/skills/whisper-stt/ 目录(个人级,所有项目可用),或 .claude/skills/whisper-stt/(项目级)。重启 AI 客户端后,用 /whisper-stt 主动调用,或让 AI 根据上下文自动发现并使用。
Whisper STT 支持 Claude、Cursor、OpenClaw,可与这些 AI 平台无缝集成,扩展其能力。
Whisper STT 可免费安装使用。请查阅仓库了解许可证信息。
Free local speech-to-text transcription using OpenAI Whisper. Transcribe audio files (mp3, wav, m4a, ogg, etc.) to text without API costs. Use when: (1) User...
Whisper STT 属于「Design & Creative」分类,该分类的技能帮助 AI 智能体在此领域执行专业任务。
Automate my design & creative tasks using Whisper STT
Identifies repetitive steps in your workflow and sets up Whisper STT to handle them automatically