Media Skills
Music, voice, and image generation for OpenClaw
Music, voice, and image generation for OpenClaw
Media skills add music control, text-to-speech, and image generation to OpenClaw. Install from ClawHub with openclaw skills install <skill-name>.
Skills in this category call external image APIs (DALL·E, Stable Diffusion hosts, etc.). The agent can generate or edit images when you ask in chat—useful for thumbnails, mockups, and social posts. Pair with Content creation for workflow ideas.
openclaw skills install <skill-name> and confirm with openclaw skills list.Voice-specific setup (wake word, TTS) is covered on Voice and Voice setup tutorial.
Productivity - Calendar, Gmail, Todoist. Dev and Infrastructure - GitHub, Docker, n8n. Research - Web search, deep research. Media - Spotify, Sonos, voice. Channels - Slack, Discord, WhatsApp. All Skills - Full list.
Media skills need OAuth or LAN tokens. Store via Secrets. Test on a private channel before enabling playback in public rooms.
Sonos and LAN TTS skills fail when the Gateway runs in a remote VPS but speakers are at home—run those skills on a Gateway that can reach your LAN or use cloud TTS instead.