OpenClaw supports rich media across all channels. Send and receive images, audio files, videos, and documents. The media pipeline handles processing, transcription, and storage efficiently.

Image Support

Moltbot can handle images in multiple ways:

  • Send Images - Share photos and screenshots
  • Receive Images - Process and analyze images
  • Image Analysis - Extract text, identify objects, describe scenes
  • Camera Access - Take photos via iOS/Android nodes
  • Screenshots - Capture and share screenshots

Image Processing

Images are processed to:

  • Extract text (OCR)
  • Describe content
  • Identify objects and scenes
  • Analyze for context

Audio Support

Audio handling includes:

  • Voice Notes - Send and receive voice messages
  • Audio Files - Process audio files
  • Transcription - Automatic transcription of audio
  • Voice Interaction - Voice wake and talk mode

Transcription Hooks

Configure transcription hooks to automatically transcribe voice notes:

Transcription Config
{
  "hooks": {
    "transcription": {
      "enabled": true,
      "provider": "whisper"
    }
  }
}

Transcribed text is processed as regular messages, allowing you to interact via voice.

Video Support

Video handling capabilities:

  • Send Videos - Share video files
  • Receive Videos - Process video content
  • Video Analysis - Extract frames, analyze content
  • Size Limits - Configurable size caps

Media Pipeline

The media pipeline handles:

  • Upload - Receiving media from channels
  • Storage - Temporary file storage
  • Processing - Transcription, analysis, extraction
  • Cleanup - Automatic temp file lifecycle management
  • Size Management - Enforce size limits

Size Limits

Configure media size limits:

Size Limits
{
  "media": {
    "maxSize": "10MB",
    "imageMaxSize": "5MB",
    "audioMaxSize": "10MB",
    "videoMaxSize": "50MB"
  }
}

Channel-Specific Media

WhatsApp

  • Images, audio, video, documents
  • Voice notes with transcription
  • Location sharing

Telegram

  • Rich media support
  • Voice notes
  • Photo albums
  • File sharing

Discord

  • Images, audio, video
  • File attachments
  • Rich embeds

Media Tools

Moltbot provides tools for media handling:

  • Camera - Take photos via nodes
  • Images - Process and analyze images
  • Audio - Handle audio files and transcription
  • Location - Send and receive location data

Best Practices

  • Size Limits - Configure appropriate size limits
  • Storage Management - Monitor temp file storage
  • Transcription - Enable transcription for voice notes
  • Privacy - Be aware media is processed by LLM provider
  • Cleanup - Ensure temp files are cleaned up

Learn More