Back to projects
active v1.0.4

OpenClaw-Discord-Voice

Talk to a local OpenClaw agent through a Discord voice channel.

Join a voice channel, speak one turn, and the bridge transcribes it locally with Whisper, hands it to your local OpenClaw session, and plays the reply back. The whole pipeline stays on your machine and in view.

Pipeline diagram for OpenClaw Discord Voice
Problem

Voice-controlling a local agent over Discord usually means a hosted bot you can't see into and don't fully control.

What I built

A self-hosted Discord.js bridge: Opus decode, ffmpeg to WAV, local whisper-cli transcription, one session per guild, and switchable Piper, macOS say, or ElevenLabs voices.

Result

Speech stays on your machine, the session is yours, and every step of the pipeline is inspectable.

Audience

For self-hosted agent setups that want voice in Discord without handing the runtime to a hosted bot.

Developer setup

source without GitHub CLI git clone https://github.com/jx-grxf/OpenClaw-Discord-Voice.git && cd OpenClaw-Discord-Voice
source with GitHub CLI gh repo clone jx-grxf/OpenClaw-Discord-Voice && cd OpenClaw-Discord-Voice
latest release open https://github.com/jx-grxf/OpenClaw-Discord-Voice/releases/tag/v1.0.4

Highlights

  • Captures one spoken turn and transcribes it locally with whisper-cli — no cloud STT.
  • Bridges straight into your local OpenClaw session, one per Discord guild.
  • Switchable replies: Piper, macOS say, or ElevenLabs.
  • Built-in doctor and /info checks for env, binaries, model path, and Discord auth.