Voice-controlling a local agent over Discord usually means a hosted bot you can't see into and don't fully control.
active v1.0.4
OpenClaw-Discord-Voice
Talk to a local OpenClaw agent through a Discord voice channel.
Join a voice channel, speak one turn, and the bridge transcribes it locally with Whisper, hands it to your local OpenClaw session, and plays the reply back. The whole pipeline stays on your machine and in view.
A self-hosted Discord.js bridge: Opus decode, ffmpeg to WAV, local whisper-cli transcription, one session per guild, and switchable Piper, macOS say, or ElevenLabs voices.
Speech stays on your machine, the session is yours, and every step of the pipeline is inspectable.
For self-hosted agent setups that want voice in Discord without handing the runtime to a hosted bot.
Developer setup
source without GitHub CLI
git clone https://github.com/jx-grxf/OpenClaw-Discord-Voice.git && cd OpenClaw-Discord-Voice source with GitHub CLI
gh repo clone jx-grxf/OpenClaw-Discord-Voice && cd OpenClaw-Discord-Voice latest release
open https://github.com/jx-grxf/OpenClaw-Discord-Voice/releases/tag/v1.0.4 Highlights
- Captures one spoken turn and transcribes it locally with whisper-cli — no cloud STT.
- Bridges straight into your local OpenClaw session, one per Discord guild.
- Switchable replies: Piper, macOS say, or ElevenLabs.
- Built-in doctor and /info checks for env, binaries, model path, and Discord auth.