Talking to a local agent over Discord voice usually means trusting a hosted bot and giving up control of the speech pipeline.
in development
Hermes-Discord-Voice
Self-hosted voice for the Hermes agent, straight in a Discord call.
The bot joins your Discord voice channel, captures a spoken turn, transcribes it locally with Whisper, sends the text to Hermes, and speaks the reply back through the TTS voice you pick. Personal servers only — no hosted service in the middle.
A self-hosted Discord.js bridge with local whisper-cli transcription, one Hermes session per guild, a per-guild speaker allowlist, and pluggable TTS — Piper, macOS say, ElevenLabs, or a custom command.
Voice input stays private and on your machine, while the Hermes transport and every step of the pipeline stay explicit.
For personal or small trusted Discord servers that want to talk to a local Hermes agent without trusting a hosted bot.
Developer setup
source without GitHub CLI
git clone https://github.com/jx-grxf/Hermes-Discord-Voice.git && cd Hermes-Discord-Voice source with GitHub CLI
gh repo clone jx-grxf/Hermes-Discord-Voice && cd Hermes-Discord-Voice latest release
open https://github.com/jx-grxf/Hermes-Discord-Voice/releases Highlights
- Joins Discord voice, records a turn, and transcribes it locally with whisper-cli — no cloud STT.
- Routes transcripts to Hermes over CLI by default, or its API/Gateway.
- Replies through Piper, macOS say, ElevenLabs, or your own TTS command.
- Private by default: one session per guild with a speaker allowlist you control.