Back to projects
in development

Hermes-Discord-Voice

Self-hosted voice for the Hermes agent, straight in a Discord call.

The bot joins your Discord voice channel, captures a spoken turn, transcribes it locally with Whisper, sends the text to Hermes, and speaks the reply back through the TTS voice you pick. Personal servers only — no hosted service in the middle.

Hermes-Voice wordmark
Problem

Talking to a local agent over Discord voice usually means trusting a hosted bot and giving up control of the speech pipeline.

What I built

A self-hosted Discord.js bridge with local whisper-cli transcription, one Hermes session per guild, a per-guild speaker allowlist, and pluggable TTS — Piper, macOS say, ElevenLabs, or a custom command.

Result

Voice input stays private and on your machine, while the Hermes transport and every step of the pipeline stay explicit.

Audience

For personal or small trusted Discord servers that want to talk to a local Hermes agent without trusting a hosted bot.

Developer setup

source without GitHub CLI git clone https://github.com/jx-grxf/Hermes-Discord-Voice.git && cd Hermes-Discord-Voice
source with GitHub CLI gh repo clone jx-grxf/Hermes-Discord-Voice && cd Hermes-Discord-Voice
latest release open https://github.com/jx-grxf/Hermes-Discord-Voice/releases

Highlights

  • Joins Discord voice, records a turn, and transcribes it locally with whisper-cli — no cloud STT.
  • Routes transcripts to Hermes over CLI by default, or its API/Gateway.
  • Replies through Piper, macOS say, ElevenLabs, or your own TTS command.
  • Private by default: one session per guild with a speaker allowlist you control.