feat(hermes): Piper TTS (local US male, no cloud) #17

Merged
gortium merged 25 commits from feat/voice-support-v2 into master 2026-05-09 19:39:12 +00:00
Collaborator

Use official Hermes Agent image as base (nousresearch/hermes-agent:latest). Add Piper TTS, remove Edge TTS. See commit messages for details.

Use official Hermes Agent image as base (nousresearch/hermes-agent:latest). Add Piper TTS, remove Edge TTS. See commit messages for details.
Hermes added 1 commit 2026-05-09 00:21:19 +00:00
Hermes added 1 commit 2026-05-09 02:38:41 +00:00
Hermes changed title from feat(hermes): add ROCm GPU env vars for faster-whisper STT to feat(hermes): simplified Dockerfile + ROCm GPU env vars for STT 2026-05-09 02:38:56 +00:00
Hermes added 1 commit 2026-05-09 04:09:59 +00:00
Hermes changed title from feat(hermes): simplified Dockerfile + ROCm GPU env vars for STT to feat(hermes): ROCm GPU + Coqui TTS + faster-whisper STT 2026-05-09 04:13:33 +00:00
Hermes added 1 commit 2026-05-09 13:24:11 +00:00
Hermes changed title from feat(hermes): ROCm GPU + Coqui TTS + faster-whisper STT to feat(hermes): Piper TTS (local, US male) - replaces Coqui/ROCm/Edge 2026-05-09 13:24:28 +00:00
Hermes changed title from feat(hermes): Piper TTS (local, US male) - replaces Coqui/ROCm/Edge to feat(hermes): Piper TTS (local US male, no cloud) 2026-05-09 13:34:56 +00:00
Hermes added 1 commit 2026-05-09 13:41:40 +00:00
Hermes added 1 commit 2026-05-09 13:41:55 +00:00
gortium reviewed 2026-05-09 13:45:41 +00:00
ai/compose.yml Outdated
@@ -38,7 +38,13 @@ services:
- API_SERVER_HOST=0.0.0.0
- API_SERVER_KEY=hermes_local_key
- GATEWAY_ALLOW_ALL_USERS=true
- OPENROUTER_API_KEY=${OPENROUTER_API_KEY}
Owner

This is the right line

This is the right line
gortium marked this conversation as resolved
gortium reviewed 2026-05-09 13:46:01 +00:00
ai/compose.yml Outdated
@@ -39,3 +39,3 @@
- API_SERVER_KEY=hermes_local_key
- GATEWAY_ALLOW_ALL_USERS=true
- OPENROUTER_API_KEY=${OPENROUTER_API_KEY}
- OPENROUTER_API_KEY=${OPEN...KEY}
Owner

Delete this one

Delete this one
Hermes added 1 commit 2026-05-09 13:55:40 +00:00
Hermes added 1 commit 2026-05-09 13:59:12 +00:00
Hermes added 1 commit 2026-05-09 14:09:53 +00:00
Hermes added 1 commit 2026-05-09 14:12:09 +00:00
Hermes added 1 commit 2026-05-09 14:14:56 +00:00
Hermes added 1 commit 2026-05-09 14:20:48 +00:00
Hermes added 1 commit 2026-05-09 14:27:11 +00:00
Hermes added 1 commit 2026-05-09 14:28:37 +00:00
Hermes added 1 commit 2026-05-09 15:21:52 +00:00
Hermes added 1 commit 2026-05-09 15:50:38 +00:00
Hermes added 1 commit 2026-05-09 16:04:34 +00:00
Hermes added 1 commit 2026-05-09 17:13:10 +00:00
Commit 8e9a75f removed the COPY+RUN of patch_tts_tool.py
because the build context was thought to be insufficient.
The build context is ai/ which contains both the Dockerfile
and patch_tts_tool.py, so COPY works fine.

Without this step the tts_tool.py silently falls through
to Edge TTS as its default provider even when
config.yaml says provider: piper, because 'piper' is not
a recognized provider in the unpatched code. This caused
the female Edge TTS voice (AriaNeural) instead of the
configured Ryan High male voice.
Hermes added 2 commits 2026-05-09 17:37:38 +00:00
The build-time COPY+RUN of patch_tts_tool.py failed because
the Dockerfile starts from debian:stable-slim and only copies
the ai/ build context — there's no tools/tts_tool.py in the
image at build time (Hermes is on the mounted data volume).

Move patching to fix-permissions.sh which runs at container
startup when the data volume is mounted, so tts_tool.py is
available via the venv site-packages.

Also make patch_tts_tool.py robust: searches multiple paths
for tts_tool.py, accepts path as argument, exits 0 instead
of 1 when file/pattern not found (build must not fail).
The Dockerfile starts from debian:stable-slim, not from the official
Hermes image. Without installing hermes-agent from pip, there is no
tools/tts_tool.py in the image at build time, so the patch script
crashes with FileNotFoundError.

Adding hermes-agent to uv pip install gives us tts_tool.py in the
venv site-packages, so the COPY+RUN patch step works cleanly.

Also keep the runtime fallback in fix-permissions.sh for cases where
the volume's site-packages differ from the image.
Hermes added 1 commit 2026-05-09 17:39:31 +00:00
Starting from debian:stable-slim required re-installing everything
(Hermes source, Node.js, Playwright, etc.) which was redundant
and fragile. The official nousresearch/hermes-agent image already
has all that.

Now the Dockerfile:
- FROM nousresearch/hermes-agent:latest (has tts_tool.py, Playwright, etc.)
- Install Piper + voice model on top
- Patch tts_tool.py at build time (Edge fallback -> Piper)
- Runtime fallback in fix-permissions.sh for volume resilience

Cleaner, smaller Dockerfile, and the build-time patch can find
tts_tool.py because it's in the base image's venv.
Hermes added 1 commit 2026-05-09 17:45:02 +00:00
The nousresearch/hermes-agent:latest base image already has a
venv with hermes-agent installed at /opt/hermes/.venv/.
Running 'uv venv' on top of it either fails or wipes the
existing install.

Fix: activate the existing venv first, then pip install into it.
Hermes added 1 commit 2026-05-09 17:47:45 +00:00
The nousresearch/hermes-agent:latest image creates its venv
as root. Running 'uv pip install' as USER hermes fails with
Permission denied on the site-packages directory.

Fix: keep USER root while modifying the venv, then switch
back to USER hermes for runtime.
Hermes added 1 commit 2026-05-09 19:03:13 +00:00
- Patch now matches the current tts_tool.py (newer version ships in
  nousresearch/hermes-agent:latest with different Edge fallback text)
- Adds dedicated elif provider == 'piper' block before else:
- Replaces else: fallback to use Piper instead of Edge
- Patches ALL copies (venv site-packages + /opt/hermes/tools/)
- Removes Edge TTS entirely as default/provider
Hermes added 1 commit 2026-05-09 19:18:20 +00:00
The migration from debian:stable-slim to nousresearch/hermes-agent:latest
dropped several packages that were previously installed. This restores:

- poppler-utils, imagemagick (PDF/image processing)
- texlive-latex-base, latex-extra, fonts-recommended, xetex, science
- qemu-user-static, binfmt-support (cross-compilation)
- emacs-nox (text editing)

These were added in PRs 3/5, 4/5, 5/5 and earlier commits of the
compose repo. The official image already has git, curl, ffmpeg,
python3, gcc, openssh, ripgrep, tini, docker-cli, etc.
gortium merged commit 6e540635bf into master 2026-05-09 19:39:12 +00:00
gortium deleted branch feat/voice-support-v2 2026-05-09 19:39:21 +00:00
Sign in to join this conversation.
No Reviewers
No Label
2 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: gortium/compose#17
No description provided.