Commit Graph

1 Commits

Author SHA1 Message Date
3c92d93366 feat: add llama-cpp-hermes service with ROCm 6.1 + gfx906 support
Some checks failed
Build Hermes agent / build (pull_request) Has been cancelled
Build ollama (gfx906) / build (pull_request) Has been cancelled
- Add custom llama.cpp Dockerfile with ROCm 6.1 + gfx906 (MI50) build
- Add llama-cpp-hermes service serving Hermes 4.3 on dual MI50 GPUs
- Strip GPU devices/ROCm env from ollama service (CPU-only for embeddings)

Hermes 4.3 runs at ~19 t/s on dual MI50s with 160K context.
2026-06-11 11:41:42 -04:00