3c92d93366bcf301878f83bcdec6b6de7246d652
- Add custom llama.cpp Dockerfile with ROCm 6.1 + gfx906 (MI50) build - Add llama-cpp-hermes service serving Hermes 4.3 on dual MI50 GPUs - Strip GPU devices/ROCm env from ollama service (CPU-only for embeddings) Hermes 4.3 runs at ~19 t/s on dual MI50s with 160K context.
Description
No description provided
Languages
Dockerfile
46.1%
Python
42%
Makefile
6.1%
Shell
5.8%