refactor: split ai/ into hermes/ and ollama/ directories with gfx906 build #19

Hermes · 2026-05-10T01:45:13Z

Hermes commented

2026-05-10 01:45:13 +00:00

Summary

Organize the ai/ stack into subdirectories and add the custom ollama build for MI50 (gfx906).

Changes

Directory structure

ai/hermes/ — Hermes Dockerfile + support files (moved from ai/)
ai/ollama/ — Custom ollama ROCm 6.1 build with gfx906 support

ai/hermes/Dockerfile

Uses nousresearch/hermes-agent:latest as base, adds extras on top
Extended: poppler-utils, imagemagick, texlive, QEMU, emacs-nox
Piper TTS via uv pip (piper-tts + sounddevice + numpy)
Piper Ryan high-quality voice download
Permission fix scripts
Removed: Node.js, npm, Playwright/Chromium, web UI build, font packages

ai/ollama/Dockerfile (NEW)

Custom build with ROCm 6.1 (rocm/dev-ubuntu-22.04:6.1.2-complete)
Three-stage build: CPU backends (GCC), HIP backend (ROCm 6 preset), Go binary
Targets gfx906:xnack- for AMD MI50 GPUs
Key fixes from the build adventure:
- --preset 'ROCm 6' for proper GPU kernel code generation (not CC=hipcc)
- -DCMAKE_HIP_COMPILER="" for CPU step (prevents HIP poisoning GCC)
- -DCMAKE_PREFIX_PATH="/opt/rocm" for find_package(hip)
- -DCMAKE_INSTALL_PREFIX=/build/dist at configure time
- COPY /build/dist/lib/ollama/ (correct nesting, not the double-bug)
Runtime: ubuntu:24.04 with ROCm 6.1 libs, proper LD_LIBRARY_PATH

ai/compose.yml

hermes: build: ./hermes (was ./)
ollama: build: ./ollama, image tag rocm-gfx906
Removed privileged: true from ollama (devices + group_add is sufficient)
OLLAMA_FLASH_ATTENTION=1 (was 0 - critical for MI50 throughput)
ROCm GPU env vars on hermes service for faster-whisper STT

Cleanup

Removed: Node.js/npm, Chromium, Playwright, fonts, gosu
Removed: npm install, Playwright install, web dashboard build steps
patch_tts_tool.py kept for now (will be removed later when fork handles it)

Verification

Build test

cd ai
docker compose build ollama --no-cache
docker compose build hermes

ollama GPU check

docker compose run --rm ollama ollama run llama3.2:3b-instruct-q4_K_M "hello"

Replaces the loose assets/ollama/Dockerfile that was in the infra repo
Next: add Gitea Actions CI for both Dockerfiles

## Summary Organize the ai/ stack into subdirectories and add the custom ollama build for MI50 (gfx906). ## Changes ### Directory structure - **ai/hermes/** — Hermes Dockerfile + support files (moved from ai/) - **ai/ollama/** — Custom ollama ROCm 6.1 build with gfx906 support ### ai/hermes/Dockerfile - Uses `nousresearch/hermes-agent:latest` as base, adds extras on top - Extended: poppler-utils, imagemagick, texlive, QEMU, emacs-nox - Piper TTS via uv pip (piper-tts + sounddevice + numpy) - Piper Ryan high-quality voice download - Permission fix scripts - Removed: Node.js, npm, Playwright/Chromium, web UI build, font packages ### ai/ollama/Dockerfile (NEW) - Custom build with ROCm 6.1 (rocm/dev-ubuntu-22.04:6.1.2-complete) - Three-stage build: CPU backends (GCC), HIP backend (ROCm 6 preset), Go binary - Targets `gfx906:xnack-` for AMD MI50 GPUs - Key fixes from the build adventure: - `--preset 'ROCm 6'` for proper GPU kernel code generation (not `CC=hipcc`) - `-DCMAKE_HIP_COMPILER=""` for CPU step (prevents HIP poisoning GCC) - `-DCMAKE_PREFIX_PATH="/opt/rocm"` for find_package(hip) - `-DCMAKE_INSTALL_PREFIX=/build/dist` at configure time - `COPY /build/dist/lib/ollama/` (correct nesting, not the double-bug) - Runtime: ubuntu:24.04 with ROCm 6.1 libs, proper LD_LIBRARY_PATH ### ai/compose.yml - hermes: `build: ./hermes` (was `./`) - ollama: `build: ./ollama`, image tag `rocm-gfx906` - Removed `privileged: true` from ollama (devices + group_add is sufficient) - `OLLAMA_FLASH_ATTENTION=1` (was 0 - critical for MI50 throughput) - ROCm GPU env vars on hermes service for faster-whisper STT ### Cleanup - Removed: Node.js/npm, Chromium, Playwright, fonts, gosu - Removed: npm install, Playwright install, web dashboard build steps - patch_tts_tool.py kept for now (will be removed later when fork handles it) ## Verification ### Build test ```bash cd ai docker compose build ollama --no-cache docker compose build hermes ``` ### ollama GPU check ```bash docker compose run --rm ollama ollama run llama3.2:3b-instruct-q4_K_M "hello" ``` ## Related - Replaces the loose `assets/ollama/Dockerfile` that was in the infra repo - Next: add Gitea Actions CI for both Dockerfiles

Hermes added 2 commits 2026-05-10 01:45:14 +00:00

feat: add custom ollama image with ROCm 6.1 + gfx906 support ef58155897

- Add ollama/Dockerfile that builds ollama from source with AMDGPU_TARGETS=gfx906
- Uses ROCm 6.1 (rocm/dev-ubuntu-22.04:6.1.2-complete) for MI50 support
- Builds llama.cpp runner with HIPBLAS for gfx906 architecture
- Updates compose.yml to build from this Dockerfile instead of pulling ollama/ollama:latest

refactor: split ai/ into hermes/ and ollama/ directories 781d98046b

- Rename ai/ to hermes/ (Hermes agent service)
- Move ai/ollama/ to ollama/ (top-level, Ollama ROCm/gfx906 service)
- Add ollama/compose.yml for standalone ollama deployment
- Update hermes/compose.yml: remove ollama service (now in ollama/)
- Update Makefile SERVICES list: ai -> hermes, add ollama
- Bump ollama version from v0.13.5 to v0.23.2

Hermes force-pushed feat/ollama-gfx906 from 781d98046b to d34a4d3647

2026-05-10 01:50:07 +00:00

Compare

Hermes added 1 commit 2026-05-10 01:56:16 +00:00

fix: update ollama Dockerfile to v0.23.2 with proper ROCm 6.1 + gfx906 build f023dc1ee4

- Update OLLAMA_VERSION from v0.13.5 to v0.23.2
- Fix package: golang -> golang-go
- Add ENV HCC_AMDGPU_TARGET=gfx906 and HSA_ENABLE_SDMA=0
- Set proper ENTRYPOINT + CMD

Hermes added 1 commit 2026-05-10 02:13:50 +00:00

fix: use ollama v0.23.2 native CMake build system for ROCm 6 + gfx906 c6d2f5918f

The old Dockerfile used the deprecated llama.cpp/ subdirectory approach
which doesn't exist in ollama v0.23.2. Now using the official CMake
presets (ROCm 6 preset) with AMDGPU_TARGETS including gfx906:xnack-.

Hermes added 1 commit 2026-05-10 02:14:57 +00:00

fix: add unzip dependency for ninja installation 956d76f14d

Hermes added 1 commit 2026-05-10 02:19:54 +00:00

fix: set CMAKE_HIP_COMPILER explicitly for ROCm 6.1 HIP detection a3d0fa0072

Hermes added 1 commit 2026-05-10 02:20:47 +00:00

fix: use CXX=hipcc legacy mode for HIP CMake build d8b77c97c3

CMake 3.31 refuses CMAKE_HIP_COMPILER=hipcc with 'not supported'.
Using CXX=hipcc triggers the legacy HIP detection path which works.

Hermes added 1 commit 2026-05-10 02:29:13 +00:00

fix: use ROCm amdclang++ as HIP compiler, keep GCC for CPU code 5b210fe624

Setting CXX=hipcc caused compilation failures on CPU backends (AVX
intrinsics). Now using GCC for CPU, ROCm's amdclang++ for HIP only.

Hermes added 1 commit 2026-05-10 02:30:24 +00:00

fix: remove unsupported AMDGPU_TARGETS (gfx1200) for ROCm 6.1 0c612d9731

ROCm 6.1's AMD clang 17 doesn't support gfx1200 (RDNA4).
Use only targets supported by ROCm 6.1: gfx906, gfx908, gfx90a, gfx1030, gfx1100.

Hermes added 1 commit 2026-05-10 02:40:43 +00:00

fix: correct AMDGPU_TARGETS to include gfx940/gfx1010/gfx1200 aa6bbe87bf

Targets were corrupted during previous patch iterations, contained
gfx908/gfx90a from the CMake preset instead of gfx940/gfx1010/gfx1200.

Hermes added 1 commit 2026-05-10 02:41:20 +00:00

fix: remove nonexistent CC=clang for Go build step f6bc2b07a7

ROCm 6.1 image doesn't have clang/clang++ in PATH (only amdclang++).
GCC is the default and works fine for CGo linking.

Hermes added 1 commit 2026-05-10 02:51:16 +00:00

fix: build CPU and HIP backends separately 0d87fb2556

CPU backends compiled with GCC (fixes AVX intrinsic errors from hipcc).
HIP backend compiled with hipcc (legacy mode skips enable_language(HIP)).
Go binary built with GCC for CGo linking.
This avoids both CMAKE_HIP_COMPILER rejection and CXX=hipcc CPU failures.

Hermes added 1 commit 2026-05-10 02:53:13 +00:00

fix: remove gfx1200 target (not supported by ROCm 6.1 clang 17) d52f18b0fa

ROCm 6.1's AMD clang 17 doesn't recognize gfx1200 architecture
(introduced in ROCm 6.2+). Caused compilation failure on all .cu files.

Hermes added 1 commit 2026-05-10 03:07:42 +00:00

fix: target only gfx906 for HIP compilation fc777e2de2

gfx940/gfx1010/gfx1030/gfx1100 cause C++ narrowing errors in ollama's
mma.cuh with hipcc. Since we only have MI50 (gfx906) cards, compile
for gfx906 only. Reduces build time and avoids upstream code bugs.

Hermes added 1 commit 2026-05-10 03:15:29 +00:00

fix: add /usr/local/go/bin to ROCm PATH (was overridden) 0f7b22c19b

ENV PATH for ROCm overwrote the previous PATH that included Go.
Without Go in PATH, 'go build' fails with 'go: not found'.

Hermes added 1 commit 2026-05-10 03:49:15 +00:00

fix: use ROCm 6 preset with HIP language detection for proper GPU kernel compilation 32df546550

- Use --preset 'ROCm 6' for HIP build step (enables enable_language(HIP))
- Remove /opt/rocm from PATH for CPU build to prevent check_language(HIP)
- Add CMAKE_PREFIX_PATH=/opt/rocm so find_package(hip) finds hip-config.cmake
- cmake --install --component HIP now works correctly with OLLAMA_RUNNER_DIR=rocm

Hermes added 1 commit 2026-05-10 03:50:28 +00:00

fix: pre-set CMAKE_HIP_COMPILER="" for CPU build to prevent HIP detection 208bfd4612

Hermes added 1 commit 2026-05-10 03:52:49 +00:00

fix: use cmake -B to override preset binaryDir, cmake --build/--install use explicit path bf2f17c5e2

Hermes added 1 commit 2026-05-10 04:10:41 +00:00

fix: copy /build/dist/lib/ollama/ (not /build/dist/lib/) to avoid extra nesting f31ae59717

Hermes added 1 commit 2026-05-10 04:44:59 +00:00

fix: set CMAKE_INSTALL_PREFIX=/build/dist at configure time for CPU, match preset for HIP 9cc7edfb39

Hermes added 1 commit 2026-05-10 14:07:28 +00:00

fix: add ldflags for version, remove privileged, enable flash attention 6b82a26c25

Hermes referenced this pull request from gortium/infra

2026-05-11 00:35:05 +00:00

feat: update compose submodule with ai/hermes/ + ai/ollama/ structure #36

Hermes changed title from ~~refactor: split ai/ into hermes/ and ollama/ directories~~ to refactor: split ai/ into hermes/ and ollama/ directories with gfx906 build

2026-05-11 00:35:06 +00:00

Hermes referenced this pull request from gortium/infra

2026-05-11 00:50:00 +00:00

feat: update compose submodule with ai/hermes/ + ai/ollama/ structure #37

gortium merged commit f8c2f864de into master

2026-05-11 01:26:12 +00:00

gortium deleted branch feat/ollama-gfx906

2026-05-11 01:26:12 +00:00

gortium referenced this issue from a commit

2026-05-11 01:26:12 +00:00

Merge pull request 'refactor: split ai/ into hermes/ and ollama/ directories with gfx906 build' (#19) from feat/ollama-gfx906 into master

Sign in to join this conversation.