refactor: split ai/ into hermes/ and ollama/ directories with gfx906 build #19

Merged
gortium merged 21 commits from feat/ollama-gfx906 into master 2026-05-11 01:26:12 +00:00
Collaborator

Summary

Organize the ai/ stack into subdirectories and add the custom ollama build for MI50 (gfx906).

Changes

Directory structure

  • ai/hermes/ — Hermes Dockerfile + support files (moved from ai/)
  • ai/ollama/ — Custom ollama ROCm 6.1 build with gfx906 support

ai/hermes/Dockerfile

  • Uses nousresearch/hermes-agent:latest as base, adds extras on top
  • Extended: poppler-utils, imagemagick, texlive, QEMU, emacs-nox
  • Piper TTS via uv pip (piper-tts + sounddevice + numpy)
  • Piper Ryan high-quality voice download
  • Permission fix scripts
  • Removed: Node.js, npm, Playwright/Chromium, web UI build, font packages

ai/ollama/Dockerfile (NEW)

  • Custom build with ROCm 6.1 (rocm/dev-ubuntu-22.04:6.1.2-complete)
  • Three-stage build: CPU backends (GCC), HIP backend (ROCm 6 preset), Go binary
  • Targets gfx906:xnack- for AMD MI50 GPUs
  • Key fixes from the build adventure:
    • --preset 'ROCm 6' for proper GPU kernel code generation (not CC=hipcc)
    • -DCMAKE_HIP_COMPILER="" for CPU step (prevents HIP poisoning GCC)
    • -DCMAKE_PREFIX_PATH="/opt/rocm" for find_package(hip)
    • -DCMAKE_INSTALL_PREFIX=/build/dist at configure time
    • COPY /build/dist/lib/ollama/ (correct nesting, not the double-bug)
  • Runtime: ubuntu:24.04 with ROCm 6.1 libs, proper LD_LIBRARY_PATH

ai/compose.yml

  • hermes: build: ./hermes (was ./)
  • ollama: build: ./ollama, image tag rocm-gfx906
  • Removed privileged: true from ollama (devices + group_add is sufficient)
  • OLLAMA_FLASH_ATTENTION=1 (was 0 - critical for MI50 throughput)
  • ROCm GPU env vars on hermes service for faster-whisper STT

Cleanup

  • Removed: Node.js/npm, Chromium, Playwright, fonts, gosu
  • Removed: npm install, Playwright install, web dashboard build steps
  • patch_tts_tool.py kept for now (will be removed later when fork handles it)

Verification

Build test

cd ai
docker compose build ollama --no-cache
docker compose build hermes

ollama GPU check

docker compose run --rm ollama ollama run llama3.2:3b-instruct-q4_K_M "hello"
  • Replaces the loose assets/ollama/Dockerfile that was in the infra repo
  • Next: add Gitea Actions CI for both Dockerfiles
## Summary Organize the ai/ stack into subdirectories and add the custom ollama build for MI50 (gfx906). ## Changes ### Directory structure - **ai/hermes/** — Hermes Dockerfile + support files (moved from ai/) - **ai/ollama/** — Custom ollama ROCm 6.1 build with gfx906 support ### ai/hermes/Dockerfile - Uses `nousresearch/hermes-agent:latest` as base, adds extras on top - Extended: poppler-utils, imagemagick, texlive, QEMU, emacs-nox - Piper TTS via uv pip (piper-tts + sounddevice + numpy) - Piper Ryan high-quality voice download - Permission fix scripts - Removed: Node.js, npm, Playwright/Chromium, web UI build, font packages ### ai/ollama/Dockerfile (NEW) - Custom build with ROCm 6.1 (rocm/dev-ubuntu-22.04:6.1.2-complete) - Three-stage build: CPU backends (GCC), HIP backend (ROCm 6 preset), Go binary - Targets `gfx906:xnack-` for AMD MI50 GPUs - Key fixes from the build adventure: - `--preset 'ROCm 6'` for proper GPU kernel code generation (not `CC=hipcc`) - `-DCMAKE_HIP_COMPILER=""` for CPU step (prevents HIP poisoning GCC) - `-DCMAKE_PREFIX_PATH="/opt/rocm"` for find_package(hip) - `-DCMAKE_INSTALL_PREFIX=/build/dist` at configure time - `COPY /build/dist/lib/ollama/` (correct nesting, not the double-bug) - Runtime: ubuntu:24.04 with ROCm 6.1 libs, proper LD_LIBRARY_PATH ### ai/compose.yml - hermes: `build: ./hermes` (was `./`) - ollama: `build: ./ollama`, image tag `rocm-gfx906` - Removed `privileged: true` from ollama (devices + group_add is sufficient) - `OLLAMA_FLASH_ATTENTION=1` (was 0 - critical for MI50 throughput) - ROCm GPU env vars on hermes service for faster-whisper STT ### Cleanup - Removed: Node.js/npm, Chromium, Playwright, fonts, gosu - Removed: npm install, Playwright install, web dashboard build steps - patch_tts_tool.py kept for now (will be removed later when fork handles it) ## Verification ### Build test ```bash cd ai docker compose build ollama --no-cache docker compose build hermes ``` ### ollama GPU check ```bash docker compose run --rm ollama ollama run llama3.2:3b-instruct-q4_K_M "hello" ``` ## Related - Replaces the loose `assets/ollama/Dockerfile` that was in the infra repo - Next: add Gitea Actions CI for both Dockerfiles
Hermes added 2 commits 2026-05-10 01:45:14 +00:00
- Add ollama/Dockerfile that builds ollama from source with AMDGPU_TARGETS=gfx906
- Uses ROCm 6.1 (rocm/dev-ubuntu-22.04:6.1.2-complete) for MI50 support
- Builds llama.cpp runner with HIPBLAS for gfx906 architecture
- Updates compose.yml to build from this Dockerfile instead of pulling ollama/ollama:latest
- Rename ai/ to hermes/ (Hermes agent service)
- Move ai/ollama/ to ollama/ (top-level, Ollama ROCm/gfx906 service)
- Add ollama/compose.yml for standalone ollama deployment
- Update hermes/compose.yml: remove ollama service (now in ollama/)
- Update Makefile SERVICES list: ai -> hermes, add ollama
- Bump ollama version from v0.13.5 to v0.23.2
Hermes force-pushed feat/ollama-gfx906 from 781d98046b to d34a4d3647 2026-05-10 01:50:07 +00:00 Compare
Hermes added 1 commit 2026-05-10 01:56:16 +00:00
- Update OLLAMA_VERSION from v0.13.5 to v0.23.2
- Fix package: golang -> golang-go
- Add ENV HCC_AMDGPU_TARGET=gfx906 and HSA_ENABLE_SDMA=0
- Set proper ENTRYPOINT + CMD
Hermes added 1 commit 2026-05-10 02:13:50 +00:00
The old Dockerfile used the deprecated llama.cpp/ subdirectory approach
which doesn't exist in ollama v0.23.2. Now using the official CMake
presets (ROCm 6 preset) with AMDGPU_TARGETS including gfx906:xnack-.
Hermes added 1 commit 2026-05-10 02:14:57 +00:00
Hermes added 1 commit 2026-05-10 02:19:54 +00:00
Hermes added 1 commit 2026-05-10 02:20:47 +00:00
CMake 3.31 refuses CMAKE_HIP_COMPILER=hipcc with 'not supported'.
Using CXX=hipcc triggers the legacy HIP detection path which works.
Hermes added 1 commit 2026-05-10 02:29:13 +00:00
Setting CXX=hipcc caused compilation failures on CPU backends (AVX
intrinsics). Now using GCC for CPU, ROCm's amdclang++ for HIP only.
Hermes added 1 commit 2026-05-10 02:30:24 +00:00
ROCm 6.1's AMD clang 17 doesn't support gfx1200 (RDNA4).
Use only targets supported by ROCm 6.1: gfx906, gfx908, gfx90a, gfx1030, gfx1100.
Hermes added 1 commit 2026-05-10 02:40:43 +00:00
Targets were corrupted during previous patch iterations, contained
gfx908/gfx90a from the CMake preset instead of gfx940/gfx1010/gfx1200.
Hermes added 1 commit 2026-05-10 02:41:20 +00:00
ROCm 6.1 image doesn't have clang/clang++ in PATH (only amdclang++).
GCC is the default and works fine for CGo linking.
Hermes added 1 commit 2026-05-10 02:51:16 +00:00
CPU backends compiled with GCC (fixes AVX intrinsic errors from hipcc).
HIP backend compiled with hipcc (legacy mode skips enable_language(HIP)).
Go binary built with GCC for CGo linking.
This avoids both CMAKE_HIP_COMPILER rejection and CXX=hipcc CPU failures.
Hermes added 1 commit 2026-05-10 02:53:13 +00:00
ROCm 6.1's AMD clang 17 doesn't recognize gfx1200 architecture
(introduced in ROCm 6.2+). Caused compilation failure on all .cu files.
Hermes added 1 commit 2026-05-10 03:07:42 +00:00
gfx940/gfx1010/gfx1030/gfx1100 cause C++ narrowing errors in ollama's
mma.cuh with hipcc. Since we only have MI50 (gfx906) cards, compile
for gfx906 only. Reduces build time and avoids upstream code bugs.
Hermes added 1 commit 2026-05-10 03:15:29 +00:00
ENV PATH for ROCm overwrote the previous PATH that included Go.
Without Go in PATH, 'go build' fails with 'go: not found'.
Hermes added 1 commit 2026-05-10 03:49:15 +00:00
- Use --preset 'ROCm 6' for HIP build step (enables enable_language(HIP))
- Remove /opt/rocm from PATH for CPU build to prevent check_language(HIP)
- Add CMAKE_PREFIX_PATH=/opt/rocm so find_package(hip) finds hip-config.cmake
- cmake --install --component HIP now works correctly with OLLAMA_RUNNER_DIR=rocm
Hermes added 1 commit 2026-05-10 03:50:28 +00:00
Hermes added 1 commit 2026-05-10 03:52:49 +00:00
Hermes added 1 commit 2026-05-10 04:10:41 +00:00
Hermes added 1 commit 2026-05-10 04:44:59 +00:00
Hermes added 1 commit 2026-05-10 14:07:28 +00:00
Hermes changed title from refactor: split ai/ into hermes/ and ollama/ directories to refactor: split ai/ into hermes/ and ollama/ directories with gfx906 build 2026-05-11 00:35:06 +00:00
gortium merged commit f8c2f864de into master 2026-05-11 01:26:12 +00:00
gortium deleted branch feat/ollama-gfx906 2026-05-11 01:26:12 +00:00
Sign in to join this conversation.
No Reviewers
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: gortium/compose#19
No description provided.