refactor: split ai/ into hermes/ and ollama/ directories with gfx906 build #19

Merged
gortium merged 21 commits from feat/ollama-gfx906 into master 2026-05-11 01:26:12 +00:00

21 Commits

Author SHA1 Message Date
6b82a26c25 fix: add ldflags for version, remove privileged, enable flash attention 2026-05-10 10:07:25 -04:00
9cc7edfb39 fix: set CMAKE_INSTALL_PREFIX=/build/dist at configure time for CPU, match preset for HIP 2026-05-10 00:44:56 -04:00
f31ae59717 fix: copy /build/dist/lib/ollama/ (not /build/dist/lib/) to avoid extra nesting 2026-05-10 00:10:39 -04:00
bf2f17c5e2 fix: use cmake -B to override preset binaryDir, cmake --build/--install use explicit path 2026-05-09 23:52:46 -04:00
208bfd4612 fix: pre-set CMAKE_HIP_COMPILER="" for CPU build to prevent HIP detection 2026-05-09 23:50:26 -04:00
32df546550 fix: use ROCm 6 preset with HIP language detection for proper GPU kernel compilation
- Use --preset 'ROCm 6' for HIP build step (enables enable_language(HIP))
- Remove /opt/rocm from PATH for CPU build to prevent check_language(HIP)
- Add CMAKE_PREFIX_PATH=/opt/rocm so find_package(hip) finds hip-config.cmake
- cmake --install --component HIP now works correctly with OLLAMA_RUNNER_DIR=rocm
2026-05-09 23:49:08 -04:00
0f7b22c19b fix: add /usr/local/go/bin to ROCm PATH (was overridden)
ENV PATH for ROCm overwrote the previous PATH that included Go.
Without Go in PATH, 'go build' fails with 'go: not found'.
2026-05-09 23:15:26 -04:00
fc777e2de2 fix: target only gfx906 for HIP compilation
gfx940/gfx1010/gfx1030/gfx1100 cause C++ narrowing errors in ollama's
mma.cuh with hipcc. Since we only have MI50 (gfx906) cards, compile
for gfx906 only. Reduces build time and avoids upstream code bugs.
2026-05-09 23:07:39 -04:00
d52f18b0fa fix: remove gfx1200 target (not supported by ROCm 6.1 clang 17)
ROCm 6.1's AMD clang 17 doesn't recognize gfx1200 architecture
(introduced in ROCm 6.2+). Caused compilation failure on all .cu files.
2026-05-09 22:53:11 -04:00
0d87fb2556 fix: build CPU and HIP backends separately
CPU backends compiled with GCC (fixes AVX intrinsic errors from hipcc).
HIP backend compiled with hipcc (legacy mode skips enable_language(HIP)).
Go binary built with GCC for CGo linking.
This avoids both CMAKE_HIP_COMPILER rejection and CXX=hipcc CPU failures.
2026-05-09 22:51:13 -04:00
f6bc2b07a7 fix: remove nonexistent CC=clang for Go build step
ROCm 6.1 image doesn't have clang/clang++ in PATH (only amdclang++).
GCC is the default and works fine for CGo linking.
2026-05-09 22:41:18 -04:00
aa6bbe87bf fix: correct AMDGPU_TARGETS to include gfx940/gfx1010/gfx1200
Targets were corrupted during previous patch iterations, contained
gfx908/gfx90a from the CMake preset instead of gfx940/gfx1010/gfx1200.
2026-05-09 22:40:40 -04:00
0c612d9731 fix: remove unsupported AMDGPU_TARGETS (gfx1200) for ROCm 6.1
ROCm 6.1's AMD clang 17 doesn't support gfx1200 (RDNA4).
Use only targets supported by ROCm 6.1: gfx906, gfx908, gfx90a, gfx1030, gfx1100.
2026-05-09 22:30:21 -04:00
5b210fe624 fix: use ROCm amdclang++ as HIP compiler, keep GCC for CPU code
Setting CXX=hipcc caused compilation failures on CPU backends (AVX
intrinsics). Now using GCC for CPU, ROCm's amdclang++ for HIP only.
2026-05-09 22:29:10 -04:00
d8b77c97c3 fix: use CXX=hipcc legacy mode for HIP CMake build
CMake 3.31 refuses CMAKE_HIP_COMPILER=hipcc with 'not supported'.
Using CXX=hipcc triggers the legacy HIP detection path which works.
2026-05-09 22:20:44 -04:00
a3d0fa0072 fix: set CMAKE_HIP_COMPILER explicitly for ROCm 6.1 HIP detection 2026-05-09 22:19:50 -04:00
956d76f14d fix: add unzip dependency for ninja installation 2026-05-09 22:14:53 -04:00
c6d2f5918f fix: use ollama v0.23.2 native CMake build system for ROCm 6 + gfx906
The old Dockerfile used the deprecated llama.cpp/ subdirectory approach
which doesn't exist in ollama v0.23.2. Now using the official CMake
presets (ROCm 6 preset) with AMDGPU_TARGETS including gfx906:xnack-.
2026-05-09 22:13:47 -04:00
f023dc1ee4 fix: update ollama Dockerfile to v0.23.2 with proper ROCm 6.1 + gfx906 build
- Update OLLAMA_VERSION from v0.13.5 to v0.23.2
- Fix package: golang -> golang-go
- Add ENV HCC_AMDGPU_TARGET=gfx906 and HSA_ENABLE_SDMA=0
- Set proper ENTRYPOINT + CMD
2026-05-09 21:56:14 -04:00
d34a4d3647 refactor: move hermes files into ai/hermes/ subdirectory
- ai/Dockerfile -> ai/hermes/Dockerfile
- ai/fix-permissions.sh -> ai/hermes/fix-permissions.sh
- ai/patch_tts_tool.py -> ai/hermes/patch_tts_tool.py
- ai/compose.yml: update hermes build context to ./hermes
- ollama stays at ai/ollama/Dockerfile
2026-05-09 21:50:04 -04:00
ef58155897 feat: add custom ollama image with ROCm 6.1 + gfx906 support
- Add ollama/Dockerfile that builds ollama from source with AMDGPU_TARGETS=gfx906
- Uses ROCm 6.1 (rocm/dev-ubuntu-22.04:6.1.2-complete) for MI50 support
- Builds llama.cpp runner with HIPBLAS for gfx906 architecture
- Updates compose.yml to build from this Dockerfile instead of pulling ollama/ollama:latest
2026-05-09 21:18:37 -04:00