Files
infra/docs/nix-container-install.md
Hermes 2c5dc3d58d feat: comprehensive NixOS deployment infrastructure
- docs/nix-container-install.md: 474-line guide covering Determinate Systems
  installer, vanilla Nix, NixOS base image, architecture notes (x86_64 vs aarch64),
  cross-compilation, container considerations, troubleshooting
- scripts/deploy.sh: 286-line deployment script with pre-flight checks, git sync,
  build validation (nix build --no-link), 5 actions (switch/boot/test/build/
  dry-activate), color-coded logging, env-based configurability
- scripts/deploy-ssh-config: SSH config for all 3 hosts with dual users for
  lazyworkhorse, reverse tunnel for cyt-pi, uConsole placeholder, Gitea entry

Full replacements of stub files from previous commit.
2026-05-20 14:29:38 -04:00

475 lines
13 KiB
Markdown

# Nix Installation for Hermes Agent Container
This guide covers several approaches for installing Nix in the Hermes Agent Docker
container to enable remote NixOS deployment via `nixos-rebuild`. It covers both
x86_64 (lazyworkhorse) and aarch64 (cyt-pi, uConsole) architectures.
## Table of Contents
1. [Why Nix in a Container?](#why-nix-in-a-container)
2. [Prerequisites](#prerequisites)
3. [Installation Methods](#installation-methods)
- [Method A: Determinate Systems Installer](#method-a-determinate-systems-installer-recommended)
- [Method B: Vanilla Nix Installer](#method-b-vanilla-nix-installer)
- [Method C: NixOS-Based Container Image](#method-c-nixos-based-container-image)
4. [Architecture-Specific Notes](#architecture-specific-notes)
- [x86_64 (lazyworkhorse)](#x86_64-lazyworkhorse)
- [aarch64 (cyt-pi, uConsole)](#aarch64-cyt-pi-uconsole)
- [Cross-Compilation](#cross-compilation)
5. [Post-Install Configuration](#post-install-configuration)
6. [Verification](#verification)
7. [Container-Specific Considerations](#container-specific-considerations)
- [Persistence](#persistence)
- [Disk Space](#disk-space)
- [Security](#security)
- [Resource Constraints](#resource-constraints)
8. [Integration with deploy.sh](#integration-with-deploysh)
9. [Troubleshooting](#troubleshooting)
10. [References](#references)
---
## Why Nix in a Container?
The Hermes Agent container runs on an Ubuntu/Debian base. To deploy NixOS
configurations to remote hosts, we need:
- `nix` — the Nix package manager (for building configurations)
- `nixos-rebuild` — the NixOS deployment tool
- Access to the infra repo with flake configuration
Installing Nix inside the container avoids:
- Host-level Nix installation on the Docker host
- Cross-container volume mounts of /nix/store
- Dependencies on the host's Nix daemon (which may be a different version)
## Prerequisites
- Docker host running Linux (x86_64 and/or aarch64)
- Container base: Debian/Ubuntu (apt-based)
- 1-2 GB additional disk space for Nix store
- Network access to cache.nixos.org (or a local binary cache)
- Git access to the infra repository
## Installation Methods
### Method A: Determinate Systems Installer (Recommended)
The Determinate Systems installer is the recommended approach. It is non-interactive,
sets up flakes by default, and handles multi-user installation cleanly.
**Dockerfile additions:**
```dockerfile
# Install Nix (Determinate Systems installer)
RUN apt-get update && apt-get install -y --no-install-recommends \
curl \
xz-utils \
&& rm -rf /var/lib/apt/lists/*
# Download and run Nix installer (non-interactive)
RUN curl --proto '=https' --tlsv1.2 -sSf -L https://install.determinate.systems/nix \
-o /tmp/nix-install.sh \
&& chmod +x /tmp/nix-install.sh \
&& sh /tmp/nix-install.sh install --no-confirm \
&& rm /tmp/nix-install.sh
# Configure Nix for flakes
RUN mkdir -p /root/.config/nix \
&& echo 'experimental-features = nix-command flakes' > /root/.config/nix/nix.conf
# Add Nix to PATH for all users
ENV PATH="/nix/var/nix/profiles/default/bin:$PATH"
```
**Pros:**
- Fully non-interactive (--no-confirm)
- Enables flakes automatically
- Sets up multi-user daemon
- Auto-selects correct architecture
- Handles upgrades gracefully
**Cons:**
- Downloads ~100 MB installer
- Requires systemd in container (works with --privileged or cgroupv2)
- Daemon mode may conflict with container exit semantics
**Container runtime additions:**
For the Nix daemon to work properly inside a container, you may need:
```dockerfile
# Ensure /nix is a volume for persistence
VOLUME /nix
# Or mount tmpfs for ephemeral builds:
# docker run --tmpfs /nix:exec,size=4G ...
```
### Method B: Vanilla Nix Installer
The official single-user Nix installer is lighter but requires manual flake setup.
**Dockerfile additions:**
```dockerfile
# Install Nix (single-user, official installer)
RUN apt-get update && apt-get install -y --no-install-recommends \
curl \
sudo \
xz-utils \
&& rm -rf /var/lib/apt/lists/*
# Install Nix as root (single-user)
RUN curl -L https://nixos.org/nix/install -o /tmp/nix-install.sh \
&& chmod +x /tmp/nix-install.sh \
&& sh /tmp/nix-install.sh --no-daemon \
&& rm /tmp/nix-install.sh
# Enable flakes
RUN mkdir -p /root/.config/nix \
&& echo 'experimental-features = nix-command flakes' > /root/.config/nix/nix.conf
# Source Nix in shell
RUN echo '. /root/.nix-profile/etc/profile.d/nix.sh' >> /root/.bashrc
ENV PATH="/root/.nix-profile/bin:$PATH"
```
**Pros:**
- Smaller installer
- No daemon needed (single-user mode)
- Works in containers without systemd
- Simpler container lifecycle
**Cons:**
- Manual flake configuration required
- Single-user only (no multi-user isolation)
- PATH must be set manually
- No automatic garbage collection
### Method C: NixOS-Based Container Image
For maximum isolation, use an official NixOS base image for the build stage.
**Multi-stage Dockerfile:**
```dockerfile
# Build stage: NixOS builder
FROM nixos/nix:latest AS builder
COPY infra /infra
WORKDIR /infra
# Build the configuration once
RUN nix build '.#nixosConfigurations.lazyworkhorse.config.system.build.toplevel'
# Final stage: Hermes container
FROM ubuntu:22.04
# Copy the Nix closure and binary cache
COPY --from=builder /nix /nix
# ... rest of Hermes setup
```
**Pros:**
- Purely declarative build environment
- No installation at runtime
- Easy to pin Nix version
- Good for CI/CD pipelines
**Cons:**
- Requires multi-stage Docker build
- Larger initial image build
- Harder to update Nix version at runtime
- Overkill if Nix is only needed for `nixos-rebuild`
---
## Architecture-Specific Notes
### x86_64 (lazyworkhorse)
The Hermes container likely runs on x86_64 hardware for the primary server.
Nix will download x86_64 binaries from cache.nixos.org by default.
**No special configuration needed** — the standard installer works out of the box.
If the container is running on an AMD Ryzen/EPYC or Intel Xeon, consider:
```bash
# Enable CPU-specific optimizations (optional)
echo 'extra-platforms = x86_64-v1 x86_64-v2 x86_64-v3' >> /root/.config/nix/nix.conf
```
### aarch64 (cyt-pi, uConsole)
When building for aarch64 targets from an x86_64 container, you need either:
1. Remote builder (aarch64 machine does the build), or
2. QEMU-based emulation (slower but self-contained), or
3. Build directly on the aarch64 target using `--build-host`
**For QEMU emulation in the container:**
```dockerfile
# Enable binfmt for aarch64 emulation
RUN apt-get update && apt-get install -y --no-install-recommends \
qemu-user-static \
binfmt-support \
&& rm -rf /var/lib/apt/lists/*
# Register aarch64 binfmt
RUN update-binfmts --enable qemu-aarch64
```
**Container runtime (for QEMU):**
```bash
docker run --privileged --rm ... hermes-agent
# Or with specific capability:
docker run --cap-add=SYS_ADMIN --security-opt seccomp=unconfined ... hermes-agent
```
### Cross-Compilation
For native cross-compilation (without emulation), add to your Nix configuration:
```nix
# In your flake.nix or nix.conf
{
nix.settings.extra-platforms = [ "aarch64-linux" "x86_64-linux" ];
nix.settings.extra-sandbox-paths = [ ];
boot.binfmt.emulatedSystems = [ "aarch64-linux" ];
}
```
Or in `nix.conf`:
```
extra-platforms = x86_64-linux aarch64-linux
extra-sandbox-paths =
```
---
## Post-Install Configuration
### nix.conf for Container Usage
Recommended `/root/.config/nix/nix.conf`:
```ini
experimental-features = nix-command flakes
substituters = https://cache.nixos.org/
trusted-users = root
max-jobs = auto
cores = 0
sandbox = false
```
Note: `sandbox = false` is needed inside containers that lack full sandbox
support. This is safe in a single-tenant container environment.
### PATH Setup
Add to your Dockerfile:
```dockerfile
ENV PATH="/nix/var/nix/profiles/default/bin:/root/.nix-profile/bin:${PATH}"
```
### Shell Integration
```dockerfile
RUN echo 'source /root/.nix-profile/etc/profile.d/nix.sh' >> /root/.bashrc
```
---
## Verification
After installation, verify with:
```bash
# Check Nix is available
nix --version
# Check nixos-rebuild
nixos-rebuild --help | head -3
# Verify flakes are enabled
nix flake --help
# Test a build (must be in infra repo)
cd /opt/data/infra
nix build --no-link '.#nixosConfigurations.lazyworkhorse.config.system.build.toplevel'
# Check available systems
nix eval --impure --expr 'builtins.currentSystem'
```
---
## Container-Specific Considerations
### Persistence
The `/nix` directory should be a Docker volume to avoid re-downloading
packages on every container restart:
```yaml
# docker-compose.yml
volumes:
- nix-store:/nix
volumes:
nix-store:
```
Without persistence, every container restart requires re-downloading the
entire Nix store (~500 MB - 2 GB depending on packages used).
### Disk Space
The Nix store grows over time as old generations accumulate. Set up garbage
collection:
```bash
# Manual GC
nix store gc
# Remove old generations
nix-collect-garbage --delete-older-than 30d
# Automatic GC (in nix.conf)
# Currently not supported in nix.conf, but you can run a cron job:
# nix store gc --max 10G
```
In Docker, limit store growth with:
```dockerfile
# Configure max store size
RUN mkdir -p /etc/nix && \
echo 'min-free = 5368709120' > /etc/nix/nix.conf # Keep 5GB free
```
### Security
Running Nix in a container introduces some security considerations:
1. **Sandboxing:** `sandbox = false` disables build isolation. In a multi-tenant
container, this means Nix builds can affect the container filesystem.
**Mitigation:** Only build configs you trust (your own infra repo).
2. **Network access:** The container needs outbound access to cache.nixos.org.
If using a restricted network, set up a local binary cache:
```nix
substituters = https://cache.nixos.org/ https://nix-cache.internal/
```
3. **Privileged mode:** QEMU emulation for aarch64 builds may need `--privileged`
or `--security-opt seccomp=unconfined`. This reduces container isolation.
**Mitigation:** Use remote builders or build natively on the target.
4. **Supply chain:** Nix derivations pin exact inputs via hashes. Verify
flake.lock is committed and reviewed.
### Resource Constraints
Nix builds can be memory and CPU intensive:
```nix
# Limit build parallelism in nix.conf
max-jobs = 2
cores = 4
# Or set per-build:
# nix build --max-jobs 2 --cores 4
```
For containers with limited memory (< 2 GB), consider:
- Building on the target host instead (`--build-host`)
- Using the deploy script's `build` action separately
---
## Integration with deploy.sh
The deployment script at `scripts/deploy.sh` expects:
1. **Nix installed** with flakes enabled
2. **SSH key** at `/opt/data/home/.ssh/id_hermes_gitea` (or via SSH_KEY env)
3. **Infra repo** cloned at the script's parent directory
4. **Network access** to:
- `code.lazyworkhorse.net:2222` (Gitea for git operations)
- Target hosts via SSH (see deploy-ssh-config)
- `cache.nixos.org` or a local substitute
Typical usage from Hermes:
```bash
# Full deployment
./scripts/deploy.sh lazyworkhorse master switch
# Build-only check (no remote deployment)
./scripts/deploy.sh cyt-pi master build
# Dry run
./scripts/deploy.sh uConsole feat/test dry-activate
# Override SSH key
SSH_KEY=/opt/data/home/.ssh/my-custom-key ./deploy.sh lazyworkhorse
```
---
## Troubleshooting
### "nix: command not found"
- Ensure Nix is installed and PATH is set:
```bash
export PATH="/nix/var/nix/profiles/default/bin:/root/.nix-profile/bin:$PATH"
```
- Check installation: `ls -la /nix/` should exist
- Re-source profile: `. /root/.nix-profile/etc/profile.d/nix.sh`
### "error: unable to download ... cache.nixos.org"
- Check network connectivity: `ping cache.nixos.org`
- Check DNS resolution from inside the container
- If behind a proxy, set `http_proxy` / `https_proxy` environment variables
### "sandbox: cannot run build in sandbox"
- Add `sandbox = false` to nix.conf
- Or run container with `--privileged` or `--security-opt seccomp=unconfined`
### "aarch64-linux builds fail on x86_64"
- QEMU binfmt not registered. Check: `ls /proc/sys/fs/binfmt_misc/`
- Rebuild QEMU registration: `docker run --privileged --rm tonistiigi/binfmt --install all`
- Or use `--build-host` to build on the target directly
### "nixos-rebuild fails with SSH errors"
- Verify SSH key exists and has correct permissions:
```bash
ls -la /opt/data/home/.ssh/id_hermes_gitea
chmod 600 /opt/data/home/.ssh/id_hermes_gitea
```
- Test SSH manually: `ssh -p 2424 -i /opt/data/home/.ssh/id_hermes_gitea ai-worker@lazyworkhorse.net`
- Check target host is reachable: `nc -zv lazyworkhorse.net 2424`
### "git fetch fails from Gitea"
- Verify GIT_SSH_COMMAND is set: `echo $GIT_SSH_COMMAND`
- Test git SSH: `ssh -T git@code.lazyworkhorse.net -p 2222`
- Check the infra repo remote: `git remote -v`
---
## References
- [Determinate Systems Nix Installer](https://github.com/DeterminateSystems/nix-installer)
- [NixOS Manual: Installation](https://nixos.org/manual/nix/stable/installation/)
- [NixOS Wiki: Flakes](https://nixos.wiki/wiki/Flakes)
- [NixOS Wiki: nixos-rebuild](https://nixos.wiki/wiki/Nixos-rebuild)
- [NixOS Wiki: Cross Compilation](https://nixos.wiki/wiki/Cross_Compilation)
- [Multi-arch Docker with QEMU](https://github.com/multiarch/qemu-user-static)