Files
infra/docs/nix-container-install.md
Hermes 2c5dc3d58d feat: comprehensive NixOS deployment infrastructure
- docs/nix-container-install.md: 474-line guide covering Determinate Systems
  installer, vanilla Nix, NixOS base image, architecture notes (x86_64 vs aarch64),
  cross-compilation, container considerations, troubleshooting
- scripts/deploy.sh: 286-line deployment script with pre-flight checks, git sync,
  build validation (nix build --no-link), 5 actions (switch/boot/test/build/
  dry-activate), color-coded logging, env-based configurability
- scripts/deploy-ssh-config: SSH config for all 3 hosts with dual users for
  lazyworkhorse, reverse tunnel for cyt-pi, uConsole placeholder, Gitea entry

Full replacements of stub files from previous commit.
2026-05-20 14:29:38 -04:00

13 KiB

Nix Installation for Hermes Agent Container

This guide covers several approaches for installing Nix in the Hermes Agent Docker container to enable remote NixOS deployment via nixos-rebuild. It covers both x86_64 (lazyworkhorse) and aarch64 (cyt-pi, uConsole) architectures.

Table of Contents

  1. Why Nix in a Container?
  2. Prerequisites
  3. Installation Methods
  4. Architecture-Specific Notes
  5. Post-Install Configuration
  6. Verification
  7. Container-Specific Considerations
  8. Integration with deploy.sh
  9. Troubleshooting
  10. References

Why Nix in a Container?

The Hermes Agent container runs on an Ubuntu/Debian base. To deploy NixOS configurations to remote hosts, we need:

  • nix — the Nix package manager (for building configurations)
  • nixos-rebuild — the NixOS deployment tool
  • Access to the infra repo with flake configuration

Installing Nix inside the container avoids:

  • Host-level Nix installation on the Docker host
  • Cross-container volume mounts of /nix/store
  • Dependencies on the host's Nix daemon (which may be a different version)

Prerequisites

  • Docker host running Linux (x86_64 and/or aarch64)
  • Container base: Debian/Ubuntu (apt-based)
  • 1-2 GB additional disk space for Nix store
  • Network access to cache.nixos.org (or a local binary cache)
  • Git access to the infra repository

Installation Methods

The Determinate Systems installer is the recommended approach. It is non-interactive, sets up flakes by default, and handles multi-user installation cleanly.

Dockerfile additions:

# Install Nix (Determinate Systems installer)
RUN apt-get update && apt-get install -y --no-install-recommends \
    curl \
    xz-utils \
    && rm -rf /var/lib/apt/lists/*

# Download and run Nix installer (non-interactive)
RUN curl --proto '=https' --tlsv1.2 -sSf -L https://install.determinate.systems/nix \
    -o /tmp/nix-install.sh \
    && chmod +x /tmp/nix-install.sh \
    && sh /tmp/nix-install.sh install --no-confirm \
    && rm /tmp/nix-install.sh

# Configure Nix for flakes
RUN mkdir -p /root/.config/nix \
    && echo 'experimental-features = nix-command flakes' > /root/.config/nix/nix.conf

# Add Nix to PATH for all users
ENV PATH="/nix/var/nix/profiles/default/bin:$PATH"

Pros:

  • Fully non-interactive (--no-confirm)
  • Enables flakes automatically
  • Sets up multi-user daemon
  • Auto-selects correct architecture
  • Handles upgrades gracefully

Cons:

  • Downloads ~100 MB installer
  • Requires systemd in container (works with --privileged or cgroupv2)
  • Daemon mode may conflict with container exit semantics

Container runtime additions:

For the Nix daemon to work properly inside a container, you may need:

# Ensure /nix is a volume for persistence
VOLUME /nix

# Or mount tmpfs for ephemeral builds:
# docker run --tmpfs /nix:exec,size=4G ...

Method B: Vanilla Nix Installer

The official single-user Nix installer is lighter but requires manual flake setup.

Dockerfile additions:

# Install Nix (single-user, official installer)
RUN apt-get update && apt-get install -y --no-install-recommends \
    curl \
    sudo \
    xz-utils \
    && rm -rf /var/lib/apt/lists/*

# Install Nix as root (single-user)
RUN curl -L https://nixos.org/nix/install -o /tmp/nix-install.sh \
    && chmod +x /tmp/nix-install.sh \
    && sh /tmp/nix-install.sh --no-daemon \
    && rm /tmp/nix-install.sh

# Enable flakes
RUN mkdir -p /root/.config/nix \
    && echo 'experimental-features = nix-command flakes' > /root/.config/nix/nix.conf

# Source Nix in shell
RUN echo '. /root/.nix-profile/etc/profile.d/nix.sh' >> /root/.bashrc
ENV PATH="/root/.nix-profile/bin:$PATH"

Pros:

  • Smaller installer
  • No daemon needed (single-user mode)
  • Works in containers without systemd
  • Simpler container lifecycle

Cons:

  • Manual flake configuration required
  • Single-user only (no multi-user isolation)
  • PATH must be set manually
  • No automatic garbage collection

Method C: NixOS-Based Container Image

For maximum isolation, use an official NixOS base image for the build stage.

Multi-stage Dockerfile:

# Build stage: NixOS builder
FROM nixos/nix:latest AS builder

COPY infra /infra
WORKDIR /infra

# Build the configuration once
RUN nix build '.#nixosConfigurations.lazyworkhorse.config.system.build.toplevel'

# Final stage: Hermes container
FROM ubuntu:22.04

# Copy the Nix closure and binary cache
COPY --from=builder /nix /nix

# ... rest of Hermes setup

Pros:

  • Purely declarative build environment
  • No installation at runtime
  • Easy to pin Nix version
  • Good for CI/CD pipelines

Cons:

  • Requires multi-stage Docker build
  • Larger initial image build
  • Harder to update Nix version at runtime
  • Overkill if Nix is only needed for nixos-rebuild

Architecture-Specific Notes

x86_64 (lazyworkhorse)

The Hermes container likely runs on x86_64 hardware for the primary server. Nix will download x86_64 binaries from cache.nixos.org by default.

No special configuration needed — the standard installer works out of the box.

If the container is running on an AMD Ryzen/EPYC or Intel Xeon, consider:

# Enable CPU-specific optimizations (optional)
echo 'extra-platforms = x86_64-v1 x86_64-v2 x86_64-v3' >> /root/.config/nix/nix.conf

aarch64 (cyt-pi, uConsole)

When building for aarch64 targets from an x86_64 container, you need either:

  1. Remote builder (aarch64 machine does the build), or
  2. QEMU-based emulation (slower but self-contained), or
  3. Build directly on the aarch64 target using --build-host

For QEMU emulation in the container:

# Enable binfmt for aarch64 emulation
RUN apt-get update && apt-get install -y --no-install-recommends \
    qemu-user-static \
    binfmt-support \
    && rm -rf /var/lib/apt/lists/*

# Register aarch64 binfmt
RUN update-binfmts --enable qemu-aarch64

Container runtime (for QEMU):

docker run --privileged --rm ... hermes-agent
# Or with specific capability:
docker run --cap-add=SYS_ADMIN --security-opt seccomp=unconfined ... hermes-agent

Cross-Compilation

For native cross-compilation (without emulation), add to your Nix configuration:

# In your flake.nix or nix.conf
{
  nix.settings.extra-platforms = [ "aarch64-linux" "x86_64-linux" ];
  nix.settings.extra-sandbox-paths = [ ];
  boot.binfmt.emulatedSystems = [ "aarch64-linux" ];
}

Or in nix.conf:

extra-platforms = x86_64-linux aarch64-linux
extra-sandbox-paths =

Post-Install Configuration

nix.conf for Container Usage

Recommended /root/.config/nix/nix.conf:

experimental-features = nix-command flakes
substituters = https://cache.nixos.org/
trusted-users = root
max-jobs = auto
cores = 0
sandbox = false

Note: sandbox = false is needed inside containers that lack full sandbox support. This is safe in a single-tenant container environment.

PATH Setup

Add to your Dockerfile:

ENV PATH="/nix/var/nix/profiles/default/bin:/root/.nix-profile/bin:${PATH}"

Shell Integration

RUN echo 'source /root/.nix-profile/etc/profile.d/nix.sh' >> /root/.bashrc

Verification

After installation, verify with:

# Check Nix is available
nix --version

# Check nixos-rebuild
nixos-rebuild --help | head -3

# Verify flakes are enabled
nix flake --help

# Test a build (must be in infra repo)
cd /opt/data/infra
nix build --no-link '.#nixosConfigurations.lazyworkhorse.config.system.build.toplevel'

# Check available systems
nix eval --impure --expr 'builtins.currentSystem'

Container-Specific Considerations

Persistence

The /nix directory should be a Docker volume to avoid re-downloading packages on every container restart:

# docker-compose.yml
volumes:
  - nix-store:/nix

volumes:
  nix-store:

Without persistence, every container restart requires re-downloading the entire Nix store (~500 MB - 2 GB depending on packages used).

Disk Space

The Nix store grows over time as old generations accumulate. Set up garbage collection:

# Manual GC
nix store gc

# Remove old generations
nix-collect-garbage --delete-older-than 30d

# Automatic GC (in nix.conf)
# Currently not supported in nix.conf, but you can run a cron job:
# nix store gc --max 10G

In Docker, limit store growth with:

# Configure max store size
RUN mkdir -p /etc/nix && \
    echo 'min-free = 5368709120' > /etc/nix/nix.conf  # Keep 5GB free

Security

Running Nix in a container introduces some security considerations:

  1. Sandboxing: sandbox = false disables build isolation. In a multi-tenant container, this means Nix builds can affect the container filesystem. Mitigation: Only build configs you trust (your own infra repo).

  2. Network access: The container needs outbound access to cache.nixos.org. If using a restricted network, set up a local binary cache:

    substituters = https://cache.nixos.org/ https://nix-cache.internal/
    
  3. Privileged mode: QEMU emulation for aarch64 builds may need --privileged or --security-opt seccomp=unconfined. This reduces container isolation. Mitigation: Use remote builders or build natively on the target.

  4. Supply chain: Nix derivations pin exact inputs via hashes. Verify flake.lock is committed and reviewed.

Resource Constraints

Nix builds can be memory and CPU intensive:

# Limit build parallelism in nix.conf
max-jobs = 2
cores = 4

# Or set per-build:
# nix build --max-jobs 2 --cores 4

For containers with limited memory (< 2 GB), consider:

  • Building on the target host instead (--build-host)
  • Using the deploy script's build action separately

Integration with deploy.sh

The deployment script at scripts/deploy.sh expects:

  1. Nix installed with flakes enabled
  2. SSH key at /opt/data/home/.ssh/id_hermes_gitea (or via SSH_KEY env)
  3. Infra repo cloned at the script's parent directory
  4. Network access to:
    • code.lazyworkhorse.net:2222 (Gitea for git operations)
    • Target hosts via SSH (see deploy-ssh-config)
    • cache.nixos.org or a local substitute

Typical usage from Hermes:

# Full deployment
./scripts/deploy.sh lazyworkhorse master switch

# Build-only check (no remote deployment)
./scripts/deploy.sh cyt-pi master build

# Dry run
./scripts/deploy.sh uConsole feat/test dry-activate

# Override SSH key
SSH_KEY=/opt/data/home/.ssh/my-custom-key ./deploy.sh lazyworkhorse

Troubleshooting

"nix: command not found"

  • Ensure Nix is installed and PATH is set:
    export PATH="/nix/var/nix/profiles/default/bin:/root/.nix-profile/bin:$PATH"
    
  • Check installation: ls -la /nix/ should exist
  • Re-source profile: . /root/.nix-profile/etc/profile.d/nix.sh

"error: unable to download ... cache.nixos.org"

  • Check network connectivity: ping cache.nixos.org
  • Check DNS resolution from inside the container
  • If behind a proxy, set http_proxy / https_proxy environment variables

"sandbox: cannot run build in sandbox"

  • Add sandbox = false to nix.conf
  • Or run container with --privileged or --security-opt seccomp=unconfined

"aarch64-linux builds fail on x86_64"

  • QEMU binfmt not registered. Check: ls /proc/sys/fs/binfmt_misc/
  • Rebuild QEMU registration: docker run --privileged --rm tonistiigi/binfmt --install all
  • Or use --build-host to build on the target directly

"nixos-rebuild fails with SSH errors"

  • Verify SSH key exists and has correct permissions:
    ls -la /opt/data/home/.ssh/id_hermes_gitea
    chmod 600 /opt/data/home/.ssh/id_hermes_gitea
    
  • Test SSH manually: ssh -p 2424 -i /opt/data/home/.ssh/id_hermes_gitea ai-worker@lazyworkhorse.net
  • Check target host is reachable: nc -zv lazyworkhorse.net 2424

"git fetch fails from Gitea"

  • Verify GIT_SSH_COMMAND is set: echo $GIT_SSH_COMMAND
  • Test git SSH: ssh -T git@code.lazyworkhorse.net -p 2222
  • Check the infra repo remote: git remote -v

References