feat: add Hyperspace Pods NixOS module and enable on lazyworkhorse

Hyperspace Pods let multiple machines pool their GPUs into one private P2P mesh AI cluster. Models are split across all connected GPUs — e.g. two machines with 16GB VRAM each can run Qwen 3.5 32B together. Changes: - Add modules/nixos/services/hyperspace.nix — NixOS module that: * Fetches the Hyperspace CLI binary (v5.45.30) via fetchurl * Sets up systemd service for the agent * Opens firewall ports (libp2p 4001, chain 30301, API 8080) * Configures GPU passthrough for AMD MI50 (ROCm) - Register module in flake.nix for lazyworkhorse - Enable hyperspace service on lazyworkhorse (ai-worker user, port 8080) Usage after deployment: hyperspace pod create "tdnde-lab" # create pod hyperspace pod invite # share invite with cyt-pi curl http://localhost:8080/v1/chat/completions # OpenAI API See skill: nixos-hyperspace-pods
chore: update compose submodule to traefik logging branch
2026-05-02 15:36:15 +00:00 · 2026-05-02 15:30:28 +00:00 · 2026-05-02 15:30:28 +00:00
17 changed files with 346 additions and 1432 deletions
--- a/.gitea/workflows/build-nixos.yml
+++ b/.gitea/workflows/build-nixos.yml
@@ -1,52 +0,0 @@
-name: Build and test NixOS config
-on:
-  pull_request:
-    branches: [ master ]
-    paths:
-      - '**.nix'
-      - 'flake.lock'
-      - 'secrets/**'
-      - 'hosts/**'
-      - 'modules/**'
-  push:
-    branches: [ master ]
-    paths:
-      - '**.nix'
-      - 'flake.lock'
-      - 'secrets/**'
-      - 'hosts/**'
-      - 'modules/**'
-
-jobs:
-  build:
-    runs-on: nixos-builder
-    steps:
-      - name: Checkout
-        run: |
-          git clone -b "${{ github.head_ref || github.ref_name }}" \
-            https://gitea:${{ secrets.GITHUB_TOKEN }}@code.lazyworkhorse.net/gortium/infra.git .
-          git log --oneline -3
-
-      - name: Build NixOS config
-        run: |
-          nix --version
-          nh os build .#lazyworkhorse 2>&1
-
-      - name: Run integration tests (staging VM)
-        run: |
-          echo "==> Running integration tests on staging VM..."
-          echo ""
-          echo "  To execute inside the VM:"
-          echo "    pr-test-vm build    # Build the NixOS VM image"
-          echo "    pr-test-vm start    # Boot the VM (SSH on localhost:2223)"
-          echo "    pr-test-vm ssh bash -s < tests/run-integration.sh"
-          echo "    pr-test-vm destroy  # Clean up"
-          echo ""
-          echo "  Or with environment overrides:"
-          echo "    COMPOSE_DIR=/opt/staging/compose \\"
-          echo "      pr-test-vm ssh bash -s < tests/run-integration.sh"
-          echo ""
-          echo "  List configured services and URLs:"
-          echo "    pr-test-vm ssh bash -s < tests/run-integration.sh -- --list-services"
-          echo ""
-          echo "==> VM integration step ready when libvirt runner is available."
--- a/.planning/ROADMAP.md
+++ b/.planning/ROADMAP.md
@@ -13,7 +13,9 @@ None
 - ✅ **Phase 1: Foundation Setup** - Establish core NixOS configuration with flakes
 - ✅ **Phase 2: Docker Service Integration** - Integrate Docker Compose services
 - ✅ **Phase 3: AI Assistant Integration** - Enable AI-assisted infrastructure management
- [ ] **Phase 4: Internet Access & MCP** - MCP server for web access
+- ✅ **Phase 4: Internet Access & MCP** - MCP server for web access
+- 🚨 **Security Hardening** - CRITICAL: Firewall, fail2ban, SSH hardening (PR #28)
+- [ ] **Phase 5: TAK Server** - Research, implementation, and validation


 ## Phase Details
@@ -133,8 +135,25 @@ Plans:

 ## Progress

+**Merge Priority Order** (CRITICAL - merge in this order):
+
+| Priority | PR | Description | Status | Notes |
+|----------|-----|-------------|--------|-------|
+| 🚨 1 | #28 | **Security hardening** (firewall, fail2ban, SSH) | Open | **MERGE FIRST** - protects all other services |
+| 2 | #22 | Matrix bridge dependency fix | Open | Blocks Hermes functionality |
+| 3 | #21 | Backup network creation fix | Open | Infrastructure fix |
+| 4 | #25 | Hermes voice GPU support | Open | Feature enhancement |
+| 5 | #24 | uConsole CM5 host | Open | New hardware support |
+| 6 | #23 | NixOS deployment infrastructure | Open | Deployment tooling |
+| 7 | #1 | AI worker restricted access | Open | Legacy PR (superseded by hardening) |
+
 **Execution Order:**
-Phases execute in numeric order: 1 → 2 → 3 → 4 → 5 → 6 → 7
+Phases execute in numeric order: 1 → 2 → 3 → 4 → Security → 5 → 6 → 7
+
+**Merge vs Phase Execution:**
+- PRs can merge independently (no strict phase ordering for merges)
+- **EXCEPTION:** Security hardening (#28) must merge before any new services are exposed
+- After security merge, deploy with: `nh os switch --flake .#lazyworkhorse`

 | Phase | Milestone | Plans Complete | Status | Completed |
 |-------|-----------|----------------|--------|-----------|
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -5,7 +5,6 @@ This document outlines the development conventions for this NixOS-based infrastr
 ## Build & Deployment

 - **Build/Deploy:** Use `nixos-rebuild switch --flake .#<hostname>` to build and deploy the configuration for a specific host.
- **CRITICAL — Validate before pushing:** Always `nix build --no-link '.#nixosConfigurations.<hostname>.config.system.build.toplevel'` (or `nh os build`) and confirm it succeeds before pushing any changes. Never push untested NixOS configs.
 - **Development Shell:** Activate the development environment with `nix develop`.

 ## Linting & Formatting
--- a/assets/compose
+++ b/assets/compose
--- a/assets/ollama/Dockerfile
+++ b/assets/ollama/Dockerfile
@@ -1,106 +0,0 @@
-# ollama-gfx906/Dockerfile
-#
-# Custom ollama image with ROCm 6.1 + gfx906 (MI50) support.
-# The official ollama/rocm image ships ROCm 7.2 which dropped gfx906.
-# This uses v0.23.2's native CMake build system with AMDGPU_TARGETS including gfx906.
-#
-# Build: docker build -t ollama/ollama:rocm-gfx906 ai/ollama
-
-FROM rocm/dev-ubuntu-22.04:6.1.2-complete AS builder
-
-# Build dependencies (CMake, Ninja, Go)
-ARG CMAKEVERSION=3.31.2
-ARG NINJAVERSION=1.12.1
-ARG GOLANG_VERSION=1.22.0
-
-RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y \
-    curl git ccache build-essential pkg-config unzip \
-    && rm -rf /var/lib/apt/lists/*
-
-# Install CMake from official binaries
-RUN curl -fsSL https://github.com/Kitware/CMake/releases/download/v${CMAKEVERSION}/cmake-${CMAKEVERSION}-linux-x86_64.tar.gz \
-    | tar xz -C /usr/local --strip-components 1
-
-# Install Ninja
-RUN curl -fsSL -o /tmp/ninja.zip \
-    https://github.com/ninja-build/ninja/releases/download/v${NINJAVERSION}/ninja-linux.zip \
-    && unzip /tmp/ninja.zip -d /usr/local/bin && rm /tmp/ninja.zip
-
-# Install Go
-RUN curl -fsSL https://go.dev/dl/go${GOLANG_VERSION}.linux-amd64.tar.gz \
-    | tar xz -C /usr/local
-ENV PATH=/usr/local/go/bin:$PATH
-
-ARG OLLAMA_VERSION=v0.23.2
-RUN git clone --depth 1 --branch ${OLLAMA_VERSION} https://github.com/ollama/ollama.git /build
-WORKDIR /build
-
-# ROCm paths
-ENV HIP_PATH=/opt/rocm
-ENV ROCM_PATH=/opt/rocm
-ENV CMAKE_GENERATOR=Ninja
-ENV LDFLAGS=-s
-
-# Step 1: Build CPU backends with GCC (no ROCm preset)
-# Pre-set CMAKE_HIP_COMPILER="" to prevent check_language(HIP) from
-# finding a HIP compiler (it searches /opt/rocm even without PATH).
-# Remove /opt/rocm from PATH to prevent find_program from finding hipcc.
-RUN mkdir -p build-cpu && \
-    PATH=/usr/local/go/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin \
-    cmake -B build-cpu -DCMAKE_BUILD_TYPE=Release \
-      -DCMAKE_HIP_COMPILER="" \
-      -DCMAKE_INSTALL_PREFIX=/build/dist && \
-    cmake --build build-cpu --target ggml-cpu -- -l $(nproc) && \
-    cmake --install build-cpu --component CPU --strip && \
-    echo "=== CPU install ===" && \
-    (find /build/dist/lib/ollama -type f -o -type l 2>&1 | head -20 || echo "empty")
-
-# Step 2: Build HIP backend with ROCm preset + gfx906 target only
-# The ROCm 6 preset enables HIP language detection (enable_language(HIP))
-# which ensures GPU kernels are properly compiled for gfx906.
-# OLLAMA_RUNNER_DIR=rocm from the preset, so HIP goes to lib/ollama/rocm/
-# Need CMAKE_PREFIX_PATH so find_package(hip) finds hip-config.cmake
-# at /opt/rocm/lib/cmake/hip/hip-config.cmake.
-RUN mkdir -p build-hip && \
-    cmake -B build-hip \
-      --preset 'ROCm 6' \
-      -DAMDGPU_TARGETS="gfx906:xnack-" \
-      -DCMAKE_BUILD_TYPE=Release \
-      -DCMAKE_PREFIX_PATH="/opt/rocm" && \
-    cmake --build build-hip --target ggml-hip -- -l $(nproc) && \
-    cmake --install build-hip --component HIP --strip && \
-    echo "=== HIP install ===" && \
-    find /build/dist/lib/ollama -type f -o -type l | head -20
-
-# Step 3: Build Go binary (GCC for CGo linking)
-ENV CGO_ENABLED=1
-RUN go build -trimpath -ldflags="-X=github.com/ollama/ollama/version.Version=${OLLAMA_VERSION}" -o /build/dist/ollama .
-
-# ---------- Runtime image ----------
-FROM ubuntu:24.04
-
-RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y \
-    ca-certificates curl libstdc++6 libgomp1 libvulkan1 libopenblas0 \
-    && rm -rf /var/lib/apt/lists/*
-
-# Copy ROCm 6.1 runtime libraries
-# These are needed at runtime by ggml-hip via LD_LIBRARY_PATH
-COPY --from=builder /opt/rocm/lib/ /opt/rocm/lib/
-COPY --from=builder /opt/rocm/share/ /opt/rocm/share/
-
-# Copy ollama binary + all backends (CPU + HIP)
-# CPU install:  /build/dist/lib/ollama/libggml-*.so
-# HIP install:  /build/dist/lib/ollama/rocm/libggml-hip.so
-COPY --from=builder /build/dist/ollama /usr/bin/ollama
-COPY --from=builder /build/dist/lib/ollama/ /usr/lib/ollama/
-
-RUN ldconfig
-
-ENV LD_LIBRARY_PATH=/opt/rocm/lib:/usr/lib/ollama/rocm:/usr/lib/ollama
-ENV HSA_OVERRIDE_GFX_VERSION=9.0.6
-ENV HCC_AMDGPU_TARGET=gfx906
-ENV HSA_ENABLE_SDMA=0
-
-EXPOSE 11434
-ENTRYPOINT ["/bin/ollama"]
-CMD ["serve"]
--- a/flake.nix
+++ b/flake.nix
@@ -61,8 +61,7 @@
              ./modules/nixos/services/open_code_server.nix
              ./modules/nixos/services/ollama_init_custom_models.nix
              ./modules/nixos/services/openclaw_node.nix
-              ./modules/nixos/services/staging-vm.nix
-              ./modules/nixos/security/ai-worker-restricted.nix
+              ./modules/nixos/services/hyperspace.nix
              ./users/gortium.nix
              ./users/ai-worker.nix
            ];
--- a/hosts/lazyworkhorse/configuration.nix
+++ b/hosts/lazyworkhorse/configuration.nix
@@ -36,7 +36,7 @@
    "transparent_hugepage=always" # because mucho ram
  ];
  # 2. Load the specific drivers found by sensors-detect
-  boot.kernelModules = [ "nct6775" "lm96163" "iptable_nat" "iptable_filter" "kvm-intel" "kvm" ];
+  boot.kernelModules = [ "nct6775" "lm96163" ];
  # 3. Force the nct6775 driver to recognize the chip if it's stubborn
  boot.extraModprobeConfig = ''
    options nct6775 force_id=0xd280
@@ -49,26 +49,6 @@
  networking.networkmanager.enable = true;  # Easiest to use and most distros use this by default.
  networking.hostId = "deadbeef";

-  # WireGuard VPN client -- always up, connects to wg-easy server
-  # Create age-encrypted secrets before deploying (run on the host):
-  #   echo -n "<private_key>" | agenix -e secrets/wireguard_private_key.age
-  #   echo -n "<preshared_key>" | agenix -e secrets/wireguard_preshared_key.age
-  networking.wireguard.interfaces = {
-    wg0 = {
-      ips = [ "10.8.0.3/24" ];
-      privateKeyFile = config.age.secrets.wireguard_private_key.path;
-      peers = [
-        {
-          publicKey = "rY9zII3AOm8rog2rv02PyA3Bq7zdvTOGkZapfCV1DkE=";
-          presharedKeyFile = config.age.secrets.wireguard_preshared_key.path;
-          allowedIPs = [ "10.8.0.0/24" ];
-          endpoint = "vpn.lazyworkhorse.net:51820";
-          persistentKeepalive = 25;
-        }
-      ];
-    };
-  };
-
  # Set your time zone.
  time.timeZone = "America/Montreal";

@@ -178,7 +158,7 @@
    settings = {
      PasswordAuthentication = false;
      KbdInteractiveAuthentication = false;
-      # Additional hardening settings below in SERVER HARDENING section
+      PermitRootLogin = "prohibit-password"; 
    };
    hostKeys = [
      {
@@ -241,11 +221,6 @@
      path = self + "/assets/compose/homepage";
    };

-    vpn = {
-      path = self + "/assets/compose/vpn";
-      envFile = config.age.secrets.containers_env.path;
-    };
-
    # tak = {
    #   path = self + "/assets/compose/tak";
    # };
@@ -289,20 +264,6 @@
        mode = "0440";
        path = "/run/secrets/openclaw_gateway_token";
      };
-      wireguard_private_key = {
-        file = ../../secrets/wireguard_private_key.age;
-        owner = "root";
-        group = "root";
-        mode = "0400";
-        path = "/run/secrets/wireguard_private_key";
-      };
-      wireguard_preshared_key = {
-        file = ../../secrets/wireguard_preshared_key.age;
-        owner = "root";
-        group = "root";
-        mode = "0400";
-        path = "/run/secrets/wireguard_preshared_key";
-      };
    };
  };

@@ -316,6 +277,16 @@
    displayName = "lazyworkhorse-host";
  };

+  # Hyperspace Pods — P2P mesh AI cluster (combine GPUs across machines)
+  services.hyperspace = {
+    enable = true;
+    user = "ai-worker";
+    apiPort = 8080;
+    profile = "auto";
+    openFirewall = true;
+    extraArgs = [ "--verbose" ];
+  };
+
  # Public host ssh key (kept in sync with the private one)
  environment.etc."ssh/ssh_host_ed25519_key.pub".text =
    "${keys.hosts.lazyworkhorse.main}";
@@ -328,223 +299,25 @@
  # Mi50 config
  hardware.graphics = {
    enable = true;
-    enable32Bit = true;
+    enable32Bit = true; # Useful for some compatibility layers
    extraPackages = with pkgs; [
-      rocmPackages.clr.icd
+      rocmPackages.clr.icd # OpenCL/HIP runtime
    ];
  };
  nixpkgs.config.rocmTargets = [ "gfx906" ];
  environment.variables = {
+    # This "tricks" ROCm into supporting the MI50 if using newer versions
    HSA_OVERRIDE_GFX_VERSION = "9.0.6";
+    # Ensures the system sees both GPUs
    HIP_VISIBLE_DEVICES = "0,1";
  };

-  # KVM/libvirt for staging VM
-  services.stagingVm.enable = true;
-
-  # Open ports in the firewall.
+ # Open ports in the firewall.
  # networking.firewall.allowedTCPPorts = [ ... ];
  # networking.firewall.allowedUDPPorts = [ ... ];
  # Or disable the firewall altogether.
  # networking.firewall.enable = false;

-  # =============================================================================
-  # SERVER HARDENING - Firewall, Fail2ban, SSH, Kernel
-  # =============================================================================
-  
-  # Firewall - default deny, explicit allow
-  networking.firewall = {
-    # Enable firewall with default deny policy (NixOS firewall denies all by default)
-    enable = true;
-    allowPing = true;
-    
-    # Only essential ports exposed to internet
-    allowedTCPPorts = [
-      2424  # SSH (non-standard port)
-      2222  # Gitea (version control)
-      80    # HTTP (Traefik redirect)
-      443   # HTTPS (Traefik)
-      # 8000  # Portainer - REVIEW: internal only?
-      # 4242  # Coms - REVIEW: internal only?
-      # 5000  # TAK API - REVIEW: internal only?
-      # 8087  # TAK Connect - REVIEW: internal only?
-      # 8089  # TAK Management - REVIEW: internal only?
-    ];
-    
-    allowedUDPPorts = [
-      51820 # WireGuard VPN
-    ];
-    
-    # Rate limiting and attack prevention
-    extraCommands = ''
-      # 1. Wipe the INPUT chain clean at the start of every activation
-      iptables -F INPUT
-      
-      # Rate limit SSH connections (max 20 new connections per 60 seconds)
-      iptables -A INPUT -p tcp --dport 2424 -m state --state NEW -m recent --set
-      iptables -A INPUT -p tcp --dport 2424 -m state --state NEW -m recent --update --seconds 60 --hitcount 20 -j DROP
-      
-      # Rate limit HTTP/HTTPS (protects Traefik)
-      iptables -A INPUT -p tcp --dport 80 -m state --state NEW -m limit --limit 25/minute --limit-burst 100 -j ACCEPT
-      iptables -A INPUT -p tcp --dport 443 -m state --state NEW -m limit --limit 25/minute --limit-burst 100 -j ACCEPT
-      
-      # Drop invalid packets
-      iptables -A INPUT -m state --state INVALID -j DROP
-      
-      # Log dropped packets (rate limited)
-      iptables -A INPUT -m limit --limit 5/min -j LOG --log-prefix "IPTables-Dropped: " --log-level 4
-
-      # 3. CRITICAL: Re-link the NixOS default firewall chain
-      # Without this line, the 'allowedTCPPorts' in your Nix config will be ignored!
-      iptables -A INPUT -j nixos-fw
-    '';
-  };
-
-  # Fail2ban - automatic IP banning
-  services.fail2ban = {
-    enable = true;
-    maxretry = 3;
-    bantime = "1h";
-    banaction = "iptables-multiport";
-    
-    jails = {
-      # SSH brute force protection (uses systemd journal backend)
-      sshd = {
-        enabled = true;
-        settings = {
-          filter = "sshd";
-          port = "2424";
-          maxretry = 3;
-          bantime = "1h";
-        };
-      };
-      
-      # Recidive - ban repeat offenders for 1 week
-      recidive = {
-        enabled = true;
-        settings = {
-          filter = "recidive";
-          logpath = "/var/log/fail2ban.log";
-          bantime = "1w";
-          findtime = "1d";
-          maxretry = 3;
-        };
-      };
-      
-      # HTTP authentication failures (Traefik)
-      http-auth = {
-        enabled = true;
-        settings = {
-          filter = "traefik-auth";
-          port = "80,443";
-          logpath = "/var/log/traefik/access.log";
-          maxretry = 5;
-          bantime = "1h";
-        };
-      };
-      
-      # HTTP scanning/attacks (Traefik)
-      http-botsearch = {
-        enabled = true;
-        settings = {
-          filter = "traefik-botsearch";
-          port = "80,443";
-          logpath = "/var/log/traefik/access.log";
-          maxretry = 2;
-          bantime = "2h";
-        };
-      };
-    };
-  };
-  
-  # Custom fail2ban filters for Traefik
-  environment.etc."fail2ban/filter.d/traefik-auth.conf".text = ''
-    [Definition]
-    failregex = ^<HOST> -.*"(GET|POST|HEAD|PUT|DELETE).*" (401|403) \d+.*$
-    ignoreregex =
-  '';
-  
-  environment.etc."fail2ban/filter.d/traefik-botsearch.conf".text = ''
-    [Definition]
-    failregex = ^<HOST> -.*"(GET|POST|HEAD|PUT|DELETE).*" 404 \d+.*$
-                ^<HOST> -.*"(GET|POST|HEAD|PUT|DELETE).*/(\.|wp-|php|admin|login|xmlrpc|\.env|\.git|\.aws|\.azure).*" \d+.*$
-    ignoreregex =
-  '';
-
-  # SSH hardening
-  services.openssh.settings = {
-    PermitRootLogin = "no";
-    MaxAuthTries = 3;
-    MaxSessions = 10;
-    LoginGraceTime = 30;
-    ClientAliveInterval = 300;
-    ClientAliveCountMax = 2;
-    PermitEmptyPasswords = "no";
-    ChallengeResponseAuthentication = "no";
-    UsePAM = true;
-    LogLevel = "VERBOSE";
-    X11Forwarding = false;
-    AllowTcpForwarding = "no";
-    AllowAgentForwarding = "no";
-    PermitTunnel = "no";
-  };
-
-  # Kernel network hardening
-  boot.kernel.sysctl = {
-    # IP Spoofing protection
-    "net.ipv4.conf.all.rp_filter" = 1;
-    "net.ipv4.conf.default.rp_filter" = 1;
-    
-    # Ignore ICMP broadcasts
-    "net.ipv4.icmp_echo_ignore_broadcasts" = 1;
-    
-    # Disable source routing
-    "net.ipv4.conf.all.accept_source_route" = 0;
-    "net.ipv4.conf.default.accept_source_route" = 0;
-    "net.ipv6.conf.all.accept_source_route" = 0;
-    "net.ipv6.conf.default.accept_source_route" = 0;
-    
-    # Disable redirects
-    "net.ipv4.conf.all.send_redirects" = 0;
-    "net.ipv4.conf.default.send_redirects" = 0;
-    
-    # SYN flood protection
-    "net.ipv4.tcp_syncookies" = 1;
-    "net.ipv4.tcp_max_syn_backlog" = 2048;
-    "net.ipv4.tcp_synack_retries" = 2;
-    "net.ipv4.tcp_syn_retries" = 5;
-    
-    # Log martian packets
-    "net.ipv4.conf.all.log_martians" = 1;
-    "net.ipv4.conf.default.log_martians" = 1;
-    
-    # Ignore redirects
-    "net.ipv4.conf.all.accept_redirects" = 0;
-    "net.ipv4.conf.default.accept_redirects" = 0;
-    "net.ipv4.conf.all.secure_redirects" = 0;
-    "net.ipv4.conf.default.secure_redirects" = 0;
-    "net.ipv6.conf.all.accept_redirects" = 0;
-    "net.ipv6.conf.default.accept_redirects" = 0;
-    
-    # Connection tuning
-    "net.core.somaxconn" = 4096;
-    "net.core.netdev_max_backlog" = 65536;
-    "net.ipv4.tcp_max_orphans" = 65536;
-    "net.ipv4.tcp_fin_timeout" = 15;
-    "net.ipv4.tcp_keepalive_time" = 300;
-    "net.ipv4.tcp_keepalive_probes" = 5;
-    "net.ipv4.tcp_keepalive_intvl" = 15;
-  };
-
-  # Audit logging
-  security.auditd.enable = true;
-  
-  # Fail2ban log directory
-  systemd.tmpfiles.rules = [
-    "d /var/log/fail2ban 0755 root root -"
-    "d /var/log/traefik 0755 root root -"
-  ];
-
  # Copy the NixOS configuration file and link it from the resulting system
  # (/run/current-system/configuration.nix). This is useful in case you
  # accidentally delete configuration.nix.
--- a/modules/nixos/security/README-ai-worker.md
+++ b/modules/nixos/security/README-ai-worker.md
@@ -1,105 +0,0 @@
-# AI Worker Restricted Access
-
-This module provides SSH access for the AI worker (hermes-agent) to run ollama benchmarks on the host.
-
-## Security Model
-
-The `ai-worker` user has:
-
-### Filesystem Access
- **Home directory**: `/home/ai-worker` (standard user home)
- **No bind mounts**: Cannot access `/home/gortium/infra` or other host files
- **Cannot access**: Any files outside standard system paths
-
-### Sudo Access
- **NONE**: ai-worker has no sudo privileges
- Cannot run `nh`, `nixos-rebuild`, `nixpkgs-fmt`, or `nix` with elevated permissions
-
-### Docker Access
- Member of `docker` group - can run `docker` and `docker exec` commands
- Primary use: `docker exec ollama ollama ...` for benchmarking
- Can run `docker exec --privileged ollama rocm-smi ...` for VRAM monitoring
-
-## Workflow: SSH + Docker Benchmarking
-
-The AI worker connects from the Hermes container to the host via SSH, runs ollama benchmarks, then returns to save results.
-
-### Example Workflow
-
-```bash
-# From Hermes container, SSH to host
-ssh -i /path/to/ssh/key ai-worker@host.docker.internal
-
-# On host, run ollama benchmarks via docker
-docker exec ollama ollama pull devstral-small-2:24b
-
-# Create test modelfile
-docker exec ollama bash -c 'cat <<EOF > /root/.ollama/test.modelfile
-FROM devstral-small-2:24b
-PARAMETER num_ctx 65536
-PARAMETER num_gpu 99
-PARAMETER flash_attn true
-EOF'
-
-# Create and test model
-docker exec ollama ollama create test-model -f /root/.ollama/test.modelfile
-docker exec ollama ollama run test-model "Write a Python async function"
-
-# Check VRAM usage
-docker exec --privileged ollama rocm-smi --showmeminfo vram
-
-# Cleanup
-docker exec ollama ollama rm test-model
-
-# Exit SSH, return to Hermes container
-exit
-
-# Save results in Hermes container
-# /opt/data/ai-optimizer/state.json
-# /opt/data/ai-optimizer/results.csv
-```
-
-## SSH Access
-
-Connect as:
-```bash
-ssh ai-worker@lazyworkhorse
-```
-
-The working directory will be `/home/ai-worker`. No infra repo access.
-
-## Verification
-
-Check ai-worker permissions:
-```bash
-# On the host, as root or gortium:
-sudo -u ai-worker sudo -l
-# Should show: no sudo access
-
-# Check docker group membership
-groups ai-worker
-# Should show: ai-worker docker
-```
-
-## Troubleshooting
-
-If ai-worker cannot run docker commands:
-```bash
-# Check docker group membership
-groups ai-worker
-
-# Verify ollama container is running
-docker ps | grep ollama
-
-# Test docker access
-sudo -u ai-worker docker exec ollama ollama list
-```
-
-If SSH connection fails:
-```bash
-# Check SSH key is authorized
-cat /home/ai-worker/.ssh/authorized_keys
-
-# Check SSH service
-systemctl status sshd
-```
--- a/modules/nixos/security/ai-worker-restricted.nix
+++ b/modules/nixos/security/ai-worker-restricted.nix
@@ -1,17 +0,0 @@
-{ config, pkgs, lib, ... }:
-
-with lib;
-
-{
-  options.services.aiWorkerAccess = mkOption {
-    type = types.bool;
-    default = false;
-    description = "Enable AI worker SSH access with docker group membership for ollama benchmarking";
-  };
-
-  config = mkIf config.services.aiWorkerAccess {
-    # ai-worker is member of docker group - can run docker commands via SSH
-    # No bind mounts, no sudo access - docker-only for ollama benchmarking
-    users.groups.docker.members = [ "ai-worker" ];
-  };
-}
--- a/modules/nixos/services/hyperspace.nix
+++ b/modules/nixos/services/hyperspace.nix
@@ -0,0 +1,235 @@
+{ config, lib, pkgs, ... }:
+
+with lib;
+
+let
+  cfg = config.services.hyperspace;
+
+  # Hyperspace CLI release from github.com/hyperspaceai/aios-cli
+  # The binary bundles Node.js runtime + llama.cpp + sidecars (~914MB)
+  # It auto-updates via `hyperspace update` post-install
+  hyperspacePkg = pkgs.stdenv.mkDerivation rec {
+    pname = "hyperspace";
+    version = cfg.release;
+
+    src = pkgs.fetchurl {
+      url = "https://github.com/hyperspaceai/aios-cli/releases/download/v${version}/aios-cli-x86_64-unknown-linux-gnu.tar.gz";
+      hash = "sha256-f6fJ8t3exqtYwUD5j+WvD+Hm0oN/Eef0X+R9Rj23dE0=";
+    };
+
+    sourceRoot = ".";
+
+    installPhase = ''
+      mkdir -p $out/bin $out/lib/hyperspace
+
+      # Main CLI binary
+      cp aios-cli $out/bin/hyperspace
+      chmod +x $out/bin/hyperspace
+
+      # Sidecar binaries
+      for f in _aios-cli pod-raft hyperspace-*; do
+        [ -f "$f" ] && install -m755 "$f" $out/lib/hyperspace/ || true
+      done
+
+      # WASM, native modules, Python shards
+      cp -r *.wasm $out/lib/hyperspace/ 2>/dev/null || true
+      cp -r *.node $out/lib/hyperspace/ 2>/dev/null || true
+      mkdir -p $out/lib/hyperspace/python
+      cp -r python/* $out/lib/hyperspace/python/ 2>/dev/null || true
+
+      # Skills directory
+      mkdir -p $out/share/hyperspace
+      cp -r skills $out/share/hyperspace/ 2>/dev/null || true
+
+      # Set HYPERSPACE_PATH so the binary finds sidecars
+      wrapProgram $out/bin/hyperspace \
+        --set HYPERSPACE_PATH "$out/lib/hyperspace" \
+        --set HYPERSPACE_SKILLS_DIR "$out/share/hyperspace/skills"
+    '';
+
+    nativeBuildInputs = with pkgs; [ makeWrapper ];
+
+    meta = {
+      description = "Hyperspace CLI — P2P mesh AI inference network (Pods)";
+      longDescription = ''
+        Hyperspace Pods let multiple machines pool their GPUs into one private
+        AI cluster. Install the CLI, create a pod, share an invite link — your
+        machines form a P2P mesh and can run models split across all connected
+        GPUs. Exposes an OpenAI-compatible API for use with Cursor, Claude Code,
+        Aider, etc.
+      '';
+      homepage = "https://hyperspace.sh";
+      sourceProvenance = with lib; [ sourceTypes.binaryNativeCode ];
+      license = lib.licenses.unfree;
+      platforms = [ "x86_64-linux" ];
+      maintainers = [ ];
+    };
+  };
+
+in {
+  options.services.hyperspace = {
+    enable = mkEnableOption "Hyperspace P2P AI agent (Pods)";
+
+    release = mkOption {
+      type = types.str;
+      default = "5.45.30";
+      description = "Hyperspace CLI release version (from GitHub releases).";
+    };
+
+    user = mkOption {
+      type = types.str;
+      default = "ai-worker";
+      description = "System user to run the Hyperspace agent.";
+    };
+
+    apiPort = mkOption {
+      type = types.port;
+      default = 8080;
+      description = "Port for the OpenAI-compatible API server.";
+    };
+
+    autoStart = mkOption {
+      type = types.bool;
+      default = true;
+      description = "Auto-start the Hyperspace agent on boot.";
+    };
+
+    openFirewall = mkOption {
+      type = types.bool;
+      default = true;
+      description = "Open firewall ports for P2P traffic (libp2p 4001, chain 30301, API).";
+    };
+
+    profile = mkOption {
+      type = types.enum [ "auto" "full" "inference" "embedding" "relay" "storage" ];
+      default = "auto";
+      description = ''
+        Agent profile:
+        - auto: auto-detect hardware
+        - full: all 9 capabilities
+        - inference: GPU inference only
+        - embedding: CPU embedding only
+        - relay: lightweight relay
+        - storage: storage + memory
+      '';
+    };
+
+    extraArgs = mkOption {
+      type = types.listOf types.str;
+      default = [ ];
+      description = "Extra arguments passed to `hyperspace start`.";
+    };
+
+    dataDir = mkOption {
+      type = types.str;
+      default = "/var/lib/hyperspace";
+      description = "Data directory for agent state (models, config, logs).";
+    };
+  };
+
+  config = mkIf cfg.enable {
+    # Ensure the service user exists
+    users.users.${cfg.user} = {
+      isSystemUser = true;
+      group = cfg.user;
+      home = "/home/${cfg.user}";
+      createHome = true;
+      shell = pkgs.bash;
+    };
+    users.groups.${cfg.user} = { };
+
+    # Install the hyperspace binary
+    environment.systemPackages = [ hyperspacePkg ];
+
+    # Data directories
+    systemd.tmpfiles.rules = [
+      "d ${cfg.dataDir} 0755 ${cfg.user} ${cfg.user} -"
+      "d ${cfg.dataDir}/models 0755 ${cfg.user} ${cfg.user} -"
+      "d ${cfg.dataDir}/data 0755 ${cfg.user} ${cfg.user} -"
+    ];
+
+    # Systemd service: runs the Hyperspace agent as a system daemon
+    systemd.services.hyperspace = {
+      description = "Hyperspace P2P AI Agent — Pods mesh cluster";
+      documentation = [ "https://hyperspace.sh" "https://github.com/hyperspaceai/aios-cli" ];
+      after = [ "network-online.target" ];
+      wants = [ "network-online.target" ];
+      wantedBy = mkIf cfg.autoStart [ "multi-user.target" ];
+
+      environment = {
+        HYPERSPACE_HOME = cfg.dataDir;
+        HYPERSPACE_API_PORT = toString cfg.apiPort;
+        HYPERSPACE_PATH = "${hyperspacePkg}/lib/hyperspace";
+      };
+
+      path = with pkgs; [ bash curl nodejs ];
+
+      script = ''
+        # Wait for network connectivity before starting
+        ${pkgs.bash}/bin/bash -c '
+          for i in $(seq 1 30); do
+            ping -c 1 -W 1 8.8.8.8 >/dev/null 2>&1 && break
+            sleep 2
+          done
+        ' || true
+
+        exec ${hyperspacePkg}/bin/hyperspace start \
+          --profile ${cfg.profile} \
+          --api-port ${toString cfg.apiPort} \
+          ${lib.escapeShellArgs cfg.extraArgs}
+      '';
+
+      serviceConfig = {
+        Type = "exec";
+        User = cfg.user;
+        Group = cfg.user;
+        WorkingDirectory = cfg.dataDir;
+        Restart = "always";
+        RestartSec = 10;
+        TimeoutStartSec = 180;
+        TimeoutStopSec = 30;
+        KillMode = "mixed";
+
+        # File limits for network-heavy P2P agent
+        LimitNOFILE = 65536;
+        LimitNPROC = 4096;
+
+        # GPU access — AMD MI50 (ROCm) through /dev/kfd and /dev/dri
+        DeviceAllow = [
+          "/dev/kfd" "rw"
+          "/dev/dri" "rw"
+        ];
+        SupplementaryGroups = [ "video" "render" ];
+
+        # Security hardening
+        NoNewPrivileges = true;
+        ProtectSystem = "strict";
+        ProtectHome = true;
+        PrivateTmp = true;
+        PrivateDevices = false;  # needs GPU access
+        ReadWritePaths = [
+          cfg.dataDir
+          "/tmp"
+        ];
+        BindPaths = [
+          # GPU devices for AMD MI50
+          "/dev/kfd"
+          "/dev/dri"
+        ];
+      };
+    };
+
+    # Firewall: open P2P ports for the mesh network
+    networking.firewall = mkIf cfg.openFirewall {
+      allowedTCPPorts = [
+        4001    # libp2p P2P (agent gossip, DHT, circuits)
+        30301   # Chain P2P (blockchain consensus)
+        cfg.apiPort  # OpenAI-compatible API
+      ];
+      allowedUDPPorts = [
+        4001    # libp2p QUIC transport
+        30301   # Chain UDP discovery
+      ];
+    };
+  };
+}
--- a/modules/nixos/services/ollama_init_custom_models.nix
+++ b/modules/nixos/services/ollama_init_custom_models.nix
@@ -1,87 +1,67 @@
 { pkgs, ... }: {
  systemd.services.init-ollama-model = {
    description = "Initialize LLM models with extra context in Ollama Docker";
-    
-    # On s'assure que Docker tourne avant de lancer ce script
-    after = [ "docker.service" ];
+    after = [ "docker-ollama.service" ];
    wantedBy = [ "multi-user.target" ];
-    
    script = ''
-      # Fonction de création asynchrone pour ne pas bloquer le démarrage
-      (
-        echo "Starting asynchronous Ollama initialization..."
-        
-        # Attente d'Ollama (maximum 120 secondes pour éviter une boucle infinie)
-        TIMEOUT=60
-        COUNT=0
-        while ! ${pkgs.curl}/bin/curl -s -f http://127.0.0.1:11434/api/tags > /dev/null; do
-          if [ $COUNT -ge $TIMEOUT ]; then
-            echo "Ollama did not become ready in time. Exiting."
-            exit 1
-          fi
-          echo "Waiting for Ollama API to be reachable..."
-          sleep 5
-          COUNT=$((COUNT + 5))
-        done
+      # Wait for Ollama
+      while ! ${pkgs.curl}/bin/curl -s http://localhost:11434/api/tags > /dev/null; do
+        sleep 2
+      done

-        create_model_if_missing() {
-          local model_name=$1
-          local base_model=$2
+      create_model_if_missing() {
+        local model_name=$1
+        local base_model=$2
+        if ! ${pkgs.docker}/bin/docker exec ollama ollama list | grep -q "$model_name"; then
+          echo "$model_name not found, creating from $base_model..."
          
-          # Vérification robuste via l'API HTTP d'Ollama plutôt que docker exec (évite les conflits de tty)
-          if ! ${pkgs.curl}/bin/curl -s http://127.0.0.1:11434/api/tags | ${pkgs.jq}/bin/jq -e ".models[] | select(.name == \"$model_name\")" > /dev/null; then
-            echo "$model_name not found, creating from $base_model..."
-            
-            # Utilisation d'un fichier temporaire sur l'hôte pour l'injecter proprement dans Docker
-            TMP_FILE=$(mktemp)
-            cat <<EOF > "$TMP_FILE"
+          # We use a custom TEMPLATE block to strip the 'currentDate' function 
+          # which is unsupported in Ollama 0.5.7 but present in Devstral's default manifest.
+          ${pkgs.docker}/bin/docker exec ollama sh -c "cat <<EOF > /root/.ollama/$model_name.modelfile
 FROM $base_model
-TEMPLATE """{{- if .System }}
+TEMPLATE \"\"\"{{- if .System }}
 [SYSTEM_PROMPT]
 {{ .System }}
 [/SYSTEM_PROMPT]
 {{- end }}
 {{- range .Messages }}
-{{- if eq .Role "user" }}
+{{- if eq .Role \"user\" }}
 [INST]
 {{ .Content }}
 [/INST]
-{{- else if eq .Role "assistant" }}
+{{- else if eq .Role \"assistant\" }}
 {{ .Content }}
 {{- end }}
-{{- end }}"""
+{{- end }}\"\"\"
 PARAMETER num_ctx 131072
 PARAMETER num_predict 4096
 PARAMETER num_keep 1024
 PARAMETER repeat_penalty 1.1
 PARAMETER top_k 40
-PARAMETER stop "[INST]"
-PARAMETER stop "[/INST]"
-PARAMETER stop "</s>"
-EOF
+PARAMETER stop \"[INST]\"
+PARAMETER stop \"[/INST]\"
+PARAMETER stop \"</s>\"
+EOF"
+          ${pkgs.docker}/bin/docker exec ollama ollama create "$model_name" -f "/root/.ollama/$model_name.modelfile"
+          ${pkgs.docker}/bin/docker exec ollama rm "/root/.ollama/$model_name.modelfile"
+        else
+          echo "$model_name already exists, skipping."
+        fi
+      }

-            # Copie et création dans le conteneur
-            ${pkgs.docker}/bin/docker cp "$TMP_FILE" ollama:/tmp/model.modelfile
-            ${pkgs.docker}/bin/docker exec ollama ollama create "$model_name" -f /tmp/model.modelfile
-            ${pkgs.docker}/bin/docker exec ollama rm /tmp/model.modelfile
-            rm -f "$TMP_FILE"
-          else
-            echo "$model_name already exists, skipping."
-          fi
-        }
-
-        # Create Nemotron
-        create_model_if_missing "nemotron-3-nano:30b-128k" "nemotron-3-nano:30b"
-        
-        # Create Devstral
-        create_model_if_missing "devstral-small-2:24b-128k" "devstral-small-2:24b" 
-        
-      ) &
+      # Create Nemotron
+      create_model_if_missing "nemotron-3-nano:30b-128k" "nemotron-3-nano:30b"
+      
+      # Create Devstral
+      create_model_if_missing "devstral-small-2:24b-128k" "devstral-small-2:24b" 
+      
+      # create_model_if_missing "qwen2.5-coder:32b-128k" "qwen2.5-coder:32b"
+      
+      # create_model_if_missing "mistral-large-planner:123b" "mistral-large:123b-instruct-v2407-q4_K_S"
    '';
-
    serviceConfig = {
-      Type = "forking"; # Permet à systemd de savoir que le script passe en arrière-plan via '&'
-      User = "root";
+      Type = "oneshot";
+      RemainAfterExit = true;
    };
  };
 }
--- a/modules/nixos/services/staging-vm.nix
+++ b/modules/nixos/services/staging-vm.nix
@@ -1,363 +0,0 @@
-{ config, pkgs, lib, ... }:
-
-with lib;
-
-let
-  cfg = config.services.stagingVm;
-
-  # ── pr-test-vm helper script ──────────────────────────────────────────
-  pr-test-vm = pkgs.writeShellScriptBin "pr-test-vm" ''
-    set -euo pipefail
-
-    LIBVIRT_URI="qemu:///system"
-    VM_DIR="${cfg.dataPath}"
-    NETWORK="default"
-    SCRIPT_NAME="$(basename "$0")"
-
-    usage() {
-      cat <<EOF
-    Usage: $SCRIPT_NAME <command> [options]
-
-    Commands:
-      build   <nixos-config> [--name <name>]   Build VM image from a NixOS config
-      start   <vm-name>                         Start a VM
-      stop    <vm-name>                         Gracefully shut down a VM
-      destroy <vm-name>                         Force-power-off and undefine a VM
-      ssh     [user@]<vm-name>                  SSH into a running VM
-      console <vm-name>                         Connect to VM serial console
-      list                                      List all staging VMs
-      status  <vm-name>                         Show VM status
-
-    Examples:
-      $SCRIPT_NAME build ./vm-config.nix --name my-test
-      $SCRIPT_NAME start my-test
-      $SCRIPT_NAME ssh root@my-test
-    EOF
-      exit 1
-    }
-
-    # Find the VM's IP address from the DHCP lease
-    vm_ip() {
-      local name="$1"
-      local mac
-      mac=$(${pkgs.libvirt}/bin/virsh -c "$LIBVIRT_URI" domiflist "$name" 2>/dev/null \
-        | ${pkgs.gawk}/bin/awk 'NR>2 && $1 ~ /^vnet/ {print $NF; exit}')
-      [ -z "$mac" ] && { echo "error: cannot find MAC for VM '$name'"; exit 1; }
-
-      local ip
-      ip=$(${pkgs.libvirt}/bin/virsh -c "$LIBVIRT_URI" net-dhcp-leases "$NETWORK" 2>/dev/null \
-        | ${pkgs.gawk}/bin/awk -v mac="$mac" '$0 ~ mac {gsub(/-.*/, "", $3); print $3; exit}')
-      [ -z "$ip" ] && { echo "error: no DHCP lease found for VM '$name' (MAC: $mac)"; exit 1; }
-      echo "$ip"
-    }
-
-    case "''${1:-help}" in
-      build)
-        shift
-        CONFIG="''${1:?Missing NixOS config path}"
-        VM_NAME="''${2:-}"
-        [ -f "$CONFIG" ] || { echo "error: config file not found: $CONFIG"; exit 1; }
-
-        # Extract name from --name flag or config basename
-        if [ "''${2:-}" = "--name" ] && [ -n "''${3:-}" ]; then
-          VM_NAME="$3"
-        elif [ -z "$VM_NAME" ] || [ "''${VM_NAME#--}" != "$VM_NAME" ]; then
-          VM_NAME="$(basename "$CONFIG" .nix)"
-        fi
-
-        BUILD_DIR="$VM_DIR/$VM_NAME"
-        echo "==> Building VM '$VM_NAME' from config: $CONFIG"
-        mkdir -p "$BUILD_DIR"
-
-        # Build the NixOS VM derivation
-        nix build --no-link -f "$CONFIG" vm 2>&1 || {
-          echo "Trying flake build..."
-          nix build "''${CONFIG%/.nix}#nixosConfigurations.$VM_NAME.config.system.build.vm" --no-link 2>&1 || {
-            echo "error: failed to build VM (tried both import and flake)"
-            exit 1
-          }
-        }
-
-        echo "==> Build complete. Run 'pr-test-vm start $VM_NAME' to launch."
-        ;;
-
-      start)
-        VM_NAME="''${1:?Missing VM name}"
-        IMAGE="$VM_DIR/$VM_NAME/disk-image.qcow2"
-        [ -f "$IMAGE" ] || { echo "error: no disk image found at $IMAGE. Build first."; exit 1; }
-
-        # Check if already running
-        STATE=$(${pkgs.libvirt}/bin/virsh -c "$LIBVIRT_URI" domstate "$VM_NAME" 2>/dev/null || echo "undefined")
-        if [ "$STATE" = "running" ]; then
-          echo "VM '$VM_NAME' is already running."
-          exit 0
-        fi
-
-        echo "==> Starting VM '$VM_NAME'..."
-
-        # Undefine if defined but not running
-        if [ "$STATE" != "undefined" ]; then
-          ${pkgs.libvirt}/bin/virsh -c "$LIBVIRT_URI" undefine "$VM_NAME" 2>/dev/null || true
-        fi
-
-        # Define and start with virt-install
-        ${pkgs.virt-manager}/bin/virt-install \
-          --connect "$LIBVIRT_URI" \
-          --name "$VM_NAME" \
-          --memory "${toString cfg.memory}" \
-          --vcpus "${toString cfg.vcpus}" \
-          --disk "$IMAGE",bus=virtio \
-          --import \
-          --network network="$NETWORK",model=virtio \
-          --graphics none \
-          --console pty,target_type=virtio \
-          --serial pty \
-          --memballoon virtio \
-          --rng /dev/urandom \
-          --noautoconsole \
-          --os-variant detect=on,name=generic
-
-        echo "==> VM '$VM_NAME' started. Get IP with: pr-test-vm status $VM_NAME"
-        ;;
-
-      stop)
-        VM_NAME="''${1:?Missing VM name}"
-        echo "==> Stoping VM '$VM_NAME'..."
-        ${pkgs.libvirt}/bin/virsh -c "$LIBVIRT_URI" shutdown "$VM_NAME" 2>/dev/null && {
-          echo "Waiting for VM to shut down..."
-          for i in $(seq 1 30); do
-            STATE=$(${pkgs.libvirt}/bin/virsh -c "$LIBVIRT_URI" domstate "$VM_NAME" 2>/dev/null || echo "undefined")
-            [ "$STATE" != "running" ] && { echo "VM stopped."; exit 0; }
-            sleep 2
-          done
-          echo "warning: VM did not shut down gracefully, use 'destroy' for force"
-        } || {
-          echo "VM '$VM_NAME' not running or does not exist."
-        }
-        ;;
-
-      destroy)
-        VM_NAME="''${1:?Missing VM name}"
-        echo "==> Destroying VM '$VM_NAME'..."
-        ${pkgs.libvirt}/bin/virsh -c "$LIBVIRT_URI" destroy "$VM_NAME" 2>/dev/null || true
-        ${pkgs.libvirt}/bin/virsh -c "$LIBVIRT_URI" undefine "$VM_NAME" 2>/dev/null || true
-        echo "==> VM '$VM_NAME' destroyed and undefined."
-        ;;
-
-      ssh)
-        TARGET="''${1:?Usage: $SCRIPT_NAME ssh [user@]<vm-name>}"
-        # Split user@hostname if present
-        if echo "$TARGET" | ${pkgs.gnugrep}/bin/grep -q '@'; then
-          USER="''${TARGET%@*}"
-          VM_NAME="''${TARGET#*@}"
-        else
-          VM_NAME="$TARGET"
-          USER=""
-        fi
-
-        IP=$(vm_ip "$VM_NAME") || exit 1
-        if [ -n "$USER" ]; then
-          exec ${pkgs.openssh}/bin/ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null "''${USER}@''${IP}"
-        else
-          exec ${pkgs.openssh}/bin/ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null "$IP"
-        fi
-        ;;
-
-      console)
-        VM_NAME="''${1:?Missing VM name}"
-        exec ${pkgs.libvirt}/bin/virsh -c "$LIBVIRT_URI" console "$VM_NAME"
-        ;;
-
-      list)
-        echo "Staging VMs:"
-        ${pkgs.libvirt}/bin/virsh -c "$LIBVIRT_URI" list --all
-        echo ""
-        echo "Active networks:"
-        ${pkgs.libvirt}/bin/virsh -c "$LIBVIRT_URI" net-list
-        echo ""
-        echo "Storage pools:"
-        ${pkgs.libvirt}/bin/virsh -c "$LIBVIRT_URI" pool-list
-        ;;
-
-      status)
-        VM_NAME="''${1:?Missing VM name}"
-        echo "VM: $VM_NAME"
-        STATE=$(${pkgs.libvirt}/bin/virsh -c "$LIBVIRT_URI" domstate "$VM_NAME" 2>/dev/null || echo "not found")
-        echo "State: $STATE"
-        if [ "$STATE" = "running" ]; then
-          IP=$(vm_ip "$VM_NAME" 2>/dev/null || echo "N/A")
-          echo "IP: $IP"
-          ${pkgs.libvirt}/bin/virsh -c "$LIBVIRT_URI" dommemstat "$VM_NAME" 2>/dev/null | head -3 || true
-        fi
-        ;;
-
-      help|--help|-h)
-        usage
-        ;;
-
-      *)
-        usage
-        ;;
-    esac
-  '';
-in
-{
-  options.services.stagingVm = {
-    enable = mkOption {
-      type = types.bool;
-      default = false;
-      description = "Enable KVM/libvirt staging VM for compose PR testing";
-    };
-
-    vmName = mkOption {
-      type = types.str;
-      default = "compose-test-vm";
-      description = "Name of the staging VM";
-    };
-
-    memory = mkOption {
-      type = types.str;
-      default = "4096";
-      description = "RAM allocated to the staging VM (MB)";
-    };
-
-    vcpus = mkOption {
-      type = types.int;
-      default = 2;
-      description = "Number of vCPUs for the staging VM";
-    };
-
-    storagePath = mkOption {
-      type = types.str;
-      default = "/var/lib/libvirt/images";
-      description = "Path for libvirt storage pool";
-    };
-
-    dataPath = mkOption {
-      type = types.str;
-      default = "/var/lib/staging-vm";
-      description = "Path for compose test data (PR checkouts, test results)";
-    };
-  };
-
-  config = mkIf cfg.enable {
-    # ── libvirtd with QEMU/KVM ──────────────────────────────────────────
-    virtualisation.libvirtd = {
-      enable = true;
-      qemu = {
-        package = pkgs.qemu_kvm;
-        runAsRoot = true;
-        swtpm.enable = true;
-        ovmf = {
-          enable = true;
-          packages = [ pkgs.OVMF ];
-        };
-      };
-    };
-
-    # ── System packages ─────────────────────────────────────────────────
-    environment.systemPackages = with pkgs; [
-      libvirt                # virsh, virt-admin
-      qemu_kvm               # QEMU/KVM
-      swtpm                  # Software TPM
-      OVMF                   # UEFI firmware for VMs
-      virt-manager           # GUI + virt-install
-      virt-viewer            # SPICE/VNC viewer
-      libguestfs             # virt-customize, guestfish
-      cdrtools               # genisoimage for cloud-init ISOs
-      jq                     # JSON parsing
-      gawk                   # awk for DHCP lease parsing
-      gnugrep                # grep
-    ];
-
-    # ── User permissions ────────────────────────────────────────────────
-    users.users.gortium.extraGroups = [ "libvirtd" ];
-
-    # ── Directories ─────────────────────────────────────────────────────
-    systemd.tmpfiles.rules = [
-      "d ${cfg.storagePath} 0755 root root -"
-      "d ${cfg.dataPath} 0755 root root -"
-    ];
-
-    # ── Default NAT network (192.168.122.0/24) ──────────────────────────
-    # Define the default libvirt NAT network using virsh postStart hook
-    systemd.services.libvirtd = {
-      postStart = ''
-        set -e
-        # Define the NAT network if it doesn't exist
-        ${pkgs.libvirt}/bin/virsh -c qemu:///system net-info default 2>/dev/null && {
-          echo "Network 'default' already exists"
-        } || {
-          echo "Defining default NAT network (192.168.122.0/24)..."
-          ${pkgs.libvirt}/bin/virsh -c qemu:///system net-define /etc/libvirt/qemu/networks/default.xml
-        }
-        ${pkgs.libvirt}/bin/virsh -c qemu:///system net-autostart default 2>/dev/null || true
-        # Start the network if not active
-        STATE=$(${pkgs.libvirt}/bin/virsh -c qemu:///system net-state default 2>/dev/null || echo "inactive")
-        if [ "$STATE" != "active" ]; then
-          ${pkgs.libvirt}/bin/virsh -c qemu:///system net-start default 2>/dev/null || true
-        fi
-        echo "Default network ready."
-      '';
-    };
-
-    # Define the default network as an XML config file
-    environment.etc."libvirt/qemu/networks/default.xml" = {
-      text = ''
-        <network>
-          <name>default</name>
-          <forward mode='nat'/>
-          <bridge name='virbr0' stp='on' delay='0'/>
-          <ip address='192.168.122.1' netmask='255.255.255.0'>
-            <dhcp>
-              <range start='192.168.122.2' end='192.168.122.254'/>
-            </dhcp>
-          </ip>
-        </network>
-      '';
-      mode = "0644";
-    };
-
-    # ── Storage pool ────────────────────────────────────────────────────
-    systemd.services.libvirtd.postStart = mkAfter ''
-      set -e
-      ${pkgs.libvirt}/bin/virsh -c qemu:///system pool-info default 2>/dev/null && {
-        echo "Storage pool 'default' already exists"
-      } || {
-        echo "Defining storage pool at ${cfg.storagePath}..."
-        ${pkgs.libvirt}/bin/virsh -c qemu:///system pool-define-as \
-          --name default --type dir --target "${cfg.storagePath}"
-      }
-      ${pkgs.libvirt}/bin/virsh -c qemu:///system pool-autostart default 2>/dev/null || true
-      STATE=$(${pkgs.libvirt}/bin/virsh -c qemu:///system pool-state default 2>/dev/null || echo "inactive")
-      if [ "$STATE" != "running" ]; then
-        ${pkgs.libvirt}/bin/virsh -c qemu:///system pool-build default 2>/dev/null || true
-        ${pkgs.libvirt}/bin/virsh -c qemu:///system pool-start default 2>/dev/null || true
-      fi
-      echo "Storage pool ready."
-    '';
-
-    # ── Firewall rules for libvirt guests ───────────────────────────────
-    networking.firewall = {
-      trustedInterfaces = [ "virbr0" ];
-
-      extraCommands = mkAfter ''
-        # Allow DHCP (port 67/68) and DNS (port 53) to libvirt guests
-        iptables -I INPUT -i virbr0 -p udp --dport 67:68 -j ACCEPT 2>/dev/null || true
-        iptables -I INPUT -i virbr0 -p tcp --dport 53 -j ACCEPT 2>/dev/null || true
-        iptables -I INPUT -i virbr0 -p udp --dport 53 -j ACCEPT 2>/dev/null || true
-
-        # Allow forwarding between the bridge and the outside world
-        iptables -I FORWARD -i virbr0 -o virbr0 -j ACCEPT 2>/dev/null || true
-        iptables -I FORWARD -o virbr0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT 2>/dev/null || true
-        iptables -I FORWARD -i virbr0 -j ACCEPT 2>/dev/null || true
-
-        # NAT for guest outbound traffic
-        iptables -t nat -I POSTROUTING -s 192.168.122.0/24 -j MASQUERADE 2>/dev/null || true
-      '';
-    };
-
-    # ── pr-test-vm helper script ────────────────────────────────────────
-    environment.systemPackages = [ pr-test-vm ];
-  };
-}
--- a/secrets/containers.env.age
+++ b/secrets/containers.env.age
@@ -1,36 +1,34 @@
 -----BEGIN AGE ENCRYPTED FILE-----
-YWdlLWVuY3J5cHRpb24ub3JnL3YxCi0+IHNzaC1lZDI1NTE5IEdoTUQ4QSBWNEpt
-cGFNeVBBaDRqb3pLSEZGQW0wb3VmVnBoZCswUFkzbnBLUnJ0QTNrClRqVkk4RUVO
-d29KYjd5YUcwankvaTFmVHUxQVpDT2ZYWHRaY3JXTUtQMU0KLT4gKXBtQ3UsXi1n
-cmVhc2UgNnwxYCBXVyA/KCQmIHt9NAo3OTZVUHR2UXkvaEFwY0ZBdEJsaFpsbHJ0
-cklKcDVHcEdWMEdPSkpnN0FiRU43RW5hUWFMdjR3WFRRSFBLSGlmClM3cTNJWlNM
-TExkdHdXUHJISkNIaE1TTUxUc1NUWkV1a09HeFU3bVZwQXMKLS0tIGhOcXFTUElS
-azJJNnEreUhMWTJBaVZGSTJPRUFqQkVYS01KRENUVVpZSDgK8+8onFejroBo7MeO
-dW+so4lOsq4zJKn3f0cxmCFg1f0X8zt6h4Uc3A5Cvr1uU+6yw1FWmJ7xa3jJz3lO
-EEaKQJXYC+xIIKGcA7qILa0SFp4a/4OuYjcg27HrlPhg7u5wDhQrd0LdVEe1Xngp
-ZivX7P7HwIna3X8C+TL+K2v/AG2N/z86cdKfRvxyMKNbHhYw+CfHEnWgh8tJ++4h
-G9evNniuNqte6cQaRe7jODfPNW4FuY/Sb7barlJ/M9iAQdYAdyLAzU1LABeHeUfD
-wtHjxy9DUZ55Vg8bB8M2JJU9MkoRT4ewiVd9LeC1GWeVmKsm93wsmrov714i7U2j
-wHtDkjqEF2MmzuQc18sjNaAHiwz8j6o5xU2L/Q4+Q707yISWG7RGZYh389Cr1rnw
-siUq/Vunqw2wk13+J/4vu9nqt5mMktBaCtp+QiWIurjwB5LUAyChrSm+dg5lb0Mt
-UhSc0lq1+E3vxAXM2Hmk+vP86VD+6WJvAU82VFApF1s6zG2FU1/AcOVVf54nan/q
-f+rgSFfASHQCYSblUJHyEtwLNsWEmTGmOEn1buUKD/H0zatPQnc0rYpjlx2V0Sjd
-6yB5+wPrZ0AkN1pjcsPKOv8Kaog2DzqIjib+SaSTaRxWHQEb9uzvaReAcYI5HOpE
-gkC040HN33BItATbo4+hz70Im8Ni/VXD+g6yzM6Hj1hJL+PinTKeg5keQRFIZjMx
-grzievB2wVBBgLgN3qMdTFmpplaL7iL702JjXZUTTK9Izp+9wiCsV1fTa53FWDht
-ylFL5SWElqXjK+QBXxAe+Jk6VQov5HI21YDXL67S554ABeRok23wxrQ31TCI4xq9
-PQV7VtNRjyVud7S29m3OwpWOsgTZhn+JclHj2v4bNJzJkJnZRTmcvGPktzRI5+R4
-e5vxVhGnJDzI71txaHl8+xS1lu9VzCQUrxX6TXyTRV4KjIOz0g06JOBgmBRBvJca
-7MZbC65xpisl/gyLRbgkVga3t94dPV+dpZsn8eq6427IyRbKslJefatggR9//c6I
-5N5fl0fR3gJQMB+HRbipBH2YsdbdWJyb4Nn6STZxIfrqoG/xC6C1raF0xK7hUx6i
-4DUDSPohM8fOIswQPfE+FH3eygfzu/Ln5+ghsgHTEhgFvmgMvyxaAt6kHIzIUhMX
-M3dASr4VPDpIXuXsRWwYLEifhzxsuvwVxfwtsnCaR6XKijsYECWGDdYOWHdleeqx
-wDPhxEesfFVhKxhrKY9Ir8k9/FFBKQU/3GjW4+SMAg5Al1YEzxshP9vKuVcsei7W
-JDwAwotNXaCm6NBckiyZJE53ou6+gckPY7V9cOfnuH74Z9ywkFzB3HW3ZlonaGyM
-oGmLGcccavFtyhg5s/As4i6X8ARIpDiwe59Pn3GNXMctySqIrrr2ogUoXgrfFCie
-6GOTdeMW7GeOSdJUxCofghlspS/nq01Og77VI/beWYrIwLubSka6Zaltww9zgObk
-/FGEMgFkEpq7iyCvYSPA8F46pJKvnMP3S84AWCPmcTcHeg4lwGPvs6btexXBGdoz
-nkCyq7wdH5Nngm7jUbl88LtaLZPAQkuqXphBVTnrF9Ofbnb4iRZ2Op4xpx9rGyvx
-mO6UEhL6V1i2YZFNkNMg/W8aoMiUgBdqbkxaxblT9L0aNdlFU9+LbWYolURVEadd
-Qjv0Z1gMA+tsuBbVszwsMfneZ5+B9Q==
+YWdlLWVuY3J5cHRpb24ub3JnL3YxCi0+IHNzaC1lZDI1NTE5IEdoTUQ4QSBOL29w
+eGk1N2xxTHJtaUEvWWZmbkh1bk11Tjk3anNnMDB1cCtPYUMzdTNJCkdhQ08vblNG
+UlV1K2xVTGZVTzFWYXAzcjZaMWs0RTFWdStKSmlSTURvK1EKLT4gLC1zKU8zVkgt
+Z3JlYXNlIFUiXFcpS302IHByVn5jOy0gRDMKQjV3SHpDWUIybGFyQUg3ZlR0R2hV
+eWM3SFlCVW5mdlpBVUF3a0xpNlZCeGNUd1oxTTlkc1RkTXdZS0lFTmN3Ci0tLSA3
+VlBqM1VLWllZc0JnOTMvUFRjMU13OTdzMmhsdGJubkk5eGpERVVLYUk4Cnzh5UbU
+FlgqpM8jkJ6XlsaIDCw/G3D6uJ/GRJW4gIekuhAUxpZJrc8eOA8ZuHfGrBbH3acV
+tVafX5F0Kr2oOblqZ6gduZOUS52KmWH8stiBJM+e5ZZ7zRQVE4PJUKUPCzi+WdcH
+zr295T//FOdicrYHdsjfziKEHzBtUCFiATW05+O2zMjYjO6cPzePcCzPWinwiID6
+V+f6ngfkkQaj3wBGkzaieQJzRcdSwky21aVhGCCX/bvqx61iW2d5QAKxGbtQ2RcG
+X1okr+xunAM94nzDMv46vyN97KxY7cZd4pAaOxoICc2Tfhtw6F+iS6QkQh1odJzO
+7ZH+sSQCvndG+8z9shXGiHalASF5tdguM+JlEvAGljcaiAUtsQWxr9CoWiEkC6c6
+NCaECSYO8Il+SXBQnSZSGJSNDhuPYCYrsjXGSAONFixuyeslAkq9x2WUaUS4H063
+1QvRF7XO2tBPtgCLsSjdiGp0h+ImUaGdu6fDR7zrDsGsaAFCSFeH/rGNNXRQ2vP2
+CSfPfDDCqpUSCn0WuA30BtaPLxGmZT6OjFevKzYMNDmdeq9ia/q8K0hmjLUBdN3k
+tdYWbwoaf4gYbUWxSleD768b0Jgxss9Vod+sFQ+NYRksdGIeyND+aQIc312XehfA
+qHFBS8nlj7eUF5bdvCYQ64z741mH4cNlGxyjPBH1x8FHnEOocJXYt1l2AZSRJmJA
+c3z0QGXyuCbsrLBXWK1EKa/Juo4PGGsEVoLRhwJAQy9+i1JN0yrfRvSPyzvD4px6
+wRPzlZ80MQdb2lv84WS/zcOEZmZzlLntszTRRdIfAsuaavP2Rquh4rEXABYeTZwp
+5dem79s8bdW2nFsGMNz1OQKQwocyjYu1jJMHu6Gp7Ngdl1xyW7xfg0dezE1c0cIh
+xt1aLER9YJp4n5to5cOH16l3mjDHnAvABx38xE9loNL3399J/evw7LxpTYQ4v2Xv
+x8xnDHcqJ+deFSwyuUnMS5DkUeYuHmUl0Q2WYcfY+ibCmcgCb2ObTtuN1/ZxNYrL
+OKrnmfuSvBgyuIOj5e6uWW0+Zs8dHKXu2TgV8WignxOhl5zQgCpCBlqVfO0t+NCu
+Gi26hU/fhGWQ/1oQa3VkpGsypZbJpgQvfWxfcGHP/MMhnl01zzlP8/aexSY3pAxf
+fz9v0IVh6xxtu3zbiiVzUsXbfG7t+xY98jMphf4AS2mWva3GWVmhhu0lS3J3P+go
+YEEP4rOFHeU0Y1/6kLydTXvz4jMH0H92XQIzshd7vzQnEJPUPAzqRmw3LKYGgCI+
+wZEnxJ6ckqTkGBFnxTpy9LLllwmnz2Ky87nY3XAmqxlhb2Ap1XFAlfgszmGjc+Il
+KkIgoWQHTUm6QM9ta++oUTIDneOvxGd0zZsqoEhiC/7E01BNNZ6E58TeJU3fDlA3
+mX6n05XjwPRpgXZfayPoAgBlZc2H4KeiynxwNZ/dWu7qz7L6Ppk6Nvtly8giTbFx
+CA+tto7vq+D+CAEJ4bgyq4BCH4GL4APrhPcWp98Mko1WCiRTIKgkZxQCYvlg/LZq
+LNhMacP9T1qTvNC+yR1NEMiegE3APzk6CkDpVaO9+5f/sqifNPINCMothenI9ePw
+zjQLI3Mo1m73bkomytUZ7i1VstP5sEZ5LF72Sq7BpR3oQ3Gp0CAN9w==
 -----END AGE ENCRYPTED FILE-----
--- a/secrets/wireguard_preshared_key.age
+++ b/secrets/wireguard_preshared_key.age
@@ -1,9 +0,0 @@
-----BEGIN AGE ENCRYPTED FILE-----
-YWdlLWVuY3J5cHRpb24ub3JnL3YxCi0+IHNzaC1lZDI1NTE5IEdoTUQ4QSA3VG9Z
-MVFPVFc2VVJ3d0h0dmtBUnI3WHl2SzUxTkRZbjFCaGloWmV3dnd3ClcxdnVPeGd6
-SU4zR0Q0K1dtVjRRVHd0VW5XSFI0dVFpTjZnYk1DNjRxTVEKLT4gQzlgRy1ncmVh
-c2UKeUozOWgyUytSTVF0NjY2STBEb2VadwotLS0gblI3bmJCUWxxU3QrYTEyVFBI
-Snc4NC9rTkh0NnZYbUtxUE9hRWRkelpmMAq58fmH6cK13GeD7wGLxKmx10hmJeW4
-b7KqnCD1ZP7uG85s32xzVRwRG8RrG4xZo5nR9Mrtg1CoTSFfUGeFnf5xveN+Ej0X
-wDVB1LwC+Q==
-----END AGE ENCRYPTED FILE-----
--- a/secrets/wireguard_private_key.age
+++ b/secrets/wireguard_private_key.age
@@ -1,11 +0,0 @@
-----BEGIN AGE ENCRYPTED FILE-----
-YWdlLWVuY3J5cHRpb24ub3JnL3YxCi0+IHNzaC1lZDI1NTE5IEdoTUQ4QSA5dzVG
-WUNvT3NlRmcrWS81bzJqSWlTekVYaDFFTE10SkI2dEgzaGpxcUI4Cmk5Y0FGYTRZ
-K0NGYzY3VUp4aS9ZZGRmWTgybDJFUURva2pZNmVOS3QxdEUKLT4gPnVRTCtldGMt
-Z3JlYXNlCk04OTJZeFRNeDI5aGpMVTk1ZTE0Y2FMMnFEMjlJalJpMHRlaTE4ZWIx
-d2lCRGQ5RHVjcktOMGJCb1VERlNWcTYKaSt0L1Z6dVJ0QWIyZkhsYzFEVjZSQWUr
-ZWpwVlo1TmhoUFJZdkEvR0gxNlVhcXF2ZTRnCi0tLSBLcmM2MThNVkdWclpHUXRr
-VTF6QVk2WUZlTXpZMVNLMlpBOFc3M1o5WjZzCs9xbPlIX+u5vRSQ/z9utu+I9S2c
-02DOsIb1kzxzb1OK91b8Kh4JucQSq3qkyEvRucsNn5QW8hIHDnRuND6EbPyN7p4S
-YB/F0dxSqgnq
-----END AGE ENCRYPTED FILE-----
--- a/tests/run-integration.sh
+++ b/tests/run-integration.sh
@@ -1,347 +0,0 @@
-#!/usr/bin/env bash
-# =============================================================================
-# run-integration.sh — Staging VM Integration Test Suite
-#
-# Verifies Docker daemon, compose stack, and service endpoint health.
-# Designed to run inside the staging VM as part of CI/CD pipeline.
-#
-# Usage:
-#   ./tests/run-integration.sh                  # all defaults
-#   ./tests/run-integration.sh --verbose         # detailed output
-#   ./tests/run-integration.sh --list-services   # print detected services and exit
-#
-# Environment variables (all optional):
-#   COMPOSE_DIR       Path to compose service directories  (default: /opt/infra/compose)
-#   COMPOSE_PROJECT   Docker Compose project name          (default: staging)
-#   STAGING_DOMAIN    Base domain for health checks        (default: staging.lazyworkhorse.net)
-#   SERVICE_LIST      Space-separated service dirs to check (default: auto-detect)
-#   HEALTH_URLS       Space-separated URLs for health checks (default: auto-detect from SERVICE_LIST)
-#   HEALTH_TIMEOUT    Curl timeout per check (seconds)      (default: 5)
-#   HEALTH_RETRIES    Number of retries per endpoint         (default: 1)
-#   HEALTH_INTERVAL   Seconds between retries                (default: 2)
-# =============================================================================
-
-set -euo pipefail
-
-# ---- Colors for readable output ----
-RED='\033[0;31m'
-GREEN='\033[0;32m'
-YELLOW='\033[1;33m'
-CYAN='\033[0;36m'
-BOLD='\033[1m'
-NC='\033[0m' # No Color
-
-# ---- Configuration (all env-overridable) ----
-COMPOSE_DIR="${COMPOSE_DIR:-/opt/infra/compose}"
-COMPOSE_PROJECT="${COMPOSE_PROJECT:-staging}"
-STAGING_DOMAIN="${STAGING_DOMAIN:-staging.lazyworkhorse.net}"
-HEALTH_TIMEOUT="${HEALTH_TIMEOUT:-5}"
-HEALTH_RETRIES="${HEALTH_RETRIES:-1}"
-HEALTH_INTERVAL="${HEALTH_INTERVAL:-2}"
-
-# Known compose service directories in order — override via SERVICE_LIST env var
-DEFAULT_SERVICES=(
-  network
-  authentification
-  homepage
-  ai
-  cloudstorage
-  versioncontrol
-  backup
-  coms
-  finance
-  homeautomation
-  passwordmanager
-)
-
-# Map service directory -> default health check URL (relative to STAGING_DOMAIN)
-# Override entirely via HEALTH_URLS env var.
-declare -A DEFAULT_HEALTH_URLS
-DEFAULT_HEALTH_URLS[network]="https://traefik.${STAGING_DOMAIN}/ping"
-DEFAULT_HEALTH_URLS[authentification]="https://auth.${STAGING_DOMAIN}/api/verify"
-DEFAULT_HEALTH_URLS[homepage]="https://${STAGING_DOMAIN}/"
-DEFAULT_HEALTH_URLS[ai]="https://hermes.${STAGING_DOMAIN}/health"
-DEFAULT_HEALTH_URLS[cloudstorage]="https://cloud.${STAGING_DOMAIN}/status.php"
-DEFAULT_HEALTH_URLS[versioncontrol]="https://code.${STAGING_DOMAIN}/api/healthz"
-
-# ---- Trackers ----
-PASS_COUNT=0
-FAIL_COUNT=0
-WARN_COUNT=0
-FAILURES=()
-
-# ---- Helpers ----
-
-log_info()  { echo -e "${CYAN}[INFO]${NC}  $*"; }
-log_pass()  { echo -e "${GREEN}[PASS]${NC}  $*"; ((PASS_COUNT++)); }
-log_fail()  { echo -e "${RED}[FAIL]${NC}  $*"; ((FAIL_COUNT++)); FAILURES+=("$*"); }
-log_warn()  { echo -e "${YELLOW}[WARN]${NC}  $*"; ((WARN_COUNT++)); }
-log_step()  { echo -e "\n${BOLD}── $* ──${NC}"; }
-log_raw()   { echo -e "         $*"; }
-
-# Check if a command exists
-require_cmd() {
-  if ! command -v "$1" &>/dev/null; then
-    log_fail "Required command not found: $1"
-    return 1
-  fi
-}
-
-# Retry a command with exponential-like backoff
-retry() {
-  local cmd="$*"
-  local attempt=0
-  local max_attempts=$((HEALTH_RETRIES + 1))
-  local result
-
-  while [[ $attempt -lt $max_attempts ]]; do
-    if eval "$cmd" 2>/dev/null; then
-      return 0
-    fi
-    attempt=$((attempt + 1))
-    if [[ $attempt -lt $max_attempts ]]; then
-      sleep "$HEALTH_INTERVAL"
-    fi
-  done
-  return 1
-}
-
-# ---- Parse arguments ----
-VERBOSE=false
-LIST_SERVICES=false
-POSITIONAL=()
-while [[ $# -gt 0 ]]; do
-  case "$1" in
-    --verbose|-v)  VERBOSE=true; shift ;;
-    --list-services) LIST_SERVICES=true; shift ;;
-    --) shift; POSITIONAL+=("$@"); break ;;
-    *) POSITIONAL+=("$1"); shift ;;
-  esac
-done
-set -- "${POSITIONAL[@]}"
-
-# Resolve service list
-if [[ -n "${SERVICE_LIST:-}" ]]; then
-  IFS=' ' read -ra SERVICES <<< "$SERVICE_LIST"
-else
-  SERVICES=("${DEFAULT_SERVICES[@]}")
-fi
-
-# Resolve health URLs — default map with overrides from env
-declare -A HEALTH_URLS
-if [[ -n "${HEALTH_URLS:-}" ]]; then
-  # User-supplied mapping: "network=https://... authentification=https://..."
-  for pair in $HEALTH_URLS; do
-    key="${pair%%=*}"
-    val="${pair#*=}"
-    HEALTH_URLS["$key"]="$val"
-  done
-else
-  for svc in "${SERVICES[@]}"; do
-    if [[ -n "${DEFAULT_HEALTH_URLS[$svc]:-}" ]]; then
-      HEALTH_URLS["$svc"]="${DEFAULT_HEALTH_URLS[$svc]}"
-    fi
-  done
-fi
-
-# --list-services mode (for CI integration)
-if $LIST_SERVICES; then
-  echo "Configured services:"
-  for svc in "${SERVICES[@]}"; do
-    url="${HEALTH_URLS[$svc]:-no-health-check}"
-    echo "  $svc -> $url"
-  done
-  exit 0
-fi
-
-# ---- Pre-flight ----
-echo -e "${BOLD}============================================${NC}"
-echo -e "${BOLD}  Staging VM Integration Test Suite${NC}"
-echo -e "${BOLD}  $(date -u '+%Y-%m-%dT%H:%M:%SZ')${NC}"
-echo -e "${BOLD}============================================${NC}"
-
-# ---- Phase 1: Prerequisites ----
-log_step "Phase 1: Prerequisites"
-
-PREREQ_OK=true
-for cmd in docker curl jq; do
-  if ! require_cmd "$cmd"; then
-    PREREQ_OK=false
-  fi
-done
-$PREREQ_OK && log_pass "All required commands available" || log_fail "Missing prerequisites"
-
-# ---- Phase 2: Docker daemon ----
-log_step "Phase 2: Docker Daemon"
-
-if docker info --format '{{.ServerVersion}}' &>/dev/null; then
-  DOCKER_VERSION=$(docker info --format '{{.ServerVersion}}' 2>/dev/null)
-  log_pass "Docker daemon is running (version: $DOCKER_VERSION)"
-
-  if docker info --format '{{.Driver}}' 2>/dev/null | grep -qi "overlay"; then
-    log_pass "Storage driver: overlay"
-  else
-    log_warn "Non-overlay storage driver detected"
-  fi
-else
-  log_fail "Docker daemon is NOT running or not accessible"
-fi
-
-# ---- Phase 3: Docker Compose stack ----
-log_step "Phase 3: Compose Stack Status"
-
-# Check if any compose files exist
-COMPOSE_FILES=()
-for svc in "${SERVICES[@]}"; do
-  cf="${COMPOSE_DIR}/${svc}/compose.yml"
-  if [[ -f "$cf" ]]; then
-    COMPOSE_FILES+=("$cf")
-  else
-    cf2="${COMPOSE_DIR}/${svc}/docker-compose.yml"
-    if [[ -f "$cf2" ]]; then
-      COMPOSE_FILES+=("$cf2")
-    else
-      log_warn "No compose file found for service '$svc' (expected: ${cf})"
-    fi
-  fi
-done
-
-if [[ ${#COMPOSE_FILES[@]} -eq 0 ]]; then
-  log_fail "No compose files found under COMPOSE_DIR=${COMPOSE_DIR}"
-  log_info "Skipping stack checks"
-else
-  log_info "Found ${#COMPOSE_FILES[@]} compose file(s) in ${COMPOSE_DIR}"
-
-  # Build the compose file args
-  COMPOSE_CMD="docker compose -p ${COMPOSE_PROJECT}"
-  for cf in "${COMPOSE_FILES[@]}"; do
-    COMPOSE_CMD+=" -f ${cf}"
-  done
-
-  log_info "Project name: ${COMPOSE_PROJECT}"
-
-  # Check stack ps
-  if $VERBOSE; then
-    log_raw "--- docker compose ps output ---"
-    eval "$COMPOSE_CMD ps" 2>&1 | while IFS= read -r line; do log_raw "$line"; done
-    log_raw "--- end ---"
-  fi
-
-  # Get all services and their status
-  if STACK_STATUS=$(eval "$COMPOSE_CMD ps --format '{{.Name}}\t{{.Status}}'" 2>/dev/null); then
-    if [[ -z "$STACK_STATUS" ]]; then
-      log_warn "Stack exists but no running services — VM may be freshly provisioned"
-    else
-      ALL_RUNNING=true
-      RUNNING_COUNT=0
-      TOTAL_COUNT=0
-      while IFS=$'\t' read -r name status; do
-        TOTAL_COUNT=$((TOTAL_COUNT + 1))
-        status_lower=$(echo "$status" | tr '[:upper:]' '[:lower:]')
-        if echo "$status_lower" | grep -qE '^(up|running|healthy)'; then
-          RUNNING_COUNT=$((RUNNING_COUNT + 1))
-          $VERBOSE && log_pass "  $name — $status"
-        else
-          ALL_RUNNING=false
-          log_warn "  $name — $status (not healthy)"
-        fi
-      done <<< "$STACK_STATUS"
-
-      if [[ "$TOTAL_COUNT" -eq 0 ]]; then
-        log_fail "No services found in compose project"
-      elif $ALL_RUNNING && [[ "$TOTAL_COUNT" -eq "$RUNNING_COUNT" ]]; then
-        log_pass "All ${TOTAL_COUNT} service(s) running (${RUNNING_COUNT}/${TOTAL_COUNT})"
-      else
-        log_fail "${RUNNING_COUNT}/${TOTAL_COUNT} service(s) running — some services are down"
-      fi
-    fi
-  else
-    log_fail "Failed to query compose stack status"
-  fi
-fi
-
-# ---- Phase 4: Service health checks ----
-log_step "Phase 4: Service Endpoint Health Checks"
-
-ENDPOINT_CHECKS=0
-ENDPOINT_PASS=0
-
-for svc in "${SERVICES[@]}"; do
-  url="${HEALTH_URLS[$svc]:-}"
-  if [[ -z "$url" ]]; then
-    $VERBOSE && log_info "No health check URL for service '$svc' — skipping"
-    continue
-  fi
-
-  ENDPOINT_CHECKS=$((ENDPOINT_CHECKS + 1))
-  echo -ne "  Checking ${svc} ... "
-
-  # Perform the HTTP health check with retries
-  if retry "curl -sf -o /dev/null -w '%{http_code}' --max-time ${HEALTH_TIMEOUT} '${url}' 2>/dev/null"; then
-    HTTP_CODE=$(curl -sf -o /dev/null -w '%{http_code}' --max-time "${HEALTH_TIMEOUT}" "${url}" 2>/dev/null || true)
-    ENDPOINT_PASS=$((ENDPOINT_PASS + 1))
-    echo -e "${GREEN}OK${NC} (HTTP ${HTTP_CODE})"
-  else
-    LAST_CODE=$(curl -s -o /dev/null -w '%{http_code}' --max-time "${HEALTH_TIMEOUT}" "${url}" 2>/dev/null || echo "000")
-    echo -e "${RED}FAIL${NC} (HTTP ${LAST_CODE})"
-    log_fail "Health check failed for ${svc} @ ${url}"
-  fi
-done
-
-if [[ $ENDPOINT_CHECKS -eq 0 ]]; then
-  log_warn "No health check URLs configured — skipping endpoint phase"
-elif [[ $ENDPOINT_PASS -eq $ENDPOINT_CHECKS ]]; then
-  log_pass "All ${ENDPOINT_CHECKS} endpoint(s) healthy"
-else
-  log_fail "${ENDPOINT_PASS}/${ENDPOINT_CHECKS} endpoint(s) healthy"
-fi
-
-# ---- Phase 5: Docker system sanity ----
-log_step "Phase 5: Docker System Sanity"
-
-# Check disk space for Docker
-DOCKER_ROOT=$(docker info --format '{{.DockerRootDir}}' 2>/dev/null || echo "/var/lib/docker")
-log_info "Docker root: ${DOCKER_ROOT}"
-
-if command -v df &>/dev/null && [[ -d "$DOCKER_ROOT" ]]; then
-  AVAIL_PCT=$(df -h "$DOCKER_ROOT" | awk 'NR==2 {print $5}' | tr -d '%')
-  if [[ -n "$AVAIL_PCT" ]]; then
-    if [[ "$AVAIL_PCT" -ge 90 ]]; then
-      log_warn "Docker storage is ${AVAIL_PCT}% full — consider cleanup"
-    else
-      log_pass "Docker storage at ${AVAIL_PCT}% — within limits"
-    fi
-  fi
-fi
-
-# Check for dangling images
-DANGLING=$(docker images -f "dangling=true" -q 2>/dev/null | wc -l)
-if [[ "$DANGLING" -gt 10 ]]; then
-  log_warn "${DANGLING} dangling images found — consider docker image prune"
-fi
-
-# ---- Summary ----
-echo ""
-echo -e "${BOLD}============================================${NC}"
-echo -e "${BOLD}  Test Summary${NC}"
-echo -e "${BOLD}  $(date -u '+%Y-%m-%dT%H:%M:%SZ')${NC}"
-echo -e "${BOLD}============================================${NC}"
-echo -e "  ${GREEN}Passed:${NC}  ${PASS_COUNT}"
-echo -e "  ${RED}Failed:${NC}  ${FAIL_COUNT}"
-echo -e "  ${YELLOW}Warnings:${NC} ${WARN_COUNT}"
-
-if [[ ${#FAILURES[@]} -gt 0 ]]; then
-  echo -e "\n${BOLD}Failed checks:${NC}"
-  for f in "${FAILURES[@]}"; do
-    echo -e "  ${RED}•${NC} $f"
-  done
-fi
-
-echo ""
-if [[ $FAIL_COUNT -eq 0 ]]; then
-  echo -e "${GREEN}${BOLD}✓ All integration checks passed${NC}"
-  exit 0
-else
-  echo -e "${RED}${BOLD}✗ ${FAIL_COUNT} integration check(s) failed${NC}"
-  exit 1
-fi
--- a/users/ai-worker.nix
+++ b/users/ai-worker.nix
@@ -4,90 +4,11 @@
    group = "ai-worker";
    home = "/home/ai-worker";
    createHome = true;
-    extraGroups = [ "docker" "libvirtd" ];
+    extraGroups = [ "docker" ];
    shell = pkgs.bashInteractive;
    openssh.authorizedKeys.keys = [
      keys.users.ai-worker.main
    ];
-    # No password login - SSH key only
-    hashedPassword = "!";
  };
  users.groups.ai-worker = {};
-
-  # Enable restricted AI worker SSH access for ollama benchmarking
-  # SECURITY: ai-worker can only:
-  #   - SSH into host from Hermes container
-  #   - Run docker commands (docker exec ollama ...) via docker group
-  #   - Run specific security audit commands
-  #   - NO access to infra repo (no bind mount)
-  #   - NO sudo access (no nh, nixos-rebuild, nixpkgs-fmt, nix)
-  # WORKFLOW: SSH from Hermes container, run docker benchmarks, return and save results to /opt/data/ai-optimizer/
-  services.aiWorkerAccess = true;
-  
-  # Restricted sudo for ai-worker - security checks only
-  security.sudo.extraRules = [
-    {
-      users = [ "ai-worker" ];
-      commands = [
-        # Firewall checks
-        {
-          command = "/run/wrappers/bin/sudo iptables -L -n -v";
-          options = [ "NOPASSWD" ];
-        }
-        {
-          command = "/run/wrappers/bin/sudo iptables -S";
-          options = [ "NOPASSWD" ];
-        }
-        # Fail2ban status
-        {
-          command = "/run/current-system/sw/bin/fail2ban-client status";
-          options = [ "NOPASSWD" ];
-        }
-        {
-          command = "/run/current-system/sw/bin/fail2ban-client status *";
-          options = [ "NOPASSWD" ];
-        }
-        {
-          command = "/run/current-system/sw/bin/fail2ban-client get * banned";
-          options = [ "NOPASSWD" ];
-        }
-        # Log inspection
-        {
-          command = "/run/current-system/sw/bin/journalctl -t kernel -n 100";
-          options = [ "NOPASSWD" ];
-        }
-        {
-          command = "/run/current-system/sw/bin/journalctl -u fail2ban -n 50";
-          options = [ "NOPASSWD" ];
-        }
-        {
-          command = "/run/current-system/sw/bin/journalctl -u firewall -n 50";
-          options = [ "NOPASSWD" ];
-        }
-        # SSH config verification
-        {
-          command = "/run/current-system/sw/bin/sshd -T";
-          options = [ "NOPASSWD" ];
-        }
-        # Docker service checks
-        {
-          command = "/run/current-system/sw/bin/docker ps";
-          options = [ "NOPASSWD" ];
-        }
-        {
-          command = "/run/current-system/sw/bin/docker inspect *";
-          options = [ "NOPASSWD" ];
-        }
-        # Network diagnostics
-        {
-          command = "/run/current-system/sw/bin/ss -tlnp";
-          options = [ "NOPASSWD" ];
-        }
-        {
-          command = "/run/current-system/sw/bin/cat /proc/net/tcp";
-          options = [ "NOPASSWD" ];
-        }
-      ];
-    }
-  ];
 }