Compare commits

..

13 Commits

Author SHA1 Message Date
4cceab05d0 Merge pull request 'security: harden lazyworkhorse with firewall, fail2ban, SSH hardening' (#28) from feature/server-hardening-clean into master
Reviewed-on: #28
2026-05-03 09:11:56 +00:00
bcebf18676 fix: move filter into jail settings (NixOS submodule doesn't pass string filters) 2026-05-01 11:59:33 +00:00
0370d784a0 fix: http-botsearch logpath must be string, not list 2026-05-01 04:02:06 +00:00
260b2d2756 fix: restructure fail2ban jails per NixOS module - recidive in jails, settings attr, str bantime 2026-05-01 03:59:32 +00:00
2477acdfc7 fix: services.fail2ban top-level options - no findtime, maxretry lowercase 2026-05-01 03:57:21 +00:00
81c25d3f20 fix: use security.auditd instead of services.auditd 2026-05-01 03:55:09 +00:00
9b1f467db9 fix: remove invalid networking.firewall.defaultAllow option 2026-05-01 03:52:57 +00:00
65fa778b2b fix: add custom traefik fail2ban filters for http-auth and http-botsearch jails 2026-05-01 03:40:59 +00:00
5d3bbe99f3 chore: update compose submodule for traefik access logs 2026-05-01 03:33:34 +00:00
Robert
bcf5cadaa0 olllama template fix to remove currenttime 2026-04-30 21:54:47 -04:00
3e04ccc1e8 security: remove deployment commands from ai-worker sudo rules
ai-worker only needs security audit commands, not deployment access.

Removed:
- nh os switch
- nixos-rebuild switch

Kept:
- Firewall checks (iptables)
- Fail2ban status
- Log inspection (journalctl)
- SSH config (sshd -T)
- Docker service checks
- Network diagnostics
2026-04-30 17:46:39 +00:00
21bd4bb283 security: add restricted sudo for ai-worker with security audit commands
- Deployment: nh os switch, nixos-rebuild switch (flake path locked)
- Firewall checks: iptables -L, iptables -S
- Fail2ban: status, banned IPs
- Logs: journalctl for kernel and fail2ban
- SSH config: sshd -T for verification
- Docker: ps, inspect (service health)
- Network: ss -tlnp, /proc/net/tcp

All commands are whitelisted with NOPASSWD.
No shell access, no ALL command - principle of least privilege.
2026-04-30 17:46:39 +00:00
7994aad8d8 security: harden lazyworkhorse with firewall, fail2ban, SSH hardening
- Firewall (default deny):
  - Allow only essential ports: SSH(2424), Gitea(2222), HTTP(80), HTTPS(443)
  - Rate limit SSH (max 4 new connections/60s)
  - Rate limit HTTP/HTTPS (25/minute)
  - Drop invalid packets, log dropped packets

- Fail2ban (auto-ban attackers):
  - SSH jail: 3 strikes = 1 hour ban
  - HTTP auth failures: 5 strikes = 1 hour ban
  - HTTP scanning: 2 strikes = 2 hour ban
  - Recidive jail: repeat offenders = 1 week ban

- SSH hardening:
  - No root login
  - Max 3 auth tries, 5 sessions
  - 30s login grace time
  - No X11/TCP/agent forwarding
  - Verbose logging

- Kernel network hardening:
  - SYN flood protection (syncookies)
  - IP spoofing protection (rp_filter)
  - Disable source routing, redirects
  - Log martian packets
  - Connection tuning for high load

- Audit logging enabled

Ports commented for review (likely internal-only):
- 8000 (Portainer), 4242 (Coms), 5000/8087/8089 (TAK)
2026-04-30 17:46:39 +00:00
7 changed files with 355 additions and 23 deletions

View File

@@ -13,9 +13,7 @@ None
-**Phase 1: Foundation Setup** - Establish core NixOS configuration with flakes
-**Phase 2: Docker Service Integration** - Integrate Docker Compose services
-**Phase 3: AI Assistant Integration** - Enable AI-assisted infrastructure management
- **Phase 4: Internet Access & MCP** - MCP server for web access
- 🚨 **Security Hardening** - CRITICAL: Firewall, fail2ban, SSH hardening (PR #28)
- [ ] **Phase 5: TAK Server** - Research, implementation, and validation
- [ ] **Phase 4: Internet Access & MCP** - MCP server for web access
## Phase Details
@@ -135,25 +133,8 @@ Plans:
## Progress
**Merge Priority Order** (CRITICAL - merge in this order):
| Priority | PR | Description | Status | Notes |
|----------|-----|-------------|--------|-------|
| 🚨 1 | #28 | **Security hardening** (firewall, fail2ban, SSH) | Open | **MERGE FIRST** - protects all other services |
| 2 | #22 | Matrix bridge dependency fix | Open | Blocks Hermes functionality |
| 3 | #21 | Backup network creation fix | Open | Infrastructure fix |
| 4 | #25 | Hermes voice GPU support | Open | Feature enhancement |
| 5 | #24 | uConsole CM5 host | Open | New hardware support |
| 6 | #23 | NixOS deployment infrastructure | Open | Deployment tooling |
| 7 | #1 | AI worker restricted access | Open | Legacy PR (superseded by hardening) |
**Execution Order:**
Phases execute in numeric order: 1 → 2 → 3 → 4 → Security → 5 → 6 → 7
**Merge vs Phase Execution:**
- PRs can merge independently (no strict phase ordering for merges)
- **EXCEPTION:** Security hardening (#28) must merge before any new services are exposed
- After security merge, deploy with: `nh os switch --flake .#lazyworkhorse`
Phases execute in numeric order: 1 → 2 → 3 → 4 → 5 → 6 → 7
| Phase | Milestone | Plans Complete | Status | Completed |
|-------|-----------|----------------|--------|-----------|

View File

@@ -5,6 +5,7 @@ This document outlines the development conventions for this NixOS-based infrastr
## Build & Deployment
- **Build/Deploy:** Use `nixos-rebuild switch --flake .#<hostname>` to build and deploy the configuration for a specific host.
- **CRITICAL — Validate before pushing:** Always `nix build --no-link '.#nixosConfigurations.<hostname>.config.system.build.toplevel'` (or `nh os build`) and confirm it succeeds before pushing any changes. Never push untested NixOS configs.
- **Development Shell:** Activate the development environment with `nix develop`.
## Linting & Formatting

View File

@@ -0,0 +1,71 @@
FROM ghcr.io/astral-sh/uv:0.11.6-python3.13-trixie@sha256:b3c543b6c4f23a5f2df22866bd7857e5d304b67a564f4feab6ac22044dde719b AS uv_source
FROM tianon/gosu:1.19-trixie@sha256:3b176695959c71e123eb390d427efc665eeb561b1540e82679c15e992006b8b9 AS gosu_source
FROM debian:13.4
# Disable Python stdout buffering to ensure logs are printed immediately
ENV PYTHONUNBUFFERED=1
# Store Playwright browsers outside the volume mount so the build-time
# install survives the /opt/data volume overlay at runtime.
ENV PLAYWRIGHT_BROWSERS_PATH=/opt/hermes/.playwright
# Install system dependencies in one layer, clear APT cache
# tini reaps orphaned zombie processes (MCP stdio subprocesses, git, bun, etc.)
# that would otherwise accumulate when hermes runs as PID 1. See #15012.
RUN apt-get update && \
apt-get install -y --no-install-recommends \
build-essential nodejs npm python3 ripgrep ffmpeg gcc python3-dev libffi-dev procps git openssh-client docker-cli tini \
curl poppler-utils imagemagick \
chromium xvfb fonts-noto-color-emoji fonts-unifont fonts-liberation fonts-ipafont-gothic fonts-wqy-zenhei fonts-tlwg-loma-otf fonts-freefont-ttf \
libasound2t64 libatk-bridge2.0-0t64 libatk1.0-0t64 libatspi2.0-0t64 libcairo2 libcups2t64 libdbus-1-3 libdrm2 libgbm1 libglib2.0-0t64 libnspr4 libnss3 libpango-1.0-0 libx11-6 libxcb1 libxcomposite1 libxdamage1 libxext6 libxfixes3 libxkbcommon0 libxrandr2 \
texlive-latex-base texlive-latex-extra texlive-fonts-recommended texlive-xetex texlive-science \
qemu-user-static binfmt-support qemu-user-binfmt \
emacs-nox \
libportaudio2 && \
rm -rf /var/lib/apt/lists/*
# Non-root user for runtime; UID can be overridden via HERMES_UID at runtime
RUN useradd -u 10000 -m -d /opt/data hermes
COPY --chmod=0755 --from=gosu_source /gosu /usr/local/bin/
COPY --chmod=0755 --from=uv_source /usr/local/bin/uv /usr/local/bin/uvx /usr/local/bin/
WORKDIR /opt/hermes
# ---------- Layer-cached dependency install ----------
# Copy only package manifests first so npm install + Playwright are cached
# unless the lockfiles themselves change.
COPY package.json package-lock.json ./
COPY web/package.json web/package-lock.json web/
RUN npm install --prefer-offline --no-audit && \
npx playwright install --with-deps chromium --only-shell && \
(cd web && npm install --prefer-offline --no-audit) && \
npm cache clean --force
# ---------- Source code ----------
# .dockerignore excludes node_modules, so the installs above survive.
COPY --chown=hermes:hermes . .
# Build web dashboard (Vite outputs to hermes_cli/web_dist/)
RUN cd web && npm run build
# ---------- Permissions ----------
# Make install dir world-readable so any HERMES_UID can read it at runtime.
# The venv needs to be traversable too.
USER root
RUN chmod -R a+rX /opt/hermes
# Start as root so the entrypoint can usermod/groupmod + gosu.
# If HERMES_UID is unset, the entrypoint drops to the default hermes user (10000).
# ---------- Python virtualenv ----------
RUN uv venv && \
uv pip install --no-cache-dir -e ".[all]" && \
uv pip install --no-cache-dir sounddevice numpy faster-whisper
# ---------- Runtime ----------
ENV HERMES_WEB_DIST=/opt/hermes/hermes_cli/web_dist
ENV HERMES_HOME=/opt/data
ENV PATH="/opt/data/.local/bin:${PATH}"
VOLUME [ "/opt/data" ]
ENTRYPOINT [ "/usr/bin/tini", "-g", "--", "/opt/hermes/docker/entrypoint.sh" ]

View File

@@ -158,7 +158,7 @@
settings = {
PasswordAuthentication = false;
KbdInteractiveAuthentication = false;
PermitRootLogin = "prohibit-password";
# Additional hardening settings below in SERVER HARDENING section
};
hostKeys = [
{
@@ -308,6 +308,196 @@
# Or disable the firewall altogether.
# networking.firewall.enable = false;
# =============================================================================
# SERVER HARDENING - Firewall, Fail2ban, SSH, Kernel
# =============================================================================
# Firewall - default deny, explicit allow
networking.firewall = {
# Enable firewall with default deny policy (NixOS firewall denies all by default)
enable = true;
allowPing = true;
# Only essential ports exposed to internet
allowedTCPPorts = [
2424 # SSH (non-standard port)
2222 # Gitea (version control)
80 # HTTP (Traefik redirect)
443 # HTTPS (Traefik)
# 8000 # Portainer - REVIEW: internal only?
# 4242 # Coms - REVIEW: internal only?
# 5000 # TAK API - REVIEW: internal only?
# 8087 # TAK Connect - REVIEW: internal only?
# 8089 # TAK Management - REVIEW: internal only?
];
allowedUDPPorts = [
# Add UDP ports if required
];
# Rate limiting and attack prevention
extraCommands = ''
# Rate limit SSH connections (max 4 new connections per 60 seconds)
iptables -A INPUT -p tcp --dport 2424 -m state --state NEW -m recent --set
iptables -A INPUT -p tcp --dport 2424 -m state --state NEW -m recent --update --seconds 60 --hitcount 4 -j DROP
# Rate limit HTTP/HTTPS (protects Traefik)
iptables -A INPUT -p tcp --dport 80 -m state --state NEW -m limit --limit 25/minute --limit-burst 100 -j ACCEPT
iptables -A INPUT -p tcp --dport 443 -m state --state NEW -m limit --limit 25/minute --limit-burst 100 -j ACCEPT
# Drop invalid packets
iptables -A INPUT -m state --state INVALID -j DROP
# Log dropped packets (rate limited)
iptables -A INPUT -m limit --limit 5/min -j LOG --log-prefix "IPTables-Dropped: " --log-level 4
'';
};
# Fail2ban - automatic IP banning
services.fail2ban = {
enable = true;
maxretry = 3;
bantime = "1h";
banaction = "iptables-multiport";
jails = {
# SSH brute force protection (uses systemd journal backend)
sshd = {
enabled = true;
settings = {
filter = "sshd";
port = "2424";
maxretry = 3;
bantime = "1h";
};
};
# Recidive - ban repeat offenders for 1 week
recidive = {
enabled = true;
settings = {
filter = "recidive";
logpath = "/var/log/fail2ban.log";
bantime = "1w";
findtime = "1d";
maxretry = 3;
};
};
# HTTP authentication failures (Traefik)
http-auth = {
enabled = true;
settings = {
filter = "traefik-auth";
port = "80,443";
logpath = "/var/log/traefik/access.log";
maxretry = 5;
bantime = "1h";
};
};
# HTTP scanning/attacks (Traefik)
http-botsearch = {
enabled = true;
settings = {
filter = "traefik-botsearch";
port = "80,443";
logpath = "/var/log/traefik/access.log";
maxretry = 2;
bantime = "2h";
};
};
};
};
# Custom fail2ban filters for Traefik
environment.etc."fail2ban/filter.d/traefik-auth.conf".text = ''
[Definition]
failregex = ^<HOST> -.*"(GET|POST|HEAD|PUT|DELETE).*" (401|403) \d+.*$
ignoreregex =
'';
environment.etc."fail2ban/filter.d/traefik-botsearch.conf".text = ''
[Definition]
failregex = ^<HOST> -.*"(GET|POST|HEAD|PUT|DELETE).*" 404 \d+.*$
^<HOST> -.*"(GET|POST|HEAD|PUT|DELETE).*/(\.|wp-|php|admin|login|xmlrpc|\.env|\.git|\.aws|\.azure).*" \d+.*$
ignoreregex =
'';
# SSH hardening
services.openssh.settings = {
PermitRootLogin = "no";
MaxAuthTries = 3;
MaxSessions = 5;
LoginGraceTime = 30;
ClientAliveInterval = 300;
ClientAliveCountMax = 2;
PermitEmptyPasswords = "no";
ChallengeResponseAuthentication = "no";
UsePAM = true;
LogLevel = "VERBOSE";
X11Forwarding = false;
AllowTcpForwarding = "no";
AllowAgentForwarding = "no";
PermitTunnel = "no";
};
# Kernel network hardening
boot.kernel.sysctl = {
# IP Spoofing protection
"net.ipv4.conf.all.rp_filter" = 1;
"net.ipv4.conf.default.rp_filter" = 1;
# Ignore ICMP broadcasts
"net.ipv4.icmp_echo_ignore_broadcasts" = 1;
# Disable source routing
"net.ipv4.conf.all.accept_source_route" = 0;
"net.ipv4.conf.default.accept_source_route" = 0;
"net.ipv6.conf.all.accept_source_route" = 0;
"net.ipv6.conf.default.accept_source_route" = 0;
# Disable redirects
"net.ipv4.conf.all.send_redirects" = 0;
"net.ipv4.conf.default.send_redirects" = 0;
# SYN flood protection
"net.ipv4.tcp_syncookies" = 1;
"net.ipv4.tcp_max_syn_backlog" = 2048;
"net.ipv4.tcp_synack_retries" = 2;
"net.ipv4.tcp_syn_retries" = 5;
# Log martian packets
"net.ipv4.conf.all.log_martians" = 1;
"net.ipv4.conf.default.log_martians" = 1;
# Ignore redirects
"net.ipv4.conf.all.accept_redirects" = 0;
"net.ipv4.conf.default.accept_redirects" = 0;
"net.ipv4.conf.all.secure_redirects" = 0;
"net.ipv4.conf.default.secure_redirects" = 0;
"net.ipv6.conf.all.accept_redirects" = 0;
"net.ipv6.conf.default.accept_redirects" = 0;
# Connection tuning
"net.core.somaxconn" = 4096;
"net.core.netdev_max_backlog" = 65536;
"net.ipv4.tcp_max_orphans" = 65536;
"net.ipv4.tcp_fin_timeout" = 15;
"net.ipv4.tcp_keepalive_time" = 300;
"net.ipv4.tcp_keepalive_probes" = 5;
"net.ipv4.tcp_keepalive_intvl" = 15;
};
# Audit logging
security.auditd.enable = true;
# Fail2ban log directory
systemd.tmpfiles.rules = [
"d /var/log/fail2ban 0755 root root -"
"d /var/log/traefik 0755 root root -"
];
# Copy the NixOS configuration file and link it from the resulting system
# (/run/current-system/configuration.nix). This is useful in case you
# accidentally delete configuration.nix.

View File

@@ -14,8 +14,25 @@
local base_model=$2
if ! ${pkgs.docker}/bin/docker exec ollama ollama list | grep -q "$model_name"; then
echo "$model_name not found, creating from $base_model..."
# We use a custom TEMPLATE block to strip the 'currentDate' function
# which is unsupported in Ollama 0.5.7 but present in Devstral's default manifest.
${pkgs.docker}/bin/docker exec ollama sh -c "cat <<EOF > /root/.ollama/$model_name.modelfile
FROM $base_model
TEMPLATE \"\"\"{{- if .System }}
[SYSTEM_PROMPT]
{{ .System }}
[/SYSTEM_PROMPT]
{{- end }}
{{- range .Messages }}
{{- if eq .Role \"user\" }}
[INST]
{{ .Content }}
[/INST]
{{- else if eq .Role \"assistant\" }}
{{ .Content }}
{{- end }}
{{- end }}\"\"\"
PARAMETER num_ctx 131072
PARAMETER num_predict 4096
PARAMETER num_keep 1024
@@ -26,6 +43,7 @@ PARAMETER stop \"[/INST]\"
PARAMETER stop \"</s>\"
EOF"
${pkgs.docker}/bin/docker exec ollama ollama create "$model_name" -f "/root/.ollama/$model_name.modelfile"
${pkgs.docker}/bin/docker exec ollama rm "/root/.ollama/$model_name.modelfile"
else
echo "$model_name already exists, skipping."
fi
@@ -36,6 +54,10 @@ EOF"
# Create Devstral
create_model_if_missing "devstral-small-2:24b-128k" "devstral-small-2:24b"
# create_model_if_missing "qwen2.5-coder:32b-128k" "qwen2.5-coder:32b"
# create_model_if_missing "mistral-large-planner:123b" "mistral-large:123b-instruct-v2407-q4_K_S"
'';
serviceConfig = {
Type = "oneshot";

View File

@@ -11,4 +11,71 @@
];
};
users.groups.ai-worker = {};
# Restricted sudo for ai-worker - security checks only
security.sudo.extraRules = [
{
users = [ "ai-worker" ];
commands = [
# Firewall checks
{
command = "/run/wrappers/bin/sudo iptables -L -n -v";
options = [ "NOPASSWD" ];
}
{
command = "/run/wrappers/bin/sudo iptables -S";
options = [ "NOPASSWD" ];
}
# Fail2ban status
{
command = "/run/current-system/sw/bin/fail2ban-client status";
options = [ "NOPASSWD" ];
}
{
command = "/run/current-system/sw/bin/fail2ban-client status *";
options = [ "NOPASSWD" ];
}
{
command = "/run/current-system/sw/bin/fail2ban-client get * banned";
options = [ "NOPASSWD" ];
}
# Log inspection
{
command = "/run/current-system/sw/bin/journalctl -t kernel -n 100";
options = [ "NOPASSWD" ];
}
{
command = "/run/current-system/sw/bin/journalctl -u fail2ban -n 50";
options = [ "NOPASSWD" ];
}
{
command = "/run/current-system/sw/bin/journalctl -u firewall -n 50";
options = [ "NOPASSWD" ];
}
# SSH config verification
{
command = "/run/current-system/sw/bin/sshd -T";
options = [ "NOPASSWD" ];
}
# Docker service checks
{
command = "/run/current-system/sw/bin/docker ps";
options = [ "NOPASSWD" ];
}
{
command = "/run/current-system/sw/bin/docker inspect *";
options = [ "NOPASSWD" ];
}
# Network diagnostics
{
command = "/run/current-system/sw/bin/ss -tlnp";
options = [ "NOPASSWD" ];
}
{
command = "/run/current-system/sw/bin/cat /proc/net/tcp";
options = [ "NOPASSWD" ];
}
];
}
];
}