Aimed at the sysadmin running the boxes — not at the engineer designing the agent. The hardening voice from the rest of the site applied to the AI infrastructure layer: bind safety, systemd unit hardening, reverse-proxy posture, GPU access controls, model storage, and update hygiene. Read Ollama hardening first if you are starting from a fresh box; the other pieces (vector databases, AI gateways, MCP servers, production LLM-serving stacks) layer on top.
Ollama Hardening
Tested on: Ubuntu 24.04 LTS, Ollama 0.30.9 (June 2026 stable line), NVIDIA driver 555-series with CUDA 12.x, Caddy 2.8 as the reverse proxy. The Ollama binary moves fast; verify the systemd unit and environment-variable names against the release you are installing. This guide assumes you have already worked through ubuntu-baseline or rhel-baseline and the ssh-hardening guide. Ollama is layered on top of a defensible host, not a fresh provider image. If the underlying box is not hardened, hardening the model server on top of it is theatre. ...