Design 2 — Bare Metal + Slurm + RoCE
The traditional HPC blueprint. No containers, no Kubernetes — Slurm submits jobs directly to bare-metal nodes. Highest performance ceiling and the simplest mental model, at the cost of multi-tenancy and dynamic scheduling.
Best for: Performance-sensitive single-tenant workloads. Research labs, weather, physics, training runs that pin the whole cluster. Trade-offs: No isolation between users. No container portability. Manual environment management.