Design 1 — Kubernetes + SR-IOV + RoCE
The most modern and flexible HPC design. Kubernetes for orchestration, SR-IOV for up to 16 virtual NIC functions per physical NIC (so each pod gets dedicated hardware), Multus to attach those VFs as additional pod interfaces, and RoCE for lossless RDMA between GPUs.
Best for: Large production AI training clusters, multi-tenant environments. Trade-offs: Higher setup complexity. Requires SR-IOV-capable NICs and RoCE-tuned switches.
Architecture
Build steps — the 15 layers
From rack power up to NCCL all-reduce. Each layer a place where things break.
What's next
- Design 2 — Bare Metal + Slurm + RoCE — the traditional HPC alternative.
- Design 3 — Kubernetes + Physical NIC + RoCE — simpler K8s, no SR-IOV.
- Design 4 — Bare Metal + MPI + RoCE — minimal lab setup.
- Design 5 — Hybrid: K8s + Slurm + RoCE — real-world large operator.