Deployment Models
The five-ish ways to deploy an AI training cluster โ bare metal, VM with SR-IOV passthrough, containers on bare metal, Kubernetes, and cloud-managed. Pros, cons, and where you'll see each.
Host Networking
How RDMA reaches the application inside a Kubernetes pod. PF vs VF, SR-IOV, Multus, GPU Operator, and the order in which they have to be configured.
What This Curriculum Picks
Two-layer approach โ bare metal as the teaching baseline (simpler mental model), Kubernetes + Multus + SR-IOV + GPU Operator as the production variant.