Congestion Control Options
The transport decides what gets carried. Congestion control decides when the senders slow down. They're co-designed — pick a transport and you've largely picked the CC algorithm that pairs with it.
This page is the CC design space. If transports are the wire, these are the brakes.
Four decades in three eras
Era 1 — TCP's reactive CC (1988–2010)
CC was born in 1988 with TCP slow start and congestion avoidance — a pure software algorithm reacting after loss. CUBIC (2006) became Linux's default by aggressively filling bandwidth between losses. ECN was standardized in 2001 (RFC 3168) — the network could finally mark instead of drop — but for a decade almost nobody used it. The whole era was reactive: the sender knew there was congestion only after packets had already been thrown away.
Era 2 — Datacenter needs precision (2010–2015)
DCTCP (Microsoft / Stanford, 2010) proved that fine-grained ECN-based CC works in tight datacenter networks — small queues, proportional reaction, no waiting for loss. DCQCN (Microsoft Research, 2015) ported the idea to RoCE v2 and became the canonical RDMA CC. TIMELY (Google, 2015) showed that delay (RTT gradient) is a viable signal too, no ECN needed. CC moved from reactive to proactive, and from software-only to NIC-offloaded.
Era 3 — AI-fabric CC: hardware-co-designed, fabric-wide (2015 → today)
Modern AI-fabric CC isn't an algorithm in the kernel — it's a control loop the switch and the NIC implement together. Swift (Google, 2020) became the basis for Falcon CC (2023, with CSIG + Carousel). HPCC (Alibaba, 2019) uses In-band Network Telemetry (INT) for precise per-link signals. MRC CC (2024) brings programmable CC and μs rerouting. UET CC (2025) is sender + receiver based for a packet-sprayed environment. Spectrum-X CC (NVIDIA, 2023) is switch + NIC co-designed.
Congestion control moved from a reactive software algorithm (TCP) to a proactive, hardware-co-designed, fabric-wide control loop (DCQCN, HPCC, UET).
CC algorithm families
Pick the family you care about — each tab is a self-contained reference:
- 1. TCP family
- 2. Datacenter / RDMA
- 3. Hyperscaler custom
- 4. Research / exotic
What runs the internet.
| Algorithm | Family | Signal | Key trait | Where used |
|---|---|---|---|---|
| Reno / NewReno | TCP | Loss | Classic AIMD; baseline | Legacy TCP everywhere |
| CUBIC | TCP | Loss | Cubic growth — default in Linux/Windows | Most internet TCP today |
| Vegas / Westwood | TCP | Delay / bw-est | Delay-based; low queueing | Niche / research |
| Compound TCP | TCP | Loss + delay | Hybrid | Older Windows |
| BBR v1 / v2 / v3 | TCP / QUIC | Bandwidth + RTT | Model-based; fills bottleneck without filling buffers | Google services, YouTube, QUIC, Linux kernel |
Standout: BBR. Doesn't wait for loss — models the bottleneck bandwidth and pacing rate, fills the pipe without filling buffers. Bridges the TCP era and the modern model-based approach.
Built for tight datacenter networks where ECN/delay signals beat loss.
| Algorithm | Family | Signal | Key trait | Where used |
|---|---|---|---|---|
| DCTCP | DC TCP | ECN marking | Proportional reaction to ECN; small queues | Microsoft, Linux DC TCP stacks |
| DCQCN | RoCE v2 | ECN + PFC | Default RoCE v2 CC; rate-based | Azure, Meta, Tencent — most RoCE clusters |
| TIMELY | RDMA | Delay (RTT gradient) | Delay-based, CPU-light | Google early RDMA |
| Swift | RDMA / Falcon | Delay (NIC RTT) | Decomposes host vs fabric latency; basis for Falcon | Google Falcon |
| HPCC | RDMA | In-band telemetry (INT) | Precise rate using switch INT data | Alibaba |
| PowerTCP | DC TCP / RDMA | Power = bw × queue | Combines bandwidth and queue depth signals | Research / select DCs |
Key takeaway: This is where CC stopped being a pure software problem. DCQCN is what you'll see in most production RoCE clusters today; HPCC and Swift are where it's heading.
The new generation — every major hyperscaler runs its own CC, paired tightly with its custom transport.
| Algorithm | Family | Signal | Key trait | Where used |
|---|---|---|---|---|
| MRC CC | MRC | Multipath telemetry | Programmable CC + microsecond rerouting | OpenAI / Microsoft Fairwater / Oracle Abilene |
| Falcon CC (Swift + CSIG + Carousel) | Falcon | Delay + congestion sig. | HW per-flow shaping, multipath PLB | Google + Intel E2100 |
| SRD CC | SRD | Path-level feedback | Avoids overloaded paths; <10 ms RTO | AWS EFA |
| UET CC | UET | Sender + receiver based | Two-sided CC for packet-sprayed environment | Ultra Ethernet 1.0 |
| Spectrum-X CC | Spectrum-X | Switch + NIC telemetry | Switch + NIC co-designed | NVIDIA Spectrum-X |
The pattern: packet spraying + delay-based CC + ECN/INT signals + microsecond failover. Pick any of the five — you'll find some combination of those four ideas.
Influential ideas that haven't fully shipped — but you'll see them cited in papers and vendor decks.
| Algorithm | Family | Signal | Key trait | Where used |
|---|---|---|---|---|
| Homa | Receiver-driven | Priorities + grants | Eliminates HoL via priorities; message-oriented | Stanford research, influential |
| NDP / Trim | Switch-assisted | Header trimming | Switch trims payload on congestion; no whole-packet drop | Cambridge research |
| ExpressPass | Credit-based | Receiver credits | Receiver paces with credit packets | Research |
| EQDS | Edge-queued | Edge-based shaping | Pushes queues to edges, not core | Cambridge / UCL research |
Worth knowing: Homa's receiver-driven priorities model influenced Falcon and several hyperscaler designs. NDP's trim-instead-of-drop idea has been quietly adopted in newer switches.
Mental model
Most modern AI fabrics combine four ideas: packet spraying + delay-based CC + ECN/INT signals + microsecond failover.
Pick any production AI-fabric CC algorithm — MRC, Falcon, SRD, UET CC, Spectrum-X CC — and you'll find some combination of these four. The PFC-only era is ending.
Who built what — full reference table
| Tech | Owner / Standards body | When |
|---|---|---|
| TCP Reno / NewReno | Berkeley / IETF | 1988 |
| ECN (RFC 3168) | IETF | 2001 |
| CUBIC | NCSU → Linux | 2006 |
| DCTCP | Microsoft / Stanford | SIGCOMM 2010 |
| DCQCN | Microsoft Research | SIGCOMM 2015 |
| TIMELY | SIGCOMM 2015 | |
| BBR | 2016 | |
| NDP / Trim | Cambridge | 2017 |
| AWS SRD CC | AWS | 2018 |
| Homa | Stanford | 2018 |
| HPCC | Alibaba | SIGCOMM 2019 |
| Swift | SIGCOMM 2020 | |
| PowerTCP | Research consortium | NSDI 2022 |
| Spectrum-X CC | NVIDIA | 2023 |
| Falcon CC | Google + Intel | 2023 |
| MRC CC | OpenAI + Microsoft + NVIDIA + AMD + Broadcom + Intel | 2024 |
| UET CC | Ultra Ethernet Consortium | 2024–2025 |
📄 Transports & Congestion Control — One-Pager — same content, denser, single-sheet print format
Next: What This Curriculum Picks → — the one transport + CC combo this course teaches, and why.