The Evolution of Google Datacenter Networking Architecture and Technologies

(T) I have been trying over the years to follow the evolution of Google’s datacenter and networking technologies, and in particular its switching fabric Jupiter.  See the following blog posts:”Google’s Data Center Switching Fabric” and “How Google Manages its SDN Network“.

Over the summer, Google presented a paper at SIGCOMM 2022 about the recent evolution of Jupiter. Unfortunately, the paper does not give much technical details regarding Google’s network architecture.

The core network of Jupiter is based on optical circuit switching (OCS), and wave division multiplexing (WDM). As it has always done in the past, Google still use cheap commodity Ethernet switches for its racks. And, the network implements a control plane based on Software Defined Network (SDN).

The data center host tens of thousands of servers at 100s of Gb/s of bandwidth, with sub-100 us latency, hundreds of individual racks housing the switches, and tens of thousands of fiber pairs.

Native network interconnect is today 400 Gb/s, and aggregated bandwidth per data center is over 6 Pb/s.

Ideally, the design of the network aims “to support heterogeneous network elements in a “pay as you grow” model, adding network elements only when needed and with the latest generation of technology incrementally.”

And, it “should allocate bandwidth and pathing for services based on real-time communication patterns and application-aware optimization of the network.”

Traffic is split “among multiple shortest and non-shortest paths while observing link capacity, real-time communication patterns, and individual application priority.”

Key metrics that need to be continuously improved are:

  • Reducing the flow completion
  • Increasing the network throughput
  • Reducing the network downtime
  • Reducing the power consumption

All of that of course while incurring less costs.

Here is a presentation of the paper:

