Google has released a new scientific paper detailing the performance of its Cloud TPU v4 supercomputing platform, claiming it provides exascale performance for machine learning with boosted efficiency. The paper notes that the TPU v4 is 1.2x-1.7x faster and uses 1.3x-1.9x less power than the Nvidia A100 in similar sized systems. The TPU v4 contains 4,096 chips interconnected via proprietary optical circuit switches (OCS) which are faster, cheaper, and utilize less power than InfiniBand. Google engineers and paper authors Norm Jouppi and David Patterson explained that the TPU v4 enabled a nearly 10x leap in scaling ML system performance over TPU v3 and boosted the energy efficiency by approximately 2-3x compared to contemporary ML DSAs.