Amazon Web Services (AWS) has launched a new Amazon EC2 P5 instance powered by NVIDIA H100 Tensor Core GPUs, allowing users to scale generative AI, high performance computing (HPC) and other applications with a click from a browser. The NVIDIA H100 GPU delivers supercomputing-class performance through architectural innovations, including fourth-generation Tensor Cores, a new Transformer Engine for accelerating LLMs and the latest NVLink technology. P5 instances are ideal for training and running inference for increasingly complex LLMs and computer vision models, and can be deployed in hyperscale clusters, called EC2 UltraClusters. Additionally, the P5 instance sports petabit-scale non-blocking networks, powered by AWS EFA, and machine learning applications can use the NVIDIA Collective Communications Library to employ as many as 20,000 H100 GPUs.
