NVIDIA’s latest release of the cuBLAS library, version 12.5, brings significant updates aimed at enhancing the functionality and performance of deep learning and high-performance computing workloads. Key updates include the introduction of Grouped GEMM APIs, improved matrix multiplication performance on NVIDIA GPUs, and enhanced performance tuning options.