At the end of this post, there are links to GTC Digital sessions that offer deeper dives into the new CUDA features.
Wmma 3 renders full#
Full support on all major CPU architectures, across x86_64, Arm64 server and POWER architectures.Ī single post cannot do justice to every feature available in CUDA 11.Updates to the Nsight product family of tools for tracing, profiling, and debugging of CUDA applications.Performance optimizations in CUDA libraries for linear algebra, FFTs, and matrix multiplication.Programming and APIs for task graphs, asynchronous data movement, fine-grained synchronization, and L2 cache residency control.New third-generation Tensor Cores to accelerate mixed-precision, matrix operations on different data types, including TF32 and Bfloat16.Multi-Instance GPU (MIG) partitioning capability that is particularly beneficial to cloud service providers (CSPs) for improved GPU utilization.Support for the NVIDIA Ampere GPU architecture, including the new NVIDIA A100 GPU for accelerated scale-up and scale-out of AI and HPC data centers multi-GPU systems with the NVSwitch fabric such as the DGX A100 and HGX A100.This post offers an overview of the major software features in this release: The A100 GPU has revolutionary hardware capabilities and we’re excited to announce CUDA 11 in conjunction with A100.ĬUDA 11 enables you to leverage the new hardware capabilities to accelerate HPC, genomics, 5G, rendering, deep learning, data analytics, data science, robotics, and many more diverse workloads.ĬUDA 11 is packed full of features-from platform system software to everything that you need to get started and develop GPU-accelerated applications. The new NVIDIA A100 GPU based on the NVIDIA Ampere GPU architecture delivers the greatest generational leap in accelerated computing.