ROCm 6.0.0 released

Software 43236 Published One year ago by Philipp Esselbach

News

AMD has released a new major release of the ROCm GPU computation platform, which includes new performance optimizations, expanded framework and library support, and an improved developer experience.

ROCm 6.0.0 Release

Release notes for AMD ROCm 6.0

ROCm 6.0 is a major release with new performance optimizations, expanded frameworks and library
support, and improved developer experience. This includes initial enablement of the AMD Instinct
MI300 series. Future releases will further enable and optimize this new platform. Key features include:

Improved performance in areas like lower precision math and attention layers.

New hipSPARSELt library accelerates AI workloads via AMD's sparse matrix core technique.

Upstream support is now available for popular AI frameworks like TensorFlow, JAX, and PyTorch.

New support for libraries, such as DeepSpeed, ONNX-RT, and CuPy.

Prepackaged HPC and AI containers on AMD Infinity Hub, with improved documentation and
tutorials on the AMD ROCm Docs site.

Consolidated developer resources and training on the new
AMD ROCm Developer Hub.

The following section provide a release overview for ROCm 6.0. For additional details, you can refer to
the Changelog. We list known
issues on GitHub.

OS and GPU support changes

ROCm 6.0 enables the use of MI300A and MI300X Accelerators with a limited operating systems
support. Future releases will add additional OS's to match our general offering.
Operating Systems MI300A MI300X
Ubuntu 22.04.5 Supported Supported
RHEL 8.9 Supported
SLES15 SP5 Supported
For older generations of supported Instinct products we've added the following operating systems:

RHEL 9.3

RHEL 8.9

Note: For ROCm 6.2 and beyond, we've planned for end-of-support (EoS) for the following operating
systems:

Ubuntu 20.04.5

SLES 15 SP4

RHEL/CentOS 7.9

New ROCm meta package

We've added a new ROCm meta package for easy installation of all ROCm core packages, tools, and
libraries. For example, the following command will install the full ROCm package: apt-get install rocm
(Ubuntu), or yum install rocm (RHEL).

Filesystem Hierarchy Standard

ROCm 6.0 fully adopts the Filesystem Hierarchy Standard (FHS) reorganization goals. We've removed
the backward compatibility support for old file locations.

Compiler location change

The installation path of LLVM has been changed from /opt/rocm-<rel>/llvm to
/opt/rocm-<rel>/lib/llvm. For backward compatibility, a symbolic link is provided to the old
location and will be removed in a future release.

The installation path of the device library bitcode has changed from /opt/rocm-<rel>/amdgcn to
/opt/rocm-<rel>/lib/llvm/lib/clang/<ver>/lib/amdgcn. For backward compatibility, a symbolic link
is provided and will be removed in a future release.

Documentation

CMake support has been added for documentation in the
ROCm repository.

AMD Instinct MI50 end-of-support notice

AMD Instinct MI50, Radeon Pro VII, and Radeon VII products (collectively gfx906 GPUs) enters
maintenance mode in ROCm 6.0.

As outlined in 5.6.0, ROCm 5.7 was the
final release for gfx906 GPUs in a fully supported state.

Henceforth, no new features and performance optimizations will be supported for the gfx906 GPUs.

Bug fixes and critical security patches will continue to be supported for the gfx906 GPUs until Q2
2024 (end of maintenance [EOM] will be aligned with the closest ROCm release).

Bug fixes will be made up to the next ROCm point release.

Bug fixes will not be backported to older ROCm releases for gfx906.

Distribution and operating system updates will continue per the ROCm release cadence for gfx906
GPUs until EOM.

ROCm projects

The following sections contains project-specific release notes for ROCm 6.0. For additional details, you
can refer to the Changelog.

AMD SMI

Integrated the E-SMI (EPYC-SMI) library.
You can now query CPU-related information directly through AMD SMI. Metrics include power,
energy, performance, and other system details.

Added support for gfx942 metrics.
You can now query MI300 device metrics to get real-time information. Metrics include power,
temperature, energy, and performance.

HIP

New features to improve resource interoperability.

For external resource interoperability, we've added new structs and enums.

We've added new members to HIP struct hipDeviceProp_t for surfaces, textures, and device
identifiers.

Changes impacting backward compatibility.
There are several changes impacting backward compatibility: we changed some struct members and
some enum values, and removed some deprecated flags. For additional information, please refer to
the Changelog.

hipCUB

Additional CUB API support.
The hipCUB backend is updated to CUB and Thrust 2.1.

HIPIFY

Enhanced CUDA2HIP document generation.
API versions are now listed in the CUDA2HIP documentation. To see if the application binary
interface (ABI) has changed, refer to the
C columnin our API documentation.

Hipified rocSPARSE.
We've implemented support for the direct hipification of additional cuSPARSE APIs into rocSPARSE
APIs under the --roc option. This covers a major milestone in the roadmap towards complete
cuSPARSE-to-rocSPARSE hipification.

hipRAND

Official release.
hipRAND is now a standalone project--it's no longer available as a submodule for rocRAND.

hipTensor

Added architecture support.
We've added contraction support for gfx942 architectures, and f32 and f64 data
types.

Upgraded testing infrastructure.
hipTensor will now support dynamic parameter configuration with input YAML config.

MIGraphX

Added TorchMIGraphX.
We introduced a Dynamo backend for Torch, which allows PyTorch to use MIGraphX directly
without first requiring a model to be converted to the ONNX model format. With a single line of
code, PyTorch users can utilize the performance and quantization benefits provided by MIGraphX.

Boosted overall performance with rocMLIR.
We've integrated the rocMLIR library for ROCm-supported RDNA and CDNA GPUs. This
technology provides MLIR-based convolution and GEMM kernel generation.

Added INT8 support across the MIGraphX portfolio.
We now support the INT8 data type. MIGraphX can perform the quantization or ingest
prequantized models. INT8 support extends to the MIGraphX execution provider for ONNX Runtime.

ROCgdb

Added support for additional GPU architectures.
Navi 3 series: gfx1100, gfx1101, and gfx1102.

MI300 series: gfx942.

rocm-smi-lib

Improved accessibility to GPU partition nodes.
You can now view, set, and reset the compute and memory partitions. You'll also get notifications of
a GPU busy state, which helps you avoid partition set or reset failure.

Upgraded GPU metrics version 1.4.
The upgraded GPU metrics binary has an improved metric version format with a content version
appended to it. You can read each metric within the binary without the full rsmi_gpu_metric_t data
structure.

Updated GPU index sorting.
We made GPU index sorting consistent with other ROCm software tools by optimizing it to use
Bus:Device.Function (BDF) instead of the card number.

ROCm Compiler

Added kernel argument optimization on gfx942.
With the new feature, you can preload kernel arguments into Scalar General-Purpose Registers
(SGPRs) rather than pass them in memory. This feature is enabled with a compiler option, which also
controls the number of arguments to pass in SGPRs. For more information, see:
https://llvm.org/docs/AMDGPUUsage.html#preloaded-kernel-arguments

Improved register allocation at -O0.
We've improved the register allocator used at -O0 to avoid compiler crashes (when the signature is
'ran out of registers during register allocation').

Improved generation of debug information.
We've improved compile time when generating debug information for certain corner cases. We've
also improved the compiler to eliminate compiler crashes when generating debug information.

ROCmValidationSuite

Added GPU and operating system support.
We added support for MI300X GPU in GPU Stress Test (GST).

Roc Profiler

Added option to specify desired Roc Profiler version.
You can now use rocProfV1 or rocProfV2 by specifying your desired version, as the legacy rocProf
(rocprofv1) provides the option to use the latest version (rocprofv2).

Automated the ISA dumping process by Advance Thread Tracer.
Advance Thread Tracer (ATT) no longer depends on user-supplied Instruction Set Architecture (ISA)
and compilation process (using hipcc --save-temps) to dump ISA from the running kernels.

Added ATT support for parallel kernels.
The automatic ISA dumping process also helps ATT successfully parse multiple kernels running in
parallel, and provide cycle-accurate occupancy information for multiple kernels at the same time.

ROCr

Support for SDMA link aggregation.
If multiple XGMI links are available when making SDMA copies between GPUs, the copy is
distributed over multiple links to increase peak bandwidth.

rocThrust

Added Thrust 2.1 API support.
rocThrust backend is updated to Thrust and CUB 2.1.

rocWMMA

Added new architecture support.
We added support for gfx942 architectures.

Added data type support.
We added support for f8, bf8, xf32 data types on supporting architectures, and for bf16 in the HIP RTC
environment.

Added support for the PyTorch kernel plugin.
We added awareness of __HIP_NO_HALF_CONVERSIONS__ to support PyTorch users.

TransferBench (beta)

Improved ordering control.
You can now set the thread block size (BLOCK_SIZE) and the thread block order (BLOCK_ORDER)
in which thread blocks from different transfers are run when using a single stream.

Added comprehensive reports.
We modified individual transfers to report X Compute Clusters (XCC) ID when SHOW_ITERATIONS
is set to 1.

Improved accuracy in result validation.
You can now validate results for each iteration instead of just once for all iterations.

Release ROCm 6.0.0 Release · ROCm/ROCm

Operating Systems	MI300A	MI300X
Ubuntu 22.04.5	Supported	Supported
RHEL 8.9	Supported
SLES15 SP5	Supported

Manjaro 23.1 Vulcan released

Lenovo Legion Pro 7i (Gen 8) Review and more

ROCm 6.0.0 released

ROCm 6.0.0 Release

Release notes for AMD ROCm 6.0

OS and GPU support changes

New ROCm meta package

Filesystem Hierarchy Standard

Compiler location change

Documentation

AMD Instinct MI50 end-of-support notice

ROCm projects

AMD SMI

HIP

hipCUB

HIPIFY

hipRAND

hipTensor

MIGraphX

ROCgdb

rocm-smi-lib

ROCm Compiler

ROCmValidationSuite

Roc Profiler

ROCr

rocThrust

rocWMMA

TransferBench (beta)

Windows

Linux

macOS

Community