Software 42770 Published by

AMD released a new version of its compute platform, ROCm, which includes significant new capabilities and improvements. The most significant version change is the increase in the rocAL version number from 1.3 to 2.0, which necessitates the recompilation of applications that were linked to version 1.3. ROCm now supports Facebook General Matrix Multiplication (FBGEMM) and its associated FBGEMM_GPU library. New improvements in ROCm Offline Installer Creator 6.2.1 include logging support, more strict checks for Linux versions and distributions, updated necessary repositories, and fixes to CTest problems.



ROCm 6.2.1 Release

The release notes provide a summary of notable changes since the previous ROCm release.

The  Compatibility matrix
provides the full list of supported hardware, operating systems, ecosystems, third-party components, and ROCm components for each ROCm release.

Release notes for previous ROCm releases are available in earlier versions of the documentation.
See the  ROCm documentation release history.

Release highlights

The following are notable new features and improvements in ROCm 6.2.1. For changes to individual components, see  Detailed component changes.

rocAL major version change

The new version of rocAL introduces many new features, but does not modify any of the existing public API functions. However, the version number was incremented from 1.3 to 2.0.
Applications linked to version 1.3 must be recompiled to link against version 2.0.

See  the rocAL detailed changes for more information.

New support for FBGEMM (Facebook General Matrix Multiplication)

As of ROCm 6.2.1, ROCm supports Facebook General Matrix Multiplication (FBGEMM) and the related FBGEMM_GPU library.

FBGEMM is a low-precision, high-performance CPU kernel library for convolution and matrix multiplication. It is used for server-side inference and as a back end for PyTorch quantized operators. FBGEMM_GPU includes a collection of PyTorch GPU operator libraries for training and inference. For more information, see the ROCm  Model acceleration libraries guideand  PyTorch's FBGEMM GitHub repository.

ROCm Offline Installer Creator changes

The  ROCm Offline Installer Creator 6.2.1 introduces several new features and improvements including:

  • Logging support for create and install logs
  • More stringent checks for Linux versions and distributions
  • Updated prerequisite repositories
  • Fixed CTest issues

ROCm documentation changes

There have been no changes to supported hardware or operating systems from ROCm 6.2.0 to ROCm 6.2.1.

The ROCm documentation, like all ROCm projects, is open source and available on GitHub. To contribute to ROCm documentation, see the [ROCm documentation contribution guidelines](https://rocm.docs.amd.com/en/latest/contribute/contributing.html).

Operating system and hardware support changes

There are no changes to supported hardware or operating systems from ROCm 6.2.0 to ROCm 6.2.1.

See the  Compatibility matrix for the full list of supported operating systems and hardware architectures.

ROCm components

The following table lists the versions of ROCm components for ROCm 6.2.1, including any version changes from 6.2.0 to 6.2.1.

Click the component's updated version to go to a detailed list of its changes. Click to go to the component's source code on GitHub.

CategoryGroupNameVersion
LibrariesMachine learning and computer vision Composable Kernel1.1.0
MIGraphX2.10
MIOpen3.2.0
MIVisionX3.0.0
rocAL1.0.0 ⇒  2.0.0
rocDecode0.6.0
rocPyDecode0.1.0
RPP1.8.0
Communication RCCL2.20.5 ⇒  2.20.5
Math hipBLAS2.2.0
hipBLASLt0.8.0
hipFFT1.0.15
hipfort0.4.0
hipRAND2.11.0
hipSOLVER2.2.0
hipSPARSE3.1.1
hipSPARSELt0.2.1
rocALUTION3.2.0
rocBLAS4.1.2 ⇒  4.2.1
rocFFT1.0.28 ⇒  1.0.29
rocRAND3.1.0
rocSOLVER3.26.0
rocSPARSE3.2.0
rocWMMA1.5.0
Tensile4.41.0
Primitives hipCUB3.2.0
hipTensor1.3.0
rocPRIM3.2.0 ⇒  3.2.1
rocThrust3.1.0
ToolsSystem management AMD SMI24.6.2 ⇒  24.6.3
rocminfo1.0.0
ROCm Data Center Tool1.0.0
ROCm SMI7.3.0 ⇒  7.3.0
ROCm Validation Suite1.0.0
Performance Omniperf2.0.1
Omnitrace1.11.2 ⇒  1.11.2
ROCm Bandwidth Test1.4.0
ROCProfiler2.0.0
ROCprofiler-SDK0.4.0
ROCTracer4.1.0
Development HIPIFY18.0.0 ⇒  18.0.0
ROCdbgapi0.76.0
ROCm CMake0.13.0
ROCm Debugger (ROCgdb)14.2
ROCr Debug Agent2.0.3
Compilers HIPCC1.1.1
llvm-project18.0.0
Runtimes HIP6.2 ⇒  6.2.1
ROCr Runtime1.14.0

Detailed component changes

The following sections describe key changes to ROCm components.

AMD SMI (24.6.3)

Changes

  • Added amd-smi static --ras on Guest VMs. Guest VMs can view enabled/disabled RAS features on Host cards.

Removals

  • Removed amd-smi metric --ecc & amd-smi metric --ecc-blocks on Guest VMs. Guest VMs do not support getting current ECC counts from the Host cards.

Resolved issues

  • Fixed TypeError in amd-smi process -G.
  • Updated CLI error strings to handle empty and invalid GPU/CPU inputs.
  • Fixed Guest VM showing passthrough options.
  • Fixed firmware formatting where leading 0s were missing.

HIP (6.2.1)

Resolved issues

  • Soft hang when using AMD_SERIALIZE_KERNEL
  • Memory leak in hipIpcCloseMemHandle

HIPIFY (18.0.0)

Changes

  • Added CUDA 12.5.1 support
  • Added cuDNN 9.2.1 support
  • Added LLVM 18.1.8 support
  • Added hipBLAS 64-bit APIs support
  • Added Support for math constants math_constants.h

Omnitrace (1.11.2)

Known issues

Perfetto can no longer open Omnitrace proto files. Loading Perfetto trace output .proto files in the latest version of ui.perfetto.dev can result in a dialog with the message, "Oops, something went wrong! Please file a bug." The information in the dialog will refer to an "Unknown field type." The workaround is to open the files with the previous version of the Perfetto UI found at https://ui.perfetto.dev/v46.0-35b3d9845/#!/.

See  issue #3767 on GitHub.

RCCL (2.20.5)

Known issues

On systems running Linux kernel 6.8.0, such as Ubuntu 24.04, Direct Memory Access (DMA) transfers between the GPU and NIC are disabled and impacts multi-node RCCL performance.
This issue was reproduced with RCCL 2.20.5 (ROCm 6.2.0 and 6.2.1) on systems with Broadcom Thor-2 NICs and affects other systems with RoCE networks using Linux 6.8.0 or newer.
Older RCCL versions are also impacted.

This issue will be addressed in a future ROCm release.

See  issue #3772 on GitHub.

rocAL (2.0.0)

Changes

  • The new version of rocAL introduces many new features, but does not modify any of the existing public API functions.However, the version number was incremented from 1.3 to 2.0.
    Applications linked to version 1.3 must be recompiled to link against version 2.0.
  • Added development and test packages.
  • Added C++ rocAL audio unit test and Python script to run and compare the outputs.
  • Added Python support for audio decoders.
  • Added Pytorch iterator for audio.
  • Added Python audio unit test and support to verify outputs.
  • Added rocDecode for HW decode.
  • Added support for:
    • Audio loader and decoder, which uses libsndfile library to decode wav files
    • Audio augmentation - PreEmphasis filter, Spectrogram, ToDecibels, Resample, NonSilentRegionDetection, MelFilterBank
    • Generic augmentation - Slice, Normalize
    • Reading from file lists in file reader
    • Downmixing audio channels during decoding
    • TensorTensorAdd and TensorScalarMultiply operations
    • Uniform and Normal distribution nodes
  • Image to tensor updates
  • ROCm install - use case graphics removed

Known issues

  • Dependencies are not installed with the rocAL package installer. Dependencies must be installed with the prerequisite setup script provided. See the  rocAL README on GitHub for details.

rocBLAS (4.2.1)

Removals

  • Removed Device_Memory_Allocation.pdf link in documentation.

Resolved issues

  • Fixed error/warning message during rocblas_set_stream() call.

rocFFT (1.0.29)

Optimizations

  • Implemented 1D kernels for factorizable sizes greater than 1024.

ROCm SMI (7.3.0)

Optimizations

  • Improved handling of UnicodeEncodeErrors with non UTF-8 locales. Non UTF-8 locales were causing crashes on UTF-8 special characters.

Resolved issues

  • Fixed an issue where the Compute Partition tests segfaulted when AMDGPU was loaded with optional parameters.

Known issues

  • When setting CPX as a partition mode, there is a DRM node limit of 64. This is a known limitation when multiple drivers are using the DRM nodes. The ls /sys/class/drm command can be used to see the number of DRM nodes, and the following steps can be used to remove unnecessary drivers:

    1. Unload AMDGPU: sudo rmmod amdgpu.
    2. Remove any unnecessary drivers using rmmod. For example, to remove an AST driver, run sudo rmmod ast.
    3. Reload AMDGPU using modprobesudo modprobe amdgpu.

rocPRIM (3.2.1)

Optimizations

  • Improved performance of block_reduce_warp_reduce when warp size equals block size.

ROCm known issues

ROCm known issues are tracked on  GitHub. Known issues related to
individual components are listed in the  Detailed component changes section.

Instinct MI300X GPU recovery failure on uncorrectable errors

For the AMD Instinct MI300X accelerator, GPU recovery resets triggered by uncorrectable errors (UE) might not complete
successfully, which can result in the system being left in an undefined state. A system reboot is needed to recover from
this state. Additionally, error logging might fail in these situations, hindering diagnostics.

This issue is under investigation and will be resolved in a future ROCm release.

See  issue #3766 on GitHub.

ROCm upcoming changes

The following changes to the ROCm software stack are anticipated for future releases.

rocm-llvm-alt

The rocm-llvm-alt package will be removed in an upcoming release. Users relying on the functionality provided by the closed-source compiler should transition to the open-source compiler. Once the rocm-llvm-alt package is removed, any compilation requesting functionality provided by the closed-source compiler will result in a Clang warning: "[AMD] proprietary optimization compiler has been removed".

rccl-rdma-sharp-plugins

The RCCL plugin package, rccl-rdma-sharp-plugins, will be removed in an upcoming ROCm release.

Release ROCm 6.2.1 Release · ROCm/ROCm