ROCm 5.7.1 Release
ROCm 5.7.1 is point release with the following changes:
rocBLAS
A new functionality rocblas-gemm-tune and an environment variable ROCBLAS_TENSILE_GEMM_OVERRIDE_PATH are added to rocBLAS in the ROCm 5.7.1 release.
rocblas-gemm-tune is used to find the best-performing GEMM kernel for each GEMM problem set. It has a command line interface, which mimics the --yaml input used by rocblas-bench. To generate the expected --yaml input, profile logging can be used, by setting the environment variable ROCBLAS_LAYER4.
For more information on rocBLAS logging, see Logging in rocBLAS.
An example input file: Expected output (note selected GEMM idx may differ): Where the far right values (solution_index) are the indices of the best-performing kernels for those GEMMs in the rocBLAS kernel library. These indices can be directly used in future GEMM calls. See rocBLAS/samples/example_user_driven_tuning.cpp for sample code of directly using kernels via their indices.
If the output is stored in a file, the results can be used to override default kernel selection with the kernels found, by setting the environment variable ROCBLAS_TENSILE_GEMM_OVERRIDE_PATH, where points to the stored file.
For more details, refer to the rocBLAS Programmer's Guide.
HIP 5.7.1 (for ROCm 5.7.1)
ROCm 5.7.1 is a point release with several bug fixes in the HIP runtime.
Fixed
- The hipPointerGetAttributes API returns the correct HIP memory type as hipMemoryTypeManaged for managed memory.
hipSOLVER 1.8.2
hipSOLVER 1.8.2 for ROCm 5.7.1
Fixed
- Fixed conflicts between the hipsolver-dev and -asan packages by excluding
hipsolver_module.f90 from the latter
AMD has updated the ROCm Radeon Open Compute Linux stack, a global platform for GPU-accelerated computing on Radeon graphics cards.