Skip to content

Fix 16bit floats in Tensile.

Lubosz Sarnecki requested to merge lubosz/rocblas:fix-f16 into main

Backport rocBLAS patch that fixes build flags on Tensile, so that half-precision floating-point operations do not result in garbage data.

The backported patch removes the -mf16c build flag from several targets, including Tensile. This is apparently required for rocm-llvm versions higher than rocm-6.1.0.

This fixes running several Llama implementations like ollama and llama.cpp.

See #2 (closed)

I have tested this PKGBUILD only with gfx1030 enabled in the build targets to have quicker results, but can confirm that it resolves the issue in llama.cpp as well as ollama for me. The ollama package does not need any rebuild. Also building the rocblas source at 6.2.2 with the backported patch resulted in a working rocblas-bench execution, which was my test during bisection.

Edited by Lubosz Sarnecki

Merge request reports

Loading