Cannot find CO in the bundle for ISA: amdgcn-amd-amdhsa--gfx1030

Error in pytorch-rocm with rx 6800:

HIP_LAUNCH_BLOCKING=1 AMD_LOG_LEVEL=1 python                                                                                                                                                                                                                                                           
Python 3.11.8 (main, Feb 12 2024, 14:50:05) [GCC 13.2.1 20230801] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.cuda.get_device_name(0)
:1:hip_fatbin.cpp           :256 : 0523359019 us: [pid:4008  tid:0x7810d95f5740] Cannot find CO in the bundle for ISA: amdgcn-amd-amdhsa--gfx1030 

:1:hip_fatbin.cpp           :109 : 0523359039 us: [pid:4008  tid:0x7810d95f5740] Missing CO for these ISAs - 
:1:hip_fatbin.cpp           :112 : 0523359042 us: [pid:4008  tid:0x7810d95f5740]      amdgcn-amd-amdhsa--gfx1030
:1:hip_fatbin.cpp           :302 : 0523359048 us: [pid:4008  tid:0x7810d95f5740] Releasing COMGR data failed with status 2 
'AMD Radeon RX 6800'

increasing log AMD level:

> AMD_LOG_LEVEL=3 python                                                                                                                                                                                                                                                                       
Python 3.11.8 (main, Feb 12 2024, 14:50:05) [GCC 13.2.1 20230801] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.cuda.get_device_name(0)
:3:rocdevice.cpp            :445 : 2050633342 us: [pid:5386  tid:0x7634f000f740] Initializing HSA stack.
:3:rocdevice.cpp            :211 : 2050645526 us: [pid:5386  tid:0x7634f000f740] Numa selects cpu agent[0]=0x5e3210c21770(fine=0x5e3213aa5c10,coarse=0x5e3213aa54f0) for gpu agent=0x5e32127b0d80 CPU<->GPU XGMI=0
:3:rocdevice.cpp            :1715: 2050646068 us: [pid:5386  tid:0x7634f000f740] Gfx Major/Minor/Stepping: 10/3/0
:3:rocdevice.cpp            :1717: 2050646074 us: [pid:5386  tid:0x7634f000f740] HMM support: 1, XNACK: 0, Direct host access: 0
:3:rocdevice.cpp            :1719: 2050646077 us: [pid:5386  tid:0x7634f000f740] Max SDMA Read Mask: 0xf, Max SDMA Write Mask: 0xf
:3:hip_context.cpp          :48  : 2050646724 us: [pid:5386  tid:0x7634f000f740] Direct Dispatch: 1
:1:hip_fatbin.cpp           :256 : 2050782314 us: [pid:5386  tid:0x7634f000f740] Cannot find CO in the bundle for ISA: amdgcn-amd-amdhsa--gfx1030 

:1:hip_fatbin.cpp           :109 : 2050782322 us: [pid:5386  tid:0x7634f000f740] Missing CO for these ISAs - 
:1:hip_fatbin.cpp           :112 : 2050782324 us: [pid:5386  tid:0x7634f000f740]      amdgcn-amd-amdhsa--gfx1030
:1:hip_fatbin.cpp           :302 : 2050782326 us: [pid:5386  tid:0x7634f000f740] Releasing COMGR data failed with status 2 
:3:hip_platform.cpp         :674 : 2050782329 us: [pid:5386  tid:0x7634f000f740] init: Returned hipErrorInvalidValue : 
:3:hip_device_runtime.cpp   :637 : 2050782338 us: [pid:5386  tid:0x7634f000f740]  hipGetDeviceCount ( 0x7ffc0ad85510 ) 
:3:hip_device_runtime.cpp   :639 : 2050782341 us: [pid:5386  tid:0x7634f000f740] hipGetDeviceCount: Returned hipSuccess : 
:3:hip_device_runtime.cpp   :637 : 2050782354 us: [pid:5386  tid:0x7634f000f740]  hipGetDeviceCount ( 0x76340f28e9f4 ) 
:3:hip_device_runtime.cpp   :639 : 2050782357 us: [pid:5386  tid:0x7634f000f740] hipGetDeviceCount: Returned hipSuccess : 
:3:hip_device.cpp           :463 : 2050782361 us: [pid:5386  tid:0x7634f000f740]  hipGetDevicePropertiesR0600 ( 0x7ffc0ad85038, 0 ) 
:3:hip_device.cpp           :465 : 2050782365 us: [pid:5386  tid:0x7634f000f740] hipGetDevicePropertiesR0600: Returned hipSuccess : 
:3:hip_device_runtime.cpp   :637 : 2050782383 us: [pid:5386  tid:0x7634f000f740]  hipGetDeviceCount ( 0x7ffc0ad85548 ) 
:3:hip_device_runtime.cpp   :639 : 2050782386 us: [pid:5386  tid:0x7634f000f740] hipGetDeviceCount: Returned hipSuccess : 
:3:hip_device_runtime.cpp   :622 : 2050782393 us: [pid:5386  tid:0x7634f000f740]  hipGetDevice ( 0x7ffc0ad85314 ) 
:3:hip_device_runtime.cpp   :630 : 2050782396 us: [pid:5386  tid:0x7634f000f740] hipGetDevice: Returned hipSuccess : 
:3:hip_device_runtime.cpp   :637 : 2050782398 us: [pid:5386  tid:0x7634f000f740]  hipGetDeviceCount ( 0x7ffc0ad85074 ) 
:3:hip_device_runtime.cpp   :639 : 2050782401 us: [pid:5386  tid:0x7634f000f740] hipGetDeviceCount: Returned hipSuccess : 
:3:hip_context.cpp          :344 : 2050782652 us: [pid:5386  tid:0x7634f000f740]  hipDevicePrimaryCtxGetState ( 0, 0x7ffc0ad85128, 0x7ffc0ad8512c ) 
:3:hip_context.cpp          :358 : 2050782657 us: [pid:5386  tid:0x7634f000f740] hipDevicePrimaryCtxGetState: Returned hipSuccess : 
:3:hip_device_runtime.cpp   :622 : 2050782660 us: [pid:5386  tid:0x7634f000f740]  hipGetDevice ( 0x7ffc0ad85334 ) 
:3:hip_device_runtime.cpp   :630 : 2050782662 us: [pid:5386  tid:0x7634f000f740] hipGetDevice: Returned hipSuccess : 
:3:hip_context.cpp          :344 : 2050782665 us: [pid:5386  tid:0x7634f000f740]  hipDevicePrimaryCtxGetState ( 0, 0x7ffc0ad85148, 0x7ffc0ad8514c ) 
:3:hip_context.cpp          :358 : 2050782667 us: [pid:5386  tid:0x7634f000f740] hipDevicePrimaryCtxGetState: Returned hipSuccess : 
:3:hip_device_runtime.cpp   :622 : 2050782671 us: [pid:5386  tid:0x7634f000f740]  hipGetDevice ( 0x7ffc0ad852d4 ) 
:3:hip_device_runtime.cpp   :630 : 2050782674 us: [pid:5386  tid:0x7634f000f740] hipGetDevice: Returned hipSuccess : 
:3:hip_context.cpp          :344 : 2050782677 us: [pid:5386  tid:0x7634f000f740]  hipDevicePrimaryCtxGetState ( 0, 0x7ffc0ad850e8, 0x7ffc0ad850ec ) 
:3:hip_context.cpp          :358 : 2050782680 us: [pid:5386  tid:0x7634f000f740] hipDevicePrimaryCtxGetState: Returned hipSuccess : 
:3:hip_device.cpp           :463 : 2050782821 us: [pid:5386  tid:0x7634f000f740]  hipGetDevicePropertiesR0600 ( 0x7ffc0ad84d70, 0 ) 
:3:hip_device.cpp           :465 : 2050782826 us: [pid:5386  tid:0x7634f000f740] hipGetDevicePropertiesR0600: Returned hipSuccess : 
'AMD Radeon RX 6800'

This is possibly a ROCM issue, but it shows up clearly when invoking torch (with rocm support) from python. With regards to potential ROCM related issue, see also:

blender#12 (moved)