Cannot find CO in the bundle for ISA: amdgcn-amd-amdhsa--gfx1030
Error in pytorch-rocm with rx 6800:
HIP_LAUNCH_BLOCKING=1 AMD_LOG_LEVEL=1 python
Python 3.11.8 (main, Feb 12 2024, 14:50:05) [GCC 13.2.1 20230801] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.cuda.get_device_name(0)
:1:hip_fatbin.cpp :256 : 0523359019 us: [pid:4008 tid:0x7810d95f5740] Cannot find CO in the bundle for ISA: amdgcn-amd-amdhsa--gfx1030
:1:hip_fatbin.cpp :109 : 0523359039 us: [pid:4008 tid:0x7810d95f5740] Missing CO for these ISAs -
:1:hip_fatbin.cpp :112 : 0523359042 us: [pid:4008 tid:0x7810d95f5740] amdgcn-amd-amdhsa--gfx1030
:1:hip_fatbin.cpp :302 : 0523359048 us: [pid:4008 tid:0x7810d95f5740] Releasing COMGR data failed with status 2
'AMD Radeon RX 6800'
increasing log AMD level:
> AMD_LOG_LEVEL=3 python
Python 3.11.8 (main, Feb 12 2024, 14:50:05) [GCC 13.2.1 20230801] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.cuda.get_device_name(0)
:3:rocdevice.cpp :445 : 2050633342 us: [pid:5386 tid:0x7634f000f740] Initializing HSA stack.
:3:rocdevice.cpp :211 : 2050645526 us: [pid:5386 tid:0x7634f000f740] Numa selects cpu agent[0]=0x5e3210c21770(fine=0x5e3213aa5c10,coarse=0x5e3213aa54f0) for gpu agent=0x5e32127b0d80 CPU<->GPU XGMI=0
:3:rocdevice.cpp :1715: 2050646068 us: [pid:5386 tid:0x7634f000f740] Gfx Major/Minor/Stepping: 10/3/0
:3:rocdevice.cpp :1717: 2050646074 us: [pid:5386 tid:0x7634f000f740] HMM support: 1, XNACK: 0, Direct host access: 0
:3:rocdevice.cpp :1719: 2050646077 us: [pid:5386 tid:0x7634f000f740] Max SDMA Read Mask: 0xf, Max SDMA Write Mask: 0xf
:3:hip_context.cpp :48 : 2050646724 us: [pid:5386 tid:0x7634f000f740] Direct Dispatch: 1
:1:hip_fatbin.cpp :256 : 2050782314 us: [pid:5386 tid:0x7634f000f740] Cannot find CO in the bundle for ISA: amdgcn-amd-amdhsa--gfx1030
:1:hip_fatbin.cpp :109 : 2050782322 us: [pid:5386 tid:0x7634f000f740] Missing CO for these ISAs -
:1:hip_fatbin.cpp :112 : 2050782324 us: [pid:5386 tid:0x7634f000f740] amdgcn-amd-amdhsa--gfx1030
:1:hip_fatbin.cpp :302 : 2050782326 us: [pid:5386 tid:0x7634f000f740] Releasing COMGR data failed with status 2
:3:hip_platform.cpp :674 : 2050782329 us: [pid:5386 tid:0x7634f000f740] init: Returned hipErrorInvalidValue :
:3:hip_device_runtime.cpp :637 : 2050782338 us: [pid:5386 tid:0x7634f000f740] hipGetDeviceCount ( 0x7ffc0ad85510 )
:3:hip_device_runtime.cpp :639 : 2050782341 us: [pid:5386 tid:0x7634f000f740] hipGetDeviceCount: Returned hipSuccess :
:3:hip_device_runtime.cpp :637 : 2050782354 us: [pid:5386 tid:0x7634f000f740] hipGetDeviceCount ( 0x76340f28e9f4 )
:3:hip_device_runtime.cpp :639 : 2050782357 us: [pid:5386 tid:0x7634f000f740] hipGetDeviceCount: Returned hipSuccess :
:3:hip_device.cpp :463 : 2050782361 us: [pid:5386 tid:0x7634f000f740] hipGetDevicePropertiesR0600 ( 0x7ffc0ad85038, 0 )
:3:hip_device.cpp :465 : 2050782365 us: [pid:5386 tid:0x7634f000f740] hipGetDevicePropertiesR0600: Returned hipSuccess :
:3:hip_device_runtime.cpp :637 : 2050782383 us: [pid:5386 tid:0x7634f000f740] hipGetDeviceCount ( 0x7ffc0ad85548 )
:3:hip_device_runtime.cpp :639 : 2050782386 us: [pid:5386 tid:0x7634f000f740] hipGetDeviceCount: Returned hipSuccess :
:3:hip_device_runtime.cpp :622 : 2050782393 us: [pid:5386 tid:0x7634f000f740] hipGetDevice ( 0x7ffc0ad85314 )
:3:hip_device_runtime.cpp :630 : 2050782396 us: [pid:5386 tid:0x7634f000f740] hipGetDevice: Returned hipSuccess :
:3:hip_device_runtime.cpp :637 : 2050782398 us: [pid:5386 tid:0x7634f000f740] hipGetDeviceCount ( 0x7ffc0ad85074 )
:3:hip_device_runtime.cpp :639 : 2050782401 us: [pid:5386 tid:0x7634f000f740] hipGetDeviceCount: Returned hipSuccess :
:3:hip_context.cpp :344 : 2050782652 us: [pid:5386 tid:0x7634f000f740] hipDevicePrimaryCtxGetState ( 0, 0x7ffc0ad85128, 0x7ffc0ad8512c )
:3:hip_context.cpp :358 : 2050782657 us: [pid:5386 tid:0x7634f000f740] hipDevicePrimaryCtxGetState: Returned hipSuccess :
:3:hip_device_runtime.cpp :622 : 2050782660 us: [pid:5386 tid:0x7634f000f740] hipGetDevice ( 0x7ffc0ad85334 )
:3:hip_device_runtime.cpp :630 : 2050782662 us: [pid:5386 tid:0x7634f000f740] hipGetDevice: Returned hipSuccess :
:3:hip_context.cpp :344 : 2050782665 us: [pid:5386 tid:0x7634f000f740] hipDevicePrimaryCtxGetState ( 0, 0x7ffc0ad85148, 0x7ffc0ad8514c )
:3:hip_context.cpp :358 : 2050782667 us: [pid:5386 tid:0x7634f000f740] hipDevicePrimaryCtxGetState: Returned hipSuccess :
:3:hip_device_runtime.cpp :622 : 2050782671 us: [pid:5386 tid:0x7634f000f740] hipGetDevice ( 0x7ffc0ad852d4 )
:3:hip_device_runtime.cpp :630 : 2050782674 us: [pid:5386 tid:0x7634f000f740] hipGetDevice: Returned hipSuccess :
:3:hip_context.cpp :344 : 2050782677 us: [pid:5386 tid:0x7634f000f740] hipDevicePrimaryCtxGetState ( 0, 0x7ffc0ad850e8, 0x7ffc0ad850ec )
:3:hip_context.cpp :358 : 2050782680 us: [pid:5386 tid:0x7634f000f740] hipDevicePrimaryCtxGetState: Returned hipSuccess :
:3:hip_device.cpp :463 : 2050782821 us: [pid:5386 tid:0x7634f000f740] hipGetDevicePropertiesR0600 ( 0x7ffc0ad84d70, 0 )
:3:hip_device.cpp :465 : 2050782826 us: [pid:5386 tid:0x7634f000f740] hipGetDevicePropertiesR0600: Returned hipSuccess :
'AMD Radeon RX 6800'
This is possibly a ROCM issue, but it shows up clearly when invoking torch (with rocm support) from python. With regards to potential ROCM related issue, see also: