8 GPU RIG: 2 GPU not working

Hello,

I am using a RIG ONDA 1800 D8P-D3 V1.00 Onda Technology Corporation (5.6.5 01/16/2018) with 8 Radeon RX 580 8192M · AMD/ATI.
It runs 5.0.21-hiveos.

Problem: GPU0 and GPU1 are not working. Here are the logs:

[ 18.408128] [drm] amdgpu: 8192M of VRAM memory ready
[ 18.408133] [drm] amdgpu: 5901M of GTT memory ready.
[ 18.408205] [drm] GART: num cpu pages 65536, num gpu pages 65536
[ 18.408427] amdgpu 0000:02:00.0: (-22) kernel bo map failed
[ 18.408605] [drm:amdgpu_device_init [amdgpu]] ERROR amdgpu_vram_scratch_init failed -22
[ 18.408615] amdgpu 0000:02:00.0: amdgpu_device_ip_init failed
[ 18.408622] amdgpu 0000:02:00.0: Fatal error during GPU init
[ 18.408629] [drm] amdgpu: finishing device.

Booting back on win10 system all GPU are working.

Any help to solve this issue would be appreciated.

Thank you.

Hello,
Did you manage to get this card working?
J.

Any luck? I’m having a similar issue. All I did was add a 5th GPU to an existing 4 GPU rig. The reason I think it’s Hive/Bios/Software is that the 4th GPU, which was previously working, is now getting similar errors. No change to riser, pcie port, or anything on that GPU. But only 4 out of 5 GPUs are functional. Errors below.

[ 17.016849] amdgpu: ATOM BIOS: 113-1E366CU-S52
[ 17.016868] [drm] UVD is enabled in VM mode
[ 17.016868] [drm] UVD ENC is enabled in VM mode
[ 17.016871] [drm] VCE enabled in VM mode
[ 17.016885] [drm] GPU posting now…
[ 17.148586] [drm] vm size is 64 GB, 2 levels, block size is 10-bit, fragment size is 9-bit
[ 17.148603] amdgpu 0000:05:00.0: BAR 2: releasing [mem 0x40000000-0x401fffff 64bit pref]
[ 17.148604] amdgpu 0000:05:00.0: BAR 0: releasing [??? 0x00000000 flags 0x0]
[ 17.148692] [drm:amdgpu_device_resize_fb_bar [amdgpu]] ERROR Problem resizing BAR0 (-16).

[ 17.148696] amdgpu 0000:05:00.0: BAR 2: assigned [mem 0x40000000-0x401fffff 64bit pref]
[ 17.149424] amdgpu 0000:05:00.0: amdgpu: VRAM: 8192M 0x000000F400000000 - 0x000000F5FFFFFFFF (8192M used)
[ 17.149426] amdgpu 0000:05:00.0: amdgpu: GART: 256M 0x000000FF00000000 - 0x000000FF0FFFFFFF
[ 17.149430] ------------[ cut here ]------------
[ 17.149431] reserve_memtype failed: [mem 0x00000000-0xffffffffffffffff], req write-combining
[ 17.149854] amdgpu 0000:05:00.0: amdgpu: (-22) kernel bo map failed
[ 17.150611] [drm:amdgpu_device_init [amdgpu]] ERROR amdgpu_vram_scratch_init failed -22
[ 17.151306] amdgpu 0000:05:00.0: amdgpu: amdgpu_device_ip_init failed
[ 17.152020] amdgpu 0000:05:00.0: amdgpu: Fatal error during GPU init

Unfortunataly I found no solution yet.
Rebooting in Win10 I have all GPU working (but this is not what I want since it is not stable for a long time).