1 of 6 cards stopped working

Hi everyone,

Amateur Miner from sunny South Africa here, and I need your help.

I’ve recently made the switch from Windows to HiveOS, I have 3 workers and all was running well for a few days. Then one day after a reboot, one of my RX580’s just stopped working.

On every reboot since, I get errors:
ERROR ring gfx test failed (-110)
ERROR hw_init of IP block <gfx_v8_0> failed -110
amdgpu_device_ip_init failed
Fatal error during GPU init

Then in Hive interface I see the card, but no temp/fan/power values are displayed, and then claymore doesn’t even show that card (only GPU 0-4, i.e. the first 5 cards)

Some background:
Total of 3 rigs, this rig has 4x RX580 and 2x RX570 (all mixed brands and memories)
This failing card is an ASUS Dual OC 8GB with Hynix memory and was running with pimped straps and mild OC/UV (1150/900/2100) and was comfortable at ~28MH/s.

After failure I attempted the following:

  1. Removed OC (no change)
  2. Flashed default BIOS (which I always save from card before modding) (also no change)
  3. Swapped riser with working card (no change)
  4. Swapped 8-pin power with working card (no change)
  5. Pulled card and put into Windows 10 machine - booted into Windows with “Standard VGA Adapter” driver BUT when installing AMD drivers instant BSOD (tried multiple driver versions/each failure removed using DDU) (obvious failure)
  6. Put card back into HiveOS rig, flashed a modded BIOS from tech forum, and got stats (temp/fan and power) to display, but no mining. Flashed original back on with no OC. Same result (stats but no mine). After a few reboots, it is back to the main issue described above.
  7. Flashed this same modded BIOS (from the forum) to see if stats come back at least but to no avail. Still completely useless except the name shown in HiveOS interface.

Has anyone seen/experienced this before? Please help.

Thanks in advance,
Barney

Hello!

Well by the looks of things, it seems the card itself is failing. There’s not a whole lot that can be done aside from what you’ve tried already.

You reverted to default factory BIOS, removed OC, issue is seen on multiple OSes (so this excludes software/drivers + other cards are working as intended, so this should already be a good give away), you’ve swapped risers and cables (excludes other potential hardware)…

Yup, looks like the card is bad.

Oh man, I’m really hoping that isn’t the case. It’s weird that the card works in Windows. Display and everything, right up until I try the drivers…

I’m having the same issue in my test rig. I have a 580 and 570 doing it. They work fine when I move them to another rig and changing the riser or which PSU they are plugged into didn’t change anything. I feel like it’s driver related with the OS. Going to mess with it more tomorrow.