More
referral
Increase your income with Hive. Invite your friends and earn real cryptocurrency!

GPU driver errors and GPUs lost, forcing reboots

I have also been having this issue. The weird thing is it happened around the same time 2 nights in a row.

I get the restart for low hashrate, then an overclock failed notice with a gpu being offline, then a full reboot which seems to fix the issue.

This on a rig with 3x 3070s using gminer.

1 Like

Had similar issue today with a new 3080 I bought today. When I added it to my rig, it would run fine for a few mins then crash with the “GPU driver error, no temps”.

I changed the thermal pads on it so definitely no issues with thermal throttling but I found out that it was an unstable memory OC causing the crash. Dropped it by 100MHz and its been running good for afewhours now.

1 Like

Hey King,

did you fix the problem? I am having the same issue with my Gigabyte 3080 OC cards… I have decreased the MEM to 2100 and I am having the same issues?

Same issue here with 1070 and 1070 Ti :cold_sweat:

Any idea?

same issue happening with me. but on gtx 1060s !!

so it’s not a RTX 30 series issue.

oc is minimal, and usually stable.

dunno what to do more
any suggestions?

no point in opening duplicate threads. there is no fix and hiveOS devs don’t even care.

I have the same problem on 1070, did you manage to solve it?

Has anyone Figure it out?

Mine has been really weird, I had the same cards and same board in another frame and it was ok never an issue. I bought a new Frame for more space and Air flow, moved the cards over and am getting these errors, sometimes the rig will run for 1d no issues and out of the blue i get The Nvidia OC failed error. Sometimes will happen right away or tell me there is an issue with GPU5. Change Risers, change PSU but the one thing we have in common is the 3070 and the Z170 series Board, am almost certain is either HIVEOS or the board. Am running 7 cards. I will test it with 6 Thursday and post here results. One thing that helps me is to change miners from Phoenix to Gminer and it 80% it works and mines for 1d or so.

Same rtx 3070 . T-rex .
Exactly 2 same rigs the 1 has the problem . The only difference between rigs is nvidia drivers . N460.67. Is the one with problem .

I’m on my 8th rig and starting to have the same problem as soon as I start installing 3080s. My first 7 rigs were all 3070 and never had any issues in the past. I’m currently testing this setup but still having the same issue. The problem was semi-solved when I tried to reduce the number of GPU. I think the problem either mixing 3080 and 3090 or my PSU cannot give enough juice to power 3080 and 3090. I got the same issue when I added a 3060 into my previous rig with 3070 only. Resetting my BIOS helped me with another rig.

I just bought a Corsair HX1000 Platinum 1000w to give it a try. I will also reset my BIOS. I will keep you guys posted!

1 Like

@russchau
J’ai le même problème mais je ne sais pas cela viens d’où mon rig ne mine plus a n’importe quel moment.
Et je voudrai savoir si je devais améliorer quelque chose sur mes overclock

Je suis entrain de chercher la cause du problème pour le moment en changeant différent PSU. Je te tiens au courant si je trouve une solution.

Essaye de reset ton BIOS et laisse moi savoir si le problème est résolu. Si ca ne fontionne toujours pas, essaye de reset encore et commence avec ton 3080.

c’est possible que se soit a cause de la connexion parce que moi c’est arriver apres que j’ai installer la fibre donc c’est possible que sa vienne de la ?

mon rig s’arrête toujours et je ne sais plus quoi faire mais moi ca marcher très bien avant et tout a coup il s’arrête a n’importe quel moment

Je ne pense pas car ca ne consomme pas beaucoup d’internet. Est-ce que la lumière sur ton ETHERNET port est allumer?
Solution:

  1. Branche un écran sur ton HDMI et prend une photo après que HIVEOS load.
  2. Débranche ton ETHERNET cable et Restart ton rig. Attend que HIVEOS load complètement. Branche le ethernet cable après que le load fini. Si la lumière de ton ethernet port s’allume, met le command: “miner”

Tu peux run les commands suivant: https://hiveos.farm/troubleshooting-conn/

Delete tout les OC settings sur les 3080 (= 0), restart your RIG et ensuite change seulement ton PL à 230w après que c’est stable.

OK so I ended up here with the same issues you are having. I have been troubleshooting this for days. Checked everything including power cables and risers. Changed them all out. Even changed my motherboard and cpu. still had the same problem. Come to find out I had a usb extension on one my cables that was causing the issue. When i cutout the extension all of my problems went away. So remove extensions and/or change usb cables or make sure they are fully seated. Hopefully this will fix your problem.

1 Like

I had the same problem,
I decided to access the disk drive through Windows and delete the amd-oc.conf (nvidia-oc.conf if it is Nvidia) and autofan.conf files
In the rig.conf file it deletes the lines referring to the WatchDog.
After that I started HiveOs again and the rig started working again. :fireworks:
or simply format your disk and reinstall the system and redo the settings.

3 Likes

Thanks for this dude, this issue has been hounding me for a week i had a few unexpected power interruptions and EVERY time after that i have all these weird nvidia driver issues and my oc’s wont apply.
and the only why i could get it working is to reload the flash drive. but this saved me a lot of time doing that every the drives. so +1 from me
PS: i used shell in a box and only removed my nvidia-oc.conf and autofan.conf and restarted.

cd /hive-config
rm nvidia-oc.conf autofan.conf

Please use the above commands at own risk and understand what you are doing.

4 Likes

I had the same problem.(ERGO, 2miners, t-rex / rig:3060ti,3070ti, 3080)
I solved it by:

  1. Downgrade hiveos.
  2. Downgrade nvidia drivers to stable version.
  3. Many tries to set good OC.
    Now its working fine.
1 Like

Same issue, four 3080. So frustrating. If this works I will praise NEoKhajitt and evandrop to my grandchildren. Will post again. So far stable.