GPU driver errors and GPUs lost, forcing reboots

image

I’m having some issues getting my rig to run for a while without crashing due to GPU errors. The two common errors I’m getting are “GPU are lost, rebooting” which causes the system to reboot, and “GPU driver error, no temps” which drops the hash rate to 0.

I’m noticing that sometimes I get very high LA around the times the errors occur, but that may be due to a coincidence.

I’ve also noticed that nvtool cannot read the clock speeds of GPU2. May be related.

I’m on the newest Nvidia driver and HiveOS versions.

I have similar problem. Just started mining.
Got 1 msi ventus rtx3080
When card is running without OC, no problem running steady at 86MH.
After doing OC 2000 on mem it runs at 96MH for several minutes and then crashes. Reboot and continue to crush after several minutes of working.

Same error “GPU driver error, no temps”

Tried nvidia driver 455 and 460