More
referral
Increase your income with Hive. Invite your friends and earn real cryptocurrency!

Misbehaving 3060ti and 3070

I frequently get errors from two of my cards in my rig of 12. It’s either GPU0, which is a 3060ti, or GPU11, which is a 3070. T-Rex Miner gives me these lines frequently:

WARN: NVML: can't get fan speed for GPU #0, error code 999

[FAIL] 39/43 - Low difficulty or invalid share, 34ms ... GPU #0

When I try to modify overclock settings in the web GUI I get this type of error:

=== GPU 11, 10:00.0 GeForce RTX 3070 7982 MB, PL: 100 W, 270 W, 300 W === 16:49:31 SET POWER LIMIT: 125.0 W [Unknown Error] (exicode=123) Max Perf mode: 4 (auto) ERROR: Error assigning value 95 to attribute 'GPUTargetFanSpeed' (NB1:0[fan:16]) as specified in assignment '[fan:16]/GPUTargetFanSpeed=95' (Unknown Error). ERROR: Error assigning value 95 to attribute 'GPUTargetFanSpeed' (NB1:0[fan:17]) as specified in assignment '[fan:17]/GPUTargetFanSpeed=95' (Unknown Error). ERROR: Error assigning value 0 to attribute 'GPUGraphicsClockOffset' (NB1:0[gpu:11]) as specified in assignment '[gpu:11]/GPUGraphicsClockOffset[4]=0' (Unknown Error). ERROR: Error assigning value 0 to attribute 'GPUMemoryTransferRateOffset' (NB1:0[gpu:11]) as specified in assignment '[gpu:11]/GPUMemoryTransferRateOffset[4]=0' (Unknown Error). Attribute 'GPUFanControlState' (NB1:0[gpu:11]) assigned value 1.

I run on the latest stable, but has also tried latest beta. A reboot seems to fix it, but it comes back quickly on either GPU #0 or #11.

Any ideas why this is?

These kind of problems are often caused by bad risers, wiring, connections, … Try switching cabling with neighboring cards and see if the problem stays with the same card or has moved to neighboring card…

Thank you! It seems to have to do with the riser not recovering from OC out of GPU bounds. After a lot of experimenting it seems that once I happen to tweak the overclocking out of the working range, so that I’m getting errors, the riser has to be power-cycled for the unit to start working again. A simple reboot is not sufficient, the riser needs to be powered off. And, of course, before powering it off I need to restore a working configuration in HiveOS again, or the problem will be right back when the miner starts, even if I issued a power-off.

The remedy for me seems to be:

  • Restore a known working configuration for the card
  • Power off completely
  • Boot up

It will probably work with replacing the last two steps with “Shutdown & reboot in 30 s” from the Hive web GUI. I also tried (as a proof of concept – not recommended!) to power off the riser while running, pulling its plug, and then soft-reboot, which also worked (but may be harmful to the hw).

Hi turboraketti,
I’m trying to do a little research on this fan bug. Would you mind telling me what risers you are using?

I’m actually trying to figure out a way to exploit this bug to get my 3060’s up to full hashrate as reported by a user in this thread:
3060 with risers : What to mine? - Nvidia Cards - Hive OS Forum

Thanks

Sure. I have two different kinds. This is what’s printed on them:

  • PCE164P-N06 VER 008S
  • PCE164P-N08 VER 0095

Some bags have “MAKKI Mining Riser” printed on it, but I’m not sure if that holds for both kinds.
Hope that helps.

1 Like

turbo, thank you very much for your reply! I’ve taken note of your riser information & will try to line up a couple test risers.

This topic was automatically closed 416 days after the last reply. New replies are no longer allowed.