ERROR: RTX 3060 ti LHR x6 rig issues with overclocking the last GPU

Hi everyone, i had a Rig with the next characteristics:
6 x RTX 3060 ti LHR Asus Hinyx
Asus Rog strix B450-f Gaming II
HiveOs with Kingston DataTraveler 3.0 31.0GB
12 × AMD Ryzen 5 3600 6-Core Processor
8GB RAM memory

After some problems with bad risers and a previous wrong CPU election, finally get all the GPUs workings but the rig crash when i tried to overclock the GPU5 , no matter how many overclock or if other GPUs are overclocked already or not, this GPU crash showing the famous
"GPU driver error, no temps, rebooting" and in the T-rex
"Can’t find nonce with device [ID=5, GPU #5], cuda exception: CUDA_ERROR_ILLEGAL_ADDRESS, try to reduce overclock to stabilize GPU state"

Without overclock the gpu works fine
-We already changed the riser.
-We already tried a previous 470 nvidia version but not a newer one.

  • Could be caused by the USB? (option: change it for a SSD)
  • Could be the virtual memmory ? how can i increase that with HiveOs ?

Thank you in advance!!!


image

Switch to locked core clocks, 1500mhz is typically a good starting point for lhr cards. Remove power limit as well.

If you still have issues after, reduce memory clocks, but set core clocks and see first.

Hi ! thank you for answer me.

We tried all kind of overclocks but that GPU doesnt buy it.

Update: We change the GPU of place and change riser and cables, and the problem continue in the same GPU. I don’t know if is a software problem or is the GPU , but the GPU is brand new and it’s work perfectly with no overclock.

Next step: change the USB to a SSD, hope it change something

You’re still using offset core values, switch to locked clocks instead and the gpus will be happier

1 Like