AMD and NVIDIA (Nvidia crashes)

I have a rig running fine with 3 AMD RX 580 gpus, as soon as I shut it down and plug in a Nvidia GTX 1060, HiveOS will either fail to start, the miner (claymore dual miner) won't start and on a few instances that claymore dual miner has started it crashed within a minute.

Any help trouble shooting, or advice on how to diagnose this would be appreciated.
Tagged:

Comments

  • Check 1060 riser and connectors. Screenshots, pls.
  • Running on SSH, so screen shots when HiveOS doesn't boot will be hard. I'll have to get a long monitor cable.

    Would a bad gpu prevent HiveOS from starting? I think one of the 1060 gpu's might be bad. HiveOS will run with 3 rx 580 and 1 gtx 1060. When I plugin in both gtx 1060's or only the second 1060 Hive won't start.
  • edited May 18
    A few times it would start detect 4 gpus of 4 plugged in, the 3 AMDs with out the memory info, but only mine on 3 RX 580s and at very low hashrates. See screen shot in below comment.

    I'm running on USB, is there anything in the logs that would help diagnose this? If so ow can I enable logging on USB?
  • edited May 18
    zaqf10y4ziuv.png

    That's weird, that first card is a 580 not 470/480
  • edited May 18
    Try plugin in first pci slot Nvidia, Amd in other slots. Try different options. Switch on 4G Decoding in motherboard bios. Power supply is enough? How many watts? What motherboard?
  • with the nvidia cards in the two pcie 16 slot it will detect all cards, but with the same memory error as above. And the miner won't start, no strange errors, seems to hang after connecting to the pools.

    HiveOS might be stuck in loop as sudo poweroff isn't shutting down the system anymore. (just upgrade to 0.5-52)
  • It crashed last night and I haven't been able to get HiveOS to boot back up detecting all the GPUS correctly. Currently I switch the two NVIDA cards to slots 1 and 2, followed by the 3 AMD cards. with a monitor plugged in HiveOS seems to hang on "modprobe NVIDIA drivers"
  • looking through the bootlog, the only thing that jumps out is:

    May 24 10:31:46 miner-001 hive[806]: Error connecting to Hive server http://api.hiveos.farm
    May 24 10:31:46 miner-001 hive[806]: CURLE_OPERATION_TIMEDOUT (28) Operation timeout. The specified time-out period was reached according to the conditions.


    However
    net-test returns ok on everything
    and
    and no issues if i run hello
  • edited May 24
    looking through the syslog there's a little bit more:

    May 24 10:32:01 miner-001 cron[567]: (root) RELOAD (crontabs/root)
    May 24 10:32:03 miner-001 hivex[1564]: #015
    May 24 10:32:05 miner-001 hivex[1564]: waiting for X server to begin accepting connections .
    May 24 10:32:07 miner-001 hivex[1564]: ..
    May 24 10:35:00 miner-001 hivex[1564]: message repeated 86 times: [ ..]
    May 24 10:35:01 miner-001 CRON[4632]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
    May 24 10:35:02 miner-001 hivex[1564]: ..
    May 24 10:36:04 miner-001 hivex[1564]: message repeated 31 times: [ ..]
    May 24 10:36:04 miner-001 hivex[1564]: xinit: giving up
    May 24 10:36:04 miner-001 hivex[1564]: xinit: unable to connect to X server: Bad file descriptor
    May 24 10:36:04 miner-001 hivex[1564]: #015
    May 24 10:36:15 miner-001 hivex[1564]: waiting for X server to shut down ..........
    May 24 10:36:15 miner-001 hivex[1564]: xinit: X server slow to shut down, sending KILL signal
    May 24 10:36:15 miner-001 hivex[1564]: #015
    May 24 10:36:19 miner-001 hivex[1564]: waiting for server to die ...
    May 24 10:36:19 miner-001 hivex[1564]: xinit: X server refuses to die
    May 24 10:36:19 miner-001 hivex[1564]: xinit exited (exitcode=1), starting hive-console
  • The solution/resolutions I came to is it's easier too separate the cards into separate rigs by amd and nvidia.
Sign In or Register to comment.