system doesn't properly start with modded AMD BIOS

Hi,

i have rx570 cards in my system with a modded BIOS, HIVEOS doesn't start on this rigs, any idea ?

this is what i see in the kernel log :

[ 35.296887] amdgpu 0000:0f:00.0: fence driver on ring 13 use gpu addr 0x00000000004006e0, cpu addr 0xffffac33c810a6e0
[ 35.296990] amdgpu 0000:0f:00.0: fence driver on ring 14 use gpu addr 0x0000000000400760, cpu addr 0xffffac33c810a760
[ 35.338690] BUG: unable to handle kernel paging request at ffffac3923717420
[ 35.339836] IP: smu7_populate_single_firmware_entry.isra.3+0x8d/0xf0 [amdgpu]
[ 35.340943] PGD 23e12a067
[ 35.340944] P4D 23e12a067
[ 35.342050] PUD 0

[ 35.345259] Oops: 0002 [#1] SMP
[ 35.346329] Modules linked in: amdgpu(OE+) amdkfd(OE) amd_iommu_v2 amdttm(OE) amdkcl(OE) intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd rtl8xxxu glue_helper cryptd arc4 intel_cstate eeepc_wmi rtl8192cu asus_wmi sparse_keymap rtl_usb mxm_wmi wmi_bmof rtl8192c_common intel_rapl_perf rtlwifi mac80211 cfg80211 hci_uart btbcm btqca intel_lpss_acpi btintel intel_lpss bluetooth ecdh_generic acpi_pad acpi_als mac_hid wmi mei_me shpchp mei kfifo_buf industrialio autofs4 uas usb_storage i915 e1000e(OE) ptp pps_core ahci i2c_algo_bit libahci drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm video i2c_hid hid
[ 35.351207] CPU: 1 PID: 936 Comm: modprobe Tainted: G OE 4.13.16-hiveos #1
[ 35.352483] Hardware name: System manufacturer System Product Name/B250 MINING EXPERT, BIOS 1001 12/13/2017
[ 35.353781] task: ffff92b037acc440 task.stack: ffffac33c15b4000
[ 35.355155] RIP: 0010:smu7_populate_single_firmware_entry.isra.3+0x8d/0xf0 [amdgpu]
[ 35.356470] RSP: 0018:ffffac33c15b78b8 EFLAGS: 00010246
[ 35.357809] RAX: 0000000000000089 RBX: 0000000000000003 RCX: 0000000000537000
[ 35.359136] RDX: 0000000000000004 RSI: 0000000000000000 RDI: ffff92b03b2b37e0
[ 35.360474] RBP: ffffac33c15b7908 R08: 0000000000000001 R09: 000000000003fa90
[ 35.361799] R10: ffff92b0397f4ae0 R11: 0000000000000001 R12: ffffac3923717420
[ 35.363134] R13: ffff92b039632ec8 R14: ffff92b039d04000 R15: 000000000000047e
[ 35.364465] FS: 00007f88db929700(0000) GS:ffff92b046d00000(0000) knlGS:0000000000000000
[ 35.365831] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 35.367183] CR2: ffffac3923717420 CR3: 00000002367d8000 CR4: 00000000003406e0
[ 35.368570] Call Trace:
[ 35.369968] smu7_request_smu_load_fw+0x97/0x330 [amdgpu]
[ 35.371406] polaris10_start_smu+0x121/0x200 [amdgpu]
[ 35.372849] pp_hw_init+0x4a/0xe0 [amdgpu]
[ 35.374266] amdgpu_pp_hw_init+0x38/0x80 [amdgpu]
[ 35.375697] amdgpu_device_init+0xae7/0x1460 [amdgpu]
[ 35.377083] ? kmalloc_order+0x18/0x40
[ 35.378453] ? kmalloc_order_trace+0x24/0xa0
[ 35.379873] amdgpu_driver_load_kms+0x87/0x2e0 [amdgpu]
[ 35.381247] drm_dev_register+0x146/0x1d0 [drm]
[ 35.382662] amdgpu_pci_probe+0x11a/0x140 [amdgpu]
[ 35.384062] local_pci_probe+0x45/0xa0
[ 35.385451] pci_device_probe+0x159/0x1a0
[ 35.386852] driver_probe_device+0x29e/0x450
[ 35.388258] __driver_attach+0xdf/0xf0
[ 35.389674] ? driver_probe_device+0x450/0x450
[ 35.391100] bus_for_each_dev+0x6c/0xc0
[ 35.392500] driver_attach+0x1e/0x20
[ 35.393934] bus_add_driver+0x1f4/0x270
[ 35.395332] ? 0xffffffffc0e76000
[ 35.396739] driver_register+0x60/0xe0
[ 35.398141] ? 0xffffffffc0e76000
[ 35.399515] __pci_register_driver+0x4c/0x50
[ 35.400939] amdgpu_init+0x8f/0xa0 [amdgpu]
[ 35.402290] do_one_initcall+0x53/0x190
[ 35.403586] ? __vunmap+0x81/0xb0
[ 35.404870] ? kmem_cache_alloc_trace+0x152/0x1c0
[ 35.406193] ? kfree+0x162/0x170
[ 35.407497] ? kfree+0x162/0x170
[ 35.408809] do_init_module+0x5f/0x209
[ 35.410110] load_module+0x27e0/0x2be0
[ 35.411378] ? ima_post_read_file+0x7d/0xa0
[ 35.412636] SYSC_finit_module+0xe5/0x120
[ 35.413920] ? SYSC_finit_module+0xe5/0x120
[ 35.415134] SyS_finit_module+0xe/0x10
[ 35.416331] entry_SYSCALL_64_fastpath+0x1e/0xa9
[ 35.417542] RIP: 0033:0x7f88db44c4d9
[ 35.418736] RSP: 002b:00007ffe37e41968 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[ 35.419919] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f88db44c4d9
[ 35.421133] RDX: 0000000000000000 RSI: 000055891de03540 RDI: 0000000000000007
[ 35.422336] RBP: 00007ffe37e40970 R08: 0000000000000000 R09: 000000000000002d
[ 35.423547] R10: 0000000000000007 R11: 0000000000000246 R12: 000055891de02760
[ 35.424695] R13: 00007ffe37e40950 R14: 0000000000000005 R15: 0000000000040000
[ 35.425826] Code: e3 fb 0f 94 c0 66 41 89 44 24 18 31 c0 48 8b 4d e0 65 48 33 0c 25 28 00 00 00 75 6a 48 83 c4 38 5b 41 5c 41 5d 5d c3 0f b7 45 b2 <66> 41 89 1c 24 41 c7 44 24 0c 00 00 00 00 41 c7 44 24 10 00 00
[ 35.427106] RIP: smu7_populate_single_firmware_entry.isra.3+0x8d/0xf0 [amdgpu] RSP: ffffac33c15b78b8
[ 35.428309] CR2: ffffac3923717420
[ 35.429517] ---[ end trace e68aa1267d4d2bd2 ]---
[ 39.516645] pci 0000:10:00.0: Invalid PCI ROM header signature: expecting 0xaa55, got 0xffff

Comments

  • edited May 31
    This difficult to say anything without additional info
    First at all flash stock BIOS then try boot Hive

    Also you can get help here - http://forum.hiveos.farm/discussion/390/proshivka-bios-amd
  • with the stock bios : rig01-6-RadeonRX570-4G-Samsung_K4G41325FE-113-D0003400_100.rom

    everything worked correct, only after is flashed this one : Powercolor.RX570.4096.170330.rom

    it doesn't come up anymore
  • just flashed back the stock ROM and it comes back up normal, so its clearly caused by the other ROM
  • the reason i started flashing the ROM is because i get very bad hashrate, just ~20 . can somebody share a ROM that is known to work with this card ?
  • i give you link where can help you, see above
  • i don't speak russian unfortunate and while there are many links i can't see one that is for this card :

    Radeon RX 570 4096M
    Samsung K4G41325FE, 113-D0003400_100

    i would appreciate a link to the file if you have one.

    thanks
  • attach your stock BIOS
  • ~30mh/s
  • i got 29, much better than before thanks.
    could you share what exactly you changed in the rom and what version of SW you used to do that so i can do this for my remaining cards (2 models of 580's) .

    thanks
  • if you can't share how you did that, can you please fix these 2 other roms too :

    rx580 type 1: https://drive.google.com/open?id=1jU8dN0M75BtNkZZBQYsSVrBmIFb7obEN
    rx580 type 2 : https://drive.google.com/open?id=10bz_cZ3YKGFNgzeAp0KaNvvoc9TTBdgw

    thanks a lot for your help
  • @oehmes:
    1) Increase max memory clock to 2350MHz
    2) Decrease memory voltage to 875 and set core voltage on P6 and P7 equal to P5's 65286
    3) Apply the timings found in PBE 1.6.7
  • that was very helpful , i was able to change all the settings. any specific settings i need to still put into the AMD OC fields, or do i leave all that empty ?

    thx
  • @brnfex since flashing the 580's the rig reboots every 2-3 hours , could it be that the settings are to aggressive ? what would you change and on which of the 2 models ?
  • You need to tweak your memory overclock and core voltage. There is no universal solution to this.
  • @brnfex sorry for all the questions i only started recently with all this. when you way play with overclock and core voltage, is that something i would adjust in the bios or is this just setting the upper limits and i would use the HIVE AMD OC Panel to then adjust max per CPU ?

    thx. Sven
  • On BIOS level you only need a timings mod, nothing else. Since Hive 0.5-54 you can adjust all clocks and voltages without having to hardcode them onto your bios. Good starting point would be 1150MHz core, 2000MHz memory and 950mV voltage. Work your way up or down on memory depending on stability, after 2-3 days when you are 100% stable - start lowering the voltage.
  • i didn't have any settings in the AMD OC page i guess thats why it was unstable, the temperature was very high as was the fan speed. i just reduced the voltage now and everything is cooler and less fan speed. i will monitor this now and follow your advice.

    thanks a lot !
Sign In or Register to comment.