-
Notifications
You must be signed in to change notification settings - Fork 209
CONFIG_NO_HZ=y possibly broken #127
Comments
I tried the Fedora Kconfig and it appears to use a dummy console. If I add |
For me, it booted but hangs after Are you seeing the same hang ? I am seeing same hang for both However, it hangs during the boot for spike/smp combination (for both configuration). I am running with fairly latest linux
Should I update anything else to verify? |
The difference is probably just my working directory being a mess, but I can reproduce some failures with
so I'm going to go figure out what's going on. |
@atishp04 is going to take a look, thanks! |
Just a quick update. I am able to get repeated stall warnings in qemu setup as well (with smp). [ 21.020000] INFO: rcu_sched self-detected stall on CPU The first one is after 21sec and rest of them keeps appearing after each ~63 secs. Still investigating. |
This is to fix warning got as: [ 6730.476938] ------------[ cut here ]------------ [ 6730.476979] Bad or missing usercopy whitelist? Kernel memory exposure attempt detected from SLAB object 'gvt-g_vgpu_workload' (offset 120, size 4)! [ 6730.477021] WARNING: CPU: 2 PID: 441 at mm/usercopy.c:81 usercopy_warn+0x7e/0xa0 [ 6730.477042] Modules linked in: tun(E) bridge(E) stp(E) llc(E) kvmgt(E) x86_pkg_temp_thermal(E) vfio_mdev(E) intel_powerclamp(E) mdev(E) coretemp(E) vfio_iommu_type1(E) vfio(E) kvm_intel(E) kvm(E) hid_generic(E) irqbypass(E) crct10dif_pclmul(E) crc32_pclmul(E) usbhid(E) i915(E) crc32c_intel(E) hid(E) ghash_clmulni_intel(E) pcbc(E) aesni_intel(E) aes_x86_64(E) crypto_simd(E) cryptd(E) glue_helper(E) intel_cstate(E) idma64(E) evdev(E) virt_dma(E) iTCO_wdt(E) intel_uncore(E) intel_rapl_perf(E) intel_lpss_pci(E) sg(E) shpchp(E) mei_me(E) pcspkr(E) iTCO_vendor_support(E) intel_lpss(E) intel_pch_thermal(E) prime_numbers(E) mei(E) mfd_core(E) video(E) acpi_pad(E) button(E) binfmt_misc(E) ip_tables(E) x_tables(E) autofs4(E) ext4(E) crc16(E) mbcache(E) jbd2(E) fscrypto(E) sd_mod(E) e1000e(E) xhci_pci(E) sdhci_pci(E) [ 6730.477244] ptp(E) cqhci(E) xhci_hcd(E) pps_core(E) sdhci(E) mmc_core(E) i2c_i801(E) usbcore(E) thermal(E) fan(E) [ 6730.477276] CPU: 2 PID: 441 Comm: gvt workload 0 Tainted: G E 4.16.0-rc1-gvt-staging-0213+ #127 [ 6730.477303] Hardware name: /NUC6i5SYB, BIOS SYSKLi35.86A.0039.2016.0316.1747 03/16/2016 [ 6730.477326] RIP: 0010:usercopy_warn+0x7e/0xa0 [ 6730.477340] RSP: 0018:ffffba6301223d18 EFLAGS: 00010286 [ 6730.477355] RAX: 0000000000000000 RBX: ffff8f41caae9838 RCX: 0000000000000006 [ 6730.477375] RDX: 0000000000000007 RSI: 0000000000000082 RDI: ffff8f41dad166f0 [ 6730.477395] RBP: 0000000000000004 R08: 0000000000000576 R09: 0000000000000000 [ 6730.477415] R10: ffffffffb1293fb2 R11: 00000000ffffffff R12: 0000000000000001 [ 6730.477447] R13: ffff8f41caae983c R14: ffff8f41caae9838 R15: 00007f183ca2b000 [ 6730.477467] FS: 0000000000000000(0000) GS:ffff8f41dad00000(0000) knlGS:0000000000000000 [ 6730.477489] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 6730.477506] CR2: 0000559462817291 CR3: 000000028b46c006 CR4: 00000000003626e0 [ 6730.477526] Call Trace: [ 6730.477537] __check_object_size+0x9c/0x1a0 [ 6730.477562] __kvm_write_guest_page+0x45/0x90 [kvm] [ 6730.477585] kvm_write_guest+0x46/0x80 [kvm] [ 6730.477599] kvmgt_rw_gpa+0x9b/0xf0 [kvmgt] [ 6730.477642] workload_thread+0xa38/0x1040 [i915] [ 6730.477659] ? do_wait_intr_irq+0xc0/0xc0 [ 6730.477673] ? finish_wait+0x80/0x80 [ 6730.477707] ? clean_workloads+0x120/0x120 [i915] [ 6730.477722] kthread+0x111/0x130 [ 6730.477733] ? _kthread_create_worker_on_cpu+0x60/0x60 [ 6730.477750] ? exit_to_usermode_loop+0x6f/0xb0 [ 6730.477766] ret_from_fork+0x35/0x40 [ 6730.477777] Code: 48 c7 c0 20 e3 25 b1 48 0f 44 c2 41 50 51 41 51 48 89 f9 49 89 f1 4d 89 d8 4c 89 d2 48 89 c6 48 c7 c7 78 e3 25 b1 e8 b2 bc e4 ff <0f> ff 48 83 c4 18 c3 48 c7 c6 09 d0 26 b1 49 89 f1 49 89 f3 eb [ 6730.477849] ---[ end trace cae869c1c323e45a ]--- By whitelist guest page write from workload struct allocated from kmem cache. Reviewed-by: Hang Yuan <[email protected]> Signed-off-by: Zhenyu Wang <[email protected]> (cherry picked from commit 5627705406874df57fdfad3b4e0c9aedd3b007df)
Got some time to look at the issue. arch_idle() function invokes wfi which puts the cpu to until "idle" mode until the next interrupt happens. Here are possible causes:
http://lists.infradead.org/pipermail/linux-riscv/2018-April/000510.html It is difficult to debug as I can't put printk in the arch_idle function as it is called so many times that serial console will hang in the first place. trace_printk is of no help as it is stuck during the boot itself. I am not aware of any method to access the trace buffer unless it's booted completely. Looking for any ideas or suggestions to debug the problem further. |
Add an instruction or CSR to qemu which does printf on a register, then debug by doing csr writes in arch_idle? If you're feeling ambitious implement Liviu's semihosting proposal (already in openocd and should be findable on sw-dev, it's a small change to EBREAK translation) |
The timer interrupt pending bit is cleared in bbl while reprogramming the timer. This works fine unless we are in nohz mode. Since, the timer is not reprogrammed, the pending bits are not cleared leading to continuous timer interrupt firing and cpu stalls. Clear timer interrupt in timer interrupt handler by calling into SEE. This also introduces an SBI call to do this. This patch requires following bbl fix. riscv-software-src/riscv-pk#108 Both kernel & bbl patch are required to run kernel as tick less mode in RISC-V. The details of the stalls can be found in riscvarchive#127 Signed-off-by: Atish Patra <[email protected]>
The timer interrupt pending bit is cleared in bbl while reprogramming the timer. This works fine unless we are in nohz mode. Since, the timer is not reprogrammed, the pending bits are not cleared leading to continuous timer interrupt firing and cpu stalls. Disable timer interrupt in interrupt handler to ignore the pending bit until next interrupt. Timer interrupt is enabled during setting the next timer event. The details of the stalls can be found in riscvarchive#127 Other possible ideas discussion: riscv-software-src/riscv-pk#108 Signed-off-by: Atish Patra <[email protected]>
Fixed in We can close this issue now. |
The timer interrupt pending bit is cleared in bbl while reprogramming the timer. This works fine unless we are in nohz mode. In nohz mode, the timer is not reprogrammed. Thus, the pending bits are not cleared leading to continuous timer interrupt firing and cpu stalls. Disable timer interrupt in interrupt handler to ignore the pending bit until next interrupt. Timer interrupt is enabled again before next timer event is set. The details of the stalls can be found in #127 Other possible ideas discussion: riscv-software-src/riscv-pk#108 Signed-off-by: Atish Patra <[email protected]> Signed-off-by: Palmer Dabbelt <[email protected]>
The timer interrupt pending bit is cleared in bbl while reprogramming the timer. This works fine unless we are in nohz mode. In nohz mode, the timer is not reprogrammed. Thus, the pending bits are not cleared leading to continuous timer interrupt firing and cpu stalls. Disable timer interrupt in interrupt handler to ignore the pending bit until next interrupt. Timer interrupt is enabled again before next timer event is set. The details of the stalls can be found in #127 Other possible ideas discussion: riscv-software-src/riscv-pk#108 Signed-off-by: Atish Patra <[email protected]> Signed-off-by: Palmer Dabbelt <[email protected]>
Adding that to the Fedora defconfig results in a kernel which fails to boot (hangs with high cpu usage immediately after "clocksource: Switched to clocksource riscv_clocksource").
The text was updated successfully, but these errors were encountered: