CPU scheduling problems
Description:
After upgrading from linux-6.8.9.arch1-2-x86_64
to linux-6.9.1.arch1-1-x86_64
, the kernel started having problems with scheduling.
Additional info:
The bug also occurs on the latest kernel version as of right now (linux-6.9.1.arch1-2-x86_64
). I'm using a GA-78LMT-USB3 Motherboard with a AMD FX 8300 CPU. Also, I am still unsure about what exactly is causing this bug. The only place I've found it mentioned is in this Reddit post. CJPeter1, the author of that Reddit post is experiencing similar problems with the same CPU.
Right here are some logs related to the scheduling problems:
May 23 23:36:49 archlinux kernel: smp: Bringing up secondary CPUs ...
May 23 23:36:49 archlinux kernel: smpboot: x86: Booting SMP configuration:
May 23 23:36:49 archlinux kernel: .... node #0, CPUs: #2 #4 #6
May 23 23:36:49 archlinux kernel: __common_interrupt: 2.55 No irq handler for vector
May 23 23:36:49 archlinux kernel: __common_interrupt: 4.55 No irq handler for vector
May 23 23:36:49 archlinux kernel: __common_interrupt: 6.55 No irq handler for vector
May 23 23:36:49 archlinux kernel: #1 #3 #5 #7
May 23 23:36:49 archlinux kernel: ------------[ cut here ]------------
May 23 23:36:49 archlinux kernel: WARNING: CPU: 1 PID: 0 at kernel/sched/core.c:6482 sched_cpu_starting+0x183/0x250
May 23 23:36:49 archlinux kernel: Modules linked in:
May 23 23:36:49 archlinux kernel: CPU: 1 PID: 0 Comm: swapper/1 Not tainted 6.9.1-arch1-2 #1 06928436e5a6b4805e171d14d8efa397d7db9ad0
May 23 23:36:49 archlinux kernel: Hardware name: Gigabyte Technology Co., Ltd. GA-78LMT-USB3 R2/GA-78LMT-USB3 R2, BIOS F1 11/08/2017
May 23 23:36:49 archlinux kernel: RIP: 0010:sched_cpu_starting+0x183/0x250
May 23 23:36:49 archlinux kernel: Code: 00 8b 0d b0 e0 10 02 39 c8 0f 83 71 ff ff ff 48 63 d0 48 8b 3c d5 20 53 71 a6 4c 01 e7 39 c3 75 c7 4c 89 bf 48 0d 00 00 eb c7 <0f> 0b eb c3 be 04 00>
May 23 23:36:49 archlinux kernel: RSP: 0000:ffffa6a3c00cfe38 EFLAGS: 00010087
May 23 23:36:49 archlinux kernel: RAX: 0000000000000002 RBX: 0000000000000001 RCX: 0000000000000008
May 23 23:36:49 archlinux kernel: RDX: 0000000000000002 RSI: fffffffffffffffc RDI: ffff9367aeb36540
May 23 23:36:49 archlinux kernel: RBP: ffff9367aea99b00 R08: ffff9367aea99b00 R09: 0000000000000003
May 23 23:36:49 archlinux kernel: R10: ffff9367aea99b00 R11: 0000000000000006 R12: 0000000000036540
May 23 23:36:49 archlinux kernel: R13: 0000000000036540 R14: 0000000000000001 R15: ffff9367aea36540
May 23 23:36:49 archlinux kernel: FS: 0000000000000000(0000) GS:ffff9367aea80000(0000) knlGS:0000000000000000
May 23 23:36:49 archlinux kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
May 23 23:36:49 archlinux kernel: CR2: 0000000000000000 CR3: 000000036dc20000 CR4: 00000000000406f0
May 23 23:36:49 archlinux kernel: Call Trace:
May 23 23:36:49 archlinux kernel: <TASK>
May 23 23:36:49 archlinux kernel: ? sched_cpu_starting+0x183/0x250
May 23 23:36:49 archlinux kernel: ? __warn.cold+0x8e/0xe8
May 23 23:36:49 archlinux kernel: ? sched_cpu_starting+0x183/0x250
May 23 23:36:49 archlinux kernel: ? report_bug+0xff/0x140
May 23 23:36:49 archlinux kernel: ? handle_bug+0x3c/0x80
May 23 23:36:49 archlinux kernel: ? exc_invalid_op+0x17/0x70
May 23 23:36:49 archlinux kernel: ? asm_exc_invalid_op+0x1a/0x20
May 23 23:36:49 archlinux kernel: ? sched_cpu_starting+0x183/0x250
May 23 23:36:49 archlinux kernel: ? sched_cpu_starting+0x15a/0x250
May 23 23:36:49 archlinux kernel: ? __pfx_sched_cpu_starting+0x10/0x10
May 23 23:36:49 archlinux kernel: cpuhp_invoke_callback+0x122/0x410
May 23 23:36:49 archlinux kernel: __cpuhp_invoke_callback_range+0x64/0xc0
May 23 23:36:49 archlinux kernel: start_secondary+0x9c/0x140
May 23 23:36:49 archlinux kernel: common_startup_64+0x13e/0x141
May 23 23:36:49 archlinux kernel: </TASK>
May 23 23:36:49 archlinux kernel: ---[ end trace 0000000000000000 ]---
May 23 23:36:49 archlinux kernel: __common_interrupt: 1.55 No irq handler for vector
May 23 23:36:49 archlinux kernel: __common_interrupt: 3.55 No irq handler for vector
May 23 23:36:49 archlinux kernel: __common_interrupt: 5.55 No irq handler for vector
May 23 23:36:49 archlinux kernel: __common_interrupt: 7.55 No irq handler for vector
May 23 23:36:49 archlinux kernel: smp: Brought up 1 node, 8 CPUs
May 23 23:36:49 archlinux kernel: smpboot: Total of 8 processors activated (53173.28 BogoMIPS)
May 23 23:36:49 archlinux kernel: ------------[ cut here ]------------
May 23 23:36:49 archlinux kernel: WARNING: CPU: 0 PID: 1 at kernel/sched/topology.c:2408 build_sched_domains+0x76b/0x12b0
May 23 23:36:49 archlinux kernel: Modules linked in:
May 23 23:36:49 archlinux kernel: CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 6.9.1-arch1-2 #1 06928436e5a6b4805e171d14d8efa397d7db9ad0
May 23 23:36:49 archlinux kernel: Hardware name: Gigabyte Technology Co., Ltd. GA-78LMT-USB3 R2/GA-78LMT-USB3 R2, BIOS F1 11/08/2017
May 23 23:36:49 archlinux kernel: RIP: 0010:build_sched_domains+0x76b/0x12b0
May 23 23:36:49 archlinux kernel: Code: 63 4d 14 39 34 8a 0f 8e 73 fe ff ff 25 e9 ef ff ff 80 cc 04 41 89 46 3c e9 62 fe ff ff 41 c7 46 30 01 00 00 00 e9 55 fe ff ff <0f> 0b bb f4 ff ff ff>
May 23 23:36:49 archlinux kernel: RSP: 0018:ffffa6a3c001fe10 EFLAGS: 00010202
May 23 23:36:49 archlinux kernel: RAX: 00000000ffffff01 RBX: 0000000000000000 RCX: 00000000ffffff01
May 23 23:36:49 archlinux kernel: RDX: 00000000fffffff8 RSI: 0000000000000003 RDI: ffff9367aea19b00
May 23 23:36:49 archlinux kernel: RBP: ffff936480367a00 R08: ffff9367aea19b00 R09: 0000000000000000
May 23 23:36:49 archlinux kernel: R10: ffffa6a3c001fdd8 R11: 0000000000000000 R12: 0000000000000001
May 23 23:36:49 archlinux kernel: R13: ffff9367aea99b00 R14: 0000000000000001 R15: ffff936480942e80
May 23 23:36:49 archlinux kernel: FS: 0000000000000000(0000) GS:ffff9367aea00000(0000) knlGS:0000000000000000
May 23 23:36:49 archlinux kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
May 23 23:36:49 archlinux kernel: CR2: ffff9366eea01000 CR3: 000000036dc20000 CR4: 00000000000406f0
May 23 23:36:49 archlinux kernel: Call Trace:
May 23 23:36:49 archlinux kernel: <TASK>
May 23 23:36:49 archlinux kernel: ? build_sched_domains+0x76b/0x12b0
May 23 23:36:49 archlinux kernel: ? __warn.cold+0x8e/0xe8
May 23 23:36:49 archlinux kernel: ? build_sched_domains+0x76b/0x12b0
May 23 23:36:49 archlinux kernel: ? report_bug+0xff/0x140
May 23 23:36:49 archlinux kernel: ? handle_bug+0x3c/0x80
May 23 23:36:49 archlinux kernel: ? exc_invalid_op+0x17/0x70
May 23 23:36:49 archlinux kernel: ? asm_exc_invalid_op+0x1a/0x20
May 23 23:36:49 archlinux kernel: ? build_sched_domains+0x76b/0x12b0
May 23 23:36:49 archlinux kernel: ? kmalloc_trace+0x13a/0x320
May 23 23:36:49 archlinux kernel: sched_init_smp+0x3e/0xc0
May 23 23:36:49 archlinux kernel: ? stop_machine+0x30/0x40
May 23 23:36:49 archlinux kernel: kernel_init_freeable+0x109/0x250
May 23 23:36:49 archlinux kernel: ? __pfx_kernel_init+0x10/0x10
May 23 23:36:49 archlinux kernel: kernel_init+0x1a/0x140
May 23 23:36:49 archlinux kernel: ret_from_fork+0x34/0x50
May 23 23:36:49 archlinux kernel: ? __pfx_kernel_init+0x10/0x10
May 23 23:36:49 archlinux kernel: ret_from_fork_asm+0x1a/0x30
May 23 23:36:49 archlinux kernel: </TASK>
May 23 23:36:49 archlinux kernel: ---[ end trace 0000000000000000 ]---
After the upgrade, the kernel also reported a bunch of ATA bus errors:
May 23 23:36:59 archlinux kernel: ata2.00: exception Emask 0x10 SAct 0x1fffe000 SErr 0x40d0002 action 0xe frozen
May 23 23:36:59 archlinux kernel: ata2.00: irq_stat 0x00000040, connection status changed
May 23 23:36:59 archlinux kernel: ata2: SError: { RecovComm PHYRdyChg CommWake 10B8B DevExch }
May 23 23:36:59 archlinux kernel: ata2.00: failed command: WRITE FPDMA QUEUED
May 23 23:36:59 archlinux kernel: ata2.00: cmd 61/10:68:10:f8:07/00:00:00:00:00/40 tag 13 ncq dma 8192 out
res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
May 23 23:36:59 archlinux kernel: ata2.00: status: { DRDY }
May 23 23:36:59 archlinux kernel: ata2.00: failed command: WRITE FPDMA QUEUED
May 23 23:36:59 archlinux kernel: ata2.00: cmd 61/08:70:28:f8:07/00:00:00:00:00/40 tag 14 ncq dma 4096 out
res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
May 23 23:36:59 archlinux kernel: ata2.00: status: { DRDY }
May 23 23:36:59 archlinux kernel: ata2.00: failed command: WRITE FPDMA QUEUED
May 23 23:36:59 archlinux kernel: ata2.00: cmd 61/18:78:50:f8:07/00:00:00:00:00/40 tag 15 ncq dma 12288 out
res 40/00:68:00:00:00/00:00:00:00:00/40 Emask 0x10 (ATA bus error)
May 23 23:36:59 archlinux kernel: ata2.00: status: { DRDY }
May 23 23:36:59 archlinux kernel: ata2.00: failed command: WRITE FPDMA QUEUED
May 23 23:36:59 archlinux kernel: ata2.00: cmd 61/08:80:e0:1a:09/00:00:00:00:00/40 tag 16 ncq dma 4096 out
res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
May 23 23:36:59 archlinux kernel: ata2.00: status: { DRDY }
May 23 23:36:59 archlinux kernel: ata2.00: failed command: WRITE FPDMA QUEUED
May 23 23:36:59 archlinux kernel: ata2.00: cmd 61/08:88:70:f8:47/00:00:01:00:00/40 tag 17 ncq dma 4096 out
res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
May 23 23:36:59 archlinux kernel: ata2.00: status: { DRDY }
May 23 23:36:59 archlinux kernel: ata2.00: failed command: WRITE FPDMA QUEUED
May 23 23:36:59 archlinux kernel: ata2.00: cmd 61/08:90:08:f8:c7/00:00:01:00:00/40 tag 18 ncq dma 4096 out
res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
May 23 23:36:59 archlinux kernel: ata2.00: status: { DRDY }
May 23 23:36:59 archlinux kernel: ata2.00: failed command: WRITE FPDMA QUEUED
May 23 23:36:59 archlinux kernel: ata2.00: cmd 61/08:98:80:f8:c7/00:00:01:00:00/40 tag 19 ncq dma 4096 out
res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
May 23 23:36:59 archlinux kernel: ata2.00: status: { DRDY }
May 23 23:36:59 archlinux kernel: ata2.00: failed command: WRITE FPDMA QUEUED
May 23 23:36:59 archlinux kernel: ata2.00: cmd 61/18:a0:00:f9:c7/00:00:01:00:00/40 tag 20 ncq dma 12288 out
res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
May 23 23:36:59 archlinux kernel: ata2.00: status: { DRDY }
May 23 23:36:59 archlinux kernel: ata2.00: failed command: WRITE FPDMA QUEUED
May 23 23:36:59 archlinux kernel: ata2.00: cmd 61/10:a8:a0:f9:c7/00:00:01:00:00/40 tag 21 ncq dma 8192 out
res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
May 23 23:36:59 archlinux kernel: ata2.00: status: { DRDY }
May 23 23:36:59 archlinux kernel: ata2.00: failed command: WRITE FPDMA QUEUED
May 23 23:36:59 archlinux kernel: ata2.00: cmd 61/08:b0:e8:f9:c7/00:00:01:00:00/40 tag 22 ncq dma 4096 out
res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
May 23 23:36:59 archlinux kernel: ata2.00: status: { DRDY }
May 23 23:36:59 archlinux kernel: ata2.00: failed command: WRITE FPDMA QUEUED
May 23 23:36:59 archlinux kernel: ata2.00: cmd 61/08:b8:80:fa:c7/00:00:01:00:00/40 tag 23 ncq dma 4096 out
res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
May 23 23:36:59 archlinux kernel: ata2.00: status: { DRDY }
May 23 23:36:59 archlinux kernel: ata2.00: failed command: WRITE FPDMA QUEUED
May 23 23:36:59 archlinux kernel: ata2.00: cmd 61/08:c0:a0:fa:c7/00:00:01:00:00/40 tag 24 ncq dma 4096 out
res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
May 23 23:36:59 archlinux kernel: ata2.00: status: { DRDY }
May 23 23:36:59 archlinux kernel: ata2.00: failed command: WRITE FPDMA QUEUED
May 23 23:36:59 archlinux kernel: ata2.00: cmd 61/08:c8:40:fd:c7/00:00:01:00:00/40 tag 25 ncq dma 4096 out
res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
May 23 23:36:59 archlinux kernel: ata2.00: status: { DRDY }
May 23 23:36:59 archlinux kernel: ata2.00: failed command: WRITE FPDMA QUEUED
May 23 23:36:59 archlinux kernel: ata2.00: cmd 61/08:d0:a8:fe:c7/00:00:01:00:00/40 tag 26 ncq dma 4096 out
res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
May 23 23:36:59 archlinux kernel: ata2.00: status: { DRDY }
May 23 23:36:59 archlinux kernel: ata2.00: failed command: WRITE FPDMA QUEUED
May 23 23:36:59 archlinux kernel: ata2.00: cmd 61/08:d8:00:f9:c8/00:00:01:00:00/40 tag 27 ncq dma 4096 out
res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
May 23 23:36:59 archlinux kernel: ata2.00: status: { DRDY }
May 23 23:36:59 archlinux kernel: ata2.00: failed command: WRITE FPDMA QUEUED
May 23 23:36:59 archlinux kernel: ata2.00: cmd 61/08:e0:b8:f9:c8/00:00:01:00:00/40 tag 28 ncq dma 4096 out
res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
May 23 23:36:59 archlinux kernel: ata2.00: status: { DRDY }
Steps to reproduce:
- Run the AMD FX 8300 on a kernel newer than
linux-6.8.9.arch1-2-x86_64
- See the warning/error messages in journalctl
Steps to remedy
Downgrade the kernel to version linux-6.8.9.arch1-2-x86_64