Due to an influx of spam, we have had to temporarily disable account registrations. Please write an email to accountsupport@archlinux.org, with your desired username, if you want to get access. Sorry for the inconvenience.
This very much looks like kernel regression, which should be bisected and reported upstream to the kernel developers and the regression mailing list
Are you confident to do the bisection on your own or do you need some help?
If you want we could also provide you with prebuilt kernel images for you to test
Note that this installs as pkgbase linux-mainline (/boot/vmlinuz-linux-mainline) so you might need to teach your bootloader how to boot that (i.e. by running grub-mkconfig or creating the needed loader entry for systemd-boot)
After each test it would be good to get a quick message with the feedback on what worked or what was broken
>Note that this installs as pkgbase linux-mainline (/boot/vmlinuz-linux-mainline) so you might need to teach your bootloader how to boot that (i.e. by running grub-mkconfig or creating the needed loader entry for systemd-boot)
I didn't realise this so I re-tested mainline-6.10rc6-1 - still hangs
f0551af021308a2a1163dc63d1f1bba3594208bd is the first bad commitcommit f0551af021308a2a1163dc63d1f1bba3594208bdAuthor: Thomas Gleixner <tglx@linutronix.de>Date: Wed Mar 6 12:17:02 2024 +0100 x86/topology: Ignore non-present APIC IDs in a present package Borislav reported that one of his systems has a broken MADT table which advertises eight present APICs and 24 non-present APICs in the same package. The non-present ones are considered hot-pluggable by the topology evaluation code, which is obviously bogus as there is no way to hot-plug within the same package. As the topology evaluation code accounts for hot-pluggable CPUs in a package, the maximum number of cores per package is computed wrong, which in turn causes the uncore performance counter driver to access non-existing MSRs. It will probably confuse other entities which rely on the maximum number of cores and threads per package too. Cure this by ignoring hot-pluggable APIC IDs within a present package. In theory it would be reasonable to just do this unconditionally, but then there is this thing called reality^Wvirtualization which ruins everything. Virtualization is the only existing user of "physical" hotplug and the virtualization tools allow the above scenario. Whether that is actually in use or not is unknown. As it can be argued that the virtualization case is not affected by the issues which exposed the reported problem, allow the bogosity if the kernel determined that it is running in a VM for now. Fixes: 89b0f15f408f ("x86/cpu/topology: Get rid of cpuinfo::x86_max_cores") Reported-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/87a5nbvccx.ffs@tglx arch/x86/kernel/cpu/topology.c | 39 ++++++++++++++++++++++++++++++--------- 1 file changed, 30 insertions(+), 9 deletions(-)
Could you please check if a kernel built with the above commit reverted works:
We will now start to collect all the necessary information for the bug report
Could you install aur/cpuid and provide its output?
Could you provide the dmesg of a working boot?
Could you provide the output of cat /sys/kernel/debug/x86/topo/cpus/* from a working boot?
Also do you want to write the bug report to the upstream kernel devs yourself or should I do it? If I should do it it would be good if you could provide me with an email address I can CC and attribute with Reported-by: Rob Newcater <rob@example.org>.
48525fd x86/cpu: Provide debug interface added the interface in 6.7 and the interface is still present. Perhaps the bug is causing the interface to not be created or some hardening is hiding it?
Apologies, I went too far back. Just booted off 6.9.9 (requires "iommu=off") and was able to get the required data. I have also attached an updated dmesg from 6.9.9 in case it is required as previous dmesg was from LTS.