Regression in 6.12.7: RIP on boot: event xe_bo_move has unsafe dereference of argument 4
Description:
Started noticing stack-trace on boot since update to 6.12.7, seems to be bisected and figured out upstream already.
Additional info:
- package version(s): 6.12.7-arch1-1
- link to upstream bug report, if any: https://lkml.org/lkml/2024/12/27/451 also on lore: https://lore.kernel.org/all/2e9332ab19c44918dbaacecd8c039fb0bbe6e1db.camel@sapience.com/
- proposed patch by upstream: https://lore.kernel.org/all/20241230145002.3cc11717@gandalf.local.home/
Stack-trace:
[ 3.677060] WARNING: CPU: 3 PID: 393 at kernel/trace/trace_events.c:577 trace_event_raw_init+0x159/0x660
[ 3.677066] Modules linked in: snd_soc_intel_hda_dsp_common(+) xe(+) drm_gpuvm drm_exec snd_hda_codec_hdmi gpu_sched drm_suballoc_helper snd_hda_codec_realtek drm_ttm_helper snd_hda_codec_generic snd_hda_scodec_component snd_soc_dmic intel_uncore_frequency intel_uncore_frequency_common intel_tcc_cooling x86_pkg_temp_thermal snd_sof_pci_intel_tgl snd_sof_pci_intel_cnl snd_sof_intel_hda_generic soundwire_intel soundwire_cadence vfat snd_sof_intel_hda_common fat snd_soc_hdac_hda snd_sof_intel_hda_mlink snd_sof_intel_hda snd_sof_pci snd_sof_xtensa_dsp intel_powerclamp coretemp snd_sof snd_sof_utils kvm_intel snd_soc_acpi_intel_match soundwire_generic_allocation snd_soc_acpi soundwire_bus kvm snd_soc_avs crct10dif_pclmul snd_soc_hda_codec snd_hda_ext_core crc32_pclmul snd_ctl_led snd_soc_core polyval_clmulni snd_compress joydev polyval_generic ac97_bus iwlmvm i915 snd_pcm_dmaengine ghash_clmulni_intel mousedev snd_hda_intel sha512_ssse3 btusb snd_intel_dspcfg sha1_ssse3 btrtl snd_intel_sdw_acpi mac80211 aesni_intel btintel
[ 3.677097] gf128mul snd_hda_codec uvcvideo crypto_simd btbcm snd_hda_core drm_buddy videobuf2_vmalloc hid_multitouch btmtk cryptd processor_thermal_device_pci_legacy uvc libarc4 hid_generic videobuf2_memops i2c_algo_bit processor_thermal_device iTCO_wdt snd_hwdep mei_hdcp videobuf2_v4l2 mei_wdt bluetooth rapl mei_pxp iwlwifi processor_thermal_wt_hint intel_pmc_bxt snd_pcm ttm videobuf2_common processor_thermal_rfim intel_cstate iTCO_vendor_support thinkpad_acpi think_lmi(+) ucsi_acpi processor_thermal_rapl videodev mc crc16 intel_rapl_msr firmware_attributes_class wmi_bmof e1000e drm_display_helper snd_timer typec_ucsi cfg80211 intel_uncore platform_profile intel_lpss_pci spi_nor i2c_i801 intel_rapl_common mei_me snd pcspkr cec intel_lpss i2c_smbus processor_thermal_wt_req typec ptp psmouse mei mtd processor_thermal_power_floor soundcore i2c_mux idma64 thunderbolt pps_core intel_gtt roles rfkill processor_thermal_mbox video intel_soc_dts_iosf i2c_hid_acpi igen6_edac i2c_hid int3403_thermal int340x_thermal_zone
[ 3.677124] intel_pmc_core intel_vsec pmt_telemetry pmt_class wmi int3400_thermal intel_hid acpi_tad acpi_thermal_rel acpi_pad pinctrl_tigerlake sparse_keymap mac_hid pkcs8_key_parser sg crypto_user dm_mod loop nfnetlink zram 842_decompress 842_compress lz4hc_compress lz4_compress ip_tables x_tables serio_raw atkbd libps2 nvme vivaldi_fmap nvme_core spi_intel_pci sha256_ssse3 spi_intel i8042 nvme_auth serio btrfs blake2b_generic libcrc32c crc32c_generic crc32c_intel xor raid6_pq
[ 3.677140] CPU: 3 UID: 0 PID: 393 Comm: (udev-worker) Not tainted 6.12.7-arch1-1 #1 9e77c5d99557be92f482a3ac6317d887bb3ffaf9
[ 3.677143] Hardware name: LENOVO 20WNS5PY00/20WNS5PY00, BIOS N35ET59W (1.59 ) 07/16/2024
[ 3.677144] RIP: 0010:trace_event_raw_init+0x159/0x660
[ 3.677146] Code: 89 ea 0f 83 3b 04 00 00 e8 44 db ff ff 84 c0 74 10 8b 0c 24 48 c7 c0 fe ff ff ff 48 d3 c0 49 21 c6 4d 85 f6 0f 84 d6 fe ff ff <0f> 0b bb 01 00 00 00 41 f6 c6 01 0f 85 c7 4e c0 00 66 0f 1f 44 00
[ 3.677147] RSP: 0018:ffffa784c18ef9f0 EFLAGS: 00010206
[ 3.677148] RAX: ffffffffffffffdf RBX: ffffffffc2147731 RCX: 0000000000000005
[ 3.677149] RDX: 0000000000000002 RSI: 0000000000000001 RDI: ffffffffc2147727
[ 3.677150] RBP: ffffffffc2147640 R08: 0000000000000039 R09: 0000000000000000
[ 3.677151] R10: 0000000000000076 R11: 000000000000004e R12: 00000000000000f2
[ 3.677151] R13: ffffffffc2148760 R14: 0000000000000018 R15: 0000000000000000
[ 3.677152] FS: 000071c67c107880(0000) GS:ffff9a860f580000(0000) knlGS:0000000000000000
[ 3.677154] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3.677154] CR2: 0000775004ff9168 CR3: 0000000103bec006 CR4: 0000000000f72ef0
[ 3.677156] PKRU: 55555554
[ 3.677156] Call Trace:
[ 3.677158] <TASK>
[ 3.677159] ? trace_event_raw_init+0x159/0x660
[ 3.677160] ? __warn.cold+0x93/0xf6
[ 3.677163] ? trace_event_raw_init+0x159/0x660
[ 3.677165] ? report_bug+0xff/0x140
[ 3.677168] ? handle_bug+0x58/0x90
[ 3.677170] ? exc_invalid_op+0x17/0x70
[ 3.677172] ? asm_exc_invalid_op+0x1a/0x20
[ 3.677176] ? trace_event_raw_init+0x159/0x660
[ 3.677177] event_init+0x28/0x70
[ 3.677179] trace_module_notify+0x1a4/0x260
[ 3.677181] notifier_call_chain+0x5a/0xd0
[ 3.677184] blocking_notifier_call_chain_robust+0x65/0xc0
[ 3.677186] load_module+0x1822/0x1cf0
[ 3.677189] ? init_module_from_file+0x89/0xe0
[ 3.677191] init_module_from_file+0x89/0xe0
[ 3.677192] idempotent_init_module+0x11e/0x310
[ 3.677194] __x64_sys_finit_module+0x5e/0xb0
[ 3.677195] do_syscall_64+0x82/0x190
[ 3.677197] ? syscall_exit_to_user_mode+0x37/0x1c0
[ 3.677199] ? do_syscall_64+0x8e/0x190
[ 3.677200] ? complete+0x1c/0x90
[ 3.677201] ? __rseq_handle_notify_resume+0xa2/0x4a0
[ 3.677204] ? switch_fpu_return+0x4e/0xd0
[ 3.677206] ? arch_exit_to_user_mode_prepare.isra.0+0x79/0x90
[ 3.677208] ? syscall_exit_to_user_mode+0x37/0x1c0
[ 3.677209] ? clear_bhb_loop+0x25/0x80
[ 3.677211] ? clear_bhb_loop+0x25/0x80
[ 3.677212] ? clear_bhb_loop+0x25/0x80
[ 3.677213] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 3.677216] RIP: 0033:0x71c67c2f81fd
[ 3.677245] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d e3 fa 0c 00 f7 d8 64 89 01 48
[ 3.677246] RSP: 002b:00007ffcb161e0e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[ 3.677248] RAX: ffffffffffffffda RBX: 00005f1adbe47410 RCX: 000071c67c2f81fd
[ 3.677249] RDX: 0000000000000004 RSI: 000071c67bb0b05d RDI: 0000000000000039
[ 3.677249] RBP: 00007ffcb161e1a0 R08: 0000000000000001 R09: 00007ffcb161e130
[ 3.677250] R10: 0000000000000040 R11: 0000000000000246 R12: 000071c67bb0b05d
[ 3.677251] R13: 0000000000020000 R14: 00005f1adbe47dd0 R15: 00005f1adbe48a90
[ 3.677252] </TASK>
[ 3.677252] ---[ end trace 0000000000000000 ]---
[ 3.677253] event xe_bo_move has unsafe dereference of argument 4
[ 3.677254] print_fmt: "move_lacks_source:%s, migrate object %p [size %zu] from %s to %s device_id:%s", REC->move_lacks_source ? "yes" : "no", REC->bo, REC->size, xe_mem_type_to_name[REC->old_placement], xe_mem_type_to_name[REC->new_placement], __get_str(device_id)
Edited by Žilvinas Vaiciukevičius