This project is mirrored from https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-stable-rt.git.
Pull mirroring updated .
- Sep 23, 2024
-
-
Daniel Wagner authored
Signed-off-by:
Daniel Wagner <wagi@monom.org>
-
Daniel Wagner authored
This reverts commit 56e89498. The tree contains already the migrate_disable/enable() helpers thus this stable backport conflicts (b) with the existing definition (compiler complains with conflicting definition). Thus we don't need this backported functions and can avoid the conflict by just dropping the backport. Signed-off-by:
Daniel Wagner <wagi@monom.org>
-
Brennan Lamoreaux (VMware) authored
Upstream commit d8bb65ab ("workqueue: Use rcuwait for wq_manager_wait") replaced the waitqueue with rcuwait in the workqueue code. This change involved removing the acquisition of pool->lock in put_unbound_pool(), as it also adds the function wq_manager_inactive() which acquires this same lock and is called one line later as a parameter to rcu_wait_event(). However, the backport of this commit in the PREEMPT_RT patchset 4.19.255-rt114 (patch 347) missed the removal of the acquisition of pool->lock in put_unbound_pool(). This leads to a deadlock due to recursive locking of pool->lock, as shown below in lockdep: [ 252.083713] WARNING: possible recursive locking detected [ 252.083718] 4.19.269-3.ph3-rt #1-photon Not tainted [ 252.083721] -------------------------------------------- [ 252.083733] kworker/2:0/33 is trying to acquire lock: [ 252.083747] 000000000b7b1ceb (&pool->lock/1){....}, at: put_unbound_pool+0x10d/0x260 [ 252.083857] but task is already holding lock: [ 252.083860] 000000000b7b1ceb (&pool->lock/1){....}, at: put_unbound_pool+0xbd/0x260 [ 252.083876] other info that might help us debug this: [ 252.083897] Possible unsafe locking scenario: [ 252.083900] CPU0 [ 252.083903] ---- [ 252.083904] lock(&pool->lock/1); [ 252.083911] lock(&pool->lock/1); [ 252.083919] *** DEADLOCK *** [ 252.083921] May be due to missing lock nesting notation Fix this deadlock by removing the pool->lock acquisition in put_unbound_pool(). Signed-off-by:
Brennan Lamoreaux (VMware) <brennanlamoreaux@gmail.com> Cc: Daniel Wagner <wagi@monom.org> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Tejun Heo <tj@kernel.org> Reviewed-by:
Srivatsa S. Bhat (VMware) <srivatsa@csail.mit.edu> Link: https://lore.kernel.org/r/20230228224938.88035-1-brennanlamoreaux@gmail.com Signed-off-by:
Daniel Wagner <wagi@monom.org>
-
Ben Hutchings authored
This reverts commit 0d796a9e. After merging stable release 4.19.266 into the -rt branch, an x86 build will fail with the following error: .../include/linux/percpu-defs.h:49:34: error: 'PER_CPU_BASE_SECTION' undeclared here (not in a function); did you mean 'PER_CPU_FIRST_SECTION'? This is due to an #include loop: <asm/percpu.h> -> <linux/irqflags.h> -> <asm/irqflags.h> -> <asm/nospec-branch.h> -> <asm/percpu.h> which appears after the merge because: - The reverted commit added <asm/percpu.h> -> <linux/irqflags.h> - 4.19.266 added <asm/nospec-branch.h> -> <asm/percpu.h> Neither upstream nor any other maintained stable-rt branch has this include, and my build succeeded without it. Revert it here as well. Signed-off-by:
Ben Hutchings <ben@decadent.org.uk> Link: https://lore.kernel.org/r/Y5O/aVw/zHKqmpu7@decadent.org.uk Signed-off-by:
Daniel Wagner <wagi@monom.org>
-
Sebastian Andrzej Siewior authored
Upstream commit c725dafc PREEMPT_RT does not spin and wait until a running timer completes its callback but instead it blocks on a sleeping lock to prevent a livelock in the case that the task waiting for the callback completion preempted the callback. This cannot be done for timers flagged with TIMER_IRQSAFE. These timers can be canceled from an interrupt disabled context even on RT kernels. The expiry callback of such timers is invoked with interrupts disabled so there is no need to use the expiry lock mechanism because obviously the callback cannot be preempted even on RT kernels. Do not use the timer_base::expiry_lock mechanism when waiting for a running callback to complete if the timer is flagged with TIMER_IRQSAFE. Also add a lockdep assertion for RT kernels to validate that the expiry lock mechanism is always invoked in preemptible context. [ bigeasy: Dropping that lockdep_assert_preemption_enabled() check in backport ] Reported-by:
Mike Galbraith <efault@gmx.de> Signed-off-by:
Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20201103190937.hga67rqhvknki3tp@linutronix.de Signed-off-by:
Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by:
Daniel Wagner <wagi@monom.org>
-
Thomas Gleixner authored
Upstream commit bb7262b2 syzbot reported KCSAN data races vs. timer_base::timer_running being set to NULL without holding base::lock in expire_timers(). This looks innocent and most reads are clearly not problematic, but Frederic identified an issue which is: int data = 0; void timer_func(struct timer_list *t) { data = 1; } CPU 0 CPU 1 ------------------------------ -------------------------- base = lock_timer_base(timer, &flags); raw_spin_unlock(&base->lock); if (base->running_timer != timer) call_timer_fn(timer, fn, baseclk); ret = detach_if_pending(timer, base, true); base->running_timer = NULL; raw_spin_unlock_irqrestore(&base->lock, flags); raw_spin_lock(&base->lock); x = data; If the timer has previously executed on CPU 1 and then CPU 0 can observe base->running_timer == NULL and returns, assuming the timer has completed, but it's not guaranteed on all architectures. The comment for del_timer_sync() makes that guarantee. Moving the assignment under base->lock prevents this. For non-RT kernel it's performance wise completely irrelevant whether the store happens before or after taking the lock. For an RT kernel moving the store under the lock requires an extra unlock/lock pair in the case that there is a waiter for the timer, but that's not the end of the world. Reported-by:
<syzbot+aa7c2385d46c5eba0b89@syzkaller.appspotmail.com> Reported-by:
<syzbot+abea4558531bae1ba9fe@syzkaller.appspotmail.com> Fixes: 030dcdd1 ("timers: Prepare support for PREEMPT_RT") Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Tested-by:
Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://lore.kernel.org/r/87lfea7gw8.fsf@nanos.tec.linutronix.de Cc: stable@vger.kernel.org Signed-off-by:
Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by:
Daniel Wagner <wagi@monom.org>
-
Sebastian Andrzej Siewior authored
Upstream commit d8bb65ab The workqueue code has it's internal spinlock (pool::lock) and also implicit spinlock usage in the wq_manager waitqueue. These spinlocks are converted to 'sleeping' spinlocks on a RT-kernel. Workqueue functions can be invoked from contexts which are truly atomic even on a PREEMPT_RT enabled kernel. Taking sleeping locks from such contexts is forbidden. pool::lock can be converted to a raw spinlock as the lock held times are short. But the workqueue manager waitqueue is handled inside of pool::lock held regions which again violates the lock nesting rules of raw and regular spinlocks. The manager waitqueue has no special requirements like custom wakeup callbacks or mass wakeups. While it does not use exclusive wait mode explicitly there is no strict requirement to queue the waiters in a particular order as there is only one waiter at a time. This allows to replace the waitqueue with rcuwait which solves the locking problem because rcuwait relies on existing locking. Signed-off-by:
Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by:
Tejun Heo <tj@kernel.org> Signed-off-by:
Sebastian Andrzej Siewior <bigeasy@linutronix.de> [wagi: Updated context as v4.19-rt was using swait] Signed-off-by:
Daniel Wagner <wagi@monom.org>
-
Daniel Wagner authored
This is an all in one commit backporting updates for rcuwait: - 03f4b48e ("rcuwait: Annotate task_struct with __rcu") - 191a43be ("rcuwait: Introduce rcuwait_active()") - 5c21f7b3 ("rcuwait: Introduce prepare_to and finish_rcuwait") - 80fbaf1c ("rcuwait: Add @state argument to rcuwait_wait_event()") - 9d9a6ebf ("rcuwait: Let rcuwait_wake_up() return whether or not a task was awoken") - 58d4292b ("rcu: Uninline multi-use function: finish_rcuwait()") Signed-off-by:
Daniel Wagner <wagi@monom.org>
-
Sebastian Andrzej Siewior authored
Upstream commit c725dafc PREEMPT_RT does not spin and wait until a running timer completes its callback but instead it blocks on a sleeping lock to prevent a livelock in the case that the task waiting for the callback completion preempted the callback. This cannot be done for timers flagged with TIMER_IRQSAFE. These timers can be canceled from an interrupt disabled context even on RT kernels. The expiry callback of such timers is invoked with interrupts disabled so there is no need to use the expiry lock mechanism because obviously the callback cannot be preempted even on RT kernels. Do not use the timer_base::expiry_lock mechanism when waiting for a running callback to complete if the timer is flagged with TIMER_IRQSAFE. Also add a lockdep assertion for RT kernels to validate that the expiry lock mechanism is always invoked in preemptible context. Reported-by:
Mike Galbraith <efault@gmx.de> Signed-off-by:
Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20201103190937.hga67rqhvknki3tp@linutronix.de [bigeasy: The logic in v4.19 is slightly different but the outcome is the same as we must not sleep while waiting for the irqsafe timer to complete. The IRQSAFE timer can not be preempted. The "lockdep annotation" is not available and has been replaced with might_sleep()] Signed-off-by:
Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by:
Daniel Wagner <wagi@monom.org>
-
Sebastian Andrzej Siewior authored
This reverts the PREEMPT_RT related changes to workqueue. It reverts the usage of local_locks() and cpu_chill(). This is a preparation to pull in the PREEMPT_RT related changes which were merged upstream. Signed-off-by:
Sebastian Andrzej Siewior <bigeasy@linutronix.de> [wagi: 827b6f69 ("workqueue: rework") already reverted most of the changes, except the missing update in put_pwq_unlocked.] Signed-off-by:
Daniel Wagner <wagi@monom.org>
-
Sebastian Andrzej Siewior authored
The original code was using INIT_LOCAL_LOCK() and I tried to sneak around it and forgot that this code also needs to compile on !RT platforms. Provide INIT_LOCAL_LOCK() to initialize properly on RT and do nothing on !RT. Let random.c use which is the only user so far and oes not compile on !RT otherwise. Signed-off-by:
Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://lore.kernel.org/all/YzcEIU17EIZ7ZIF5@linutronix.de/ Signed-off-by:
Daniel Wagner <wagi@monom.org>
-
Sebastian Andrzej Siewior authored
As part of the backports the random code lost its local_lock_t type and the whole operation became a local_irq_{disable|enable}() simply because the older kernel did not provide those primitives. RT as of v4.9 has a slightly different variant of local_locks. Replace the local_irq_*() operations with matching local_lock_irq*() operations which were there as part of commit 77760fd7 ("random: remove batched entropy locking") Signed-off-by:
Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://lore.kernel.org/all/20220819092446.980320-2-bigeasy@linutronix.de/ Signed-off-by:
Daniel Wagner <dwagner@suse.de>
-
Sebastian Andrzej Siewior authored
The irq_settings_no_softirq_call() related handling got lost in process, here are the missing bits. Reported-by:
Martin Kaistra <martin.kaistra@linutronix.de> Fixes: b0cf5c23 ("Merge tag 'v4.19.183' into linux-4.19.y-rt") Signed-off-by:
Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by:
Clark Williams <williams@redhat.com>
-
Sebastian Andrzej Siewior authored
The patch net: move xmit_recursion to per-task variable on -RT lost a few hunks during its rebase. Add the `xmit_lock_owner' accessor/wrapper. Reported-by:
Salvatore Bonaccorso <carnil@debian.org> Signed-off-by:
Sebastian Andrzej Siewior <bigeasy@linutronix.de>
-
Clark Williams authored
While doing some 4.19-rt cleanup work, I stumbled across the fact that parts of two backported patches were dependent on CONFIG_PREEMPT_RT, rather than the CONFIG_PREEMPT_RT_FULL used in 4.19 and earlier RT series. The commits in the linux-stable-rt v4.19-rt branch are: dad4c6a3 mm: slub: Don't resize the location tracking cache on PREEMPT_RT e626b6f8 net: Treat __napi_schedule_irqoff() as __napi_schedule() on PREEMPT_RT Discussing this at the Stable RT maintainers meeting, Steven Rostedt suggested that we automagically select CONFIG_PREEMPT_RT if CONFIG_PREEMPT_RT_FULL is on, giving us a safety net for any subsequently backported patches. Here's my first cut at that patch. I suspect we'll need a similar patch for stable RT kernels < 4.19. Suggested-by:
Steven Rostedt <rostedt@goodmis.org> Signed-off-by:
Clark Williams <williams@redhat.com>
-
Gregor Beck authored
The original patch, 60266060 ("fscache: initialize cookie hash table raw spinlocks"), subtracted 1 from the shift and so still left some spinlocks uninitialized. This fixes that. [zanussi: Added changelog text] Signed-off-by:
Gregor Beck <gregor.beck@gmail.com> Fixes: 60266060 ("fscache: initialize cookie hash table raw spinlocks") Signed-off-by:
Tom Zanussi <zanussi@kernel.org> (cherry picked from commit 2cdede91) Signed-off-by:
Clark Williams <williams@redhat.com>
-
Andrew Halaney authored
There's no chance of sleeping here, the reader is giving up the lock and possibly waking up the writer who is waiting on it. Reported-by:
Chunyu Hu <chuhu@redhat.com> Signed-off-by:
Andrew Halaney <ahalaney@redhat.com> Signed-off-by:
Steven Rostedt (VMware) <rostedt@goodmis.org> (cherry picked from commit b2ed0a43) Signed-off-by:
Clark Williams <williams@redhat.com>
-
Sebastian Andrzej Siewior authored
The stable backported a patch which adds __down_read_interruptible() for the generic rwsem implementation. Add RT's version __down_read_interruptible(). Signed-off-by:
Sebastian Andrzej Siewior <bigeasy@linutronix.de>
-
Sebastian Andrzej Siewior authored
The location tracking cache has a size of a page and is resized if its current size is too small. This allocation happens with disabled interrupts and can't happen on PREEMPT_RT. Should one page be too small, then we have to allocate more at the beginning. The only downside is that less callers will be visible. Signed-off-by:
Sebastian Andrzej Siewior <bigeasy@linutronix.de> (cherry picked from commit 87bd0bf3) Signed-off-by:
Clark Williams <williams@redhat.com>
-
Oleg Nesterov authored
[ Upstream commit 0fdc9197 ] The patch "ptrace: fix ptrace vs tasklist_lock race" changed ptrace_freeze_traced() to take task->saved_state into account, but ptrace_unfreeze_traced() has the same problem and needs a similar fix: it should check/update both ->state and ->saved_state. Reported-by:
Luis Claudio R. Goncalves <lgoncalv@redhat.com> Fixes: "ptrace: fix ptrace vs tasklist_lock race" Signed-off-by:
Oleg Nesterov <oleg@redhat.com> Signed-off-by:
Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: stable-rt@vger.kernel.org Signed-off-by:
Tom Zanussi <zanussi@kernel.org>
-
Sebastian Andrzej Siewior authored
[ Upstream commit 74858f0d ] The callers expect disabled preemption/interrupts while invoking __mod_memcg_lruvec_state(). This works mainline because a lock of somekind is acquired. Use preempt_disable_rt() where per-CPU variables are accessed and a stable pointer is expected. This is also done in __mod_zone_page_state() for the same reason. Cc: stable-rt@vger.kernel.org Signed-off-by:
Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by:
Tom Zanussi <zanussi@kernel.org> Conflicts: mm/memcontrol.c
-
Davidlohr Bueso authored
A crash was seen in xfrm when running ltp's 'tcp4_ipsec06' stresser on v4.x based RT kernels. ipcomp_compress() will serialize access to the ipcomp_scratches percpu buffer by disabling BH and preventing a softirq from coming in and running ipcom_decompress(), which is never called from process context. This of course won't work on RT and the buffer can get corrupted; there have been similar issues with in the past with such assumptions, ie: ebf255ed (net: add back the missing serialization in ip_send_unicast_reply()). Similarly, this patch addresses the issue with locallocks allowing RT to have a percpu spinlock and do the correct serialization. Signed-off-by:
Davidlohr Bueso <dbueso@suse.de> Signed-off-by:
Tom Zanussi <zanussi@kernel.org>
-
Ahmed S. Darwish authored
[ Upstream commit 6554eac9 ] Commit bf7afb29 ("phy: improve safety of fixed-phy MII register reading") protected the fixed PHY status with a sequence counter. Two years later, commit d2b97793 ("net: phy: fixed-phy: remove fixed_phy_update_state()") removed the sequence counter's write side critical section -- neutralizing its read side retry loop. Remove the unused seqcount. Signed-off-by:
Ahmed S. Darwish <a.darwish@linutronix.de> Reviewed-by:
Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reviewed-by:
Andrew Lunn <andrew@lunn.ch> Signed-off-by:
David S. Miller <davem@davemloft.net> (cherry picked from v5.8-rc1 commit 79cbb6bc) Signed-off-by:
Ahmed S. Darwish <a.darwish@linutronix.de> Signed-off-by:
Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by:
Tom Zanussi <zanussi@kernel.org>
-
Sebastian Andrzej Siewior authored
[ Upstream commit e6da0edc ] There was a lockdep which led to commit fad003b6 ("Bluetooth: Fix inconsistent lock state with RFCOMM") Lockdep noticed that `sk->sk_lock.slock' was acquired without disabling the softirq while the lock was also used in softirq context. Unfortunately the solution back then was to disable interrupts before acquiring the lock which however made lockdep happy. It would have been enough to simply disable the softirq. Disabling interrupts before acquiring a spinlock_t is not allowed on PREEMPT_RT because these locks are converted to 'sleeping' spinlocks. Use spin_lock_bh() in order to acquire the `sk_lock.slock'. Reported-by:
Luis Claudio R. Goncalves <lclaudio@uudg.org> Reported-by: kbuild test robot <lkp@intel.com> [missing unlock] Signed-off-by:
Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by:
Marcel Holtmann <marcel@holtmann.org> Signed-off-by:
Tom Zanussi <zanussi@kernel.org>
-
Matt Fleming authored
[ Upsteam commit 9567db2e ] The way user struct reference counting works changed significantly with, fda31c50 ("signal: avoid double atomic counter increments for user accounting") Now user structs are only freed once the last pending signal is dequeued. Make sigqueue_free_current() follow this new convention to avoid freeing the user struct multiple times and triggering this warning: refcount_t: underflow; use-after-free. WARNING: CPU: 0 PID: 6794 at lib/refcount.c:288 refcount_dec_not_one+0x45/0x50 Call Trace: refcount_dec_and_lock_irqsave+0x16/0x60 free_uid+0x31/0xa0 __dequeue_signal+0x17c/0x190 dequeue_signal+0x5a/0x1b0 do_sigtimedwait+0x208/0x250 __x64_sys_rt_sigtimedwait+0x6f/0xd0 do_syscall_64+0x72/0x200 entry_SYSCALL_64_after_hwframe+0x49/0xbe Signed-off-by:
Matt Fleming <matt@codeblueprint.co.uk> Reported-by:
Daniel Wagner <wagi@monom.org> Signed-off-by:
Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by:
Tom Zanussi <zanussi@kernel.org>
-
Tom Zanussi authored
commit 62d0a2a3 (tasklet: Address a race resulting in double-enqueue) addresses a problem that can result in a tasklet being enqueued on two cpus at the same time by combining the RUN flag with a new CHAINED flag, and relies on the combination to be present in order to zero it out, which can never happen on (!SMP and !PREEMPT_RT_FULL) because the RUN flag is SMP/PREEMPT_RT_FULL-only. So make sure the above commit is only applied for the SMP || PREEMPT_RT_FULL case. Fixes: 62d0a2a3 ("tasklet: Address a race resulting in double-enqueue") Signed-off-by:
Tom Zanussi <zanussi@kernel.org> Reported-by:
Ramon Fried <rfried.dev@gmail.com> Tested-By:
Ramon Fried <rfried.dev@gmail.com>
-
Kevin Hao authored
[ Upstream commit 23a2c31b ] After commit f0b23110 ("mm/SLUB: delay giving back empty slubs to IRQ enabled regions"), when the free_slab() is invoked with the IRQ disabled, the empty slubs are moved to a per-CPU list and will be freed after IRQ enabled later. But in the current codes, there is a check to see if there really has the cpu slub on a specific cpu before flushing the delayed empty slubs, this may cause a reference of already released kmem_cache in a scenario like below: cpu 0 cpu 1 kmem_cache_destroy() flush_all() --->IPI flush_cpu_slab() flush_slab() deactivate_slab() discard_slab() free_slab() c->page = NULL; for_each_online_cpu(cpu) if (!has_cpu_slab(1, s)) continue this skip to flush the delayed empty slub released by cpu1 kmem_cache_free(kmem_cache, s) kmalloc() __slab_alloc() free_delayed() __free_slab() reference to released kmem_cache Fixes: f0b23110 ("mm/SLUB: delay giving back empty slubs to IRQ enabled regions") Signed-off-by:
Kevin Hao <haokexin@gmail.com> Signed-off-by:
Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: stable-rt@vger.kernel.org Signed-off-by:
Tom Zanussi <zanussi@kernel.org>
-
Sebastian Andrzej Siewior authored
[ Upstream commit 279f90dd ] Include the swait.h header so it compiles even if not all patches are applied. Reported-by:
kbuild test robot <lkp@intel.com> Signed-off-by:
Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by:
Tom Zanussi <zanussi@kernel.org> Conflicts: fs/proc/base.c
-
Rasmus Villemoes authored
Commit hrtimer: Add a missing bracket and hide `migration_base' on !SMP which is 47b6de0b in 5.2-rt and 40aae570 in 4.19-rt, inadvertently changed the logic from base != &migration_base to base == &migration_base. On !CONFIG_SMP, the effect was to effectively always elide this lock/unlock pair (since is_migration_base() is unconditionally false), which for me consistently causes lockups during reboot, and reportedly also often causes a hang during boot. Adding this logical negation (or, what is effectively the same thing on !CONFIG_SMP, reverting the above commit as well as "hrtimer: Prevent using hrtimer_grab_expiry_lock() on migration_base") fixes that lockup. Fixes: 40aae570 (hrtimer: Add a missing bracket and hide `migration_base' on !SMP) # 4.19-rt Fixes: 47b6de0b (hrtimer: Add a missing bracket and hide `migration_base' on !SMP) # 5.2-rt Signed-off-by:
Rasmus Villemoes <rasmus.villemoes@prevas.dk> Reviewed-by:
Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by:
Tom Zanussi <zanussi@kernel.org>
-
Zhang Xiao authored
The kernel bugzilla has the following race condition reported: CPU0 CPU1 CPU2 ------------------------------------------------ test_set SCHED test_set RUN if SCHED add_list raise clear RUN <softirq> test_set RUN test_clear SCHED ->func test_set SCHED tasklet_try_unlock ->0 test_clear SCHED test_set SCHED ->func tasklet_try_unlock ->1 test_set RUN if SCHED add list raise clear RUN test_set RUN if SCHED add list raise clear RUN As a result the tasklet is enqueued on both CPUs and run on both CPUs. Due to the nature of the list used here, it is possible that further (different) tasklets, which are enqueued after this double-enqueued tasklet, are scheduled on CPU2 but invoked on CPU1. It is also possible that these tasklets won't be invoked at all, because during the second enqueue process the t->next pointer is set to NULL - dropping everything from the list. This race will trigger one or two of the WARN_ON() in tasklet_action_common(). The problem is that the tasklet may be invoked multiple times and clear SCHED bit on each invocation. This makes it possible to enqueue the very same tasklet on different CPUs. Current RT-devel is using the upstream implementation which does not re-run tasklets if they have SCHED set again and so it does not clear the SCHED bit multiple times on a single invocation. Introduce the CHAINED flag. The tasklet will only be enqueued if the CHAINED flag has been set successfully. If it is possible to exchange the flags (CHAINED | RUN) -> 0 then the tasklet won't be re-run. Otherwise the possible SCHED flag is removed and the tasklet is re-run again. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=61451 Not-signed-off-by:
Zhang Xiao <xiao.zhang@windriver.com> [bigeasy: patch description] Signed-off-by:
Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by:
Tom Zanussi <zanussi@kernel.org>
-
Steven Rostedt (VMware) authored
When CONFIG_PREEMPT_RT_FULL is not set, some of the checks for using lazy_list are not properly made as the IRQ_WORK_LAZY is not checked. There's two locations that need this update, so a use_lazy_list() helper function is added and used in both locations. Link: https://lore.kernel.org/r/20200321230028.GA22058@duo.ucw.cz Reported-by:
Pavel Machek <pavel@denx.de> Signed-off-by:
Steven Rostedt (VMware) <rostedt@goodmis.org>
-
Tiejun Chen authored
Fails to build with CONFIG_UBSAN=y lib/ubsan.c: In function '__ubsan_handle_vla_bound_not_positive': lib/ubsan.c:348:2: error: too many arguments to function 'ubsan_prologue' ubsan_prologue(&data->location, &flags); ^~~~~~~~~~~~~~ lib/ubsan.c:146:13: note: declared here static void ubsan_prologue(struct source_location *location) ^~~~~~~~~~~~~~ lib/ubsan.c:353:2: error: too many arguments to function 'ubsan_epilogue' ubsan_epilogue(&flags); ^~~~~~~~~~~~~~ lib/ubsan.c:155:13: note: declared here static void ubsan_epilogue(void) ^~~~~~~~~~~~~~ Signed-off-by:
Tiejun Chen <tiejunc@vmware.com> Signed-off-by:
Steven Rostedt (VMware) <rostedt@goodmis.org>
-
Sebastian Andrzej Siewior authored
[ Upstream commit dd430bf5 ] The migrate_disable counter should not exceed 255 so it is enough to store it in an 8bit field. With this change we can move the `preempt_lazy_count' member into the gap so the whole struct shrinks by 4 bytes to 12 bytes in total. Remove the `padding' field, it is not needed. Update the tracing fields in trace_define_common_fields() (it was missing the preempt_lazy_count field). Signed-off-by:
Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by:
Steven Rostedt (VMware) <rostedt@goodmis.org>
-
Sebastian Andrzej Siewior authored
[ Upstream commit b901491e ] vmw_fifo_ping_host() disables preemption around a test and a register write via vmw_write(). The write function acquires a spinlock_t typed lock which is not allowed in a preempt_disable()ed section on PREEMPT_RT. This has been reported in the bugzilla. It has been explained by Thomas Hellstrom that this preempt_disable()ed section is not required for correctness. Remove the preempt_disable() section. Link: https://bugzilla.kernel.org/show_bug.cgi?id=206591 Link: https://lkml.kernel.org/r/0b5e1c65d89951de993deab06d1d197b40fd67aa.camel@vmware.com Signed-off-by:
Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by:
Steven Rostedt (VMware) <rostedt@goodmis.org>
-
Sebastian Andrzej Siewior authored
[ Upstream commit e693075a ] Include the header for `current' macro so that CONFIG_KERNEL_HEADER_TEST=y passes. Signed-off-by:
Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by:
Steven Rostedt (VMware) <rostedt@goodmis.org>
-
Matt Fleming authored
[ Upstream commit 071a1d6a ] The comment about local_lock_irqsave() mentions just the counters and css_put_many()'s callback just invokes a worker so it is safe to move the unlock function after memcg_check_events() so css_put_many() can be invoked without the lock acquired. Cc: Daniel Wagner <wagi@monom.org> Signed-off-by:
Matt Fleming <matt@codeblueprint.co.uk> Signed-off-by:
Steven Rostedt (VMware) <rostedt@goodmis.org> [bigeasy: rewrote the patch description] Signed-off-by:
Sebastian Andrzej Siewior <bigeasy@linutronix.de>
-
Scott Wood authored
[ Upstream commit b8162e61 ] We can rely on preempt_enable() to schedule. Besides simplifying the code, this potentially allows sequences such as the following to be permitted: migrate_disable(); preempt_disable(); migrate_enable(); preempt_enable(); Suggested-by:
Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by:
Scott Wood <swood@redhat.com> Reviewed-by:
Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by:
Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by:
Steven Rostedt (VMware) <rostedt@goodmis.org>
-
Scott Wood authored
[ Upstream commit 2dcd94b4 ] Commit e6c287b1 ("sched: migrate_enable: Use stop_one_cpu_nowait()") adds a busy wait to deal with an edge case where the migrated thread can resume running on another CPU before the stopper has consumed cpu_stop_work. However, this is done with preemption disabled and can potentially lead to deadlock. While it is not guaranteed that the cpu_stop_work will be consumed before the migrating thread resumes and exits the stack frame, it is guaranteed that nothing other than the stopper can run on the old cpu between the migrating thread scheduling out and the cpu_stop_work being consumed. Thus, we can store cpu_stop_work in per-cpu data without it being reused too early. Fixes: e6c287b1 ("sched: migrate_enable: Use stop_one_cpu_nowait()") Suggested-by:
Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by:
Scott Wood <swood@redhat.com> Reviewed-by:
Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by:
Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by:
Steven Rostedt (VMware) <rostedt@goodmis.org>
-
Sebastian Andrzej Siewior authored
[ Upstream commit dc952a56 ] On RT write_seqcount_begin() disables preemption which leads to warning in add_wait_queue() while the spinlock_t is acquired. The waitqueue can't be converted to swait_queue because userfaultfd_wake_function() is used as a custom wake function. Use seqlock instead seqcount to avoid the preempt_disable() section during add_wait_queue(). Cc: stable-rt@vger.kernel.org Signed-off-by:
Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by:
Steven Rostedt (VMware) <rostedt@goodmis.org>
-
Sebastian Andrzej Siewior authored
[ Upstream commit 140d7f54 ] If user task changes the CPU affinity mask of a running task it will dispatch migration request if the current CPU is no longer allowed. This might happen shortly before a task enters a migrate_disable() section. Upon leaving the migrate_disable() section, the task will notice that the current CPU is no longer allowed and will will dispatch its own migration request to move it off the current CPU. While invoking __schedule() the first migration request will be processed and the task returns on the "new" CPU with "arg.done = 0". Its own migration request will be processed shortly after and will result in memory corruption if the stack memory, designed for request, was used otherwise in the meantime. Spin until the migration request has been processed if it was accepted. Signed-off-by:
Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by:
Steven Rostedt (VMware) <rostedt@goodmis.org>
-