Commits · v6.0-rc3-rt5-rebase · Arch Linux / Packaging / Upstream / linux-rt

This project is mirrored from https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git. Pull mirroring updated 9 minutes ago.

Aug 29, 2022

Add localversion for -RT release · 746d2d0a
Thomas Gleixner authored 13 years ago
```
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
```
v6.0-rc3-rt5-rebase

746d2d0a

sysfs: Add /sys/kernel/realtime entry · 518bacec

Clark Williams authored 13 years ago


Add a /sys/kernel entry to indicate that the kernel is a
realtime kernel.

Clark says that he needs this for udev rules, udev needs to evaluate
if its a PREEMPT_RT kernel a few thousand times and parsing uname
output is too slow or so.

Are there better solutions? Should it exist and return 0 on !-rt?

Signed-off-by: Clark Williams <williams@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

518bacec

POWERPC: Allow to enable RT · bc37f67e

Sebastian Andrzej Siewior authored 5 years ago


Allow to select RT.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

bc37f67e

powerpc/stackprotector: work around stack-guard init from atomic · 5642a355

Sebastian Andrzej Siewior authored 5 years ago


This is invoked from the secondary CPU in atomic context. On x86 we use
tsc instead. On Power we XOR it against mftb() so lets use stack address
as the initial value.

Cc: stable-rt@vger.kernel.org
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

5642a355

powerpc/kvm: Disable in-kernel MPIC emulation for PREEMPT_RT · 8fed5560

Bogdan Purcareata authored 9 years ago

While converting the openpic emulation code to use a raw_spinlock_t enables
guests to run on RT, there's still a performance issue. For interrupts sent in
directed delivery mode with a multiple CPU mask, the emulated openpic will loop
through all of the VCPUs, and for each VCPUs, it call IRQ_check, which will loop
through all the pending interrupts for that VCPU. This is done while holding the
raw_lock, meaning that in all this time the interrupts and preemption are
disabled on the host Linux. A malicious user app can max both these number and
cause a DoS.

This temporary fix is sent for two reasons. First is so that users who want to
use the in-kernel MPIC emulation are aware of the potential latencies, thus
making sure that the hardware MPIC and their usage scenario does not involve
interrupts sent in directed delivery mode, and the number of possible pending
interrupts is kept small. Secondly, this should incentivize the development of a
proper openpic emulation that would be better suited for RT.

Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Bogdan Purcareata <bogdan.purcareata@freescale.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

8fed5560

powerpc/pseries/iommu: Use a locallock instead local_irq_save() · c44d9b81

Sebastian Andrzej Siewior authored 5 years ago


The locallock protects the per-CPU variable tce_page. The function
attempts to allocate memory while tce_page is protected (by disabling
interrupts).

Use local_irq_save() instead of local_irq_disable().

Cc: stable-rt@vger.kernel.org
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

c44d9b81

powerpc: traps: Use PREEMPT_RT · 7ab753cc

Sebastian Andrzej Siewior authored 5 years ago


Add PREEMPT_RT to the backtrace if enabled.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

7ab753cc

ARM64: Allow to enable RT · 4c972dfd

Sebastian Andrzej Siewior authored 5 years ago


Allow to select RT.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

4c972dfd

ARM: Allow to enable RT · fb827876

Sebastian Andrzej Siewior authored 5 years ago


Allow to select RT.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

fb827876

tty/serial/pl011: Make the locking work on RT · 0f0d848b

Thomas Gleixner authored 12 years ago

The lock is a sleeping lock and local_irq_save() is not the optimsation
we are looking for. Redo it to make it work on -RT and non-RT.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

0f0d848b

tty/serial/omap: Make the locking RT aware · a07cc75b

Thomas Gleixner authored 13 years ago


The lock is a sleeping lock and local_irq_save() is not the
optimsation we are looking for. Redo it to make it work on -RT and
non-RT.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

a07cc75b

ARM: enable irq in translation/section permission fault handlers · 87fcc844

Yadi.hu authored 10 years ago


Probably happens on all ARM, with
CONFIG_PREEMPT_RT
CONFIG_DEBUG_ATOMIC_SLEEP

This simple program....

int main() {
   *((char*)0xc0001000) = 0;
};

[ 512.742724] BUG: sleeping function called from invalid context at kernel/rtmutex.c:658
[ 512.743000] in_atomic(): 0, irqs_disabled(): 128, pid: 994, name: a
[ 512.743217] INFO: lockdep is turned off.
[ 512.743360] irq event stamp: 0
[ 512.743482] hardirqs last enabled at (0): [< (null)>] (null)
[ 512.743714] hardirqs last disabled at (0): [<c0426370>] copy_process+0x3b0/0x11c0
[ 512.744013] softirqs last enabled at (0): [<c0426370>] copy_process+0x3b0/0x11c0
[ 512.744303] softirqs last disabled at (0): [< (null)>] (null)
[ 512.744631] [<c041872c>] (unwind_backtrace+0x0/0x104)
[ 512.745001] [<c09af0c4>] (dump_stack+0x20/0x24)
[ 512.745355] [<c0462490>] (__might_sleep+0x1dc/0x1e0)
[ 512.745717] [<c09b6770>] (rt_spin_lock+0x34/0x6c)
[ 512.746073] [<c0441bf0>] (do_force_sig_info+0x34/0xf0)
[ 512.746457] [<c0442668>] (force_sig_info+0x18/0x1c)
[ 512.746829] [<c041d880>] (__do_user_fault+0x9c/0xd8)
[ 512.747185] [<c041d938>] (do_bad_area+0x7c/0x94)
[ 512.747536] [<c041d990>] (do_sect_fault+0x40/0x48)
[ 512.747898] [<c040841c>] (do_DataAbort+0x40/0xa0)
[ 512.748181] Exception stack(0xecaa1fb0 to 0xecaa1ff8)

Oxc0000000 belongs to kernel address space, user task can not be
allowed to access it. For above condition, correct result is that
test case should receive a “segment fault” and exits but not stacks.

the root cause is commit 02fe2845 ("avoid enabling interrupts in
prefetch/data abort handlers"),it deletes irq enable block in Data
abort assemble code and move them into page/breakpiont/alignment fault
handlers instead. But author does not enable irq in translation/section
permission fault handlers. ARM disables irq when it enters exception/
interrupt mode, if kernel doesn't enable irq, it would be still disabled
during translation/section permission fault.

We see the above splat because do_force_sig_info is still called with
IRQs off, and that code eventually does a:

        spin_lock_irqsave(&t->sighand->siglock, flags);

As this is architecture independent code, and we've not seen any other
need for other arch to have the siglock converted to raw lock, we can
conclude that we should enable irq for ARM translation/section
permission exception.


Signed-off-by: Yadi.hu <yadi.hu@windriver.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

87fcc844

arm: Disable jump-label on PREEMPT_RT. · 0f5fc742

Thomas Gleixner authored 9 years ago


jump-labels are used to efficiently switch between two possible code
paths. To achieve this, stop_machine() is used to keep the CPU in a
known state while the opcode is modified. The usage of stop_machine()
here leads to large latency spikes which can be observed on PREEMPT_RT.

Jump labels may change the target during runtime and are not restricted
to debug or "configuration/ setup" part of a PREEMPT_RT system where
high latencies could be defined as acceptable.

Disable jump-label support on a PREEMPT_RT system.

[bigeasy: Patch description.]

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://lkml.kernel.org/r/20220613182447.112191-2-bigeasy@linutronix.de

0f5fc742

arch/arm64: Add lazy preempt support · 084d4ab1

Anders Roxell authored 9 years ago

arm64 is missing support for PREEMPT_RT. The main feature which is
lacking is support for lazy preemption. The arch-specific entry code,
thread information structure definitions, and associated data tables
have to be extended to provide this support. Then the Kconfig file has
to be extended to indicate the support is available, and also to
indicate that support for full RT preemption is now available.

Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

084d4ab1

powerpc: Add support for lazy preemption · 05e4861a

Thomas Gleixner authored 12 years ago


Implement the powerpc pieces for lazy preempt.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

05e4861a

arm: Add support for lazy preemption · 47408959

Thomas Gleixner authored 12 years ago


Implement the arm pieces for lazy preempt.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

47408959

entry: Fix the preempt lazy fallout · fdc638c2

Thomas Gleixner authored 3 years ago


Common code needs common defines....

Fixes: f2f9e496 ("x86: Support for lazy preemption")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

fdc638c2

x86: Support for lazy preemption · 32f21ed2

Thomas Gleixner authored 12 years ago


Implement the x86 pieces for lazy preempt.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

32f21ed2

x86/entry: Use should_resched() in idtentry_exit_cond_resched() · b0ec1bab

Sebastian Andrzej Siewior authored 4 years ago


The TIF_NEED_RESCHED bit is inlined on x86 into the preemption counter.
By using should_resched(0) instead of need_resched() the same check can
be performed which uses the same variable as 'preempt_count()` which was
issued before.

Use should_resched(0) instead need_resched().

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

b0ec1bab

sched: Add support for lazy preemption · 2d1febd7

Thomas Gleixner authored 12 years ago


It has become an obsession to mitigate the determinism vs. throughput
loss of RT. Looking at the mainline semantics of preemption points
gives a hint why RT sucks throughput wise for ordinary SCHED_OTHER
tasks. One major issue is the wakeup of tasks which are right away
preempting the waking task while the waking task holds a lock on which
the woken task will block right after having preempted the wakee. In
mainline this is prevented due to the implicit preemption disable of
spin/rw_lock held regions. On RT this is not possible due to the fully
preemptible nature of sleeping spinlocks.

Though for a SCHED_OTHER task preempting another SCHED_OTHER task this
is really not a correctness issue. RT folks are concerned about
SCHED_FIFO/RR tasks preemption and not about the purely fairness
driven SCHED_OTHER preemption latencies.

So I introduced a lazy preemption mechanism which only applies to
SCHED_OTHER tasks preempting another SCHED_OTHER task. Aside of the
existing preempt_count each tasks sports now a preempt_lazy_count
which is manipulated on lock acquiry and release. This is slightly
incorrect as for lazyness reasons I coupled this on
migrate_disable/enable so some other mechanisms get the same treatment
(e.g. get_cpu_light).

Now on the scheduler side instead of setting NEED_RESCHED this sets
NEED_RESCHED_LAZY in case of a SCHED_OTHER/SCHED_OTHER preemption and
therefor allows to exit the waking task the lock held region before
the woken task preempts. That also works better for cross CPU wakeups
as the other side can stay in the adaptive spinning loop.

For RT class preemption there is no change. This simply sets
NEED_RESCHED and forgoes the lazy preemption counter.

 Initial test do not expose any observable latency increasement, but
history shows that I've been proven wrong before :)

The lazy preemption mode is per default on, but with
CONFIG_SCHED_DEBUG enabled it can be disabled via:

 # echo NO_PREEMPT_LAZY >/sys/kernel/debug/sched_features

and reenabled via

 # echo PREEMPT_LAZY >/sys/kernel/debug/sched_features

The test results so far are very machine and workload dependent, but
there is a clear trend that it enhances the non RT workload
performance.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

2d1febd7

Revert "drm/i915: Depend on !PREEMPT_RT." · 7951c15a

Sebastian Andrzej Siewior authored 2 years ago


Once the known issues are addressed, it should be safe to enable the
driver.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

7951c15a

drm/i915: Drop the irqs_disabled() check · 82361798

Sebastian Andrzej Siewior authored 3 years ago


The !irqs_disabled() check triggers on PREEMPT_RT even with
i915_sched_engine::lock acquired. The reason is the lock is transformed
into a sleeping lock on PREEMPT_RT and does not disable interrupts.

There is no need to check for disabled interrupts. The lockdep
annotation below already check if the lock has been acquired by the
caller and will yell if the interrupts are not disabled.

Remove the !irqs_disabled() check.

Reported-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

82361798

drm/i915/gt: Use spin_lock_irq() instead of local_irq_disable() + spin_lock() · 1815c275

Sebastian Andrzej Siewior authored 3 years ago


execlists_dequeue() is invoked from a function which uses
local_irq_disable() to disable interrupts so the spin_lock() behaves
like spin_lock_irq().
This breaks PREEMPT_RT because local_irq_disable() + spin_lock() is not
the same as spin_lock_irq().

execlists_dequeue_irq() and execlists_dequeue() has each one caller
only. If intel_engine_cs::active::lock is acquired and released with the
_irq suffix then it behaves almost as if execlists_dequeue() would be
invoked with disabled interrupts. The difference is the last part of the
function which is then invoked with enabled interrupts.
I can't tell if this makes a difference. From looking at it, it might
work to move the last unlock at the end of the function as I didn't find
anything that would acquire the lock again.

Reported-by: Clark Williams <williams@redhat.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

1815c275

drm/i915/gt: Queue and wait for the irq_work item. · ebeec035

Sebastian Andrzej Siewior authored 3 years ago


Disabling interrupts and invoking the irq_work function directly breaks
on PREEMPT_RT.
PREEMPT_RT does not invoke all irq_work from hardirq context because
some of the user have spinlock_t locking in the callback function.
These locks are then turned into a sleeping locks which can not be
acquired with disabled interrupts.

Using irq_work_queue() has the benefit that the irqwork will be invoked
in the regular context. In general there is "no" delay between enqueuing
the callback and its invocation because the interrupt is raised right
away on architectures which support it (which includes x86).

Use irq_work_queue() + irq_work_sync() instead invoking the callback
directly.

Reported-by: Clark Williams <williams@redhat.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

ebeec035

drm/i915: skip DRM_I915_LOW_LEVEL_TRACEPOINTS with NOTRACE · f80c7f87

Sebastian Andrzej Siewior authored 6 years ago


The order of the header files is important. If this header file is
included after tracepoint.h was included then the NOTRACE here becomes a
nop. Currently this happens for two .c files which use the tracepoitns
behind DRM_I915_LOW_LEVEL_TRACEPOINTS.

Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

f80c7f87

drm/i915: Disable tracing points on PREEMPT_RT · dfbb4baa

Sebastian Andrzej Siewior authored 6 years ago


Luca Abeni reported this:
| BUG: scheduling while atomic: kworker/u8:2/15203/0x00000003
| CPU: 1 PID: 15203 Comm: kworker/u8:2 Not tainted 4.19.1-rt3 #10
| Call Trace:
|  rt_spin_lock+0x3f/0x50
|  gen6_read32+0x45/0x1d0 [i915]
|  g4x_get_vblank_counter+0x36/0x40 [i915]
|  trace_event_raw_event_i915_pipe_update_start+0x7d/0xf0 [i915]

The tracing events use trace_i915_pipe_update_start() among other events
use functions acquire spinlock_t locks which are transformed into
sleeping locks on PREEMPT_RT. A few trace points use
intel_get_crtc_scanline(), others use ->get_vblank_counter() wich also
might acquire a sleeping locks on PREEMPT_RT.
At the time the arguments are evaluated within trace point, preemption
is disabled and so the locks must not be acquired on PREEMPT_RT.

Based on this I don't see any other way than disable trace points on
PREMPT_RT.

Reported-by: Luca Abeni <lucabe72@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

dfbb4baa

drm/i915: Don't check for atomic context on PREEMPT_RT · ebed9177

Sebastian Andrzej Siewior authored 3 years ago

The !in_atomic() check in _wait_for_atomic() triggers on PREEMPT_RT
because the uncore::lock is a spinlock_t and does not disable
preemption or interrupts.

Changing the uncore:lock to a raw_spinlock_t doubles the worst case
latency on an otherwise idle testbox during testing. Therefore I'm
currently unsure about changing this.

Link: https://lore.kernel.org/all/20211006164628.s2mtsdd2jdbfyf7g@linutronix.de/


Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

ebed9177

drm/i915: Don't disable interrupts on PREEMPT_RT during atomic updates · 44083240

Mike Galbraith authored 8 years ago


Commit
   8d7849db ("drm/i915: Make sprite updates atomic")

started disabling interrupts across atomic updates. This breaks on PREEMPT_RT
because within this section the code attempt to acquire spinlock_t locks which
are sleeping locks on PREEMPT_RT.

According to the comment the interrupts are disabled to avoid random delays and
not required for protection or synchronisation.
If this needs to happen with disabled interrupts on PREEMPT_RT, and the
whole section is restricted to register access then all sleeping locks
need to be acquired before interrupts are disabled and some function
maybe moved after enabling interrupts again.
This includes:
- prepare_to_wait() + finish_wait() due its wake queue.
- drm_crtc_vblank_put() -> vblank_disable_fn() drm_device::vbl_lock.
- skl_pfit_enable(), intel_update_plane(), vlv_atomic_update_fifo() and
  maybe others due to intel_uncore::lock
- drm_crtc_arm_vblank_event() due to drm_device::event_lock and
  drm_device::vblank_time_lock.

Don't disable interrupts on PREEMPT_RT during atomic updates.

[bigeasy: drop local locks, commit message]

Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

44083240

drm/i915: Use preempt_disable/enable_rt() where recommended · 86566f4c

Mike Galbraith authored 8 years ago


Mario Kleiner suggest in commit
  ad3543ed ("drm/intel: Push get_scanout_position() timestamping into kms driver.")

a spots where preemption should be disabled on PREEMPT_RT. The
difference is that on PREEMPT_RT the intel_uncore::lock disables neither
preemption nor interrupts and so region remains preemptible.

The area covers only register reads and writes. The part that worries me
is:
- __intel_get_crtc_scanline() the worst case is 100us if no match is
  found.

- intel_crtc_scanlines_since_frame_timestamp() not sure how long this
  may take in the worst case.

It was in the RT queue for a while and nobody complained.
Disable preemption on PREEPMPT_RT during timestamping.

[bigeasy: patch description.]

Cc: Mario Kleiner <mario.kleiner.de@gmail.com>
Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

86566f4c

printk: avoid preempt_disable() for PREEMPT_RT · 8ea35f13

John Ogness authored 2 years ago


During non-normal operation, printk() calls will attempt to
write the messages directly to the consoles. This involves
using console_trylock() to acquire @console_sem.

Preemption is disabled while directly printing to the consoles
in order to ensure that the printing task is not scheduled away
while holding @console_sem, thus blocking all other printers
and causing delays in printing.

Commit fd5f7cde ("printk: Never set console_may_schedule in
console_trylock()") specifically reverted a previous attempt at
allowing preemption while printing.

However, on PREEMPT_RT systems, disabling preemption while
printing is not allowed because console drivers typically
acquire a spin lock (which under PREEMPT_RT is an rtmutex).
Since direct printing is only used during early boot and
non-panic dumps, the risks of delayed print output for these
scenarios will be accepted under PREEMPT_RT.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

8ea35f13

serial: 8250: implement write_atomic · 337fa844

John Ogness authored 2 years ago


Implement a non-sleeping NMI-safe write_atomic() console function in
order to support atomic console printing during a panic.

Trasmitting data requires disabling interrupts. Since write_atomic()
can be called from any context, it may be called while another CPU
is executing in console code. In order to maintain the correct state
of the IER register, use the global cpu_sync to synchronize all
access to the IER register. This synchronization is only necessary
for serial ports that are being used as consoles.

The global cpu_sync is also used to synchronize between the write()
and write_atomic() callbacks. write() synchronizes per character,
write_atomic() synchronizes per line.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

337fa844

printk: add infrastucture for atomic consoles · e24930fd

John Ogness authored 2 years ago


Many times it is not possible to see the console output on
panic because printing threads cannot be scheduled and/or the
console is already taken and forcibly overtaking/busting the
locks does provide the hoped results.

Introduce a new infrastructure to support "atomic consoles".
A new optional callback in struct console, write_atomic(), is
available for consoles to provide an implemention for writing
console messages. The implementation must be NMI safe if they
can run on an architecture where NMIs exist.

Console drivers implementing the write_atomic() callback must
also select CONFIG_HAVE_ATOMIC_CONSOLE in order to enable the
atomic console code within the printk subsystem.

If atomic consoles are available, panic() will flush the kernel
log only to the atomic consoles (before busting spinlocks).
Afterwards, panic() will continue  as before, which includes
attempting to flush the other (non-atomic) consoles.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

e24930fd

printk: Bring back the RT bits. · 9684feaa

Sebastian Andrzej Siewior authored 2 years ago


This is a revert of the commits:
| 07a22b61 Revert "printk: add functions to prefer direct printing"
| 5831788a Revert "printk: add kthread console printers"
| 2d9ef940 Revert "printk: extend console_lock for per-console locking"
| 007eeab7 Revert "printk: remove @console_locked"
| 05c96b37 Revert "printk: Block console kthreads when direct printing will be required"
| 20fb0c82 Revert "printk: Wait for the global console lock when the system is going down"

which is needed for the atomic consoles which are used on PREEMPT_RT.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

9684feaa

locking/lockdep: Remove lockdep_init_map_crosslock. · 81e7c331

Sebastian Andrzej Siewior authored 2 years ago


The cross-release bits have been removed, lockdep_init_map_crosslock() is
a leftover.

Remove lockdep_init_map_crosslock.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Waiman Long <longman@redhat.com>
Link: https://lore.kernel.org/r/20220311164457.46461-1-bigeasy@linutronix.de
Link: https://lore.kernel.org/r/YqITgY+2aPITu96z@linutronix.de

81e7c331

zram: Replace bit spinlocks with spinlock_t for PREEMPT_RT. · 61f7b0f8

Mike Galbraith authored 8 years ago


The bit spinlock disables preemption on PREEMPT_RT. With disabled preemption it
is not allowed to acquire other sleeping locks which includes invoking
zs_free().

Use a spinlock_t on PREEMPT_RT for locking and set/ clear ZRAM_LOCK after the
lock has been acquired/ dropped.

Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://lkml.kernel.org/r/YqIbMuHCPiQk+Ac2@linutronix.de

61f7b0f8

tpm_tis: fix stall after iowrite*()s · 09c75ddd

Haris Okanovic authored 7 years ago

ioread8() operations to TPM MMIO addresses can stall the cpu when
immediately following a sequence of iowrite*()'s to the same region.

For example, cyclitest measures ~400us latency spikes when a non-RT
usermode application communicates with an SPI-based TPM chip (Intel Atom
E3940 system, PREEMPT_RT kernel). The spikes are caused by a
stalling ioread8() operation following a sequence of 30+ iowrite8()s to
the same address. I believe this happens because the write sequence is
buffered (in cpu or somewhere along the bus), and gets flushed on the
first LOAD instruction (ioread*()) that follows.

The enclosed change appears to fix this issue: read the TPM chip's
access register (status code) after every iowrite*() operation to
amortize the cost of flushing data to chip across multiple instructions.

Signed-off-by: Haris Okanovic <haris.okanovic@ni.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

09c75ddd

tick: Fix timer storm since introduction of timersd · 0fff0ee1

Frederic Weisbecker authored 2 years ago


If timers are pending while the tick is reprogrammed on nohz_mode, the
next expiry is not armed to fire now, it is delayed one jiffy forward
instead so as not to raise an inextinguishable timer storm with such
scenario:

1) IRQ triggers and queue a timer
2) ksoftirqd() is woken up
3) IRQ tail: timer is reprogrammed to fire now
4) IRQ exit
5) TIMER interrupt
6) goto 3)

...all that until we finally reach ksoftirqd.

Unfortunately we are checking the wrong softirq vector bitmask since
timersd kthread has split from ksoftirqd. Timers now have their own
vector state field that must be checked separately. As a result, the
old timer storm is back. This shows up early on boot with extremely long
initcalls:

	[  333.004807] initcall dquot_init+0x0/0x111 returned 0 after 323822879 usecs

and the cause is uncovered with the right trace events showing just
10 microseconds between ticks (~100 000 Hz):

|swapper/-1 1dn.h111 60818582us : hrtimer_expire_entry: hrtimer=00000000e0ef0f6b function=tick_sched_timer now=60415486608
|swapper/-1 1dn.h111 60818592us : hrtimer_expire_entry: hrtimer=00000000e0ef0f6b function=tick_sched_timer now=60415496082
|swapper/-1 1dn.h111 60818601us : hrtimer_expire_entry: hrtimer=00000000e0ef0f6b function=tick_sched_timer now=60415505550

Fix this by checking the right timer vector state from the nohz code.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://lkml.kernel.org/r/20220405010752.1347437-2-frederic@kernel.org

0fff0ee1

rcutorture: Also force sched priority to timersd on boosting test. · c2c66eb8

Frederic Weisbecker authored 2 years ago


ksoftirqd is statically boosted to the priority level right above the
one of rcu_torture_boost() so that timers, which torture readers rely on,
get a chance to run while rcu_torture_boost() is polling.

However timers processing got split from ksoftirqd into their own kthread
(timersd) that isn't boosted. It has the same SCHED_FIFO low prio as
rcu_torture_boost() and therefore timers can't preempt it and may
starve.

The issue can be triggered in practice on v5.17.1-rt17 using:

	./kvm.sh --allcpus --configs TREE04 --duration 10m --kconfig "CONFIG_EXPERT=y CONFIG_PREEMPT_RT=y"

Fix this with statically boosting timersd just like is done with
ksoftirqd in commit
   ea6d962e ("rcutorture: Judge RCU priority boosting on grace periods, not callbacks")

Suggested-by: Mel Gorman <mgorman@suse.de>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Link: https://lkml.kernel.org/r/20220405010752.1347437-1-frederic@kernel.org


Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

c2c66eb8

softirq: Use a dedicated thread for timer wakeups. · 61581707

Sebastian Andrzej Siewior authored 3 years ago

A timer/hrtimer softirq is raised in-IRQ context. With threaded
interrupts enabled or on PREEMPT_RT this leads to waking the ksoftirqd
for the processing of the softirq.
Once the ksoftirqd is marked as pending (or is running) it will collect
all raised softirqs. This in turn means that a softirq which would have
been processed at the end of the threaded interrupt, which runs at an
elevated priority, is now moved to ksoftirqd which runs at SCHED_OTHER
priority and competes with every regular task for CPU resources.
This introduces long delays on heavy loaded systems and is not desired
especially if the system is not overloaded by the softirqs.

Split the TIMER_SOFTIRQ and HRTIMER_SOFTIRQ processing into a dedicated
timers thread and let it run at the lowest SCHED_FIFO priority.
RT tasks are are woken up from hardirq context so only timer_list timers
and hrtimers for "regular" tasks are processed here. The higher priority
ensures that wakeups are performed before scheduling SCHED_OTHER tasks.

Using a dedicated variable to store the pending softirq bits values
ensure that the timer are not accidentally picked up by ksoftirqd and
other threaded interrupts.
It shouldn't be picked up by ksoftirqd since it runs at lower priority.
However if the timer bits are ORed while a threaded interrupt is
running, then the timer softirq would be performed at higher priority.
The new timer thread will block on the softirq lock before it starts
softirq work. This "race window" isn't closed because while timer thread
is performing the softirq it can get PI-boosted via the softirq lock by
a random force-threaded thread.
The timer thread can pick up pending softirqs from ksoftirqd but only
if the softirq load is high. It is not be desired that the picked up
softirqs are processed at SCHED_FIFO priority under high softirq load
but this can already happen by a PI-boost by a force-threaded interrupt.

Reported-by: kernel test robot <lkp@intel.com> [ static timer_threads ]
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

61581707

x86: Enable RT also on 32bit · 4c30525f

Sebastian Andrzej Siewior authored 5 years ago


Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

4c30525f

Admin message