This project is mirrored from Pull mirroring updated .
  1. 29 Nov, 2016 1 commit
    • Peter Zijlstra's avatar
      sched/idle: Add support for tasks that inject idle · c1de45ca
      Peter Zijlstra authored
      Idle injection drivers such as Intel powerclamp and ACPI PAD drivers use
      realtime tasks to take control of CPU then inject idle. There are two
      issues with this approach:
       1. Low efficiency: injected idle task is treated as busy so sched ticks
          do not stop during injected idle period, the result of these
          unwanted wakeups can be ~20% loss in power savings.
       2. Idle accounting: injected idle time is presented to user as busy.
      This patch addresses the issues by introducing a new PF_IDLE flag which
      allows any given task to be treated as idle task while the flag is set.
      Therefore, idle injection tasks can run through the normal flow of NOHZ
      idle enter/exit to get the correct accounting as well as tick stop when
      The implication is that idle task is then no longer limited to PID == 0.
      Acked-by: default avatarIngo Molnar <>
      Signed-off-by: default avatarPeter Zijlstra <>
      Signed-off-by: default avatarJacob Pan <>
      Signed-off-by: default avatarRafael J. Wysocki <>
  2. 08 Oct, 2016 1 commit
  3. 06 Sep, 2016 1 commit
  4. 26 Aug, 2016 1 commit
    • James Morse's avatar
      cpu/hotplug: Allow suspend/resume CPU to be specified · d391e552
      James Morse authored
      disable_nonboot_cpus() assumes that the lowest numbered online CPU is
      the boot CPU, and that this is the correct CPU to run any power
      management code on.
      On x86 this is always correct, as CPU0 cannot (easily) by taken offline.
      On arm64 CPU0 can be taken offline. For hibernate/resume this means we
      may hibernate on a CPU other than CPU0. If the system is rebooted with
      kexec 'CPU0' will be assigned to a different physical CPU. This
      complicates hibernate/resume as now we can't trust the CPU numbers.
      Arch code can find the correct physical CPU, and ensure it is online
      before resume from hibernate begins, but also needs to influence
      disable_nonboot_cpus()s choice of CPU.
      Rename disable_nonboot_cpus() as freeze_secondary_cpus() and add an
      argument indicating which CPU should be left standing. Follow the logic
      in migrate_to_reboot_cpu() to use the lowest numbered online CPU if the
      requested CPU is not online.
      Add disable_nonboot_cpus() as an inline function that has the existing
      Cc: Rafael J. Wysocki <>
      Reviewed-by: default avatarThomas Gleixner <>
      Signed-off-by: default avatarJames Morse <>
      Signed-off-by: default avatarWill Deacon <>
  5. 14 Jul, 2016 2 commits
  6. 06 May, 2016 3 commits
  7. 01 Mar, 2016 5 commits
    • Thomas Gleixner's avatar
      rcu: Make CPU_DYING_IDLE an explicit call · 27d50c7e
      Thomas Gleixner authored
      Make the RCU CPU_DYING_IDLE callback an explicit function call, so it gets
      invoked at the proper place.
      Signed-off-by: default avatarThomas Gleixner <>
      Cc: Rik van Riel <>
      Cc: Rafael Wysocki <>
      Cc: "Srivatsa S. Bhat" <>
      Cc: Peter Zijlstra <>
      Cc: Arjan van de Ven <>
      Cc: Sebastian Siewior <>
      Cc: Rusty Russell <>
      Cc: Steven Rostedt <>
      Cc: Oleg Nesterov <>
      Cc: Tejun Heo <>
      Cc: Andrew Morton <>
      Cc: Paul McKenney <>
      Cc: Linus Torvalds <>
      Cc: Paul Turner <>
      Signed-off-by: default avatarThomas Gleixner <>
    • Thomas Gleixner's avatar
      cpu/hotplug: Make wait for dead cpu completion based · e69aab13
      Thomas Gleixner authored
      Kill the busy spinning on the control side and just wait for the hotplugged
      cpu to tell that it reached the dead state.
      Signed-off-by: default avatarThomas Gleixner <>
      Cc: Rik van Riel <>
      Cc: Rafael Wysocki <>
      Cc: "Srivatsa S. Bhat" <>
      Cc: Peter Zijlstra <>
      Cc: Arjan van de Ven <>
      Cc: Sebastian Siewior <>
      Cc: Rusty Russell <>
      Cc: Steven Rostedt <>
      Cc: Oleg Nesterov <>
      Cc: Tejun Heo <>
      Cc: Andrew Morton <>
      Cc: Paul McKenney <>
      Cc: Linus Torvalds <>
      Cc: Paul Turner <>
      Signed-off-by: default avatarThomas Gleixner <>
    • Thomas Gleixner's avatar
      cpu/hotplug: Unpark smpboot threads from the state machine · 931ef163
      Thomas Gleixner authored
      Handle the smpboot threads in the state machine.
      Signed-off-by: default avatarThomas Gleixner <>
      Cc: Rik van Riel <>
      Cc: Rafael Wysocki <>
      Cc: "Srivatsa S. Bhat" <>
      Cc: Peter Zijlstra <>
      Cc: Arjan van de Ven <>
      Cc: Sebastian Siewior <>
      Cc: Rusty Russell <>
      Cc: Steven Rostedt <>
      Cc: Oleg Nesterov <>
      Cc: Tejun Heo <>
      Cc: Andrew Morton <>
      Cc: Paul McKenney <>
      Cc: Linus Torvalds <>
      Cc: Paul Turner <>
      Signed-off-by: default avatarThomas Gleixner <>
    • Thomas Gleixner's avatar
      cpu/hotplug: Convert to a state machine for the control processor · cff7d378
      Thomas Gleixner authored
      Move the split out steps into a callback array and let the cpu_up/down
      code iterate through the array functions. For now most of the
      callbacks are asymmetric to resemble the current hotplug maze.
      Signed-off-by: default avatarThomas Gleixner <>
      Cc: Rik van Riel <>
      Cc: Rafael Wysocki <>
      Cc: "Srivatsa S. Bhat" <>
      Cc: Peter Zijlstra <>
      Cc: Arjan van de Ven <>
      Cc: Sebastian Siewior <>
      Cc: Rusty Russell <>
      Cc: Steven Rostedt <>
      Cc: Oleg Nesterov <>
      Cc: Tejun Heo <>
      Cc: Andrew Morton <>
      Cc: Paul McKenney <>
      Cc: Linus Torvalds <>
      Cc: Paul Turner <>
      Signed-off-by: default avatarThomas Gleixner <>
    • Thomas Gleixner's avatar
      cpu/hotplug: Restructure FROZEN state handling · 090e77c3
      Thomas Gleixner authored
      There are only a few callbacks which really care about FROZEN
      vs. !FROZEN. No need to have extra states for this.
      Publish the frozen state in an extra variable which is updated under
      the hotplug lock and let the users interested deal with it w/o
      imposing that extra state checks on everyone.
      Signed-off-by: default avatarThomas Gleixner <>
      Cc: Rik van Riel <>
      Cc: Rafael Wysocki <>
      Cc: "Srivatsa S. Bhat" <>
      Cc: Peter Zijlstra <>
      Cc: Arjan van de Ven <>
      Cc: Sebastian Siewior <>
      Cc: Rusty Russell <>
      Cc: Steven Rostedt <>
      Cc: Oleg Nesterov <>
      Cc: Tejun Heo <>
      Cc: Andrew Morton <>
      Cc: Paul McKenney <>
      Cc: Linus Torvalds <>
      Cc: Paul Turner <>
      Signed-off-by: default avatarThomas Gleixner <>
  8. 07 Oct, 2015 1 commit
  9. 17 Jul, 2015 1 commit
    • Nicolas Iooss's avatar
      include, lib: add __printf attributes to several function prototypes · 8db14860
      Nicolas Iooss authored
      Using __printf attributes helps to detect several format string issues
      at compile time (even though -Wformat-security is currently disabled in
      Makefile).  For example it can detect when formatting a pointer as a
      number, like the issue fixed in commit a3fa71c4 ("wl18xx: show
      rx_frames_per_rates as an array as it really is"), or when the arguments
      do not match the format string, c.f.  for example commit 5ce1aca8
      ("reiserfs: fix __RASSERT format string").
      To prevent similar bugs in the future, add a __printf attribute to every
      function prototype which needs one in include/linux/ and lib/.  These
      functions were mostly found by using gcc's -Wsuggest-attribute=format
      Signed-off-by: default avatarNicolas Iooss <>
      Cc: Greg Kroah-Hartman <>
      Cc: Felipe Balbi <>
      Cc: Joel Becker <>
      Signed-off-by: default avatarAndrew Morton <>
      Signed-off-by: default avatarLinus Torvalds <>
  10. 13 Apr, 2015 2 commits
    • Ingo Molnar's avatar
      cpu: Provide smpboot_thread_init() on !CONFIG_SMP kernels as well · 590ee7db
      Ingo Molnar authored
      Now that we are using smpboot_thread_init() in init/main.c as well,
      provide it for !CONFIG_SMP as well.
      This addresses a !CONFIG_SMP build failure.
      Cc: Paul E. McKenney <>
      Cc: Borislav Petkov <>
      Cc: Andrew Morton <>
      Cc: Linus Torvalds <>
      Cc: Peter Zijlstra <>
      Cc: Thomas Gleixner <>
      Signed-off-by: default avatarIngo Molnar <>
    • Paul E. McKenney's avatar
      cpu: Defer smpboot kthread unparking until CPU known to scheduler · 00df35f9
      Paul E. McKenney authored
      Currently, smpboot_unpark_threads() is invoked before the incoming CPU
      has been added to the scheduler's runqueue structures.  This might
      potentially cause the unparked kthread to run on the wrong CPU, since the
      correct CPU isn't fully set up yet.
      That causes a sporadic, hard to debug boot crash triggering on some
      systems, reported by Borislav Petkov, and bisected down to:
       ("x86: Use common outgoing-CPU-notification code")
      This patch places smpboot_unpark_threads() in a CPU hotplug
      notifier with priority set so that these kthreads are unparked just after
      the CPU has been added to the runqueues.
      Reported-and-tested-by: default avatarBorislav Petkov <>
      Signed-off-by: default avatarPaul E. McKenney <>
      Cc: Andrew Morton <>
      Cc: Linus Torvalds <>
      Cc: Peter Zijlstra <>
      Cc: Thomas Gleixner <>
      Signed-off-by: default avatarIngo Molnar <>
  11. 12 Mar, 2015 1 commit
    • Paul E. McKenney's avatar
      rcu: Handle outgoing CPUs on exit from idle loop · 88428cc5
      Paul E. McKenney authored
      This commit informs RCU of an outgoing CPU just before that CPU invokes
      arch_cpu_idle_dead() during its last pass through the idle loop (via a
      new CPU_DYING_IDLE notifier value).  This change means that RCU need not
      deal with outgoing CPUs passing through the scheduler after informing
      RCU that they are no longer online.  Note that removing the CPU from
      the rcu_node ->qsmaskinit bit masks is done at CPU_DYING_IDLE time,
      and orphaning callbacks is still done at CPU_DEAD time, the reason being
      that at CPU_DEAD time we have another CPU that can adopt them.
      Signed-off-by: default avatarPaul E. McKenney <>
  12. 11 Mar, 2015 1 commit
    • Paul E. McKenney's avatar
      smpboot: Add common code for notification from dying CPU · 8038dad7
      Paul E. McKenney authored
      RCU ignores offlined CPUs, so they cannot safely run RCU read-side code.
      (They -can- use SRCU, but not RCU.)  This means that any use of RCU
      during or after the call to arch_cpu_idle_dead().  Unfortunately,
      commit 2ed53c0d
       added a complete() call, which will contain RCU
      read-side critical sections if there is a task waiting to be awakened.
      Which, as it turns out, there almost never is.  In my qemu/KVM testing,
      the to-be-awakened task is not yet asleep more than 99.5% of the time.
      In current mainline, failure is even harder to reproduce, requiring a
      virtualized environment that delays the outgoing CPU by at least three
      jiffies between the time it exits its stop_machine() task at CPU_DYING
      time and the time it calls arch_cpu_idle_dead() from the idle loop.
      However, this problem really can occur, especially in virtualized
      environments, and therefore really does need to be fixed
      This suggests moving back to the polling loop, but using a much shorter
      wait, with gentle exponential backoff instead of the old 100-millisecond
      wait.  Most of the time, the loop will exit without waiting at all,
      and almost all of the remaining uses will wait only five microseconds.
      If the outgoing CPU is preempted, a loop will wait one jiffy, then
      increase the wait by a factor of 11/10ths, rounding up.  As before, there
      is a five-second timeout.
      This commit therefore provides common-code infrastructure to do the
      dying-to-surviving CPU handoff in a safe manner.  This code also
      provides an indication at CPU-online of whether the CPU to be onlined
      previously timed out on offline.  The new cpu_check_up_prepare() function
      returns -EBUSY if this CPU previously took more than five seconds to
      go offline, or -EAGAIN if it has not yet managed to go offline.  The
      rationale for -EAGAIN is that it might still be preempted, so an additional
      wait might well find it correctly offlined.  Architecture-specific code
      can decide how to handle these conditions.  Systems in which CPUs take
      themselves completely offline might respond to an -EBUSY return as if
      it was a zero (success) return.  Systems in which the surviving CPU must
      take some action might take it at this time, or might simply mark the
      other CPU as unusable.
      Note that architectures that take the easy way out and simply pass the
      -EBUSY and -EAGAIN upwards will change the sysfs API.
      Signed-off-by: default avatarPaul E. McKenney <>
      Cc: <>
      Cc: <>
      [ paulmck: Fixed state machine for architectures that don't check earlier
        CPU-hotplug results as suggested by James Hogan. ]
  13. 07 Nov, 2014 1 commit
    • Sudeep Holla's avatar
      drivers: base: add cpu_device_create to support per-cpu devices · 3d52943b
      Sudeep Holla authored
      This patch adds a new function to create per-cpu devices.
      This helps in:
      1. reusing the device infrastructure to create any cpu related
         attributes and corresponding sysfs instead of creating and
         dealing with raw kobjects directly
      2. retaining the legacy path(/sys/devices/system/cpu/..) to support
         existing sysfs ABI
      3. avoiding to create links in the bus directory pointing to the
         device as there would be per-cpu instance of these devices with
         the same name since dev->bus is not populated to cpu_sysbus on
      Signed-off-by: default avatarSudeep Holla <>
      Tested-by: default avatarStephen Boyd <>
      Cc: Greg Kroah-Hartman <>
      Cc: David Herrmann <>
      Cc: Kay Sievers <>
      Signed-off-by: default avatarGreg Kroah-Hartman <>
  14. 18 Sep, 2014 1 commit
    • Paul E. McKenney's avatar
      rcu: Eliminate deadlock between CPU hotplug and expedited grace periods · dd56af42
      Paul E. McKenney authored
      Currently, the expedited grace-period primitives do get_online_cpus().
      This greatly simplifies their implementation, but means that calls
      to them holding locks that are acquired by CPU-hotplug notifiers (to
      say nothing of calls to these primitives from CPU-hotplug notifiers)
      can deadlock.  But this is starting to become inconvenient, as can be
      seen here:
      .  The problem in this
      case is that some developers need to acquire a mutex from a CPU-hotplug
      notifier, but also need to hold it across a synchronize_rcu_expedited().
      As noted above, this currently results in deadlock.
      This commit avoids the deadlock and retains the simplicity by creating
      a try_get_online_cpus(), which returns false if the get_online_cpus()
      reference count could not immediately be incremented.  If a call to
      try_get_online_cpus() returns true, the expedited primitives operate as
      before.  If a call returns false, the expedited primitives fall back to
      normal grace-period operations.  This falling back of course results in
      increased grace-period latency, but only during times when CPU hotplug
      operations are actually in flight.  The effect should therefore be
      negligible during normal operation.
      Signed-off-by: default avatarPaul E. McKenney <>
      Cc: Josh Triplett <>
      Cc: "Rafael J. Wysocki" <>
      Tested-by: default avatarLan Tianyu <>
  15. 06 Jun, 2014 1 commit
  16. 20 Mar, 2014 1 commit
    • Srivatsa S. Bhat's avatar
      CPU hotplug: Provide lockless versions of callback registration functions · 93ae4f97
      Srivatsa S. Bhat authored
      The following method of CPU hotplug callback registration is not safe
      due to the possibility of an ABBA deadlock involving the cpu_add_remove_lock
      and the cpu_hotplug.lock.
      The deadlock is shown below:
                CPU 0                                         CPU 1
                -----                                         -----
         Acquire cpu_hotplug.lock
         [via get_online_cpus()]
                                                    CPU online/offline operation
                                                    takes cpu_add_remove_lock
                                                    [via cpu_maps_update_begin()]
         Try to acquire
         [via register_cpu_notifier()]
                                                    CPU online/offline operation
                                                    tries to acquire cpu_hotplug.lock
                                                    [via cpu_hotplug_begin()]
                                  *** DEADLOCK! ***
      The problem here is that callback registration takes the locks in one order
      whereas the CPU hotplug operations take the same locks in the opposite order.
      To avoid this issue and to provide a race-free method to register CPU hotplug
      callbacks (along with initialization of already online CPUs), introduce new
      variants of the callback registration APIs that simply register the callbacks
      without holding the cpu_add_remove_lock during the registration. That way,
      we can avoid the ABBA scenario. However, we will need to hold the
      cpu_add_remove_lock throughout the entire critical section, to protect updates
      to the callback/notifier chain.
      This can be achieved by writing the callback registration code as follows:
      	cpu_maps_update_begin(); [ or cpu_notifier_register_begin(); see below ]
      	/* This doesn't take the cpu_add_remove_lock */
      	cpu_maps_update_done();  [ or cpu_notifier_register_done(); see below ]
      Note that we can't use get_online_cpus() here instead of cpu_maps_update_begin()
      because the cpu_hotplug.lock is dropped during the invocation of CPU_POST_DEAD
      notifiers, and hence get_online_cpus() cannot provide the necessary
      synchronization to protect the callback/notifier chains against concurrent
      reads and writes. On the other hand, since the cpu_add_remove_lock protects
      the entire hotplug operation (including CPU_POST_DEAD), we can use
      cpu_maps_update_begin/done() to guarantee proper synchronization.
      Also, since cpu_maps_update_begin/done() is like a super-set of
      get/put_online_cpus(), the former naturally protects the critical sections
      from concurrent hotplug operations.
      Since the names cpu_maps_update_begin/done() don't make much sense in CPU
      hotplug callback registration scenarios, we'll introduce new APIs named
      cpu_notifier_register_begin/done() and map them to cpu_maps_update_begin/done().
      In summary, introduce the lockless variants of un/register_cpu_notifier() and
      also export the cpu_notifier_register_begin/done() APIs for use by modules.
      This way, we provide a race-free way to register hotplug callbacks as well as
      perform initialization for the CPUs that are already online.
      Cc: Thomas Gleixner <>
      Cc: Andrew Morton <>
      Cc: Peter Zijlstra <>
      Cc: Ingo Molnar <>
      Acked-by: default avatarOleg Nesterov <>
      Acked-by: default avatarToshi Kani <>
      Reviewed-by: default avatarGautham R. Shenoy <>
      Signed-off-by: default avatarSrivatsa S. Bhat <>
      Signed-off-by: default avatarRafael J. Wysocki <>
  17. 18 Feb, 2014 1 commit
  18. 15 Oct, 2013 1 commit
  19. 30 Sep, 2013 1 commit
    • Toshi Kani's avatar
      hotplug, powerpc, x86: Remove cpu_hotplug_driver_lock() · 6dedcca6
      Toshi Kani authored
      cpu_hotplug_driver_lock() serializes CPU online/offline operations
      when ARCH_CPU_PROBE_RELEASE is set.  This lock interface is no longer
      necessary with the following reason:
       - lock_device_hotplug() now protects CPU online/offline operations,
         including the probe & release interfaces enabled by
         ARCH_CPU_PROBE_RELEASE.  The use of cpu_hotplug_driver_lock() is
       - cpu_hotplug_driver_lock() is only valid when ARCH_CPU_PROBE_RELEASE
         is defined, which is misleading and is only enabled on powerpc.
      This patch removes the cpu_hotplug_driver_lock() interface.  As
      a result, ARCH_CPU_PROBE_RELEASE only enables / disables the cpu
      probe & release interface as intended.  There is no functional change
      in this patch.
      Signed-off-by: default avatarToshi Kani <>
      Reviewed-by: default avatarNathan Fontenot <>
      Signed-off-by: default avatarRafael J. Wysocki <>
  20. 21 Aug, 2013 1 commit
    • Sudeep KarkadaNagesha's avatar
      of: move of_get_cpu_node implementation to DT core library · 183912d3
      Sudeep KarkadaNagesha authored
      This patch moves the generalized implementation of of_get_cpu_node from
      PowerPC to DT core library, thereby adding support for retrieving cpu
      node for a given logical cpu index on any architecture.
      The CPU subsystem can now use this function to assign of_node in the
      cpu device while registering CPUs.
      It is recommended to use these helper function only in pre-SMP/early
      initialisation stages to retrieve CPU device node pointers in logical
      ordering. Once the cpu devices are registered, it can be retrieved easily
      from cpu device of_node which avoids unnecessary parsing and matching.
      Cc: Benjamin Herrenschmidt <>
      Cc: Grant Likely <>
      Acked-by: default avatarRob Herring <>
      Signed-off-by: default avatarSudeep KarkadaNagesha <>
  21. 13 Aug, 2013 1 commit
    • Toshi Kani's avatar
      ACPI / processor: Acquire writer lock to update CPU maps · b9d10be7
      Toshi Kani authored
      CPU system maps are protected with reader/writer locks.  The reader
      lock, get_online_cpus(), assures that the maps are not updated while
      holding the lock.  The writer lock, cpu_hotplug_begin(), is used to
      udpate the cpu maps along with cpu_maps_update_begin().
      However, the ACPI processor handler updates the cpu maps without
      holding the the writer lock.
      acpi_map_lsapic() is called from acpi_processor_hotadd_init() to
      update cpu_possible_mask and cpu_present_mask.  acpi_unmap_lsapic()
      is called from acpi_processor_remove() to update cpu_possible_mask.
      Currently, they are either unprotected or protected with the reader
      lock, which is not correct.
      For example, the get_online_cpus() below is supposed to assure that
      cpu_possible_mask is not changed while the code is iterating with
              for_each_possible_cpu(cpu) {
      However, this lock has no protection with CPU hotplug since the ACPI
      processor handler does not use the writer lock when it updates
      cpu_possible_mask.  The reader lock does not serialize within the
      This patch protects them with the writer lock with cpu_hotplug_begin()
      along with cpu_maps_update_begin(), which must be held before calling
      cpu_hotplug_begin().  It also protects arch_register_cpu() /
      arch_unregister_cpu(), which creates / deletes a sysfs cpu device
      interface.  For this purpose it changes cpu_hotplug_begin() and
      cpu_hotplug_done() to global and exports them in cpu.h.
      Signed-off-by: default avatarToshi Kani <>
      Signed-off-by: default avatarRafael J. Wysocki <>
  22. 14 Jul, 2013 1 commit
    • Paul Gortmaker's avatar
      kernel: delete __cpuinit usage from all core kernel files · 0db0628d
      Paul Gortmaker authored
      The __cpuinit type of throwaway sections might have made sense
      some time ago when RAM was more constrained, but now the savings
      do not offset the cost and complications.  For example, the fix in
      commit 5e427ec2 ("x86: Fix bit corruption at CPU resume time")
      is a good example of the nasty type of bugs that can be created
      with improper use of the various __init prefixes.
      After a discussion on LKML[1] it was decided that cpuinit should go
      the way of devinit and be phased out.  Once all the users are gone,
      we can then finally remove the macros themselves from linux/init.h.
      This removes all the uses of the __cpuinit macros from C files in
      the core kernel directories (kernel, init, lib, mm, and include)
      that don't really have a specific maintainer.
      Signed-off-by: default avatarPaul Gortmaker <>
  23. 12 Jun, 2013 1 commit
  24. 28 May, 2013 1 commit
  25. 08 Apr, 2013 2 commits
  26. 17 Jul, 2012 1 commit
    • Tejun Heo's avatar
      workqueue: perform cpu down operations from low priority cpu_notifier() · 65758202
      Tejun Heo authored
      Currently, all workqueue cpu hotplug operations run off
      CPU_PRI_WORKQUEUE which is higher than normal notifiers.  This is to
      ensure that workqueue is up and running while bringing up a CPU before
      other notifiers try to use workqueue on the CPU.
      Per-cpu workqueues are supposed to remain working and bound to the CPU
      for normal CPU_DOWN_PREPARE notifiers.  This holds mostly true even
      with workqueue offlining running with higher priority because
      workqueue CPU_DOWN_PREPARE only creates a bound trustee thread which
      runs the per-cpu workqueue without concurrency management without
      explicitly detaching the existing workers.
      However, if the trustee needs to create new workers, it creates
      unbound workers which may wander off to other CPUs while
      CPU_DOWN_PREPARE notifiers are in progress.  Furthermore, if the CPU
      down is cancelled, the per-CPU workqueue may end up with workers which
      aren't bound to the CPU.
      While reliably reproducible with a convoluted artificial test-case
      involving scheduling and flushing CPU burning work items from CPU down
      notifiers, this isn't very likely to happen in the wild, and, even
      when it happens, the effects are likely to be hidden by the following
      successful CPU down.
      Fix it by using different priorities for up and down notifiers - high
      priority for up operations and low priority for down operations.
      Workqueue cpu hotplug operations will soon go through further cleanup.
      Signed-off-by: default avatarTejun Heo <>
      Acked-by: default avatar"Rafael J. Wysocki" <>
  27. 01 Jun, 2012 1 commit
    • Anton Vorontsov's avatar
      cpu: introduce clear_tasks_mm_cpumask() helper · cb79295e
      Anton Vorontsov authored
      Many architectures clear tasks' mm_cpumask like this:
      	for_each_process(p) {
      		if (p->mm)
      			cpumask_clear_cpu(cpu, mm_cpumask(p->mm));
      Depending on the context, the code above may have several problems,
      such as:
      1. Working with task->mm w/o getting mm or grabing the task lock is
         dangerous as ->mm might disappear (exit_mm() assigns NULL under
         task_lock(), so tasklist lock is not enough).
      2. Checking for process->mm is not enough because process' main
         thread may exit or detach its mm via use_mm(), but other threads
         may still have a valid mm.
      This patch implements a small helper function that does things
      correctly, i.e.:
      1. We take the task's lock while whe handle its mm (we can't use
         get_task_mm()/mmput() pair as mmput() might sleep);
      2. To catch exited main thread case, we use find_lock_task_mm(),
         which walks up all threads and returns an appropriate task
         (with task lock held).
      Also, Per Peter Zijlstra's idea, now we don't grab tasklist_lock in
      the new helper, instead we take the rcu read lock. We can do this
      because the function is called after the cpu is taken down and marked
      offline, so no new tasks will get this cpu set in their mm mask.
      Signed-off-by: default avatarAnton Vorontsov <>
      Cc: Richard Weinberger <>
      Cc: Oleg Nesterov <>
      Cc: Peter Zijlstra <>
      Cc: Russell King <>
      Cc: Benjamin Herrenschmidt <>
      Cc: Mike Frysinger <>
      Cc: Paul Mundt <>
      Signed-off-by: default avatarAndrew Morton <>
      Signed-off-by: default avatarLinus Torvalds <>
  28. 17 May, 2012 1 commit
    • Peter Zijlstra's avatar
      sched: Remove stale power aware scheduling remnants and dysfunctional knobs · 8e7fbcbc
      Peter Zijlstra authored
      It's been broken forever (i.e. it's not scheduling in a power
      aware fashion), as reported by Suresh and others sending
      patches, and nobody cares enough to fix it properly ...
      so remove it to make space free for something better.
      There's various problems with the code as it stands today, first
      and foremost the user interface which is bound to topology
      levels and has multiple values per level. This results in a
      state explosion which the administrator or distro needs to
      master and almost nobody does.
      Furthermore large configuration state spaces aren't good, it
      means the thing doesn't just work right because it's either
      under so many impossibe to meet constraints, or even if
      there's an achievable state workloads have to be aware of
      it precisely and can never meet it for dynamic workloads.
      So pushing this kind of decision to user-space was a bad idea
      even with a single knob - it's exponentially worse with knobs
      on every node of the topology.
      There is a proposal to replace the user interface with a single
      3 state knob:
       sched_balance_policy := { performance, power, auto }
      where 'auto' would be the preferred default which looks at things
      like Battery/AC mode and possible cpufreq state or whatever the hw
      exposes to show us power use expectations - but there's been no
      progress on it in the past many months.
      Aside from that, the actual implementation of the various knobs
      is known to be broken. There have been sporadic attempts at
      fixing things but these always stop short of reaching a mergable
      Therefore this wholesale removal with the hopes of spurring
      people who care to come forward once again and work on a
      coherent replacement.
      Signed-off-by: default avatarPeter Zijlstra <>
      Cc: Suresh Siddha <>
      Cc: Arjan van de Ven <>
      Cc: Vincent Guittot <>
      Cc: Vaidyanathan Srinivasan <>
      Cc: Linus Torvalds <>
      Cc: Andrew Morton <>
      Signed-off-by: default avatarIngo Molnar <>
  29. 16 Mar, 2012 1 commit
    • Paul Gortmaker's avatar
      device.h: audit and cleanup users in main include dir · 313162d0
      Paul Gortmaker authored
      The <linux/device.h> header includes a lot of stuff, and
      it in turn gets a lot of use just for the basic "struct device"
      which appears so often.
      Clean up the users as follows:
      1) For those headers only needing "struct device" as a pointer
      in fcn args, replace the include with exactly that.
      2) For headers not really using anything from device.h, simply
      delete the include altogether.
      3) For headers relying on getting device.h implicitly before
      being included themselves, now explicitly include device.h
      4) For files in which doing #1 or #2 uncovers an implicit
      dependency on some other header, fix by explicitly adding
      the required header(s).
      Any C files that were implicitly relying on device.h to be
      present have already been dealt with in advance.
      Total removals from #1 and #2: 51.  Total additions coming
      from #3: 9.  Total other implicit dependencies from #4: 7.
      As of 3.3-rc1, there were 110, so a net removal of 42 gives
      about a 38% reduction in device.h presence in include/*
      Signed-off-by: default avatarPaul Gortmaker <>
  30. 27 Jan, 2012 1 commit
  31. 21 Dec, 2011 1 commit
    • Kay Sievers's avatar
      cpu: convert 'cpu' and 'machinecheck' sysdev_class to a regular subsystem · 8a25a2fd
      Kay Sievers authored
      This moves the 'cpu sysdev_class' over to a regular 'cpu' subsystem
      and converts the devices to regular devices. The sysdev drivers are
      implemented as subsystem interfaces now.
      After all sysdev classes are ported to regular driver core entities, the
      sysdev implementation will be entirely removed from the kernel.
      Userspace relies on events and generic sysfs subsystem infrastructure
      from sysdev devices, which are made available with this conversion.
      Cc: Haavard Skinnemoen <>
      Cc: Hans-Christian Egtvedt <>
      Cc: Tony Luck <>
      Cc: Fenghua Yu <>
      Cc: Arnd Bergmann <>
      Cc: Benjamin Herrenschmidt <>
      Cc: Paul Mackerras <>
      Cc: Martin Schwidefsky <>
      Cc: Heiko Carstens <>
      Cc: Paul Mundt <>
      Cc: "David S. Miller" <>
      Cc: Chris Metcalf <>
      Cc: Thomas Gleixner <>
      Cc: Ingo Molnar <>
      Cc: "H. Peter Anvin" <>
      Cc: Borislav Petkov <>
      Cc: Tigran Aivazian <>
      Cc: Len Brown <>
      Cc: Zhang Rui <>
      Cc: Dave Jones <>
      Cc: Peter Zijlstra <>
      Cc: Russell King <>
      Cc: Andrew Morton <>
      Cc: Arjan van de Ven <>
      Cc: "Rafael J. Wysocki" <>
      Cc: "Srivatsa S. Bhat" <>
      Signed-off-by: default avatarKay Sievers <>
      Signed-off-by: default avatarGreg Kroah-Hartman <>