Skip to content
Snippets Groups Projects
This project is mirrored from https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-stable-rt.git. Pull mirroring updated .
  1. Jun 16, 2011
  2. Jun 15, 2011
  3. Jun 14, 2011
    • Shaohua Li's avatar
      rcu: Use softirq to address performance regression · 09223371
      Shaohua Li authored
      
      Commit a26ac245(rcu: move TREE_RCU from softirq to kthread)
      introduced performance regression. In an AIM7 test, this commit degraded
      performance by about 40%.
      
      The commit runs rcu callbacks in a kthread instead of softirq. We observed
      high rate of context switch which is caused by this. Out test system has
      64 CPUs and HZ is 1000, so we saw more than 64k context switch per second
      which is caused by RCU's per-CPU kthread.  A trace showed that most of
      the time the RCU per-CPU kthread doesn't actually handle any callbacks,
      but instead just does a very small amount of work handling grace periods.
      This means that RCU's per-CPU kthreads are making the scheduler do quite
      a bit of work in order to allow a very small amount of RCU-related
      processing to be done.
      
      Alex Shi's analysis determined that this slowdown is due to lock
      contention within the scheduler.  Unfortunately, as Peter Zijlstra points
      out, the scheduler's real-time semantics require global action, which
      means that this contention is inherent in real-time scheduling.  (Yes,
      perhaps someone will come up with a workaround -- otherwise, -rt is not
      going to do well on large SMP systems -- but this patch will work around
      this issue in the meantime.  And "the meantime" might well be forever.)
      
      This patch therefore re-introduces softirq processing to RCU, but only
      for core RCU work.  RCU callbacks are still executed in kthread context,
      so that only a small amount of RCU work runs in softirq context in the
      common case.  This should minimize ksoftirqd execution, allowing us to
      skip boosting of ksoftirqd for CONFIG_RCU_BOOST=y kernels.
      
      Signed-off-by: default avatarShaohua Li <shaohua.li@intel.com>
      Tested-by: default avatar"Alex,Shi" <alex.shi@intel.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      09223371
  4. Jun 09, 2011
  5. Jun 03, 2011
  6. Jun 02, 2011
  7. May 30, 2011
  8. May 27, 2011
  9. May 26, 2011
    • Arnaldo Carvalho de Melo's avatar
      perf tools: Fix build on older systems · 75911c9b
      Arnaldo Carvalho de Melo authored
      
      Where /usr/include/linux/const.h is not present, e.g. RHEL5.
      
      Reported-by: default avatarSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Link: http://lkml.kernel.org/n/tip-ypcw2mu0w7dl1rrc6ncz3pee@git.kernel.org
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      75911c9b
    • Arnaldo Carvalho de Melo's avatar
      perf symbols: Handle /proc/sys/kernel/kptr_restrict · ec80fde7
      Arnaldo Carvalho de Melo authored
      
      Perf uses /proc/modules to figure out where kernel modules are loaded.
      
      With the advent of kptr_restrict, non root users get zeroes for all module
      start addresses.
      
      So check if kptr_restrict is non zero and don't generate the syntethic
      PERF_RECORD_MMAP events for them.
      
      Warn the user about it in perf record and in perf report.
      
      In perf report the reference relocation symbol being zero means that
      kptr_restrict was set, thus /proc/kallsyms has only zeroed addresses, so don't
      use it to fixup symbol addresses when using a valid kallsyms (in the buildid
      cache) or vmlinux (in the vmlinux path) build-id located automatically or
      specified by the user.
      
      Provide an explanation about it in 'perf report' if kernel samples were taken,
      checking if a suitable vmlinux or kallsyms was found/specified.
      
      Restricted /proc/kallsyms don't go to the buildid cache anymore.
      
      Example:
      
       [acme@emilia ~]$ perf record -F 100000 sleep 1
      
       WARNING: Kernel address maps (/proc/{kallsyms,modules}) are restricted, check
       /proc/sys/kernel/kptr_restrict.
      
       Samples in kernel functions may not be resolved if a suitable vmlinux file is
       not found in the buildid cache or in the vmlinux path.
      
       Samples in kernel modules won't be resolved at all.
      
       If some relocation was applied (e.g. kexec) symbols may be misresolved even
       with a suitable vmlinux or kallsyms file.
      
       [ perf record: Woken up 1 times to write data ]
       [ perf record: Captured and wrote 0.005 MB perf.data (~231 samples) ]
       [acme@emilia ~]$
      
       [acme@emilia ~]$ perf report --stdio
       Kernel address maps (/proc/{kallsyms,modules}) were restricted,
       check /proc/sys/kernel/kptr_restrict before running 'perf record'.
      
       If some relocation was applied (e.g. kexec) symbols may be misresolved.
      
       Samples in kernel modules can't be resolved as well.
      
       # Events: 13  cycles
       #
       # Overhead  Command      Shared Object                 Symbol
       # ........  .......  .................  .....................
       #
          20.24%    sleep  [kernel.kallsyms]  [k] page_fault
          20.04%    sleep  [kernel.kallsyms]  [k] filemap_fault
          19.78%    sleep  [kernel.kallsyms]  [k] __lru_cache_add
          19.69%    sleep  ld-2.12.so         [.] memcpy
          14.71%    sleep  [kernel.kallsyms]  [k] dput
           4.70%    sleep  [kernel.kallsyms]  [k] flush_signal_handlers
           0.73%    sleep  [kernel.kallsyms]  [k] perf_event_comm
           0.11%    sleep  [kernel.kallsyms]  [k] native_write_msr_safe
      
       #
       # (For a higher level overview, try: perf report --sort comm,dso)
       #
       [acme@emilia ~]$
      
      This is because it found a suitable vmlinux (build-id checked) in
      /lib/modules/2.6.39-rc7+/build/vmlinux (use -v in perf report to see the long
      file name).
      
      If we remove that file from the vmlinux path:
      
       [root@emilia ~]# mv /lib/modules/2.6.39-rc7+/build/vmlinux \
      		     /lib/modules/2.6.39-rc7+/build/vmlinux.OFF
       [acme@emilia ~]$ perf report --stdio
       [kernel.kallsyms] with build id 57298cdbe0131f6871667ec0eaab4804dcf6f562
       not found, continuing without symbols
      
       Kernel address maps (/proc/{kallsyms,modules}) were restricted, check
       /proc/sys/kernel/kptr_restrict before running 'perf record'.
      
       As no suitable kallsyms nor vmlinux was found, kernel samples can't be
       resolved.
      
       Samples in kernel modules can't be resolved as well.
      
       # Events: 13  cycles
       #
       # Overhead  Command      Shared Object  Symbol
       # ........  .......  .................  ......
       #
          80.31%    sleep  [kernel.kallsyms]  [k] 0xffffffff8103425a
          19.69%    sleep  ld-2.12.so         [.] memcpy
      
       #
       # (For a higher level overview, try: perf report --sort comm,dso)
       #
       [acme@emilia ~]$
      
      Reported-by: default avatarStephane Eranian <eranian@google.com>
      Suggested-by: default avatarDavid Miller <davem@davemloft.net>
      Cc: Dave Jones <davej@redhat.com>
      Cc: David Miller <davem@davemloft.net>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Kees Cook <kees.cook@canonical.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Link: http://lkml.kernel.org/n/tip-mt512joaxxbhhp1odop04yit@git.kernel.org
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ec80fde7
    • Jesper Juhl's avatar
      perf: Remove duplicate headers · ea7659fb
      Jesper Juhl authored
      
      Signed-off-by: default avatarJesper Juhl <jj@chaosbits.net>
      Cc: Tom Zanussi <tom.zanussi@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
      Cc: trivial@kernel.org
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Link: http://lkml.kernel.org/r/alpine.LNX.2.00.1105261011290.17400@swampdragon.chaosbits.net
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      ea7659fb
  10. May 24, 2011
  11. May 23, 2011
  12. May 22, 2011
  13. May 20, 2011
    • Linus Torvalds's avatar
      sanitize <linux/prefetch.h> usage · 268bb0ce
      Linus Torvalds authored
      
      Commit e66eed65 ("list: remove prefetching from regular list
      iterators") removed the include of prefetch.h from list.h, which
      uncovered several cases that had apparently relied on that rather
      obscure header file dependency.
      
      So this fixes things up a bit, using
      
         grep -L linux/prefetch.h $(git grep -l '[^a-z_]prefetchw*(' -- '*.[ch]')
         grep -L 'prefetchw*(' $(git grep -l 'linux/prefetch.h' -- '*.[ch]')
      
      to guide us in finding files that either need <linux/prefetch.h>
      inclusion, or have it despite not needing it.
      
      There are more of them around (mostly network drivers), but this gets
      many core ones.
      
      Reported-by: default avatarStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      268bb0ce
    • Steven Rostedt's avatar
      ktest: Allow options to be used by other options · 2a62512b
      Steven Rostedt authored
      
      There are cases where one ktest option may be used within another
      ktest option. Allow them to be reused just like config variables
      but there are evaluated at time of test not config processing time.
      
      Thus having something like:
      
      MAKE_CMD = make ARCH=${ARCH}
      
      TEST_START
      ARCH = powerpc
      
      TEST_START
      ARCH = arm
      
      Will have the arch defined for each test iteration.
      
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      2a62512b
    • Steven Rostedt's avatar
      ktest: Create variables for the ktest config files · 77d942ce
      Steven Rostedt authored
      
      I found that I constantly reuse information for each test case.
      It would be nice to just define a variable to reuse.
      
      For example I may have:
      
      TEST_START
      [...]
      TEST = ssh root@mybox /path/to/my/script
      
      TEST_START
      [...]
      TEST = ssh root@mybox /path/to/my/script
      
      [etc]
      
      The issue is, I may wont to change that script or one of the other
      fields. Then I need to update each line individually.
      
      With the addition of config variables (variables only used during parsing
      the config) we can simplify the config files. These variables can
      also be defined multiple times and each time the new value will
      overwrite the old value.
      
      The convention to use a config variable over a ktest option is to use :=
      instead of =.
      
      Now we could do:
      
      USER := root
      TARGET := mybox
      TEST_SCRIPT := /path/to/my/script
      TEST_CASE := ${USER}@${TARGET} ${TEST_SCRIPT}
      
      TEST_START
      [...]
      TEST = ${TEST_CASE}
      
      TEST_START
      [...]
      TEST = ${TEST_CASE}
      
      [etc]
      
      Now we just need to update the variables at the top.
      
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      77d942ce
    • Steven Rostedt's avatar
      ktest: Reboot after each patchcheck run · 27d934b2
      Steven Rostedt authored
      
      The patches being checked may not leave the kernel in a state
      that the next run will allow the new kernel to be copied to the
      machine. Reboot to a known good kernel before continuing to the
      next kernel to test.
      
      Added option PATCHCHECK_SLEEP_TIME for the max time to sleep between
      patchcheck reboots.
      
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      27d934b2
    • Steven Rostedt's avatar
      ktest: Reboot to good kernel after every bisect run · 4025bc62
      Steven Rostedt authored
      
      Reboot after each bisect run regardless if the bisect passed
      or failed. The test may just be to boot the kernel and that kernel
      may not have a way to copy the next kerne to it. Reboot to a known
      good kernel after each bisect run.
      
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      4025bc62
    • Steven Rostedt's avatar
      ktest: If test failed due to timeout, print that · 4d62bf51
      Steven Rostedt authored
      
      If the test failed due to timeout for boot, print a message saying
      so. Otherwise the user will be confused to why their test just failed.
      
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      4d62bf51
    • Steven Rostedt's avatar
      ktest: Fix post install command · ca6a21f8
      Steven Rostedt authored
      
      The command to run post install (for those that want initrds) was
      broken. Instead of doing a substitution for the $KERNEL_VERSION
      variable. It was replacing the entire command with nothing.
      
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      ca6a21f8
  14. May 19, 2011
    • Ingo Molnar's avatar
      perf stat: Add more cache-miss percentage printouts · c3305257
      Ingo Molnar authored
      Print out the cache-miss percentage as well if the cache refs were
      collected, for all the generic cache event types.
      
      Before:
      
         11,103,723,230 dTLB-loads                #  622.471 M/sec                    ( +-  0.30% )
             87,065,337 dTLB-load-misses          #    4.881 M/sec                    ( +-  0.90% )
      
      After:
      
         11,353,713,242 dTLB-loads                #  626.020 M/sec                    ( +-  0.35% )
            113,393,472 dTLB-load-misses          #    1.00% of all dTLB cache hits   ( +-  0.49% )
      
      Also ASCII color highlight too high percentages, them when it's executed on the console.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/n/tip-lkhwxsevdbd9a8nymx0vxc3y@git.kernel.org
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      c3305257
    • Ingo Molnar's avatar
      perf stat: Add -d -d and -d -d -d options to show more CPU events · 2cba3ffb
      Ingo Molnar authored
      Print even more detailed statistics if requested via perf stat -d:
      
             -d:          detailed events, L1 and LLC data cache
          -d -d:     more detailed events, dTLB and iTLB events
       -d -d -d:     very detailed events, adding prefetch events
      
      Full output looks like this now:
      
       Performance counter stats for '/home/mingo/hackbench 10' (5 runs):
      
             1703.674707 task-clock                #    8.709 CPUs utilized            ( +-  4.19% )
                  49,068 context-switches          #    0.029 M/sec                    ( +- 16.66% )
                   8,303 CPU-migrations            #    0.005 M/sec                    ( +- 24.90% )
                  17,397 page-faults               #    0.010 M/sec                    ( +-  0.46% )
           2,345,389,239 cycles                    #    1.377 GHz                      ( +-  4.61% ) [55.90%]
           1,884,503,527 stalled-cycles-frontend   #   80.35% frontend cycles idle     ( +-  5.67% ) [50.39%]
             743,919,737 stalled-cycles-backend    #   31.72% backend  cycles idle     ( +-  8.75% ) [49.91%]
           1,314,416,379 instructions              #    0.56  insns per cycle
                                                   #    1.43  stalled cycles per insn  ( +-  2.53% ) [60.87%]
             272,592,567 branches                  #  160.003 M/sec                    ( +-  1.74% ) [56.56%]
               3,794,846 branch-misses             #    1.39% of all branches          ( +-  6.59% ) [58.50%]
             449,982,778 L1-dcache-loads           #  264.125 M/sec                    ( +-  2.47% ) [49.88%]
              22,404,961 L1-dcache-load-misses     #    4.98% of all L1-dcache hits    ( +-  6.08% ) [55.05%]
               6,204,750 LLC-loads                 #    3.642 M/sec                    ( +-  8.91% ) [43.75%]
               1,837,411 LLC-load-misses           #    1.078 M/sec                    ( +-  7.27% ) [12.07%]
             411,440,421 L1-icache-loads           #  241.502 M/sec                    ( +-  5.60% ) [36.52%]
              27,556,832 L1-icache-load-misses     #   16.175 M/sec                    ( +-  7.46% ) [46.72%]
             464,067,627 dTLB-loads                #  272.392 M/sec                    ( +-  4.46% ) [54.17%]
              10,765,648 dTLB-load-misses          #    6.319 M/sec                    ( +-  3.18% ) [48.68%]
           1,273,080,386 iTLB-loads                #  747.256 M/sec                    ( +-  3.38% ) [47.53%]
                 117,481 iTLB-load-misses          #    0.069 M/sec                    ( +- 14.99% ) [47.01%]
               4,590,653 L1-dcache-prefetches      #    2.695 M/sec                    ( +-  4.49% ) [46.19%]
               1,712,660 L1-dcache-prefetch-misses #    1.005 M/sec                    ( +-  3.75% ) [44.82%]
      
              0.195622057  seconds time elapsed  ( +-  6.84% )
      
      Also clean up the attribute construction code to be appending, and factor
      it out into add_default_attributes().
      
      Tweak the coverage percentage printout a bit, so that it's easier to view it
      alongside the +- sttddev colum.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/n/tip-to3kgu04449s64062val8b62@git.kernel.org
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      2cba3ffb
  15. May 18, 2011
  16. May 17, 2011
    • Stephane Eranian's avatar
      perf: Fix multi-event parsing bug · 94692349
      Stephane Eranian authored
      
      This patch fixes an issue with event parsing.
      The following commit appears to have broken the
      ability to specify a comma separated list of events:
      
         commit ceb53fbf
         Author: Ingo Molnar <mingo@elte.hu>
         Date:   Wed Apr 27 04:06:33 2011 +0200
      
             perf stat: Fail more clearly when an invalid modifier is specified
      
      This patch fixes this while preserving the desired effect:
      
      $ perf stat -e instructions:u,instructions:k ls /dev/null /dev/null
      
       Performance counter stats for 'ls /dev/null':
      
                  365956 instructions:u           #    0.00  insns per cycle
                  731806 instructions:k           #    0.00  insns per cycle
      
              0.001108862  seconds time elapsed
      
      $ perf stat -e task-clock-msecs true
      invalid event modifier: '-msecs'
      Run 'perf list' for a list of valid events and modifiers
      
      Signed-off-by: default avatarStephane Eranian <eranian@google.com>
      Cc: acme@redhat.com
      Cc: peterz@infradead.org
      Cc: fweisbec@gmail.com
      Link: http://lkml.kernel.org/r/20110517133619.GA6999@quad
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      94692349
Loading