Commit Graph

169202 Commits

Author SHA1 Message Date
Pekka Enberg c10edee2e1 perf tools: Fix permission checks
The perf_event_open() system call returns EACCES if the user is
not root which results in a very confusing error message:

  $ perf record -A -a -f

    Error: perfcounter syscall returned with -1 (Permission denied)

    Fatal: No CONFIG_PERF_EVENTS=y kernel support configured?

It turns out that's because perf tools are checking only for
EPERM. Fix that up to get a much better error message:

  $ perf record -A -a -f
    Fatal: Permission error - are you root?

Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <1257696066-4046-1-git-send-email-penberg@cs.helsinki.fi>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-08 17:04:54 +01:00
Jan Beulich eb647138ac x86/PCI: Adjust GFP mask handling for coherent allocations
Rather than forcing GFP flags and DMA mask to be inconsistent,
GFP flags should be determined even for the fallback device
through dma_alloc_coherent_mask()/dma_alloc_coherent_gfp_flags().

This restores 64-bit behavior as it was prior to commits
8965eb1938 and
4a367f3a9d (not sure why there are
two of them), where GFP_DMA was forced on for 32-bit, but not
for 64-bit, with the slight adjustment that afaict even 32-bit
doesn't need this without CONFIG_ISA.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Acked-by: Takashi Iwai <tiwai@suse.de>
LKML-Reference: <4AF18187020000780001D8AA@vpn.id2.novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-11-08 07:44:30 -08:00
Li Zefan 30ff21e31f ksym_tracer: Remove KSYM_SELFTEST_ENTRY
The macro used to be used in both trace_selftest.c and
trace_ksym.c, but no longer, so remove it from header file.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Cc: Prasad <prasad@linux.vnet.ibm.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-11-08 16:21:01 +01:00
Frederic Weisbecker ba1c813a6b hw-breakpoints: Arbitrate access to pmu following registers constraints
Allow or refuse to build a counter using the breakpoints pmu following
given constraints.

We keep track of the pmu users by using three per cpu variables:

- nr_cpu_bp_pinned stores the number of pinned cpu breakpoints counters
  in the given cpu

- nr_bp_flexible stores the number of non-pinned breakpoints counters
  in the given cpu.

- task_bp_pinned stores the number of pinned task breakpoints in a cpu

The latter is not a simple counter but gathers the number of tasks that
have n pinned breakpoints.
Considering HBP_NUM the number of available breakpoint address
registers:
   task_bp_pinned[0] is the number of tasks having 1 breakpoint
   task_bp_pinned[1] is the number of tasks having 2 breakpoints
   [...]
   task_bp_pinned[HBP_NUM - 1] is the number of tasks having the
   maximum number of registers (HBP_NUM).

When a breakpoint counter is created and wants an access to the pmu,
we evaluate the following constraints:

== Non-pinned counter ==

- If attached to a single cpu, check:

    (per_cpu(nr_bp_flexible, cpu) || (per_cpu(nr_cpu_bp_pinned, cpu)
         + max(per_cpu(task_bp_pinned, cpu)))) < HBP_NUM

       -> If there are already non-pinned counters in this cpu, it
          means there is already a free slot for them.
          Otherwise, we check that the maximum number of per task
          breakpoints (for this cpu) plus the number of per cpu
          breakpoint (for this cpu) doesn't cover every registers.

- If attached to every cpus, check:

    (per_cpu(nr_bp_flexible, *) || (max(per_cpu(nr_cpu_bp_pinned, *))
           + max(per_cpu(task_bp_pinned, *)))) < HBP_NUM

       -> This is roughly the same, except we check the number of per
          cpu bp for every cpu and we keep the max one. Same for the
          per tasks breakpoints.

== Pinned counter ==

- If attached to a single cpu, check:

       ((per_cpu(nr_bp_flexible, cpu) > 1)
            + per_cpu(nr_cpu_bp_pinned, cpu)
            + max(per_cpu(task_bp_pinned, cpu))) < HBP_NUM

       -> Same checks as before. But now the nr_bp_flexible, if any,
          must keep one register at least (or flexible breakpoints will
          never be be fed).

- If attached to every cpus, check:

      ((per_cpu(nr_bp_flexible, *) > 1)
           + max(per_cpu(nr_cpu_bp_pinned, *))
           + max(per_cpu(task_bp_pinned, *))) < HBP_NUM

Changes in v2:

- Counter -> event rename

Changes in v5:

- Fix unreleased non-pinned task-bound-only counters. We only released
  it in the first cpu. (Thanks to Paul Mackerras for reporting that)

Changes in v6:

- Currently, events scheduling are done in this order: cpu context
  pinned + cpu context non-pinned + task context pinned + task context
  non-pinned events. Then our current constraints are right theoretically
  but not in practice, because non-pinned counters may be scheduled
  before we can apply every possible pinned counters. So consider
  non-pinned counters as pinned for now.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Prasad <prasad@linux.vnet.ibm.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jan Kiszka <jan.kiszka@web.de>
Cc: Jiri Slaby <jirislaby@gmail.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Paul Mundt <lethal@linux-sh.org>
2009-11-08 16:20:47 +01:00
Frederic Weisbecker 24f1e32c60 hw-breakpoints: Rewrite the hw-breakpoints layer on top of perf events
This patch rebase the implementation of the breakpoints API on top of
perf events instances.

Each breakpoints are now perf events that handle the
register scheduling, thread/cpu attachment, etc..

The new layering is now made as follows:

       ptrace       kgdb      ftrace   perf syscall
          \          |          /         /
           \         |         /         /
                                        /
            Core breakpoint API        /
                                      /
                     |               /
                     |              /

              Breakpoints perf events

                     |
                     |

               Breakpoints PMU ---- Debug Register constraints handling
                                    (Part of core breakpoint API)
                     |
                     |

             Hardware debug registers

Reasons of this rewrite:

- Use the centralized/optimized pmu registers scheduling,
  implying an easier arch integration
- More powerful register handling: perf attributes (pinned/flexible
  events, exclusive/non-exclusive, tunable period, etc...)

Impact:

- New perf ABI: the hardware breakpoints counters
- Ptrace breakpoints setting remains tricky and still needs some per
  thread breakpoints references.

Todo (in the order):

- Support breakpoints perf counter events for perf tools (ie: implement
  perf_bpcounter_event())
- Support from perf tools

Changes in v2:

- Follow the perf "event " rename
- The ptrace regression have been fixed (ptrace breakpoint perf events
  weren't released when a task ended)
- Drop the struct hw_breakpoint and store generic fields in
  perf_event_attr.
- Separate core and arch specific headers, drop
  asm-generic/hw_breakpoint.h and create linux/hw_breakpoint.h
- Use new generic len/type for breakpoint
- Handle off case: when breakpoints api is not supported by an arch

Changes in v3:

- Fix broken CONFIG_KVM, we need to propagate the breakpoint api
  changes to kvm when we exit the guest and restore the bp registers
  to the host.

Changes in v4:

- Drop the hw_breakpoint_restore() stub as it is only used by KVM
- EXPORT_SYMBOL_GPL hw_breakpoint_restore() as KVM can be built as a
  module
- Restore the breakpoints unconditionally on kvm guest exit:
  TIF_DEBUG_THREAD doesn't anymore cover every cases of running
  breakpoints and vcpu->arch.switch_db_regs might not always be
  set when the guest used debug registers.
  (Waiting for a reliable optimization)

Changes in v5:

- Split-up the asm-generic/hw-breakpoint.h moving to
  linux/hw_breakpoint.h into a separate patch
- Optimize the breakpoints restoring while switching from kvm guest
  to host. We only want to restore the state if we have active
  breakpoints to the host, otherwise we don't care about messed-up
  address registers.
- Add asm/hw_breakpoint.h to Kbuild
- Fix bad breakpoint type in trace_selftest.c

Changes in v6:

- Fix wrong header inclusion in trace.h (triggered a build
  error with CONFIG_FTRACE_SELFTEST

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Prasad <prasad@linux.vnet.ibm.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jan Kiszka <jan.kiszka@web.de>
Cc: Jiri Slaby <jirislaby@gmail.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Paul Mundt <lethal@linux-sh.org>
2009-11-08 15:34:42 +01:00
Cyrill Gorcunov e9036b36ee sched: Use root_task_group_empty only with FAIR_GROUP_SCHED
root_task_group_empty is used only with FAIR_GROUP_SCHED
so if we use other scheduler options we get:

  kernel/sched.c:314: warning: 'root_task_group_empty' defined but not used

So move CONFIG_FAIR_GROUP_SCHED up that it covers
root_task_group_empty().

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20091026192414.GB5321@lenovo>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-08 13:15:48 +01:00
Nicolas Pitre 158bc5af3d ARM: 5784/1: fix early boot machine ID mismatch error display
That code was refactored a long time ago, but one particular label
didn't get adjusted properly which broke the listing of supported
machines.

Signed-off-by: Nicolas Pitre <nico@marvell.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-11-08 11:58:54 +00:00
Xiaotian Feng de2a47cf2b x86: Fix error return sequence in __ioremap_caller()
kernel missed to free memtype if get_vm_area_caller failed in
__ioremap_caller.

This patch introduces error path to fix this and cleans up the
repetitive error return sequences that contributed to the
creation of the bug.

Signed-off-by: Xiaotian Feng <dfeng@redhat.com>
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
LKML-Reference: <1257389031-20429-1-git-send-email-dfeng@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-08 12:48:58 +01:00
Randy Dunlap 968c86458a sched: Fix kernel-doc function parameter name
Fix variable name in sched.c kernel-doc notation.

Fixes this DocBook warning:

 Warning(kernel/sched.c:2008): No description found for parameter
 'p' Warning(kernel/sched.c:2008): Excess function parameter 'k'
 description in 'kthread_bind'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
LKML-Reference: <4AF4B1BC.8020604@oracle.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-08 11:26:25 +01:00
Ryusuke Konishi c083234f15 nilfs2: fix missing cleanup of gc cache on error cases
This fixes an -rc1 regression brought by the commit:
1cf58fa840 ("nilfs2: shorten freeze
period due to GC in write operation v3").

Although the patch moved out a function call of
nilfs_ioctl_move_blocks() to nilfs_ioctl_clean_segments() from
nilfs_ioctl_prepare_clean_segments(), it didn't move corresponding
cleanup job needed for the error case.

This will move the missing cleanup job to the destination function.

Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Acked-by: Jiro SEKIBA <jir@unicus.jp>
2009-11-08 19:04:25 +09:00
Ryusuke Konishi 5399dd1fc8 nilfs2: fix kernel oops in error case of nilfs_ioctl_move_blocks
This fixes a kernel oops reported by Markus Trippelsdorf in the email
titled "[NILFS users] kernel Oops while running nilfs_cleanerd".

The oops was caused by a bug of error path in
nilfs_ioctl_move_blocks() function, which was inlined in
nilfs_ioctl_clean_segments().

nilfs_ioctl_move_blocks checks duplication of blocks which will be
moved in garbage collection.  But, the check should have be done
within nilfs_ioctl_move_inode_block() to prevent list corruption among
buffers storing the target blocks.

To fix the kernel oops, this moves forward the duplication check
before the list insertion.

I also tested this for stable trees [2.6.30, 2.6.31].

Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Cc: stable <stable@kernel.org>
2009-11-08 19:01:35 +09:00
Arnaldo Carvalho de Melo 8d06367fa7 perf symbols: Use the buildids if present
With this change 'perf record' will intercept PERF_RECORD_MMAP
calls, creating a linked list of DSOs, then when the session
finishes, it will traverse this list and read the buildids,
stashing them at the end of the file and will set up a new
feature bit in the header bitmask.

'perf report' will then notice this feature and populate the
'dsos' list and set the build ids.

When reading the symtabs it will refuse to load from a file that
doesn't have the same build id. This improves the
reliability of the profiler output, as symbols and profiling
data is more guaranteed to match.

Example:

 [root@doppio ~]# perf report | head
 /home/acme/bin/perf with build id b1ea544ac3746e7538972548a09aadecc5753868 not found, continuing without symbols
  # Samples: 2621434559
  #
  # Overhead          Command                  Shared Object  Symbol
  # ........  ...............  .............................  ......
  #
       7.91%             init  [kernel]        [k] read_hpet
       7.64%             init  [kernel]        [k] mwait_idle_with_hints
       7.60%          swapper  [kernel]        [k] read_hpet
       7.60%          swapper  [kernel]        [k] mwait_idle_with_hints
       3.65%             init  [kernel]        [k] 0xffffffffa02339d9
[root@doppio ~]#

In this case the 'perf' binary was an older one, vanished,
so its symbols probably wouldn't match or would cause subtly
different (and misleading) output.

Next patches will support the kernel as well, reading the build
id notes for it and the modules from /sys.

Another patch should also introduce a new plumbing command:

'perf list-buildids'

that will then be used in porcelain that is distro specific to
fetch -debuginfo packages where such buildids are present. This
will in turn allow for one to run 'perf record' in one machine
and 'perf report' in another.

Future work on having the buildid sent directly from the kernel
in the PERF_RECORD_MMAP event is needed to close races, as the
DSO can be changed during a 'perf record' session, but this
patch at least helps with non-corner cases and current/older
kernels.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Frank Ch. Eigler <fche@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Jim Keniston <jkenisto@us.ibm.com>
Cc: K. Prasad <prasad@linux.vnet.ibm.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Roland McGrath <roland@redhat.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <1257367843-26224-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-08 10:44:36 +01:00
Frederic Weisbecker 444a2a3bcd tracing, perf_events: Protect the buffer from recursion in perf
While tracing using events with perf, if one enables the
lockdep:lock_acquire event, it will infect every other perf
trace events.

Basically, you can enable whatever set of trace events through
perf but if this event is part of the set, the only result we
can get is a long list of lock_acquire events of rcu read lock,
and only that.

This is because of a recursion inside perf.

1) When a trace event is triggered, it will fill a per cpu
   buffer and submit it to perf.

2) Perf will commit this event but will also protect some data
   using rcu_read_lock

3) A recursion appears: rcu_read_lock triggers a lock_acquire
   event that will fill the per cpu event and then submit the
   buffer to perf.

4) Perf detects a recursion and ignores it

5) Perf continues its work on the previous event, but its buffer
   has been overwritten by the lock_acquire event, it has then
   been turned into a lock_acquire event of rcu read lock

Such scenario also happens with lock_release with
rcu_read_unlock().

We could turn the rcu_read_lock() into __rcu_read_lock() to drop
the lock debugging from perf fast path, but that would make us
lose the rcu debugging and that doesn't prevent from other
possible kind of recursion from perf in the future.

This patch adds a recursion protection based on a counter on the
perf trace per cpu buffers to solve the problem.

-v2: Fixed lost whitespace, added reviewed-by tag

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Reviewed-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Jason Baron <jbaron@redhat.com>
LKML-Reference: <1257477185-7838-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-08 10:31:42 +01:00
Hitoshi Mitake bfde82ef51 perf bench: Add subcommand 'bench' to the Makefile
This patch modifies Makefile for new files related to 'bench'
subcommand. The new code is active from this point on.

Signed-off-by: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: fweisbec@gmail.com
Cc: Jiri Kosina <jkosina@suse.cz>
LKML-Reference: <1257381097-4743-8-git-send-email-mitake@dcl.info.waseda.ac.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-08 10:19:20 +01:00
Hitoshi Mitake dcba8848d3 perf bench: Add new subcommand 'bench' to perf.c
This patch modifies perf.c for invoking 'bench' subcommand.

Signed-off-by: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: fweisbec@gmail.com
Cc: Jiri Kosina <jkosina@suse.cz>
LKML-Reference: <1257381097-4743-7-git-send-email-mitake@dcl.info.waseda.ac.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-08 10:19:19 +01:00
Hitoshi Mitake 11bd341c04 perf bench: Modify builtin.h for new prototype
This patch modifies builtin.h to add prototype of cmd_bench().

Signed-off-by: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: fweisbec@gmail.com
Cc: Jiri Kosina <jkosina@suse.cz>
LKML-Reference: <1257381097-4743-6-git-send-email-mitake@dcl.info.waseda.ac.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-08 10:19:19 +01:00
Hitoshi Mitake 629cc35665 perf bench: Add builtin-bench.c: General framework for benchmark suites
This patch adds builtin-bench.c
builtin-bench.c is a general framework for benchmark suites.

Signed-off-by: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: fweisbec@gmail.com
Cc: Jiri Kosina <jkosina@suse.cz>
LKML-Reference: <1257381097-4743-5-git-send-email-mitake@dcl.info.waseda.ac.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-08 10:19:18 +01:00
Hitoshi Mitake c7d9300f36 perf bench: Add sched-pipe.c: Benchmark for pipe() system call
This patch adds bench/sched-pipe.c.

bench/sched-pipe.c is a benchmark program
to measure performance of pipe() system call.
This benchmark is based on pipe-test-1m.c by Ingo Molnar:

   http://people.redhat.com/mingo/cfs-scheduler/tools/pipe-test-1m.c

Example of use:

% perf bench sched pipe
  (executing 1000000 pipe operations between two tasks)

          Total time:4.499 sec
                  4.499179 usecs/op
                  222262 ops/sec

% perf bench sched pipe -s -l 1000
0.015

Signed-off-by: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: fweisbec@gmail.com
Cc: Jiri Kosina <jkosina@suse.cz>
LKML-Reference: <1257381097-4743-4-git-send-email-mitake@dcl.info.waseda.ac.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-08 10:19:18 +01:00
Hitoshi Mitake e27454cc63 perf bench: Add sched-messaging.c: Benchmark for scheduler and IPC mechanisms based on hackbench
This patch adds bench/sched-messaging.c.

This benchmark measures performance of scheduler and IPC
mechanisms, and is based on hackbench by Rusty Russell.

Example of usage:

  % perf bench sched messaging -g 20 -l 1000 -s
  5.432  	  	       	    	    	     # in sec

  % perf bench sched messaging                 # run with default
  options (20 sender and receiver processes per group)
  (10 groups == 400 processes run)

        Total time:0.308 sec

  % perf bench sched messaging -t -g 20	     # # be multi-thread,
  with 20 groups (20 sender and receiver threads per group)
  (20 groups == 800 threads run)

        Total time:0.582 sec

( Rusty is the original author of hackbench.c and he said the code is
  and was under the GPLv2 so fine to be merged. )

Signed-off-by: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: fweisbec@gmail.com
Cc: Jiri Kosina <jkosina@suse.cz>
LKML-Reference: <1257381097-4743-3-git-send-email-mitake@dcl.info.waseda.ac.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-08 10:19:17 +01:00
Hitoshi Mitake c426bba069 perf bench: Add new directory and header for new subcommand 'bench'
This patch adds bench/ directory and bench/bench.h.

bench/ directory will contain modules for bench subcommand.
bench/bench.h is for listing prototypes of module functions.

Signed-off-by: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: fweisbec@gmail.com
Cc: Jiri Kosina <jkosina@suse.cz>
LKML-Reference: <1257381097-4743-2-git-send-email-mitake@dcl.info.waseda.ac.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-08 10:19:15 +01:00
Sebastian Siewior 2606289779 net/fsl_pq_mdio: add module license GPL
or it will taint the kernel and fail to load becuase
of_address_to_resource() is GPL only.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-08 00:49:04 -08:00
Wolfgang Grandegger 53a0ef866d can: fix WARN_ON dump in net/core/rtnetlink.c:rtmsg_ifinfo()
On older kernels, e.g. 2.6.27, a WARN_ON dump in rtmsg_ifinfo()
is thrown when the CAN device is registered due to insufficient
skb space, as reported by various users. This patch adds the
rtnl_link_ops "get_size" to fix the problem. I think this patch
is required for more recent kernels as well, even if no WARN_ON
dumps are triggered. Maybe we also need "get_xstats_size" for
the CAN xstats.

Signed-off-by: Wolfgang Grandegger <wg@grandegger.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-08 00:45:48 -08:00
Eric Dumazet 6755aebaaf can: should not use __dev_get_by_index() without locks
bcm_proc_getifname() is called with RTNL and dev_base_lock
not held. It calls __dev_get_by_index() without locks, and
this is illegal (might crash)

Close the race by holding dev_base_lock and copying dev->name
in the protected section.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Oliver Hartkopp <oliver@hartkopp.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-08 00:33:43 -08:00
Roel Kluin 88b938e63e sparc64: replace parentheses in pmul()
`>>' has a higher precedence than `?' so src2 evaluated to
either 16 or 0 dependent on the bits set in rs2.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-08 00:26:56 -08:00
Takashi Iwai dede17b8e9 Merge branch 'fix/hda' into for-linus 2009-11-08 09:16:15 +01:00
Takashi Iwai f645073961 Merge branch 'fix/misc' into for-linus 2009-11-08 09:16:06 +01:00
Ben Hutchings f37325a956 ALSA: snd-aica: declare MODULE_FIRMWARE
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2009-11-08 09:13:51 +01:00
Nicolas Pitre 3293576c6b [ARM] orion5x: update defconfig
Signed-off-by: Nicolas Pitre <nico@marvell.com>
2009-11-07 20:59:20 -05:00
Nicolas Pitre ffbfe093b6 [ARM] Kirkwood: update defconfig
Signed-off-by: Nicolas Pitre <nico@marvell.com>
2009-11-07 20:36:02 -05:00
Lennert Buytenhek a1897fa67c [ARM] Kirkwood: clarify PCIe MEM bus/physical address distinction
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Signed-off-by: Nicolas Pitre <nico@marvell.com>
2009-11-07 20:18:24 -05:00
Lennert Buytenhek 35f029e251 [ARM] kirkwood: fix PCI I/O port assignment
Instead of allocating PCI devices I/O port bus addresses from the
000xxxxx I/O port range as intended, due to a bus versus physical
address mixup, the Kirkwood PCIe handling code inadvertently
allocated I/O port bus addresses from the f20xxxxx address range
(which is the physical address range of the PCIe I/O mapping window),
but then direct all I/O port accesses to bus addresses 000xxxxx,
which would then not be decoded at all.

Fix this by setting the base address of the PCIe I/O space struct
resource to KIRKWOOD_PCIE_IO_BUS_BASE instead of the incorrect
KIRKWOOD_PCIE_IO_PHYS_BASE, and fix up __io() to expect addresses
offsetted by the former instead of the latter.

(The suggested fix of directing I/O port accesses from the host to
bus addresses f20xxxxx instead has the problem that assigning full
32bit I/O port bus addresses (f20xxxxx) doesn't work on all PCI
devices, as not all PCI devices implement full 32 bit BAR registers
for I/O ports.  We should really try to allocate I/O port bus
addresses that fit in 16 bits.)

Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Signed-off-by: Nicolas Pitre <nico@marvell.com>
2009-11-07 20:14:21 -05:00
Yong Zhang e7e7e0c084 genirq: try_one_irq() must be called with irq disabled
Prarit reported:
=================================
[ INFO: inconsistent lock state ]
2.6.32-rc5 #1
---------------------------------
inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage.
swapper/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
 (&irq_desc_lock_class){?.-...}, at: [<ffffffff810c264e>] try_one_irq+0x32/0x138
{IN-HARDIRQ-W} state was registered at:
 [<ffffffff81095160>] __lock_acquire+0x2fc/0xd5d
 [<ffffffff81095cb4>] lock_acquire+0xf3/0x12d
 [<ffffffff814cdadd>] _spin_lock+0x40/0x89
 [<ffffffff810c3389>] handle_level_irq+0x30/0x105
 [<ffffffff81014e0e>] handle_irq+0x95/0xb7
 [<ffffffff810141bd>] do_IRQ+0x6a/0xe0
 [<ffffffff81012813>] ret_from_intr+0x0/0x16
irq event stamp: 195096
hardirqs last  enabled at (195096): [<ffffffff814cd7f7>] _spin_unlock_irq+0x3a/0x5c
hardirqs last disabled at (195095): [<ffffffff814cdbdd>] _spin_lock_irq+0x29/0x95
softirqs last  enabled at (195088): [<ffffffff81068c92>] __do_softirq+0x1c1/0x1ef
softirqs last disabled at (195093): [<ffffffff8101304c>] call_softirq+0x1c/0x30

other info that might help us debug this:
1 lock held by swapper/0:
 #0:  (kernel/irq/spurious.c:21){+.-...}, at: [<ffffffff81070cf2>]
run_timer_softirq+0x1a9/0x315

stack backtrace:
Pid: 0, comm: swapper Not tainted 2.6.32-rc5 #1
Call Trace:
 <IRQ>  [<ffffffff81093e94>] valid_state+0x187/0x1ae
 [<ffffffff81093fe4>] mark_lock+0x129/0x253
 [<ffffffff810951d4>] __lock_acquire+0x370/0xd5d
 [<ffffffff81095cb4>] lock_acquire+0xf3/0x12d
 [<ffffffff814cdadd>] _spin_lock+0x40/0x89
 [<ffffffff810c264e>] try_one_irq+0x32/0x138
 [<ffffffff810c2795>] poll_all_shared_irqs+0x41/0x6d
 [<ffffffff810c27dd>] poll_spurious_irqs+0x1c/0x49
 [<ffffffff81070d82>] run_timer_softirq+0x239/0x315
 [<ffffffff81068bd3>] __do_softirq+0x102/0x1ef
 [<ffffffff8101304c>] call_softirq+0x1c/0x30
 [<ffffffff81014b65>] do_softirq+0x59/0xca
 [<ffffffff810686ad>] irq_exit+0x58/0xae
 [<ffffffff81029b84>] smp_apic_timer_interrupt+0x94/0xba
 [<ffffffff81012a33>] apic_timer_interrupt+0x13/0x20

The reason is that try_one_irq() is called from hardirq context with
interrupts disabled and from softirq context (poll_all_shared_irqs())
with interrupts enabled.

Disable interrupts before calling it from poll_all_shared_irqs().

Reported-and-tested-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Yong Zhang <yong.zhang0@gmail.com>
LKML-Reference: <1257563773-4620-1-git-send-email-yong.zhang0@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-11-07 21:44:45 +01:00
Michael Krufky 22370ef503 V4L/DVB (13314): saa7134: set ts_force_val for the Hauppauge WinTV HVR-1150
The Hauppauge WinTV HVR-1150 retail boards require the FORCE_TS_VALID bit
to be set in order to function properly. This change will work on the early
revisions on the board as well, but the final revision will not function
without this change.

Signed-off-by: Michael Krufky <mkrufky@kernellabs.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2009-11-07 12:55:15 -02:00
Michael Krufky 4007a672ab V4L/DVB (13313): saa7134: add support for FORCE_TS_VALID mode for mpeg ts input
When FORCE_TS_VALID mode is enabled, the saa713x will accept MPEG TS input
without requiring TS_VALID set high.  This is required for some new boards
to function properly, due to the hardware design implementation.

The configuration is toggled within the board setup configuration.  Boards
that do not have this bit set will function as before with no change.

Signed-off-by: Michael Krufky <mkrufky@kernellabs.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2009-11-07 12:55:15 -02:00
Laurent Pinchart 2e8961330e V4L/DVB (13311): uvcvideo: Fix compilation warning with 2.6.32 due to type mismatch with abs()
The abs() macro has changed in 2.6.32 and returns a long instead of an
int. Fix the driver to avoid compilation warnings.

Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2009-11-07 12:55:14 -02:00
Laurent Pinchart a2e35af6c7 V4L/DVB (13309): uvcvideo: Ignore the FIX_BANDWIDTH for compressed video
The FIX_BANDWIDTH quirk tries to work around cameras requesting the
maximum bandwidth regardless of the image size by computing a bandwidth
estimate. This works only for uncompressed frames.

Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2009-11-07 12:55:14 -02:00
Jean Delvare 7157fbd0ed V4L/DVB (13287): ce6230 - saa7164-cmd: Fix wrong sizeof
Which is why I have always preferred sizeof(struct foo) over
sizeof(var).

Cc: Antti Palosaari <crope@iki.fi>
Acked-by: Steven Toth <stoth@kernellabs.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Douglas Schilling Landgraf <dougsland@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2009-11-07 12:55:13 -02:00
Jonathan Cameron d514edac5d V4L/DVB (13286): pxa-camera: Fix missing sched.h
Required for wakeup call.

Signed-off-by: Jonathan Cameron <jic23@cam.ac.uk>
Acked-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Douglas Schilling Landgraf <dougsland@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2009-11-07 12:55:13 -02:00
Theodore Kilgore 32345b0596 V4L/DVB (13264): gspca_mr97310a: Change vstart for CIF sensor type 1 cams
gspca_mr97310a: Change vstart for CIF sensor type 1 cams

This fixes the distortion at the end of the frame, and avoids the bad frame
dropping done because of this distortion, trippling the framerate!

Signed-off-by: Theodore Kilgore <kilgota@banach.math.auburn.edu>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2009-11-07 12:55:12 -02:00
Erik Andrén 81191f694c V4L/DVB (13257): gspca - m5602-s5k4aa: Add vflip for Fujitsu Amilo Xi 2528
Adds a vflip quirk for the Fujitsu Amilo Xi 2528. Thanks to Evgeny for the report.

Signed-off-by: Erik Andrén <erik.andren@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2009-11-07 12:55:12 -02:00
Erik Andrén 2339a1887d V4L/DVB (13256): gspca - m5602-s5k4aa: Add another MSI GX700 vflip quirk
Adds another vflip quirk for the MSI GX700.
Thanks to John Katzmaier for reporting.

Signed-off-by: Erik Andrén <erik.andren@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2009-11-07 12:55:11 -02:00
Erik Andrén b6ef8836c1 V4L/DVB (13255): gspca - m5602-s5k4aa: Add vflip quirk for the Bruneinit laptop
Adds a vflip quirk for the Bruneinit laptop. Thanks to Jörg for the report

Signed-off-by: Erik Andrén <erik.andren@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2009-11-07 12:55:11 -02:00
Stefan Richter dcff9cfe43 V4L/DVB (13240): firedtv: fix regression: tuning fails due to bogus error return
Since 2.6.32(-rc1), DVB core checks the return value of
dvb_frontend_ops.set_frontend.  Now it becomes apparent that firedtv
always returned a bogus value from its set_frontend method.

CC: stable@kernel.org
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2009-11-07 12:55:11 -02:00
Henrik Kurelid c94115ffc4 V4L/DVB (13237): firedtv: length field corrupt in ca2host if length>127
This solves a problem in firedtv that has become major for Swedish DVB-T
users the last month or so.  It will most likely solve issues seen by
other users as well.

If the length of an AVC message is greater than 127, the length field
should be encoded in LV mode instead of V mode. V mode can only be used
if the length is 127 or less. This patch ensures that the CA_PMT
message is always encoded in LV mode so PMT message of greater lengths
can be supported.

Signed-off-by: Henrik Kurelid <henrik@kurelid.se>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2009-11-07 12:55:10 -02:00
Mike Isely 1f95725755 V4L/DVB (13230): s2255drv: Don't conditionalize video buffer completion on waiting processes
The s2255 driver had logic which aborted processing of a video frame
if there was no process waiting on the video buffer in question.  That
simply doesn't work when the application is doing things in an
asynchronous manner.  If the application went to the trouble to queue
the buffer in the first place, then the driver should always attempt
to complete it - even if the application at that moment has its
attention turned elsewhere.  Applications which always blocked waiting
for I/O on the capture device would not have been affected by this.
Applications which *mostly* blocked waiting for I/O on the capture
device probably only would have been somewhat affected (frame lossage,
at a rate which goes up as the application blocks less).  Applications
which never blocked on the capture device (e.g. polling only) however
would never have been able to receive any video frames, since in that
case this "is anyone waiting on this?" check on the buffer never would
have evalutated true.  This patch just deletes that harmful check
against the buffer's wait queue.

Signed-off-by: Mike Isely <isely@pobox.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
CC: stable@kernel.org
2009-11-07 12:55:10 -02:00
Michael Krufky 78c948ab0c V4L/DVB (13202): smsusb: add autodetection support for three additional Hauppauge USB IDs
Add support for three new Hauppauge Device USB IDs:

2040:b900
2040:b910
2040:c000

Signed-off-by: Michael Krufky <mkrufky@kernellabs.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2009-11-07 12:55:09 -02:00
Devin Heitmueller 96fbf771d8 V4L/DVB (13190): em28xx: fix panic that can occur when starting audio streaming
Because the counters were not reset when starting up streaming, they would
be reused from the previous run.  This can result in cases such that when the
second instance of streaming starts up, the "cnt" variable in
em28xx_audio_isocirq() can end up being negative, resulting in attempting to
write to memory before the start of runtime->dma_area (as well as having a
negative number of bytes to copy).

Signed-off-by: Devin Heitmueller <dheitmueller@kernellabs.com>
CC: stable@kernel.org
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2009-11-07 12:55:08 -02:00
Mike Isely 2de26c0a4a V4L/DVB (13170): bttv: Fix reversed polarity error when switching video standard
The bttv driver function which handles switching of the video standard
(set_tvnorm() in bttv-driver.c) includes a check which can optionally
also reset the cropping configuration to a default value.  It is
"optional" based on a comparison of the cropcap parameters of the
previous vs the newly requested video standard.  The comparison is
being done with a memcmp(), a function which only returns a true value
if the comparison actually fails.

This if-statement appears to have been written to assume wrong
memcmp() semantics.  That is, it was re-initializing the cropping
configuration only if the new video standard did NOT have different
cropcap values.  That doesn't make any sense.  One definitely should
reset things if the cropcap parameters are different - if there's any
comparison to made at all.

The effect of this problem was that a transition from, say, PAL to
NTSC would leave in place old cropping setup that made sense for the
PAL geometry but not for NTSC.  If the application doesn't care about
cropping it also won't try to reset the cropping configuration,
resulting in an improperly cropped video frame.  In the case I was
testing this actually caused black video frames to be displayed.

Another interesting effect of this bug is that if one does something
which does NOT change the video standard and this function is run,
then the cropping setup gets reset anyway - again because of the
backwards comparison.  It turns out that just running anything which
merely opens and closes the video device node (e.g. v4l-info) will
cause this to happen.  One can argue that simply opening the device
node and not doing anything to it should not mess with any of its
state - but because of this behavior, any TV app which does such
things (e.g. xawtv) probably therefore doesn't see the problem.

The solution is to fix the sense of the if-statement.  It's easy to
see how this mistake could have been made given how memcmp() works.
The patch is therefore removal of a single "!" character from the
if-statement in set_tvnorm in bttv-driver.c.

Signed-off-by: Mike Isely <isely@pobox.com>
CC: stable@kernel.org
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2009-11-07 12:55:08 -02:00
Mike Isely 66349b4e7a V4L/DVB (13169): bttv: Fix potential out-of-order field processing
There is a subtle interaction in the bttv driver which can result in
fields being repeatedly processed out of order.  This is a problem
specifically when running in V4L2_FIELD_ALTERNATE mode (probably the
most common case).

1. The determination of which fields are associated with which buffers
happens in videobuf, before the bttv driver gets a chance to queue the
corresponding DMA.  Thus by the point when the DMA is queued for a
given buffer, the algorithm has to do the queuing based on the
buffer's already assigned field type - not based on which field is
"next" in the video stream.

2. The driver normally tries to queue both the top and bottom fields
at the same time (see bttv_irq_next_video()).  It tries to sort out
top vs bottom by looking at the field type for the next 2 available
buffers and assigning them appropriately.

3. However the bttv driver *always* actually processes the top field
first.  There's even an interrupt set aside for specifically
recognizing when the top field has been processed so that it can be
marked done even while the bottom field is still being DMAed.

Given all of the above, if one gets into a situation where
bttv_irq_next_video() gets entered when the first available buffer has
been pre-associated as a bottom field, then the function is going to
process the buffers out of order.  That first available buffer will be
put into the bottom field slot and the buffer after that will be put
into the top field slot.  Problem is, since the top field is always
processed first by the driver, then that second buffer (the one after
the first available buffer) will be the first one to be finished.
Because of the strict fifo handling of all video buffers, then that
top field won't be seen by the app until after the bottom field is
also processed.  Worse still, the app will get back the
chronologically later bottom field first, *before* the top field is
received.  The buffer's timestamps will even be backwards.

While not fatal to most TV apps, this behavior can subtlely degrade
userspace deinterlacing (probably will cause jitter).  That's probably
why it has gone unnoticed.  But it will also cause serious problems if
the app in question discards all but the latest received buffer (a
latency minimizing tactic) - causing one field to only ever be
displayed since the other is now always late.  Unfortunately once you
get into this state, you're stuck this way - because having consumed
two buffers, now the next time around the "first" available buffer
will again be a bottom field and the same thing happens.

How can we get into this state?  In a perfect world, where there's
always a few free buffers queued to the driver, it should be
impossible.  However if something disrupts streaming, e.g. if the
userspace app can't queue free buffers fast enough for a moment due
perhaps to a CPU scheduling glitch, then the driver can get
momentarily starved and some number of fields will be dropped.  That's
OK.  But if an odd number of fields get dropped, then that "first"
available buffer might be the bottom field and now we're stuck...

This patch fixes that problem by deliberately only setting up a single
field for one frame if we don't get a top field as the first available
buffer.  By purposely skipping the other field, then we only handle a
single buffer thus bringing things back into proper sync (i.e. top
field first) for the next frame.  To do this we just drop the few
lines in bttv_irq_next_video() that attempt to set up the second
buffer when that second buffer isn't for the bottom field.

This is definitely a problem in when in V4L2_FIELD_ALTERNATE mode.  In
the other modes this change either has no effect or doesn't harm
things any further anyway.

Signed-off-by: Mike Isely <isely@pobox.com>
CC: stable@kernel.org
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2009-11-07 12:55:07 -02:00
HIRANO Takahito e0a7e8a621 V4L/DVB (13167): pt1: Fix a compile error on arm
The lack of #include <linux/vmalloc.h> caused a compile error on some
architectures.

Signed-off-by: HIRANO Takahito <hiranotaka@zng.info>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2009-11-07 12:55:07 -02:00