Commit Graph

37322 Commits

Author SHA1 Message Date
Ingo Molnar 8178d00050 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/perfcounters into perfcounters/core 2009-08-18 09:06:25 +02:00
Paul Mackerras 20002ded4d perf_counter: powerpc: Add callchain support
This adds support for tracing callchains for powerpc, both 32-bit
and 64-bit, and both in the kernel and userspace, from PMU interrupt
context.

The first three entries stored for each callchain are the NIP (next
instruction pointer), LR (link register), and the contents of the LR
save area in the second stack frame (the first is ignored because the
ABI convention on powerpc is that functions save their return address
in their caller's stack frame).  Because leaf functions don't have to
save their return address (LR value) and don't have to establish a
stack frame, it's possible for either or both of LR and the second
stack frame's LR save area to have valid return addresses in them.
This is basically impossible to disambiguate without either reading
the code or looking at auxiliary information such as CFI tables.
Since we don't want to do either of those things at interrupt time,
we store both LR and the second stack frame's LR save area.

Once we get past the second stack frame, there is no ambiguity; all
return addresses we get are reliable.

For kernel traces, we check whether they are valid kernel instruction
addresses and store zero instead if they are not (rather than
omitting them, which would make it impossible for userspace to know
which was which).  We also store zero instead of the second stack
frame's LR save area value if it is the same as LR.

For kernel traces, we check for interrupt frames, and for user traces,
we check for signal frames.  In each case, since we're starting a new
trace, we store a PERF_CONTEXT_KERNEL/USER marker so that userspace
knows that the next three entries are NIP, LR and the second stack frame
for the interrupted context.

We read user memory with __get_user_inatomic.  On 64-bit, if this
PMU interrupt occurred while interrupts are soft-disabled, and
there is no MMU hash table entry for the page, we will get an
-EFAULT return from __get_user_inatomic even if there is a valid
Linux PTE for the page, since hash_page isn't reentrant.  Thus we
have code here to read the Linux PTE and access the page via the
kernel linear mapping.  Since 64-bit doesn't use (or need) highmem
there is no need to do kmap_atomic.  On 32-bit, we don't do soft
interrupt disabling, so this complication doesn't occur and there
is no need to fall back to reading the Linux PTE, since hash_page
(or the TLB miss handler) will get called automatically if necessary.

Note that we cannot get PMU interrupts in the interval during
context switch between switch_mm (which switches the user address
space) and switch_to (which actually changes current to the new
process).  On 64-bit this is because interrupts are hard-disabled
in switch_mm and stay hard-disabled until they are soft-enabled
later, after switch_to has returned.  So there is no possibility
of trying to do a user stack trace when the user address space is
not current's address space.

Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2009-08-18 14:48:47 +10:00
Paul Mackerras 9c1e105238 powerpc: Allow perf_counters to access user memory at interrupt time
This provides a mechanism to allow the perf_counters code to access
user memory in a PMU interrupt routine.  Such an access can cause
various kinds of interrupt: SLB miss, MMU hash table miss, segment
table miss, or TLB miss, depending on the processor.  This commit
only deals with 64-bit classic/server processors, which use an MMU
hash table.  32-bit processors are already able to access user memory
at interrupt time.  Since we don't soft-disable on 32-bit, we avoid
the possibility of reentering hash_page or the TLB miss handlers,
since they run with interrupts disabled.

On 64-bit processors, an SLB miss interrupt on a user address will
update the slb_cache and slb_cache_ptr fields in the paca.  This is
OK except in the case where a PMU interrupt occurs in switch_slb,
which also accesses those fields.  To prevent this, we hard-disable
interrupts in switch_slb.  Interrupts are already soft-disabled at
this point, and will get hard-enabled when they get soft-enabled
later.

This also reworks slb_flush_and_rebolt: to avoid hard-disabling twice,
and to make sure that it clears the slb_cache_ptr when called from
other callers than switch_slb, the existing routine is renamed to
__slb_flush_and_rebolt, which is called by switch_slb and the new
version of slb_flush_and_rebolt.

Similarly, switch_stab (used on POWER3 and RS64 processors) gets a
hard_irq_disable() to protect the per-cpu variables used there and
in ste_allocate.

If a MMU hashtable miss interrupt occurs, normally we would call
hash_page to look up the Linux PTE for the address and create a HPTE.
However, hash_page is fairly complex and takes some locks, so to
avoid the possibility of deadlock, we check the preemption count
to see if we are in a (pseudo-)NMI handler, and if so, we don't call
hash_page but instead treat it like a bad access that will get
reported up through the exception table mechanism.  An interrupt
whose handler runs even though the interrupt occurred when
soft-disabled (such as the PMU interrupt) is considered a pseudo-NMI
handler, which should use nmi_enter()/nmi_exit() rather than
irq_enter()/irq_exit().

Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2009-08-18 14:48:43 +10:00
Paul Mackerras 1660e9d3d0 powerpc/32: Always order writes to halves of 64-bit PTEs
On 32-bit systems with 64-bit PTEs, the PTEs have to be written in two
32-bit halves.  On SMP we write the higher-order half and then the
lower-order half, with a write barrier between the two halves, but on
UP there was no particular ordering of the writes to the two halves.

This extends the ordering that we already do on SMP to the UP case as
well.  The reason is that with the perf_counter subsystem potentially
accessing user memory at interrupt time to get stack traces, we have
to be careful not to create an incorrect but apparently valid PTE even
on UP.

Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2009-08-18 14:48:39 +10:00
Ingo Molnar be750231ce Merge branch 'perfcounters/urgent' into perfcounters/core
Conflicts:
	kernel/perf_counter.c

Merge reason: update to latest upstream (-rc6) and resolve
              the conflict with urgent fixes.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-15 12:06:12 +02:00
Linus Torvalds 3493e84de6 Merge branch 'perfcounters-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perfcounters-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  perf_counter: Report the cloning task as parent on perf_counter_fork()
  perf_counter: Fix an ipi-deadlock
  perf: Rework/fix the whole read vs group stuff
  perf_counter: Fix swcounter context invariance
  perf report: Don't show unresolved DSOs and symbols when -S/-d is used
  perf tools: Add a general option to enable raw sample records
  perf tools: Add a per tracepoint counter attribute to get raw sample
  perf_counter: Provide hw_perf_counter_setup_online() APIs
  perf list: Fix large list output by using the pager
  perf_counter, x86: Fix/improve apic fallback
  perf record: Add missing -C option support for specifying profile cpu
  perf tools: Fix dso__new handle() to handle deleted DSOs
  perf tools: Fix fallback to cplus_demangle() when bfd_demangle() is not available
  perf report: Show the tid too in -D
  perf record: Fix .tid and .pid fill-in when synthesizing events
  perf_counter, x86: Fix generic cache events on P6-mobile CPUs
  perf_counter, x86: Fix lapic printk message
2009-08-13 12:24:33 -07:00
Linus Torvalds 1c2ffff407 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: Fix oops in identify_cpu() on CPUs without CPUID
  x86: Clear incorrectly forced X86_FEATURE_LAHF_LM flag
  x86, mce: therm_throt - change when we print messages
  x86: Add reboot quirk for every 5 series MacBook/Pro
2009-08-13 12:08:44 -07:00
Magnus Damm dbefd606a3 sh: fix i2c init order on ap325rxa V2
Convert the AP325RXA board code to register devices at
arch_initcall() time instead of device_initcall(). This
fix unbreaks pcf8563 RTC driver support.

Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-08-13 11:43:43 +09:00
Magnus Damm ba3a170191 sh: fix i2c init order on Migo-R V2
Convert the Migo-R board code to register devices at
arch_initcall() time instead of __initcall(). This fix
unbreaks migor_ts touch screen driver support.

Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-08-13 11:39:02 +09:00
Magnus Damm ba9a633787 sh: convert processor device setup functions to arch_initcall()
Convert the processor platform device setup
functions from __initcall() and sometimes
device_initcall() to arch_initcall().

This makes sure that the platform devices are
registered a bit earlier so the devices are
available when drivers register using initcall
levels earlier than device_initcall().

A good example is platform devices needed by
i2c-sh_mobile.c which registers a bit earlier
using subsys_initcall().

Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-08-13 11:36:33 +09:00
Ingo Molnar 04da8a43da perf_counter, x86: Fix/improve apic fallback
Johannes Stezenbach reported that his Pentium-M based
laptop does not have the local APIC enabled by default,
and hence perfcounters do not get initialized.

Add a fallback for this case: allow non-sampled counters
and return with an error on sampled counters. This allows
'perf stat' to work out of box - and allows 'perf top'
and 'perf record' to fall back on a hrtimer based sampling
method.

( Passing 'lapic' on the boot line will allow hardware
  sampling to occur - but if the APIC is disabled
  permanently by the hardware then this fallback still
  allows more systems to use perfcounters. )

Also decouple perfcounter support from X86_LOCAL_APIC.

-v2: fix typo breaking counters on all other systems ...

Reported-by: Johannes Stezenbach <js@sig21.net>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-12 14:12:49 +02:00
Ondrej Zary e8055139d9 x86: Fix oops in identify_cpu() on CPUs without CPUID
Kernel is broken for x86 CPUs without CPUID since 2.6.28. It
crashes with NULL pointer dereference in identify_cpu():

766        generic_identify(c);
767
768-->     if (this_cpu->c_identify)
769               this_cpu->c_identify(c);

this_cpu is NULL. This is because it's only initialized in
get_cpu_vendor() function, which is not called if the CPU has
no CPUID instruction.

Signed-off-by: Ondrej Zary <linux@rainbow-software.org>
LKML-Reference: <200908112000.15993.linux@rainbow-software.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-12 11:49:41 +02:00
Roel Kluin e7369e01eb arch/ia64/kernel/iosapic: missing test after ioremap()
Missing test after ioremap()

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Acked-by: Fenghua Yu <fenghua.yu@intel.com>
2009-08-11 14:52:11 -07:00
Fenghua Yu 5359dffd43 ia64/topology.c: exit cache_add_dev when kobject_init_and_add fails
Make cache_add_dev exit sysfs when kobject_init_and_add returns an error.

Signed-off-by: Xiaotian Feng <dfeng@redhat.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
2009-08-11 14:52:11 -07:00
Fenghua Yu bf2a4c7270 arch/ia64/Makefile: Remove -mtune=merced in IA64 kernel build
Between GCC version 3.4.0 and 4.3.3 (including 3.4.0 and 4.3.3), -mtune=merced
is implemented in GCC. Starting from 4.4.0, -mtune=merced is deprecated.

Even implemented in versions between 3.4.0 and 4.3.3, the -mtune=merced
feature has been broken in some of the versions. For example, GCC 4.1.2 reports
interanl tuning function errors during kernel building with -mtune=merced. Or
GCC Bugzilla 16130 reports another -mtune=merced issue on GCC 3.4.1.

So I would remove the -mtune=merced from IA64 kernel build. Without this option,
kernel on Merced will remain the same except losing an unstable and out-of-date
performance tunning feature.

Since GCC version 3.4.0, -mtune=mckinley has been implemented. The
-mtune=mckinley option functions the same as mtune=itanium2. And mtune=itanium2
is the default option. So we don't need to add mtune=mckinley either since its
been the default option in any GCC version which implements this option.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
2009-08-11 14:52:11 -07:00
Jaswinder Singh Rajput b5a8879347 IA64: includecheck fix: ia64, pgtable.h
fix the following 'make includecheck' warning:

  arch/ia64/include/asm/pgtable.h: asm/processor.h is included more than once.

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Acked-by: Fenghua Yu <fenghua.yu@intel.com>
2009-08-11 14:52:11 -07:00
Jaswinder Singh Rajput cfa5f809e3 IA64: includecheck fix: ia64, ia64_ksyms.c
fix the following 'make includecheck' warning:

  arch/ia64/kernel/ia64_ksyms.c: asm/page.h is included more than once.

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Acked-by: Fenghua Yu <fenghua.yu@intel.com>
2009-08-11 14:52:10 -07:00
Johannes Weiner 8d6f9af919 ia64: boolean __test_and_clear_bit
__test_and_clear_bit() returns a bitfield with the tested-for bit set.
Make it consistent with the other bitops - of ia64 but also every
other architecture - and return a boolean value.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Fenghua Yu <fenghua.yu@intel.com>
2009-08-11 14:52:10 -07:00
Fenghua Yu 51b89f7a66 Bug Fix arch/ia64/kernel/pci-dma.c: fix recursive dma_supported() call in iommu_dma_supported()
In commit 160c1d8e40,
dma_ops->dma_supported = iommu_dma_supported;

This dma_ops->dma_supported is first called in platform_dma_init() during kernel
boot. Then dma_ops->dma_supported will be called recursively in
iommu_dma_supported.

Kernel can not boot because kernel can not get out of iommu_dma_supported until
it runs out of stack memory.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
2009-08-11 14:52:10 -07:00
Kevin Winchester fbd8b1819e x86: Clear incorrectly forced X86_FEATURE_LAHF_LM flag
Due to an erratum with certain AMD Athlon 64 processors, the
BIOS may need to force enable the LAHF_LM capability.
Unfortunately, in at least one case, the BIOS does this even
for processors that do not support the functionality.

Add a specific check that will clear the feature bit for
processors known not to support the LAHF/SAHF instructions.

Signed-off-by: Kevin Winchester <kjwinchester@gmail.com>
Acked-by: Borislav Petkov <petkovbb@googlemail.com>
LKML-Reference: <4A80A5AD.2000209@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-11 13:34:54 +02:00
Ingo Molnar f64ccccb8a perf_counter, x86: Fix generic cache events on P6-mobile CPUs
Johannes Stezenbach reported that 'perf stat' does not count
cache-miss and cache-references events on his Pentium-M based
laptop.

This is because we left them blank in p6_perfmon_event_map[],
fill them in.

Reported-by: Johannes Stezenbach <js@sig21.net>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-11 11:35:26 +02:00
Ingo Molnar 3c581a7f94 perf_counter, x86: Fix lapic printk message
Instead of this garbled bootup on UP Pentium-M systems:

[    0.015048] Performance Counters:
[    0.016004] no Local APIC, try rebooting with lapicno PMU driver, software counters only.

Print:

[    0.015050] Performance Counters:
[    0.016004] no APIC, boot with the "lapic" boot parameter to force-enable it.
[    0.017003] no PMU driver, software counters only.

Cf: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-11 10:58:25 +02:00
Dmitry Torokhov 0d01f31439 x86, mce: therm_throt - change when we print messages
My Latitude d630 seems to be handling thermal events in SMI by
lowering the max frequency of the CPU till it cools down but
still leaks the "everything is normal" events.

This spams the console and with high priority printks.

Adjust therm_throt driver to only print messages about the fact
that temperatire returned back to normal when leaving the
throttling state.

Also lower the severity of "back to normal" message from
KERN_CRIT to KERN_INFO.

Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Acked-by: H. Peter Anvin <hpa@zytor.com>
LKML-Reference: <20090810051513.0558F526EC9@mailhub.coreip.homeip.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-11 09:54:17 +02:00
Shunichi Fuji 3e03bbeac5 x86: Add reboot quirk for every 5 series MacBook/Pro
Reboot does not work on my MacBook Pro 13 inch (MacBookPro5,5)
too. It seems all unibody MacBook and MacBookPro require
PCI reboot handling, i guess.

Following model/machine ID list shows unibody MacBook/Pro have
the 5 series of model number:

   http://www.everymac.com/systems/by_capability/macs-by-machine-model-machine-id.html

Signed-off-by: Shunichi Fuji <palglowr@gmail.com>
Cc: Ozan Çağlayan <ozan@pardus.org.tr>
LKML-Reference: <30046e3b0908101134p6487ddbftd8776e4ddef204be@mail.gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-10 20:59:42 +02:00
Linus Torvalds d00aa6695b Merge branch 'perfcounters-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perfcounters-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (27 commits)
  perf_counter: Zero dead bytes from ftrace raw samples size alignment
  perf_counter: Subtract the buffer size field from the event record size
  perf_counter: Require CAP_SYS_ADMIN for raw tracepoint data
  perf_counter: Correct PERF_SAMPLE_RAW output
  perf tools: callchain: Fix bad rounding of minimum rate
  perf_counter tools: Fix libbfd detection for systems with libz dependency
  perf: "Longum est iter per praecepta, breve et efficax per exempla"
  perf_counter: Fix a race on perf_counter_ctx
  perf_counter: Fix tracepoint sampling to be part of generic sampling
  perf_counter: Work around gcc warning by initializing tracepoint record unconditionally
  perf tools: callchain: Fix sum of percentages to be 100% by displaying amount of ignored chains in fractal mode
  perf tools: callchain: Fix 'perf report' display to be callchain by default
  perf tools: callchain: Fix spurious 'perf report' warnings: ignore empty callchains
  perf record: Fix the -A UI for empty or non-existent perf.data
  perf util: Fix do_read() to fail on EOF instead of busy-looping
  perf list: Fix the output to not include tracepoints without an id
  perf_counter/powerpc: Fix oops on cpus without perf_counter hardware support
  perf stat: Fix tool option consistency: rename -S/--scale to -c/--scale
  perf report: Add debug help for the finding of symbol bugs - show the symtab origin (DSO, build-id, kernel, etc)
  perf report: Fix per task mult-counter stat reporting
  ...
2009-08-10 11:48:51 -07:00
Linus Torvalds a3263969b0 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: Fix serialization in pit_expect_msb()
2009-08-10 11:11:40 -07:00
Linus Torvalds b6e61eef4f x86: Fix serialization in pit_expect_msb()
Wei Chong Tan reported a fast-PIT-calibration corner-case:

| pit_expect_msb() is vulnerable to SMI disturbance corner case
| in some platforms which causes /proc/cpuinfo to show wrong
| CPU MHz value when quick_pit_calibrate() jumps to success
| section.

I think that the real issue isn't even an SMI - but the fact
that in the very last iteration of the loop, there's no
serializing instruction _after_ the last 'rdtsc'. So even in
the absense of SMI's, we do have a situation where the cycle
counter was read without proper serialization.

The last check should be done outside the outer loop, since
_inside_ the outer loop, we'll be testing that the PIT has
the right MSB value has the right value in the next iteration.

So only the _last_ iteration is special, because that's the one
that will not check the PIT MSB value any more, and because the
final 'get_cycles()' isn't serialized.

In other words:

 - I'd like to move the PIT MSB check to after the last
   iteration, rather than in every iteration

 - I think we should comment on the fact that it's also a
   serializing instruction and so 'fences in' the TSC read.

Here's a suggested replacement.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Reported-by: "Tan, Wei Chong" <wei.chong.tan@intel.com>
Tested-by: "Tan, Wei Chong" <wei.chong.tan@intel.com>
LKML-Reference: <B28277FD4E0F9247A3D55704C440A140D5D683F3@pgsmsx504.gar.corp.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-10 19:56:57 +02:00
Linus Torvalds 2c661a669b Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
  powerpc/dma: pci_set_dma_mask() shouldn't fail if mask fits in RAM
2009-08-10 08:59:56 -07:00
Jaswinder Singh Rajput 04e35357e2 MN10300: includecheck fix: mn10300, pci.h
Fix the following 'make includecheck' warning:

  arch/mn10300/include/asm/pci.h: linux/mm.h is included more than once.

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-10 08:54:27 -07:00
Benjamin Herrenschmidt b2f2e8fee3 powerpc/dma: pci_set_dma_mask() shouldn't fail if mask fits in RAM
On an iMac G5, the b43 driver is failing to initialise because trying to
set the dma mask to 30-bit fails. Even though there's only 512MiB of RAM
in the machine anyway:
	https://bugzilla.redhat.com/show_bug.cgi?id=514787

We should probably let it succeed if the available RAM in the system
doesn't exceed the requested limit.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-08-10 16:36:38 +10:00
Linus Torvalds 17d11ba149 Merge branch 'kvm-updates/2.6.31' of git://git.kernel.org/pub/scm/virt/kvm/kvm
* 'kvm-updates/2.6.31' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: Avoid redelivery of edge interrupt before next edge
  KVM: MMU: limit rmap chain length
  KVM: ia64: fix build failures due to ia64/unsigned long mismatches
  KVM: Make KVM_HPAGES_PER_HPAGE unsigned long to avoid build error on powerpc
  KVM: fix ack not being delivered when msi present
  KVM: s390: fix wait_queue handling
  KVM: VMX: Fix locking imbalance on emulation failure
  KVM: VMX: Fix locking order in handle_invalid_guest_state
  KVM: MMU: handle n_free_mmu_pages > n_alloc_mmu_pages in kvm_mmu_change_mmu_pages
  KVM: SVM: force new asid on vcpu migration
  KVM: x86: verify MTRR/PAT validity
  KVM: PIT: fix kpit_elapsed division by zero
  KVM: Fix KVM_GET_MSR_INDEX_LIST
2009-08-09 14:58:21 -07:00
Linus Torvalds 413dd8768a Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: fix buffer overflow in efi_init()
  x86: Add quirk to make Apple MacBookPro5,1 use reboot=pci
  x86: Fix MSI-X initialization by using online_mask for x2apic target_cpus
  x86: Fix VMI && stack protector
2009-08-09 14:57:09 -07:00
Markus Metzger 30dd568c91 x86, perf_counter, bts: Add BTS support to perfcounters
Implement a performance counter with:

    attr.type           = PERF_TYPE_HARDWARE
    attr.config         = PERF_COUNT_HW_BRANCH_INSTRUCTIONS
    attr.sample_period  = 1

Using branch trace store (BTS) on x86 hardware, if available.

The from and to address for each branch can be sampled using:

    PERF_SAMPLE_IP      for the from address
    PERF_SAMPLE_ADDR    for the to address

[ v2: address review feedback, fix bugs ]

Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-09 13:04:14 +02:00
Paul Mackerras f36a1a133a perf_counter/powerpc: Fix oops on cpus without perf_counter hardware support
If we have the powerpc perf_counter backend compiled in, but
the cpu we are running on is one where we don't support the
PMU, we currently oops in hw_perf_group_sched_in if we try to
use any counters, because ppmu is NULL in that case, and we
unconditionally dereference ppmu.

This fixes the problem by adding a check if ppmu is NULL at the
beginning of hw_perf_group_sched_in, and also at the beginning
of the other functions that get called from the perf_counter
core, i.e. hw_perf_disable, hw_perf_enable, and
hw_perf_counter_setup.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: benh@kernel.crashing.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-09 12:54:37 +02:00
Roel Kluin fdb8a42742 x86: fix buffer overflow in efi_init()
If the vendor name (from c16) can be longer than 100 bytes (or missing a
terminating null), then the null is written past the end of vendor[].

Found with Parfait, http://research.sun.com/projects/parfait/

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: Huang Ying <ying.huang@intel.com>
2009-08-09 01:08:42 -07:00
Ozan Çağlayan 498cdbfbcf x86: Add quirk to make Apple MacBookPro5,1 use reboot=pci
MacBookPro5,1 is not able to reboot unless reboot=pci is set.
This patch forces it through a DMI quirk specific to this
device.

Signed-off-by: Ozan Çağlayan <ozan@pardus.org.tr>
LKML-Reference: <1249403971-6543-1-git-send-email-ozan@pardus.org.tr>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-08 17:09:11 +02:00
Yinghai Lu 087d7e56de x86: Fix MSI-X initialization by using online_mask for x2apic target_cpus
found a system where x2apic reports an MSI-X irq initialization
failure:

[  302.859446] igbvf 0000:81:10.4: enabling device (0000 -> 0002)
[  302.874369] igbvf 0000:81:10.4: using 64bit DMA mask
[  302.879023] igbvf 0000:81:10.4: using 64bit consistent DMA mask
[  302.894386] igbvf 0000:81:10.4: enabling bus mastering
[  302.898171] igbvf 0000:81:10.4: setting latency timer to 64
[  302.914050] reserve_memtype added 0xefb08000-0xefb0c000, track uncached-minus, req uncached-minus, ret uncached-minus
[  302.933839] reserve_memtype added 0xefb28000-0xefb29000, track uncached-minus, req uncached-minus, ret uncached-minus
[  302.940367]   alloc irq_desc for 265 on node 4
[  302.956874]   alloc kstat_irqs on node 4
[  302.959452] alloc irq_2_iommu on node 0
[  302.974328] igbvf 0000:81:10.4: irq 265 for MSI/MSI-X
[  302.977778]   alloc irq_desc for 266 on node 4
[  302.980347]   alloc kstat_irqs on node 4
[  302.995312] free_memtype request 0xefb28000-0xefb29000
[  302.998816] igbvf 0000:81:10.4: Failed to initialize MSI-X interrupts.

... it turns out that when trying to enable MSI-X,
__assign_irq_vector(new, cfg_new, apic->target_cpus()) can not
get vector because for x2apic target-cpus returns cpumask_of(0)

Update that to online_mask like xapic.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
LKML-Reference: <4A785AFF.3050902@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-08 17:04:58 +02:00
Gupta, Ajay Kumar e8e2ff462d USB: musb: fix the nop registration for OMAP3EVM
OMAP3EVM uses ISP1504 phy which doesn't require any programming and
thus has to use NOP otg transceiver.

Cleanups being done:
	- Remove unwanted code in usb-musb.c file
	- Register NOP in OMAP3EVM board file using
	  usb_nop_xceiv_register().
	- Select NOP_USB_XCEIV for OMAP3EVM boards.
	- Don't enable TWL4030_USB in omap3_evm_defconfig

Signed-off-by: Ajay Kumar Gupta <ajay.gupta@ti.com>
Signed-off-by: Eino-Ville Talvala <talvala@stanford.edu>
Acked-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-08-07 16:05:13 -07:00
Linus Torvalds 36b8659f93 Merge master.kernel.org:/home/rmk/linux-2.6-arm
* master.kernel.org:/home/rmk/linux-2.6-arm: (30 commits)
  ARM: 5639/1: arm: clkdev.c should include <linux/clk.h>
  ARM: 5638/1: arch/arm/kernel/signal.c: use correct address space for CRUNCH
  ARM: 5637/1: [KS8695] Don't reference CLOCK_TICK_RATE in drivers
  ARM: S3C64XX: serial: Fix section mismatch warning
  ARM: S3C24XX: serial: Fix section mismatch warnings
  ARM: S3C: PWM fix for low duty cycle
  ARM: 5597/1: [PCI] reset all internal hardware prior PCI initialization
  ARM: 5627/1: Fix restoring of lr at the end of mcount
  ARM: 5624/1: Document cache aliasing region
  S3C64XX: Fix ARMCLK configuration
  S3C64XX: Fix get_rate() for ARMCLK
  S3C24XX: GPIO: Fix pin range check in s3c_gpiolib_getchip
  mx3 defconfig update
  mx27 defconfig update
  ARM: 5623/1: Treo680: ir shutdown typo fix
  ARM: includecheck fix: plat-stmp3xxx/pinmux.c
  ARM: includecheck fix: plat-s3c64xx/pm.c
  ARM: includecheck fix: mach-omap2/mcbsp.c
  ARM: includecheck fix: mach-omap1/mcbsp.c
  ARM: includecheck fix: board-sffsdr.c
  ...
2009-08-07 10:46:51 -07:00
Linus Torvalds 9cf9d28e9b Merge branch 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6
* 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6:
  [S390] KVM: Read buffer overflow
  [S390] kernel: Storing machine flags early in lowcore
2009-08-07 10:46:09 -07:00
Roel Kluin 53cb780adb [S390] KVM: Read buffer overflow
Check whether index is within bounds before testing the element.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-08-07 10:40:40 +02:00
Hendrik Brueckner 677c1dd706 [S390] kernel: Storing machine flags early in lowcore
Currently, the machine_flags are stored late in the startup
initialization which results in failing machine type checks
(e.g. for MACHINE_IS_VM).
To allow these checks, store the machine flags in the lowcore
when the machine type has been detected.

Moving the machine_flags to the lowcore has been introduced with
git commit 25097bf153

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-08-07 10:40:39 +02:00
Benjamin Herrenschmidt e0d82a0a4e perf_counter/powerpc: Check oprofile_cpu_type for NULL before using it
If the current CPU doesn't support performance counters,
cur_cpu_spec->oprofile_cpu_type can be NULL. The current
perf_counter modules don't test for that case and would thus
crash at boot time.

Bug reported by David Woodhouse.

Reported-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Paul Mackerras <paulus@samba.org>
LKML-Reference: <19066.48028.446975.501454@cargo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-06 13:55:09 +02:00
Marcelo Tosatti 53a27b39ff KVM: MMU: limit rmap chain length
Otherwise the host can spend too long traversing an rmap chain, which
happens under a spinlock.

Cc: stable@kernel.org
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-08-06 12:06:54 +03:00
Hartley Sweeten c0c60c4b9a ARM: 5639/1: arm: clkdev.c should include <linux/clk.h>
<linux/clk.h> should be included to get the base API prototypes.

This fixes the following sparse warnings:

  arch/arm/common/clkdev.c:65:12:
    warning: symbol 'clk_get_sys' was not declared. Should it be static?

  arch/arm/common/clkdev.c:79:12:
    warning: symbol 'clk_get' was not declared. Should it be static?

  arch/arm/common/clkdev.c:87:6:
    warning: symbol 'clk_put' was not declared. Should it be static?

Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-08-05 22:06:58 +01:00
Hartley Sweeten 65a5053b76 ARM: 5638/1: arch/arm/kernel/signal.c: use correct address space for CRUNCH
preserve_crunch_context() calls __copy_to_user() which expects the
destination address to be in __user space.  setup_sigframe() properly
passes the destination address.

restore_crunch_context() calls __copy_from_user() which expects the
source address to be in __user space.  restore_sigframe() properly
passes the source address.

This fixes {preserve/restore}_crunch_context() to accept the
address as __user space and resolves the following sparse warnings:

  arch/arm/kernel/signal.c:146:31:
     warning: incorrect type in argument 1 (different address spaces)
        expected void [noderef] <asn:1>*to
        got struct crunch_sigframe *frame

  arch/arm/kernel/signal.c:156:38:
     warning: incorrect type in argument 2 (different address spaces)
        expected void const [noderef] <asn:1>*from
        got struct crunch_sigframe *frame

  arch/arm/kernel/signal.c:250:48:
     warning: incorrect type in argument 1 (different address spaces)
        expected struct crunch_sigframe *frame
        got struct crunch_sigframe [noderef] <asn:1>*<noident>

  arch/arm/kernel/signal.c:365:49:
     warning: incorrect type in argument 1 (different address spaces)
        expected struct crunch_sigframe *frame
        got struct crunch_sigframe [noderef] <asn:1>*<noident>

Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-08-05 22:06:58 +01:00
Andrew Victor 0a51810aa0 ARM: 5637/1: [KS8695] Don't reference CLOCK_TICK_RATE in drivers
Stop referencing CLOCK_TICK_RATE in the KS8695 drivers, rather refer
to a KS8695_CLOCK_RATE.
Issue pointed out by Russell King on arm-linux-kernel mailing list.

Signed-off-by: Andrew Victor <linux@maxim.org.za>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-08-05 22:06:56 +01:00
Avi Kivity e9cbde8c15 KVM: ia64: fix build failures due to ia64/unsigned long mismatches
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-08-05 15:04:16 +03:00
Stephen Rothwell c428dcc9b9 KVM: Make KVM_HPAGES_PER_HPAGE unsigned long to avoid build error on powerpc
Eliminates this compiler warning:

arch/powerpc/kvm/../../../virt/kvm/kvm_main.c:1178: error: integer overflow in expression

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-08-05 14:51:33 +03:00
Christian Borntraeger d3bc2f91b4 KVM: s390: fix wait_queue handling
There are two waitqueues in kvm for wait handling:
vcpu->wq for virt/kvm/kvm_main.c and
vpcu->arch.local_int.wq for the s390 specific wait code.

the wait handling in kvm_s390_handle_wait was broken by using different
wait_queues for add_wait queue and remove_wait_queue.

There are two options to fix the problem:
o  move all the s390 specific code to vcpu->wq and remove
   vcpu->arch.local_int.wq
o  move all the s390 specific code to vcpu->arch.local_int.wq

This patch chooses the 2nd variant for two reasons:
o  s390 does not use kvm_vcpu_block but implements its own enabled wait
   handling.
   Having a separate wait_queue make it clear, that our wait mechanism is
   different
o  the patch is much smaller

Report-by:  Julia Lawall <julia@diku.dk>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-08-05 13:59:46 +03:00