Commit Graph

1690 Commits

Author SHA1 Message Date
Joerg Roedel ee893c24ed AMD IOMMU: save pci segment from ACPI tables
This patch adds the pci_seg field to the amd_iommu structure and fills
it with the corresponding value from the ACPI table.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-19 12:59:12 +02:00
Joerg Roedel 335503e57b AMD IOMMU: add event buffer allocation
This patch adds the allocation of a event buffer for each AMD IOMMU in
the system. The hardware will log events like device page faults or
other errors to this buffer once this is enabled.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-19 12:59:11 +02:00
Joerg Roedel 1c65577398 AMD IOMMU: implement lazy IO/TLB flushing
The IO/TLB flushing on every unmaping operation is the most expensive
part in AMD IOMMU code and not strictly necessary. It is sufficient to
do the flush before any entries are reused. This is patch implements
lazy IO/TLB flushing which does exactly this.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-19 12:59:07 +02:00
Joerg Roedel 2842e5bf31 x86: move GART TLB flushing options to generic code
The GART currently implements the iommu=[no]fullflush command line
parameters which influence its IO/TLB flushing strategy. This patch
makes these parameters generic so that they can be used by the AMD IOMMU
too.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-19 12:59:06 +02:00
FUJITA Tomonori 982162602b x86: convert dma_alloc_coherent to use is_device_dma_capable
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-10 13:57:02 +02:00
FUJITA Tomonori 8a493d37f0 x86: remove duplicated extern force_iommu
Both iommu.h and dma-mapping.h have extern force_iommu. The latter
doesn't need to do.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-10 13:57:01 +02:00
Ingo Molnar e92b4fdacc Merge commit 'v2.6.27-rc6' into x86/iommu 2008-09-10 11:32:52 +02:00
Steven Noonan 9fcaff0e66 x86: unused variable in dma_alloc_coherent_gfp_flags()
Fixed a warning caused by a badly placed ifdef.

Signed-off-by: Steven Noonan <steven@uplinklabs.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-09 08:57:09 +02:00
FUJITA Tomonori 823e7e8c6e x86: dma_alloc_coherent sets gfp flags properly
Non real IOMMU implemenations (which doesn't do virtual mappings,
e.g. swiotlb, pci-nommu, etc) need to use proper gfp flags and
dma_mask to allocate pages in their own dma_alloc_coherent()
(allocated page need to be suitable for device's coherent_dma_mask).

This patch makes dma_alloc_coherent do this job so that IOMMUs don't
need to take care of it any more.

Real IOMMU implemenataions can simply ignore the gfp flags.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-08 15:50:07 +02:00
FUJITA Tomonori 8a53ad675f x86: fix nommu_alloc_coherent allocation with NULL device argument
We need to use __GFP_DMA for NULL device argument (fallback_dev) with
pci-nommu. It's a hack for ISA (and some old code) so we need to use
GFP_DMA.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-08 15:50:06 +02:00
FUJITA Tomonori de9f521fb7 x86: move pci-nommu's dma_mask check to common code
The check to see if dev->dma_mask is NULL in pci-nommu is more
appropriate for dma_alloc_coherent().

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-08 15:50:06 +02:00
H. Peter Anvin b6734c35af x86: add NOPL as a synthetic CPU feature bit
The long noops ("NOPL") are supposed to be detected by family >= 6.
Unfortunately, several non-Intel x86 implementations, both hardware
and software, don't obey this dictum.  Instead, probe for NOPL
directly by executing a NOPL instruction and see if we get #UD.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-09-05 16:13:52 -07:00
David Woodhouse e51af66308 x86: blacklist DMAR on Intel G31/G33 chipsets
Some BIOSes (the Intel DG33BU, for example) wrongly claim to have DMAR
when they don't. Avoid the resulting crashes when it doesn't work as
expected.

I'd still be grateful if someone could test it on a DG33BU with the old
BIOS though, since I've killed mine. I tested the DMI version, but not
this one.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-05 20:20:25 +02:00
Linus Torvalds e52c8857e0 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: update defconfigs
  x86: msr: fix bogus return values from rdmsr_safe/wrmsr_safe
  x86: cpuid: correct return value on partial operations
  x86: msr: correct return value on partial operations
  x86: cpuid: propagate error from smp_call_function_single()
  x86: msr: propagate errors from smp_call_function_single()
  smp: have smp_call_function_single() detect invalid CPUs
2008-08-28 12:30:59 -07:00
H. Peter Anvin 08970fc4e0 x86: msr: fix bogus return values from rdmsr_safe/wrmsr_safe
Impact: bogus error codes (+other?) on x86-64

The rdmsr_safe/wrmsr_safe routines have macros for the handling of the
edx:eax arguments.  Those macros take a variable number of assembly
arguments.  This is rather inherently incompatible with using
%digit-style escapes in the inline assembly; replace those with
%[name]-style escapes.

This fixes miscompilation on x86-64, which at the very least caused
bogus return values.  It is possible that this could also corrupt the
return value; I am not sure.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-08-25 22:39:15 -07:00
H. Peter Anvin c6f31932d0 x86: msr: propagate errors from smp_call_function_single()
Propagate error (-ENXIO) from smp_call_function_single().  These
errors can happen when a CPU is unplugged while the MSR driver is
open.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-08-25 17:45:48 -07:00
Linus Torvalds ec73adba51 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: add X86_FEATURE_XMM4_2 definitions
  x86: fix cpufreq + sched_clock() regression
  x86: fix HPET regression in 2.6.26 versus 2.6.25, check hpet against BAR, v3
  x86: do not enable TSC notifier if we don't need it
  x86 MCE: Fix CPU hotplug problem with multiple multicore AMD CPUs
  x86: fix: make PCI ECS for AMD CPUs hotplug capable
  x86: fix: do not run code in amd_bus.c on non-AMD CPUs
2008-08-25 11:26:33 -07:00
Austin Zhang 2a61812af2 x86: add X86_FEATURE_XMM4_2 definitions
Added Intel processor SSE4.2 feature flag.

No in-tree user at the moment, but makes the tree-merging life easier
for the crypto tree.

Signed-off-by: Austin Zhang <austin.zhang@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-25 17:28:16 +02:00
Eduardo Habkost 18b13e5457 KVM: Use .fixup instead of .text.fixup on __kvm_handle_fault_on_reboot
vmlinux.lds expects the fixup code to be on a section named .fixup. The
.text.fixup section is not mentioned on vmlinux.lds, and is included on
the resulting vmlinux (just after .text) only because of ld heuristics on
placing orphan sections.

However, placing .text.fixup outside .text breaks the definition of
_etext, making it exclude the .text.fixup contents. That makes .text.fixup
be ignored by the kernel initialization code that needs to know about
section locations, such as the code setting page protection bits.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-08-25 17:22:57 +03:00
Ingo Molnar f58899bb02 Merge branch 'linus' into x86/urgent 2008-08-25 14:39:12 +02:00
Adrian Bunk 7a8fc9b248 removed unused #include <linux/version.h>'s
This patch lets the files using linux/version.h match the files that
#include it.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-23 12:14:12 -07:00
Rafael J. Wysocki 8735728ef8 x86 MCE: Fix CPU hotplug problem with multiple multicore AMD CPUs
During CPU hot-remove the sysfs directory created by
threshold_create_bank(), defined in
arch/x86/kernel/cpu/mcheck/mce_amd_64.c, has to be removed before
its parent directory, created by mce_create_device(), defined in
arch/x86/kernel/cpu/mcheck/mce_64.c .  Moreover, when the CPU in
question is hotplugged again, obviously the latter has to be created
before the former.  At present, the right ordering is not enforced,
because all of these operations are carried out by CPU hotplug
notifiers which are not appropriately ordered with respect to each
other.  This leads to serious problems on systems with two or more
multicore AMD CPUs, among other things during suspend and hibernation.

Fix the problem by placing threshold bank CPU hotplug callbacks in
mce_cpu_callback(), so that they are invoked at the right places,
if defined.  Additionally, use kobject_del() to remove the sysfs
directory associated with the kobject created by
kobject_create_and_add() in threshold_create_bank(), to prevent the
kernel from crashing during CPU hotplug operations on systems with
two or more multicore AMD CPUs.

This patch fixes bug #11337.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Andi Kleen <andi@firstfloor.org>
Tested-by: Mark Langsdorf <mark.langsdorf@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-23 17:49:19 +02:00
Marcin Slusarz c4bd1fdab0 x86: fix section mismatch warning - uv_cpu_init
WARNING: vmlinux.o(.cpuinit.text+0x3cc4): Section mismatch in reference from the function uv_cpu_init() to the function .init.text:uv_system_init()
The function __cpuinit uv_cpu_init() references
a function __init uv_system_init().
If uv_system_init is only used by uv_cpu_init then
annotate uv_system_init with a matching annotation.

uv_system_init was ment to be called only once, so do it from codepath
(native_smp_prepare_cpus) which is called once, right before activation
of other cpus (smp_init).

Note: old code relied on uv_node_to_blade being initialized to 0,
but it'a not initialized from anywhere.

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Acked-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-22 14:12:20 +02:00
FUJITA Tomonori 766af9fa81 dma-mapping.h, x86: remove last user of dma_mapping_ops->map_simple
pci-dma.c doesn't use map_simple hook any more so we can remove it
from struct dma_mapping_ops now.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-22 08:43:25 +02:00
Joerg Roedel 6c505ce393 x86: move dma_*_coherent functions to include file
All the x86 DMA-API functions are defined in asm/dma-mapping.h. This patch
moves the dma_*_coherent functions also to this header file because they are
now small enough to do so.
This is done as a separate patch because it also includes some renaming and
restructuring of the dma-mapping.h file.

Signed-off-by: Joerg Roedel <joerg.roede@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-22 08:34:51 +02:00
Cliff Wickman 99dd871330 x86, SGI UV: hardcode the TLB flush interrupt system vector
The UV TLB shootdown mechanism needs a system interrupt vector.

Its vector had been hardcoded as 200, but needs to moved to the reserved
system vector range so that it does not collide with some device vector.

This is still temporary until dynamic system IRQ allocation is provided.
But it will be needed when real UV hardware becomes available and runs 2.6.27.

Signed-off-by: Cliff Wickman <cpw@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-20 12:36:03 +02:00
Linus Torvalds a7f5aaf36d Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: fix build warnings in real mode code
  x86, calgary: fix section mismatch warning - get_tce_space_from_tar
  x86: silence section mismatch warning - get_local_pda
  x86, percpu: silence section mismatch warnings related to EARLY_PER_CPU variables
  x86: fix i486 suspend to disk CR4 oops
  x86: mpparse.c: fix section mismatch warning
  x86: mmconf: fix section mismatch warning
  x86: fix MP_processor_info section mismatch warning
  x86, tsc: fix section mismatch warning
  x86: correct register constraints for 64-bit atomic operations
2008-08-18 12:10:14 -07:00
Marcin Slusarz c6a92a2501 x86, percpu: silence section mismatch warnings related to EARLY_PER_CPU variables
Quoting Mike Travis in "x86: cleanup early per cpu variables/accesses v4"
(23ca4bba3e):

    The DEFINE macro defines the per_cpu variable as well as the early
    map and pointer.  It also initializes the per_cpu variable and map
    elements to "_initvalue".  The early_* macros provide access to
    the initial map (usually setup during system init) and the early
    pointer.  This pointer is initialized to point to the early map
    but is then NULL'ed when the actual per_cpu areas are setup.  After
    that the per_cpu variable is the correct access to the variable.

As these variables are NULL'ed before __init sections are dropped
(in setup_per_cpu_maps), they can be safely annotated as __ref.

This change silences following section mismatch warnings:

WARNING: vmlinux.o(.data+0x46c0): Section mismatch in reference from the variable x86_cpu_to_apicid_early_ptr to the variable .init.data:x86_cpu_to_apicid_early_map
The variable x86_cpu_to_apicid_early_ptr references
the variable __initdata x86_cpu_to_apicid_early_map
If the reference is valid then annotate the
variable with __init* (see linux/init.h) or name the variable:
*driver, *_template, *_timer, *_sht, *_ops, *_probe, *_probe_one, *_console,

WARNING: vmlinux.o(.data+0x46c8): Section mismatch in reference from the variable x86_bios_cpu_apicid_early_ptr to the variable .init.data:x86_bios_cpu_apicid_early_map
The variable x86_bios_cpu_apicid_early_ptr references
the variable __initdata x86_bios_cpu_apicid_early_map
If the reference is valid then annotate the
variable with __init* (see linux/init.h) or name the variable:
*driver, *_template, *_timer, *_sht, *_ops, *_probe, *_probe_one, *_console,

WARNING: vmlinux.o(.data+0x46d0): Section mismatch in reference from the variable x86_cpu_to_node_map_early_ptr to the variable .init.data:x86_cpu_to_node_map_early_map
The variable x86_cpu_to_node_map_early_ptr references
the variable __initdata x86_cpu_to_node_map_early_map
If the reference is valid then annotate the
variable with __init* (see linux/init.h) or name the variable:
*driver, *_template, *_timer, *_sht, *_ops, *_probe, *_probe_one, *_console,

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-18 09:10:55 +02:00
Marcin Slusarz c72a5efec1 x86: mmconf: fix section mismatch warning
WARNING: arch/x86/kernel/built-in.o(.cpuinit.text+0x1591): Section mismatch in reference from the function init_amd() to the function .init.text:check_enable_amd_mmconf_dmi()
The function __cpuinit init_amd() references
a function __init check_enable_amd_mmconf_dmi().
If check_enable_amd_mmconf_dmi is only used by init_amd then
annotate check_enable_amd_mmconf_dmi with a matching annotation.

check_enable_amd_mmconf_dmi is only called from init_amd which is __cpuinit

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-18 07:49:06 +02:00
Mathieu Desnoyers 3c3b5c3b0b x86: correct register constraints for 64-bit atomic operations
x86_64 add/sub atomic ops does not seems to accept integer values bigger
than 32 bits as immediates. Intel's add/sub documentation specifies they
have to be passed as registers.

The only operations in the x86-64 architecture which accept arbitrary
64-bit immediates is "movq" to any register; similarly, the only
operation which accept arbitrary 64-bit displacement is "movabs" to or
from al/ax/eax/rax.

http://gcc.gnu.org/onlinedocs/gcc-4.3.0/gcc/Machine-Constraints.html

states :

e
    32-bit signed integer constant, or a symbolic reference known to fit
    that range (for immediate operands in sign-extending x86-64
    instructions).
Z
    32-bit unsigned integer constant, or a symbolic reference known to
    fit that range (for immediate operands in zero-extending x86-64
    instructions).

Since add/sub does sign extension, using the "e" constraint seems appropriate.

It applies to 2.6.27-rc, 2.6.26, 2.6.25...

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-18 07:47:30 +02:00
Linus Torvalds 0473b79929 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (32 commits)
  x86: add MAP_STACK mmap flag
  x86: fix section mismatch warning - spp_getpage()
  x86: change init_gdt to update the gdt via write_gdt, rather than a direct write.
  x86-64: fix overlap of modules and fixmap areas
  x86, geode-mfgpt: check IRQ before using MFGPT as clocksource
  x86, acpi: cleanup, temp_stack is used only when CONFIG_SMP is set
  x86: fix spin_is_contended()
  x86, nmi: clean UP NMI watchdog failure message
  x86, NMI: fix watchdog failure message
  x86: fix /proc/meminfo DirectMap
  x86: fix readb() et al compile error with gcc-3.2.3
  arch/x86/Kconfig: clean up, experimental adjustement
  x86: invalidate caches before going into suspend
  x86, perfctr: don't use CCCR_OVF_PMI1 on Pentium 4Ds
  x86, AMD IOMMU: initialize dma_ops after sysfs registration
  x86m AMD IOMMU: cleanup: replace LOW_U32 macro with generic lower_32_bits
  x86, AMD IOMMU: initialize device table properly
  x86, AMD IOMMU: use status bit instead of memory write-back for completion wait
  x86: silence mmconfig printk
  x86, msr: fix NULL pointer deref due to msr_open on nonexistent CPUs
  ...
2008-08-16 17:14:07 -07:00
Ingo Molnar cd98a04a59 x86: add MAP_STACK mmap flag
as per this discussion:

   http://lkml.org/lkml/2008/8/12/423

Pardo reported that 64-bit threaded apps, if their stacks exceed the
combined size of ~4GB, slow down drastically in pthread_create() - because
glibc uses MAP_32BIT to allocate the stacks. The use of MAP_32BIT is
a legacy hack - to speed up context switching on certain early model
64-bit P4 CPUs.

So introduce a new flag to be used by glibc instead, to not constrain
64-bit apps like this.

glibc can switch to this new flag straight away - it will be ignored
by the kernel. If those old CPUs ever matter to anyone, support for
it can be implemented.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Ulrich Drepper <drepper@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-15 11:45:19 -07:00
Ingo Molnar 2fdc86901d x86: add MAP_STACK mmap flag
as per this discussion:

   http://lkml.org/lkml/2008/8/12/423

Pardo reported that 64-bit threaded apps, if their stacks exceed the
combined size of ~4GB, slow down drastically in pthread_create() - because
glibc uses MAP_32BIT to allocate the stacks. The use of MAP_32BIT is
a legacy hack - to speed up context switching on certain early model
64-bit P4 CPUs.

So introduce a new flag to be used by glibc instead, to not constrain
64-bit apps like this.

glibc can switch to this new flag straight away - it will be ignored
by the kernel. If those old CPUs ever matter to anyone, support for
it can be implemented.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Ulrich Drepper <drepper@gmail.com>
2008-08-15 19:17:33 +02:00
Ingo Molnar 529d0e402e Merge branch 'x86/geode' into x86/urgent 2008-08-15 17:53:07 +02:00
Huang Ying fb45daa69d kexec jump: check code size in control page
Kexec/Kexec-jump require code size in control page is less than
PAGE_SIZE/2.  This patch add link-time checking for this.

ASSERT() of ld link script is used as the link-time checking mechanism.

[akpm@linux-foundation.org: build fix]
Signed-off-by: Huang Ying <ying.huang@intel.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-15 08:35:42 -07:00
Huang Ying 163f6876f5 kexec jump: rename KEXEC_CONTROL_CODE_SIZE to KEXEC_CONTROL_PAGE_SIZE
Rename KEXEC_CONTROL_CODE_SIZE to KEXEC_CONTROL_PAGE_SIZE, because control
page is used for not only code on some platform.  For example in kexec
jump, it is used for data and stack too.

[akpm@linux-foundation.org: unbreak powerpc and arm, finish conversion]
Signed-off-by: Huang Ying <ying.huang@intel.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-15 08:35:42 -07:00
Jan Beulich 66d4bdf22b x86-64: fix overlap of modules and fixmap areas
Plus add a build time check so this doesn't go unnoticed again.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-15 17:31:50 +02:00
Jens Rottmann 0d5cdc97e2 x86, geode-mfgpt: check IRQ before using MFGPT as clocksource
Adds a simple IRQ autodetection to the AMD Geode MFGPT driver, and more
importantly, adds some checks, if IRQs can actually be received on the
chosen line.  This fixes cases where MFGPT is selected as clocksource
though not producing any ticks, so the kernel simply starves during
boot.

Signed-off-by: Jens Rottmann <JRottmann@LiPPERTEmbedded.de>
Cc: Andres Salomon <dilinger@debian.org>
Cc: linux-geode@bombadil.infradead.org
Cc: Jordan Crouse <jordan.crouse@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-15 17:12:32 +02:00
Jan Beulich 7bc069c6bc x86: fix spin_is_contended()
The masked difference is what needs to be compared against 1, rather
than the difference of masked values (which can be negative).

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Acked-by: Nick Piggin <npiggin@suse.de>
Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-15 16:26:51 +02:00
Mikael Pettersson 1c5b0eb66d x86: fix readb() et al compile error with gcc-3.2.3
Building 2.6.27-rc1 on x86 with gcc-3.2.3 fails with:

In file included from include/asm/dma.h:12,
                 from include/linux/bootmem.h:8,
                 from init/main.c:26:
include/asm/io.h: In function `readb':
include/asm/io.h:32: syntax error before string constant
include/asm/io.h: In function `readw':
include/asm/io.h:33: syntax error before string constant
include/asm/io.h: In function `readl':
include/asm/io.h:34: syntax error before string constant
include/asm/io.h: In function `__readb':
include/asm/io.h:36: syntax error before string constant
include/asm/io.h: In function `__readw':
include/asm/io.h:37: syntax error before string constant
include/asm/io.h: In function `__readl':
include/asm/io.h:38: syntax error before string constant
make[1]: *** [init/main.o] Error 1
make: *** [init] Error 2

Starting with 2.6.27-rc1 readb() et al are generated by a
build_mmio_read() macro, which generates asm() statements with
output register constraints like "=" "q", i.e. as two adjacent
string literals. This doesn't work with gcc-3.2.3.

Fixed by moving the "=" part into the callers' reg parameter
(as suggested by Ingo).

Build and boot-tested with gcc-3.2.3 on 32 and 64-bit x86.

Fixes <http://bugzilla.kernel.org/show_bug.cgi?id=11205>.

Signed-off-by: Mikael Pettersson <mikpe@it.uu.se>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-15 14:30:32 +02:00
Mark Langsdorf 394a15051c x86: invalidate caches before going into suspend
When a CPU core is shut down, all of its caches need to be flushed
to prevent stale data from causing errors if the core is resumed.
Current Linux suspend code performs an assignment after the flush,
which can add dirty data back to the cache.  On some AMD platforms,
additional speculative reads have caused crashes on resume because
of this dirty data.

Relocate the cache flush to be the very last thing done before
halting.  Tie into an assembly line so the compile will not
reorder it.  Add some documentation explaining what is going
on and why we're doing this.

Signed-off-by: Mark Langsdorf <mark.langsdorf@amd.com>
Acked-by: Mark Borden <mark.borden@amd.com>
Acked-by: Michael Hohmuth <michael.hohmuth@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-15 14:04:30 +02:00
Ingo Molnar 975439fe73 Merge branch 'x86/amd-iommu' into x86/urgent 2008-08-15 13:57:32 +02:00
Joerg Roedel 8a456695c5 x86m AMD IOMMU: cleanup: replace LOW_U32 macro with generic lower_32_bits
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-15 13:56:56 +02:00
Joerg Roedel 9f5f5fb35d x86, AMD IOMMU: initialize device table properly
This patch adds device table initializations which forbids memory accesses
for devices per default and disables all page faults.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-15 13:56:54 +02:00
Joerg Roedel 519c31bacf x86, AMD IOMMU: use status bit instead of memory write-back for completion wait
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-15 13:56:46 +02:00
Ingo Molnar 3167761965 Merge branch 'x86/fpu' into x86/urgent 2008-08-14 11:18:08 +02:00
Suresh Siddha e49140120c crypto: padlock - fix VIA PadLock instruction usage with irq_ts_save/restore()
Wolfgang Walter reported this oops on his via C3 using padlock for
AES-encryption:

##################################################################

BUG: unable to handle kernel NULL pointer dereference at 000001f0
IP: [<c01028c5>] __switch_to+0x30/0x117
*pde = 00000000
Oops: 0002 [#1] PREEMPT
Modules linked in:

Pid: 2071, comm: sleep Not tainted (2.6.26 #11)
EIP: 0060:[<c01028c5>] EFLAGS: 00010002 CPU: 0
EIP is at __switch_to+0x30/0x117
EAX: 00000000 EBX: c0493300 ECX: dc48dd00 EDX: c0493300
ESI: dc48dd00 EDI: c0493530 EBP: c04cff8c ESP: c04cff7c
 DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
Process sleep (pid: 2071, ti=c04ce000 task=dc48dd00 task.ti=d2fe6000)
Stack: dc48df30 c0493300 00000000 00000000 d2fe7f44 c03b5b43 c04cffc8 00000046
       c0131856 0000005a dc472d3c c0493300 c0493470 d983ae00 00002696 00000000
       c0239f54 00000000 c04c4000 c04cffd8 c01025fe c04f3740 00049800 c04cffe0
Call Trace:
 [<c03b5b43>] ? schedule+0x285/0x2ff
 [<c0131856>] ? pm_qos_requirement+0x3c/0x53
 [<c0239f54>] ? acpi_processor_idle+0x0/0x434
 [<c01025fe>] ? cpu_idle+0x73/0x7f
 [<c03a4dcd>] ? rest_init+0x61/0x63
 =======================

Wolfgang also found out that adding kernel_fpu_begin() and kernel_fpu_end()
around the padlock instructions fix the oops.

Suresh wrote:

These padlock instructions though don't use/touch SSE registers, but it behaves
similar to other SSE instructions. For example, it might cause DNA faults
when cr0.ts is set. While this is a spurious DNA trap, it might cause
oops with the recent fpu code changes.

This is the code sequence  that is probably causing this problem:

a) new app is getting exec'd and it is somewhere in between
   start_thread() and flush_old_exec() in the load_xyz_binary()

b) At pont "a", task's fpu state (like TS_USEDFPU, used_math() etc) is
   cleared.

c) Now we get an interrupt/softirq which starts using these encrypt/decrypt
   routines in the network stack. This generates a math fault (as
   cr0.ts is '1') which sets TS_USEDFPU and restores the math that is
   in the task's xstate.

d) Return to exec code path, which does start_thread() which does
   free_thread_xstate() and sets xstate pointer to NULL while
   the TS_USEDFPU is still set.

e) At the next context switch from the new exec'd task to another task,
   we have a scenarios where TS_USEDFPU is set but xstate pointer is null.
   This can cause an oops during unlazy_fpu() in __switch_to()

Now:

1) This should happen with or with out pre-emption. Viro also encountered
   similar problem with out CONFIG_PREEMPT.

2) kernel_fpu_begin() and kernel_fpu_end() will fix this problem, because
   kernel_fpu_begin() will manually do a clts() and won't run in to the
   situation of setting TS_USEDFPU in step "c" above.

3) This was working before the fpu changes, because its a spurious
   math fault  which doesn't corrupt any fpu/sse registers and the task's
   math state was always in an allocated state.

With out the recent lazy fpu allocation changes, while we don't see oops,
there is a possible race still present in older kernels(for example,
while kernel is using kernel_fpu_begin() in some optimized clear/copy
page and an interrupt/softirq happens which uses these padlock
instructions generating DNA fault).

This is the failing scenario that existed even before the lazy fpu allocation
changes:

0. CPU's TS flag is set

1. kernel using FPU in some optimized copy  routine and while doing
kernel_fpu_begin() takes an interrupt just before doing clts()

2. Takes an interrupt and ipsec uses padlock instruction. And we
take a DNA fault as TS flag is still set.

3. We handle the DNA fault and set TS_USEDFPU and clear cr0.ts

4. We complete the padlock routine

5. Go back to step-1, which resumes clts() in kernel_fpu_begin(), finishes
the optimized copy routine and does kernel_fpu_end(). At this point,
we have cr0.ts again set to '1' but the task's TS_USEFPU is stilll
set and not cleared.

6. Now kernel resumes its user operation. And at the next context
switch, kernel sees it has do a FP save as TS_USEDFPU is still set
and then will do a unlazy_fpu() in __switch_to(). unlazy_fpu()
will take a DNA fault, as cr0.ts is '1' and now, because we are
in __switch_to(), math_state_restore() will get confused and will
restore the next task's FP state and will save it in prev tasks's FP state.
Remember, in __switch_to() we are already on the stack of the next task
but take a DNA fault for the prev task.

This causes the fpu leakage.

Fix the padlock instruction usage by calling them inside the
context of new routines irq_ts_save/restore(), which clear/restore cr0.ts
manually in the interrupt context. This will not generate spurious DNA
in the  context of the interrupt which will fix the oops encountered and
the possible FPU leakage issue.

Reported-and-bisected-by: Wolfgang Walter <wolfgang.walter@stwm.de>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2008-08-13 22:02:26 +10:00
Ingo Molnar a12e61df4f Merge commit 'v2.6.27-rc3' into x86/urgent 2008-08-13 13:08:47 +02:00
Johannes Weiner 0ed89b06e4 x86: propagate new nonpanic bootmem macros to CONFIG_HAVE_ARCH_BOOTMEM_NODE
Commit 74768ed833 "page allocator: use no-panic variant of
alloc_bootmem() in alloc_large_system_hash()" introduced two new
_nopanic macros which are undefined for CONFIG_HAVE_ARCH_BOOTMEM_NODE.

Signed-off-by: Johannes Weiner <hannes@saeurebad.de>
Acked-by: "Jan Beulich" <jbeulich@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-13 11:57:18 +02:00
Linus Torvalds 7019b1b500 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: fix 2.6.27rc1 cannot boot more than 8CPUs
  x86: make "apic" an early_param() on 32-bit, NULL check
  EFI, x86: fix function prototype
  x86, pci-calgary: fix function declaration
  x86: work around gcc 3.4.x bug
  x86: make "apic" an early_param() on 32-bit
  x86, debug: tone down arch/x86/kernel/mpparse.c debugging printk
  x86_64: restore the proper NR_IRQS define so larger systems work.
  x86: Restore proper vector locking during cpu hotplug
  x86: Fix broken VMI in 2.6.27-rc..
  x86: fdiv bug detection fix
2008-08-11 16:44:35 -07:00