Commit Graph

1387 Commits

Author SHA1 Message Date
Ingo Molnar 95492e4646 [PATCH] x86: rewrite SMP TSC sync code
make the TSC synchronization code more robust, and unify it between x86_64 and
i386.

The biggest change is the removal of the 'fix up TSCs' code on x86_64 and
i386, in some rare cases it was /causing/ time-warps on SMP systems.

The new code only checks for TSC asynchronity - and if it can prove a
time-warp (if it can observe the TSC going backwards when going from one CPU
to another within a critical section), then the TSC clock-source is turned
off.

The TSC synchronization-checking code also got moved into a separate file.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16 08:13:57 -08:00
Thomas Gleixner 92c7e00254 [PATCH] Simplify the registration of clocksources
Enqueue clocksources in rating order to make selection of the clocksource
easier.  Also check the match with an user override at enqueue time.

Preparatory patch for the generic clocksource verification.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16 08:13:57 -08:00
Thomas Gleixner 26a08eb301 [PATCH] i386 Remove useless code in tsc.c
The delayed work code in arch/i386/kernel/tsc.c is an unused leftover of the
GTOD conversion. Remove it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16 08:13:57 -08:00
John Stultz c1d370e167 [PATCH] i386: use GTOD persistent clock support
Persistent clock support: do proper timekeeping across suspend/resume, i386
arch support.

[bunk@stusta.de: cleanup]
Build-fixes-from: Andrew Morton <akpm@osdl.org>
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16 08:13:57 -08:00
Thomas Gleixner 950f4427c2 [PATCH] Add irq flag to disable balancing for an interrupt
Add a flag so we can prevent the irq balancing of an interrupt.  Move the
bits, so we have room for more :)

Necessary for the ability to setup clocksources more flexible (e.g.  use the
different HPET channels per CPU)

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16 08:13:56 -08:00
Rafa³ Bilski 2b8c0e1302 [CPUFREQ] Longhaul - Redo Longhaul ver. 2
Start using v2 version of Longhaul when available. It provides
voltage scaling and can use ACPI C3 state. That's curious. CPU
will not change frequency on ACPI C3 when v1 is in use, but it will
when v2 is used. Driver will return max frequency all the time if
this isn't true for all processors. There is strange thing with
mobile voltage. Looks like only Nehemiah (C3-M) supports it.
Earlier processors have different mobile VRM (in docs), but I can't
find any which is using it. Looks like all are using VRM 8.5. So
fail for non Nehemiah with mobile VRM.

Signed-off-by: Rafal Bilski <rafalbilski@interia.pl>
Signed-off-by: Dave Jones <davej@redhat.com>
2007-02-14 17:32:06 -05:00
Rafa³ Bilski b6f45a4b07 [CPUFREQ] EPS - Correct 2nd brand test
Solution for small, but nasty bug: access beyond end of f_table for C7 brand.

Signed-off-by: Rafal Bilski <rafalbilski@interia.pl>
Signed-off-by: Dave Jones <davej@redhat.com>
2007-02-13 22:58:26 -05:00
Jan Beulich 22c5ace729 [PATCH] i386: Fix broken CONFIG_COMPAT_VDSO on i386
After updating several machines to 2.6.20, I can't boot  anymore the single
one of them that supports the NX bit and is configured as a 32-bit system.

My understanding is that the VDSO changes in 2.6.20-rc7 were not fully
cooked, in that with that config option enabled VDSO_SYM(x) now equals
x, meaning that an address in the fixmap area is now being passed to
apps via AT_SYSINFO. However, the page is mapped with PAGE_READONLY
rather than PAGE_READONLY_EXEC.

I'm not certain whether having app code go through the fixmap area is
intended, but in case it is here is the simple patch that makes things work
again.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2007-02-13 13:26:26 +01:00
Giuliano Procida 98838ec984 [PATCH] i386: fix 32-bit ioctls on x64_32
[MTRR] fix 32-bit ioctls on x64_32

Signed-off-by: Giuliano Procida <giuliano.procida@googlemail.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2007-02-13 13:26:26 +01:00
Andi Kleen 62cc49396e [PATCH] x86: Unify pcspeaker platform device code between i386/x86-64
Trivial cleanup.

Only change is that it is always compiled in now on x86-64 like on i386.

Signed-off-by: Andi Kleen <ak@suse.de>
2007-02-13 13:26:26 +01:00
Rusty Russell 2a57ff1a70 [PATCH] i386: Rename cpu_gdt_descr and remove extern declaration from smpboot.c
When I implemented the DECLARE_PER_CPU(var) macros, I was careful that
people couldn't use "var" in a non-percpu context, by prepending
percpu__.  I never considered that this would allow them to overload
the same name for a per-cpu and a non-percpu variable.

It is only one of many horrors in the i386 boot code, but let's rename
the non-perpcu cpu_gdt_descr to early_gdt_descr (not boot_gdt_descr,
that's something else...)

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andi Kleen <ak@suse.de>

===================================================================
2007-02-13 13:26:26 +01:00
Rusty Russell 105fddb862 [PATCH] i386: Move mce_disabled to asm/mce.h
Allows external actors to disable mce.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andi Kleen <ak@suse.de>

===================================================================
2007-02-13 13:26:26 +01:00
Rusty Russell 992af68147 [PATCH] i386: paravirt unhandled fallthrough
The current code simply calls "start_kernel" directly if we're under a
hypervisor and no paravirt_ops backend wants us, because paravirt.c
registers that as a backend.

This was always a vain hope; start_kernel won't get far without setup.
It's also impossible for paravirt_ops backends which don't sit in the
arch/i386/kernel directory: they can't link before paravirt.o anyway.

Keep it simple: if we pass all the registered paravirt probes, BUG().

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andi Kleen <ak@suse.de>
2007-02-13 13:26:26 +01:00
Andi Kleen 9fbbd4dd17 [PATCH] x86: Don't require the vDSO for handling a.out signals
and in other strange binfmts. vDSO is not necessarily mapped there.

Signed-off-by: Andi Kleen <ak@suse.de>
2007-02-13 13:26:26 +01:00
Alan 120fad7240 [PATCH] i386: Fix Cyrix MediaGX detection
The old Cyrix 5520 CPU detection code relied upon the PCI layer setup being
done earlier than the CPU setup, which is no longer true.  Fortunately we
know that if the processor is a MediaGX we can do type 1 pci config
accesses to check the companion chip.  We thus do those directly and from
this find the 5520 and implement the workarounds for the timer problem

Original report from takada@mbf.nifty.com, I sent a proposed patch which
Takara then corrected, tested and sent back to the list on 10th January.

Submitting for merging as it seems to have been missed

AK: Changed to use pci-direct.h and fix warning for !CONFIG_PCI (later
AK: originally from akpm)

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Cc: <takada@mbf.nifty.com>
Cc: Jordan Crouse <jordan.crouse@amd.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2007-02-13 13:26:26 +01:00
Andi Kleen 7de6d3618b [PATCH] i386: Fix warning in cpu initialization
Fix bogus warning

linux/arch/i386/kernel/cpu/transmeta.c:12: warning: ‘cpu_freq’ may be used uninitialized in this function

Signed-off-by: Andi Kleen <ak@suse.de>
2007-02-13 13:26:25 +01:00
Andi Kleen 2ba1ff2b79 [PATCH] i386: Fix warning in microcode.c
Fix bogus gcc warning

linux/arch/i386/kernel/microcode.c:387: warning: ‘new_mc’ may be used uninitialized in this function

Signed-off-by: Andi Kleen <ak@suse.de>
2007-02-13 13:26:25 +01:00
Andi Kleen 0a4599c894 [PATCH] x86: Enable NMI watchdog for AMD Family 0x10 CPUs
For i386/x86-64.

Straight forward -- just reuse the Family 0xf code.

Signed-off-by: Andi Kleen <ak@suse.de>
2007-02-13 13:26:25 +01:00
Andi Kleen f790cd30d0 [PATCH] x86: Add new CPUID bits for AMD Family 10 CPUs in /proc/cpuinfo
Just various new acronyms. The new popcnt bit is in the middle
of Intel space. This looks a little weird, but I've been assured
it's ok.

Also I fixed RDTSCP for i386 which was at the wrong place.

For i386 and x86-64.

Signed-off-by: Andi Kleen <ak@suse.de>
2007-02-13 13:26:25 +01:00
Andi Kleen 1a1eecd1c2 [PATCH] i386: Remove fastcall in paravirt.[ch]
Not needed because fastcall is always default now

Signed-off-by: Andi Kleen <ak@suse.de>
2007-02-13 13:26:25 +01:00
TAKADA Yoshihito bcde1ebb81 [PATCH] i386: geode configuration fixes
Original code doesn't write back to CCR4 register.  This patch reflects a
value of a register.

Cc: Jordan Crouse <jordan.crouse@amd.com>
Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
2007-02-13 13:26:25 +01:00
Chuck Ebbert 86c4183742 [PATCH] i386: add option to show more code in oops reports
Sometimes developers need to see more object code in an oops report,
e.g. when kernel may be corrupted at runtime.

Add the "code_bytes" option for this.

Signed-off-by: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2007-02-13 13:26:25 +01:00
Jan Beulich 47a55cd795 [PATCH] i386: entry.S END/ENDPROC annotations
Annotate i386/kernel/entry.S with END/ENDPROC to assist disassemblers and
other analysis tools.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2007-02-13 13:26:24 +01:00
takada 2632f01a66 [PATCH] i386: support Classic MediaGXm
I hope to support "classic" MediaGXm in kernel.

The DIR1 register of MediaGXm( or Geode) shows the following values for
identify CPU.  For example, My MediaGXm shows 0x42.

We can read National Semiconductor's datasheet without any NDAs.
  http://www.national.com/pf/GX/GXLV.html

from datasheets:
DIR1
0x30 - 0x33 GXm rev. 1.0 - 2.3
0x34 - 0x4f GXm rev. 2.4 - 3.x
0x5x        GXm rev. 5.0 - 5.4
0x6x        GXLV
0x7x         (unknow)
0x8x	    Gx1

In nsc driver of X, accept 0x30 through 0x82. What will 0x7x mean?

Cc: Jordan Crouse <jordan.crouse@amd.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
2007-02-13 13:26:24 +01:00
H. Peter Anvin 30b82ea08c [PATCH] i386: All Transmeta CPUs have constant TSCs
All Transmeta CPUs ever produced have constant-rate TSCs.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2007-02-13 13:26:24 +01:00
Ingo Molnar 5d0e600d90 [PATCH] x86: fix laptop bootup hang in init_acpi()
During kernel bootup, a new T60 laptop (CoreDuo, 32-bit) hangs about
10%-20% of the time in acpi_init():

 Calling initcall 0xc055ce1a: topology_init+0x0/0x2f()
 Calling initcall 0xc055d75e: mtrr_init_finialize+0x0/0x2c()
 Calling initcall 0xc05664f3: param_sysfs_init+0x0/0x175()
 Calling initcall 0xc014cb65: pm_sysrq_init+0x0/0x17()
 Calling initcall 0xc0569f99: init_bio+0x0/0xf4()
 Calling initcall 0xc056b865: genhd_device_init+0x0/0x50()
 Calling initcall 0xc056c4bd: fbmem_init+0x0/0x87()
 Calling initcall 0xc056dd74: acpi_init+0x0/0x1ee()

It's a hard hang that not even an NMI could punch through!  Frustratingly,
adding printks or function tracing to the ACPI code made the hangs go away
...

After some time an additional detail emerged: disabling the NMI watchdog
made these occasional hangs go away.

So i spent the better part of today trying to debug this and trying out
various theories when i finally found the likely reason for the hang: if
acpi_ns_initialize_devices() executes an _INI AML method and an NMI
happens to hit that AML execution in the wrong moment, the machine would
hang.  (my theory is that this must be some sort of chipset setup method
doing stores to chipset mmio registers?)

Unfortunately given the characteristics of the hang it was sheer
impossible to figure out which of the numerous AML methods is impacted
by this problem.

As a workaround i wrote an interface to disable chipset-based NMIs while
executing _INI sections - and indeed this fixed the hang.  I did a
boot-loop of 100 separate reboots and none hung - while without the patch
it would hang every 5-10 attempts.  Out of caution i did not touch the
nmi_watchdog=2 case (it's not related to the chipset anyway and didnt
hang).

I implemented this for both x86_64 and i686, tested the i686 laptop both
with nmi_watchdog=1 [which triggered the hangs] and nmi_watchdog=2, and
tested an Athlon64 box with the 64-bit kernel as well. Everything builds
and works with the patch applied.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Cc: Len Brown <lenb@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2007-02-13 13:26:24 +01:00
Andreas Herrmann 6c5806cae5 [PATCH] i386: fix size_or_mask and size_and_mask
mtrr: fix size_or_mask and size_and_mask

This fixes two bugs in /proc/mtrr interface:
o If physical address size crosses the 44 bit boundary
  size_or_mask is evaluated wrong.
o size_and_mask limits width of physical base
  address for an MTRR to be less than 44 bits.

TBD: later patch had one more change, but I think that was bogus.
TBD: need to double check

Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2007-02-13 13:26:23 +01:00
Alexey Dobriyan 016d6f3580 [PATCH] i386: Convert /proc/apm to seqfile
Byte-to-byte identical /proc/apm here.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2007-02-13 13:26:23 +01:00
Alexey Dobriyan ad4e680fb2 [PATCH] i386: use smp_call_function_single()
It will execure cpuid only on the cpu we need.

Signed-off-by: Alexey Dobriyan <adobriyan@openvz.org>
Signed-off-by: Andi Kleen <ak@suse.de>
2007-02-13 13:26:23 +01:00
Alexey Dobriyan d958f14332 [PATCH] i386: use smp_call_function_single()
It will execute rdmsr and wrmsr only on the cpu we need.

Signed-off-by: Alexey Dobriyan <adobriyan@openvz.org>
Signed-off-by: Andi Kleen <ak@suse.de>
2007-02-13 13:26:23 +01:00
Andi Kleen 8c40ad02e5 [PATCH] i386: Small cleanup to TLB flush code
- Remove outdated comment
- Use cpu_relax() in a busy loop

Signed-off-by: Andi Kleen <ak@suse.de>
2007-02-13 13:26:23 +01:00
Venkatesh Pallipadi 90ce4bc454 [PATCH] i386: Handle 32 bit PerfMon Counter writes cleanly in i386 nmi_watchdog
Change i386 nmi handler to handle 32 bit perfmon counter MSR writes cleanly.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2007-02-13 13:26:22 +01:00
Rene Herman 3b3d5e1db6 [PATCH] i386: romsignature/checksum cleanup
Use adding __init to romsignature() (it's only called from probe_roms()
which is itself __init) as an excuse to submit a pedantic cleanup.

Signed-off-by: Rene Herman <rene.herman@gmail.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2007-02-13 13:26:22 +01:00
Ingo Molnar f9690982b8 [PATCH] i386: improve sched_clock() on i686
Clean up sched_clock() on i686: it will use the TSC if available and falls
back to jiffies only if the user asked for it to be disabled via notsc or
the CPU calibration code didnt figure out the right cpu_khz.

This generally makes the scheduler timestamps more finegrained, on all
hardware.  (the current scheduler is pretty resistant against asynchronous
sched_clock() values on different CPUs, it will allow at most up to a jiffy
of jitter.)

Also simplify sched_clock()'s check for TSC availability: propagate the
desire and ability to use the TSC into the tsc_disable flag, previously
this flag only indicated whether the notsc option was passed.  This makes
the rare low-res sched_clock() codepath a single branch off a read-mostly
flag.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Andi Kleen <ak@suse.de>
2007-02-13 13:26:22 +01:00
Stephane Eranian 2ff2d3d747 [PATCH] i386: add idle notifier
Add a notifier mechanism to the low level idle loop.  You can register a
callback function which gets invoked on entry and exit from the low level idle
loop.  The low level idle loop is defined as the polling loop, low-power call,
or the mwait instruction.  Interrupts processed by the idle thread are not
considered part of the low level loop.

The notifier can be used to measure precisely how much is spent in useless
execution (or low power mode).  The perfmon subsystem uses it to turn on/off
monitoring.

Signed-off-by: stephane eranian <eranian@hpl.hp.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Andi Kleen <ak@suse.de>
2007-02-13 13:26:22 +01:00
Adrian Bunk 86a978837c [PATCH] i386: arch/i386/kernel/cpu/mcheck/mce.c should #include <asm/mce.h>
Every file should include the headers containing the prototypes for
it's global functions.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Andi Kleen <ak@suse.de>
2007-02-13 13:26:22 +01:00
Vivek Goyal f8657e1b55 [PATCH] i386: move startup_32() in text.head section
o Entry startup_32 was in .text section but it was accessing some init
  data too and it prompts MODPOST to generate compilation warnings.

WARNING: vmlinux - Section mismatch: reference to .init.data:boot_params from
.text between '_text' (at offset 0xc0100029) and 'startup_32_smp'
WARNING: vmlinux - Section mismatch: reference to .init.data:boot_params from
.text between '_text' (at offset 0xc0100037) and 'startup_32_smp'
WARNING: vmlinux - Section mismatch: reference to
.init.data:init_pg_tables_end from .text between '_text' (at offset
0xc0100099) and 'startup_32_smp'

o Can't move startup_32 to .init.text as this entry point has to be at the
  start of bzImage. Hence moved startup_32 to a new section .text.head and
  instructed MODPOST to not to generate warnings if init data is being
  accessed from .text.head section. This code has been audited.

o SMP boot up code (startup_32_smp) can go into .init.text if CPU hotplug
  is not supported. Otherwise it generates more warnings

WARNING: vmlinux - Section mismatch: reference to .init.data:new_cpu_data from
.text between 'checkCPUtype' (at offset 0xc0100126) and 'is486'
WARNING: vmlinux - Section mismatch: reference to .init.data:new_cpu_data from
.text between 'checkCPUtype' (at offset 0xc0100130) and 'is486'

Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Andi Kleen <ak@suse.de>
2007-02-13 13:26:22 +01:00
Zachary Amsden 90736e20e3 [PATCH] i386: Vmi timer race
Because timer code moves around, and we might eventually move our init to a
late_time_init hook, save and restore IRQs around this code because it is
definitely not interrupt safe.

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2007-02-13 13:26:21 +01:00
Zachary Amsden ac3b6faff9 [PATCH] i386: Kprobe rpl fix
Kprobes bugfix for paravirt compatibility - RPL on the CS when inserting
BPs must match running kernel.

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andi Kleen <ak@suse.de>
CC: Eric Biederman <ebiederm@xmission.com>
2007-02-13 13:26:21 +01:00
Zachary Amsden 7b35520243 [PATCH] i386: Profile pc badness
Profile_pc was broken when using paravirtualization because the
assumption the kernel was running at CPL 0 was violated, causing
bad logic to read a random value off the stack.

The only way to be in kernel lock functions is to be in kernel
code, so validate that assumption explicitly by checking the CS
value.  We don't want to be fooled by BIOS / APM segments and
try to read those stacks, so only match KERNEL_CS.

I moved some stuff in segment.h to make it prettier.

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2007-02-13 13:26:21 +01:00
Zachary Amsden bbab4f3bb7 [PATCH] i386: vMI timer patches
VMI timer code.  It works by taking over the local APIC clock when APIC is
configured, which requires a couple hooks into the APIC code.  The backend
timer code could be commonized into the timer infrastructure, but there are
some pieces missing (stolen time, in particular), and the exact semantics of
when to do accounting for NO_IDLE need to be shared between different
hypervisors as well.  So for now, VMI timer is a separate module.

[Adrian Bunk: cleanups]

Subject: VMI timer patches
Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2007-02-13 13:26:21 +01:00
Zachary Amsden 7ce0bcfd16 [PATCH] i386: vMI backend for paravirt-ops
Fairly straightforward implementation of VMI backend for paravirt-ops.

[Adrian Bunk: some cleanups]

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2007-02-13 13:26:21 +01:00
Zachary Amsden ae5da273fe [PATCH] i386: SMP boot hook for paravirt
Add VMI SMP boot hook.  We emulate a regular boot sequence and use the same
APIC IPI initiation, we just poke magic values to load into the CPU state when
the startup IPI is received, rather than having to jump through a real mode
trampoline.

This is all that was needed to get SMP to work.

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2007-02-13 13:26:21 +01:00
Zachary Amsden 8b15114434 [PATCH] i386: iOPL handling for paravirt guests
I found a clever way to make the extra IOPL switching invisible to
non-paravirt compiles - since kernel_rpl is statically defined to be zero
there, and only non-zero rpl kernel have a problem restoring IOPL, as popf
does not restore IOPL flags unless run at CPL-0.

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2007-02-13 13:26:21 +01:00
Zachary Amsden 9226d125d9 [PATCH] i386: paravirt CPU hypercall batching mode
The VMI ROM has a mode where hypercalls can be queued and batched.  This turns
out to be a significant win during context switch, but must be done at a
specific point before side effects to CPU state are visible to subsequent
instructions.  This is similar to the MMU batching hooks already provided.
The same hooks could be used by the Xen backend to implement a context switch
multicall.

To explain a bit more about lazy modes in the paravirt patches, basically, the
idea is that only one of lazy CPU or MMU mode can be active at any given time.
 Lazy MMU mode is similar to this lazy CPU mode, and allows for batching of
multiple PTE updates (say, inside a remap loop), but to avoid keeping some
kind of state machine about when to flush cpu or mmu updates, we just allow
one or the other to be active.  Although there is no real reason a more
comprehensive scheme could not be implemented, there is also no demonstrated
need for this extra complexity.

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2007-02-13 13:26:21 +01:00
Zachary Amsden c119ecce89 [PATCH] MM: page allocation hooks for VMI backend
The VMI backend uses explicit page type notification to track shadow page
tables.  The allocation of page table roots is especially tricky.  We need to
clone the root for non-PAE mode while it is protected under the pgd lock to
correctly copy the shadow.

We don't need to allocate pgds in PAE mode, (PDPs in Intel terminology) as
they only have 4 entries, and are cached entirely by the processor, which
makes shadowing them rather simple.

For base page table level allocation, pmd_populate provides the exact hook
point we need.  Also, we need to allocate pages when splitting a large page,
and we must release pages before returning the page to any free pool.

Despite being required with these slightly odd semantics for VMI, Xen also
uses these hooks to determine the exact moment when page tables are created or
released.

AK: All nops for other architectures

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2007-02-13 13:26:21 +01:00
Adrian Bunk 90611fe923 [PATCH] i386: arch/i386/kernel/e820.c should #include <asm/setup.h
Every file should #include the headers containing the prototypes for
its global functions.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Andi Kleen <ak@suse.de>
2007-02-13 13:26:21 +01:00
Maciej W. Rozycki 2e188938ab [PATCH] i386: Fix a typo in an IRQ handler name
The "fasteoi" IRQ handler is named "fasteio" incorrectly.  This is a fix.

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2007-02-13 13:26:20 +01:00
Jeremy Fitzhardinge 464d1a78fb [PATCH] i386: Convert i386 PDA code to use %fs
Convert the PDA code to use %fs rather than %gs as the segment for
per-processor data.  This is because some processors show a small but
measurable performance gain for reloading a NULL segment selector (as %fs
generally is in user-space) versus a non-NULL one (as %gs generally is).

On modern processors the difference is very small, perhaps undetectable.
Some old AMD "K6 3D+" processors are noticably slower when %fs is used
rather than %gs; I have no idea why this might be, but I think they're
sufficiently rare that it doesn't matter much.

This patch also fixes the math emulator, which had not been adjusted to
match the changed struct pt_regs.

[frederik.deweerdt@gmail.com: fixit with gdb]
[mingo@elte.hu: Fix KVM too]

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Ian Campbell <Ian.Campbell@XenSource.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Zachary Amsden <zach@vmware.com>
Cc: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Frederik Deweerdt <frederik.deweerdt@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2007-02-13 13:26:20 +01:00
Len Brown 7f8f97c3cc ACPI: acpi_table_parse() now returns success/fail, not count
Returning count for tables that are supposed to be unique
was useless and confusing.

Signed-off-by: Len Brown <len.brown@intel.com>
2007-02-13 02:58:52 -05:00
Arjan van de Ven 5dfe4c964a [PATCH] mark struct file_operations const 2
Many struct file_operations in the kernel can be "const".  Marking them const
moves these to the .rodata section, which avoids false sharing with potential
dirty data.  In addition it'll catch accidental writes at compile time to
these shared resources.

[akpm@osdl.org: sparc64 fix]
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-12 09:48:44 -08:00
Alon Bar-Lev 4e498b6610 [PATCH] Dynamic kernel command-line: i386
1. Rename saved_command_line into boot_command_line.
2. Set command_line as __initdata.

Signed-off-by: Alon Bar-Lev <alon.barlev@gmail.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-12 09:48:38 -08:00
Jean-Paul Saman 67d38229df [PATCH] disable init/initramfs.c: architectures
Update all arch/*/kernel/vmlinux.lds.S to not include space for initramfs
when CONFIG_BLK_DEV_INITRAMFS is not selected.  This saves another 4 kbytes
on most platfoms (some reserve PAGE_SIZE for initramfs).

Signed-off-by: Jean-Paul Saman <jean-paul.saman@nxp.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11 10:51:25 -08:00
Dave Jones bd0561c9d8 [CPUFREQ] Fix up merge conflicts with recent ACPI changes.
Signed-off-by: Dave Jones <davej@redhat.com>
2007-02-10 20:36:29 -05:00
Rafa³ Bilski 348f31ed2b [CPUFREQ] Longhaul - Separate frequency and voltage transition
This change should make Longhaul more compatible with
both ver. 2 and Powersaver processors. Voltage transitions
will be done before or after frequency transition. That depends
on direction of change. I don't know how to force conservative
governor when voltage scaling is enabled, so there is only
a warning for user. Minimal voltage is calculated in different
way now because in this way more power is saved at lower
multipliers.

Signed-off-by: Rafal Bilski <rafalbilski@interia.pl>
Signed-off-by: Dave Jones <davej@redhat.com>
2007-02-10 20:05:50 -05:00
Rafa³ Bilski e57501c15f [CPUFREQ] Longhaul - Models of Nehemiah
Borowed from VIA driver.

Signed-off-by: Rafal Bilski <rafalbilski@interia.pl>
Signed-off-by: Dave Jones <davej@redhat.com>
2007-02-10 20:05:04 -05:00
Rafa³ Bilski 9addf3b638 [CPUFREQ] Longhaul - Simplier minmult
Simple cleanup in code which is setting minmult.

Signed-off-by: Rafal Bilski <rafalbilski@interia.pl>
Signed-off-by: Dave Jones <davej@redhat.com>
2007-02-10 20:01:48 -05:00
Rafa³ Bilski 86acd49aa1 [CPUFREQ] Enhanced PowerSaver driver
This is driver for Enhanced Powersaver which is present in VIA C7
processors. Beta tested by Jorgen (jorgen (at) greven dot dk).
Thanks! Based on documentation provided by Dave Jones (Thanks!)
and C7 Eden datasheet available from www.via.com.tw. Looks like all
these C7 Eden CPU's don't have P-states in BIOS. I know that 2
p-states is low, but Jorgen finds it usefull anyway because board
is passive cooled.
There are 3 different types of C7 processors (called brands):
0. C7-M - these processors can set any maultiplier between min and
max, any voltage between min and max.
1. C7 - only min and max states are supported. Voltage is different
for min and max states.
2. Eden - only min and max states are supported. Looks like this
brand can only change multiplier. Voltage seems to be the same for
min and max frequency.

Signed-off-by: Rafal Bilski <rafalbilski@interia.pl>
Signed-off-by: Dave Jones <davej@redhat.com>
2007-02-10 20:01:47 -05:00
Roland McGrath 7d91d53190 [PATCH] i386 vDSO: use install_special_mapping
This patch uses install_special_mapping for the i386 vDSO setup, consolidating
duplicated code.

Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-09 09:25:47 -08:00
Al Viro cb468984f6 [PATCH] io_apic: trivial __iomem annotations
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-09 09:14:07 -08:00
Al Viro 5b71bddb78 [PATCH] hpet: trivial __iomem annotations
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-09 09:14:07 -08:00
Linus Torvalds 78149df6d5 Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6: (41 commits)
  Revert "PCI: remove duplicate device id from ata_piix"
  msi: Make MSI useable more architectures
  msi: Kill the msi_desc array.
  msi: Remove attach_msi_entry.
  msi: Fix msi_remove_pci_irq_vectors.
  msi: Remove msi_lock.
  msi: Kill msi_lookup_irq
  MSI: Combine pci_(save|restore)_msi/msix_state
  MSI: Remove pci_scan_msi_device()
  MSI: Replace pci_msi_quirk with calls to pci_no_msi()
  PCI: remove duplicate device id from ipr
  PCI: remove duplicate device id from ata_piix
  PCI: power management: remove noise on non-manageable hw
  PCI: cleanup MSI code
  PCI: make isa_bridge Alpha-only
  PCI: remove quirk_sis_96x_compatible()
  PCI: Speed up the Intel SMBus unhiding quirk
  PCI Quirk: 1k I/O space IOBL_ADR fix on P64H2
  shpchp: delete trailing whitespace
  shpchp: remove DBG_XXX_ROUTINE
  ...
2007-02-07 19:23:44 -08:00
Eric W. Biederman f7feaca77d msi: Make MSI useable more architectures
The arch hooks arch_setup_msi_irq and arch_teardown_msi_irq are now
responsible for allocating and freeing the linux irq in addition to
setting up the the linux irq to work with the interrupt.

arch_setup_msi_irq now takes a pci_device and a msi_desc and returns
an irq.

With this change in place this code should be useable by all platforms
except those that won't let the OS touch the hardware like ppc RTAS.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-02-07 15:50:08 -08:00
Len Brown 57e1c5c87d Pull test into release branch 2007-02-06 15:31:00 -05:00
Rafa³ Bilski 786f46b262 [CPUFREQ] Longhaul - Add VT8235 support
I don't know why it is working and how, but it is working. On my
Epia transition time is by default set to 100us. I'm changing it to
200us. After that I can change frequency from min (x4.0) to max (x7.5)
without lockup. Many times.
There is a paranoid check at a beginning of a patch. Probably dead
code, but I don't have better ideas for CL10000 case at the moment.
Only way to to detect broken chip seems to be looking in log for
spurious interrupts.

Signed-off-by: Rafal Bilski <rafalbilski@interia.pl>
Signed-off-by: Dave Jones <davej@redhat.com>
2007-02-04 18:09:19 -05:00
Rafa³ Bilski 46ef955f5c [CPUFREQ] Longhaul - Fix guess_fsb function
This is bug reported by John-Marc Chandonia:
> Detected 1002.292 MHz processor.
> longhaul: VIA C3 'Nehemiah B' [C5N] CPU detected.  Powersaver supported.
> longhaul: Using throttling support.
> longhaul: Invalid (reserved) FSB!
FSB is correcly guessed for 999.554 MHz CPU.
To fix this error:
- ROUNDING should be range, not mask - at it's current value it is +7 -8,
- more precise calculations inside guess_fsb - 7.5x133MHz is 1000MHz now.

Signed-off-by: Rafal Bilski <rafalbilski@interia.pl>
Signed-off-by: Dave Jones <davej@redhat.com>
2007-02-04 18:09:19 -05:00
Frédéric Riss 40c373cc3a [PATCH] EFI x86: pass firmware call parameters on the stack
When calling into the EFI firmware, the parameters need to be passed on
the stack. The recent change to use -mregparm=3 breaks x86 EFI support.
This patch is needed to allow the new Intel-based Macs to suspend to ram
(efi.get_time is called during the suspend phase).

Signed-off-by: Frederic Riss <frederic.riss@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-04 10:27:10 -08:00
Rafa³ Bilski 0d44b2ba28 [CPUFREQ] Longhaul - Remove duplicate tables
Now there is no need to depend on -1 in Nehemiah tables. After
previous change code is eliminating multipliers lower then 5.0
by minmult for Nehemiah A and B.

Signed-off-by: Rafa³ Bilski <rafalbilski@interia.pl>
Signed-off-by: Dave Jones <davej@redhat.com>
2007-02-03 17:25:19 -05:00
Rafa³ Bilski 980342a7eb [CPUFREQ] Longhaul - Introduce Nehemiah C
Looks like some time ago I introduced a bug to Longhaul.
I had report that 9x133Mhz CPU is seen as 5x133MHz. So I
changed multipliers table. That was a mistake. According to
documentation table was correct. So only way to avoid 5 or 9
dilema is not use MaxMHzBR for PowerSaver 1.0. One code that
works on all processors. To do it I need also separate flag for
Nehemiah C (min = x4.0) and Nehemiah (min = x5.0).

Signed-off-by: Rafa³ Bilski <rafalbilski@interia.pl>
Signed-off-by: Dave Jones <davej@redhat.com>
2007-02-03 17:25:19 -05:00
Joachim Deguara 58389a86df [CPUFREQ] fix cpuinfo_cur_freq for CPU_HW_PSTATE
This fixes the cpuinfo_cur_freq value by using the correct
find_khz_freq_from_fiddid() when the CPU uses hardware p-states.

Signed-off-by: Joachim Deguara <joachim.deguara@amd.com>
Acked-by: Mark Langsdorf <mark.langsdorf@amd.com>
Signed-off-by: Dave Jones <davej@redhat.com>
2007-02-03 17:25:19 -05:00
Rafa³ Bilski 1479672283 [CPUFREQ] Longhaul - Remove "ignore_latency" option
There is no need to have this option in Longhaul anymore.
It was for laptop with CLE266 chipset in times, when only
ACPI C3 was used to switch frequency. Now we have native
support not only for CLE266, but CN400 too. Would be good
to have support for PN266, but I can't find datasheet for it.
Looks like BIOS for CPU's faster then 1GHz don't support
ACPI C2 nor C3.

Signed-off-by: Rafa³ Bilski <rafalbilski@interia.pl>
Signed-off-by: Dave Jones <davej@redhat.com>
2007-02-03 17:24:08 -05:00
Alexey Starikovskiy 0e5683350f ACPI: build fix for IBM x440 - CONFIG_X86_SUMMIT
i386 srat.c broke due to re-names from ACPICA table-manager re-write.

Signed-off-by: Alexey Starikovskiy <alexey.y.starikovskiy@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2007-02-02 21:47:33 -05:00
Alexey Starikovskiy f18c5a08bf ACPICA: Allow ACPI id to be u32 instead of u8.
Allow ACPI id to be u32 instead of u8.
Requires drop of conversion tables with the acpiid as index.

Signed-off-by: Len Brown <len.brown@intel.com>
2007-02-02 21:14:31 -05:00
Alexey Starikovskiy 15a58ed121 ACPICA: Remove duplicate table definitions (non-conflicting), cont
Signed-off-by: Len Brown <len.brown@intel.com>
2007-02-02 21:14:29 -05:00
Alexey Starikovskiy 5f3b1a8b67 ACPICA: Remove duplicate table definitions (non-conflicting)
Signed-off-by: Len Brown <len.brown@intel.com>
2007-02-02 21:14:29 -05:00
Alexey Starikovskiy ad363f80c3 ACPICA: Remove duplicate table definitions.
Signed-off-by: Len Brown <len.brown@intel.com>
2007-02-02 21:14:28 -05:00
Alexey Starikovskiy cee324b145 ACPICA: use new ACPI headers.
Signed-off-by: Len Brown <len.brown@intel.com>
2007-02-02 21:14:28 -05:00
Alexey Starikovskiy ceb6c46839 ACPICA: Remove duplicate table manager
Signed-off-by: Len Brown <len.brown@intel.com>
2007-02-02 21:14:28 -05:00
Alexey Starikovskiy ad71860a17 ACPICA: minimal patch to integrate new tables into Linux
Signed-off-by: Len Brown <len.brown@intel.com>
2007-02-02 21:14:22 -05:00
Linus Torvalds 435f8a605d Revert "[PATCH] fix typo in geode_configre()@cyrix.c"
This reverts commit e4f0ae0ea6.

It's not wrong, but it's not right either, and everybody seems to agree
that the right fix is probably to do the ccr3 write after the ccr4 one
(and that we also should clean it up a bit).  And after that we need to
really validate that all the bits that we write to ccr4 actually do
work.

The old 2.6.19 code was insane, and basically didn't change ccr4 at all
(even though it certainly looks like it was the *intent* to do so).  So
let's revert the change that may fix things, just because it's not what
was actually ever tested when the code was written, even if it _was_ the
intent.

There's a discussion on http://lkml.org/lkml/2007/1/9/63 that was
started by the patch that now gets reverted, and that discussion may
well contain the proper long-term fix.

Suggested-by: Adrian Bunk <bunk@stusta.de>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-02 08:07:42 -08:00
Linus Torvalds ad2e62a038 Merge master.kernel.org:/pub/scm/linux/kernel/git/davej/cpufreq
* master.kernel.org:/pub/scm/linux/kernel/git/davej/cpufreq:
  [CPUFREQ] Remove unneeded errata workaround from p4-clockmod.
  [CPUFREQ] check sysfs_create_link return value
2007-01-30 08:44:08 -08:00
Eric W. Biederman 8339f0008c [PATCH] i386: In assign_irq_vector look at all vectors before giving up
When the world was a simple and static place setting up irqs was easy.
It sufficed to allocate a linux irq number and a find a free cpu
vector we could receive that linux irq on.  In those days it was
a safe assumption that any allocated vector was actually in use
so after one global pass through all of the vectors we would have
none left.

These days things are much more dynamic with interrupt controllers
(in the form of MSI or MSI-X) appearing on plug in cards and linux
irqs appearing and disappearing.  As these irqs come and go vectors
are allocated and freed,  invalidating the ancient assumption that all
allocated vectors stayed in use forever.

So this patch modifies the vector allocator to walk through every
possible vector before giving up, and to check to see if a vector
is in use before assigning it.  With these changes we stop leaking
freed vectors and it becomes possible to allocate and free irq vectors
all day long.

This changed was modeled after the vector allocator on x86_64 where
this limitation has already been removed.  In essence we don't update
the static variables that hold the position of the last vector we
allocated until have successfully allocated another vector.  This
allows us to detect if we have completed one complete scan through
all of the possible vectors.

Acked-by: Auke Kok <auke-jan.h.kok@intel.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-01-30 08:29:58 -08:00
Dave Jones 3453c8478a [CPUFREQ] Remove unneeded errata workaround from p4-clockmod.
This workaround unnecessarily cripples functionality to work
around an errata that doesn't seem possible to hit due to
us using the automatic clock throttling in the p4 mcheck code.

See http://lkml.org/lkml/2006/10/28/148 for complete reasoning
and lack of disconsent.

Signed-off-by: Dave Jones <davej@redhat.com>
2007-01-29 00:07:04 -05:00
Roland McGrath f47aef55d9 [PATCH] i386 vDSO: use VM_ALWAYSDUMP
This patch fixes core dumps to include the vDSO vma, which is left out now.
It removes the special-case core writing macros, which were not doing the
right thing for the vDSO vma anyway.  Instead, it uses VM_ALWAYSDUMP in the
vma; there is no need for the fixmap page to be installed.  It handles the
CONFIG_COMPAT_VDSO case by making elf_core_dump use the fake vma from
get_gate_vma after real vmas in the same way the /proc/PID/maps code does.

This changes core dumps so they no longer include the non-PT_LOAD phdrs from
the vDSO.  I made the change to add them in the first place, but in turned out
that nothing ever wanted them there since the advent of NT_AUXV.  It's cleaner
to leave them out, and just let the phdrs inside the vDSO image speak for
themselves.

Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-01-26 13:50:58 -08:00
Roland McGrath a1f3bb9ae4 [PATCH] Fix CONFIG_COMPAT_VDSO
I wouldn't mind if CONFIG_COMPAT_VDSO went away entirely.  But if it's there,
it should work properly.  Currently it's quite haphazard: both real vma and
fixmap are mapped, both are put in the two different AT_* slots, sysenter
returns to the vma address rather than the fixmap address, and core dumps yet
are another story.

This patch makes CONFIG_COMPAT_VDSO disable the real vma and use the fixmap
area consistently.  This makes it actually compatible with what the old vdso
implementation did.

Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-01-26 13:50:58 -08:00
Ingo Molnar 0dbe5a1113 [PATCH] paravirt: mark the paravirt_ops export internal
The paravirt subsystem is still in flux so all exports from it are
definitely internal use only.  The APIs around this /will/ change.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Andi Kleen <ak@suse.de>
Cc: Zachary Amsden <zach@vmware.com>
Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-01-23 07:52:06 -08:00
Venkatesh Pallipadi 58d9ce7d75 [PATCH] Revert nmi_known_cpu() check during boot option parsing
Commit f2802e7f57 and its x86 version
(b7471c6da9) adds nmi_known_cpu() check
while parsing boot options in x86_64 and i386.

With that, "nmi_watchdog=2" stops working for me on Intel Core 2 CPU
based system.

The problem is, setup_nmi_watchdog is called while parsing the boot
option and identify_cpu is not done yet.  So, the return value of
nmi_known_cpu() is not valid at this point.

So revert that check.  This should not have any adverse effect as the
nmi_known_cpu() check is done again later in enable_lapic_nmi_watchdog().

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-01-23 07:52:05 -08:00
James Bottomley 9ee79a3d37 [PATCH] x86: fix PDA variables to work during boot
The current PDA code, which went in in post 2.6.19 has a flaw in that it
doesn't correctly cycle the GDT and %GS segment through the boot PDA,
the CPU PDA and finally the per-cpu PDA.

The bug generally doesn't show up if the boot CPU id is zero, but
everything falls apart for a non zero boot CPU id.  The basically kills
voyager which is perfectly capable of doing non zero CPU id boots, so
voyager currently won't boot without this.

The fix is to be careful and actually do the GDT setups correctly.

Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-01-22 19:39:36 -08:00
Linus Torvalds e947382ed3 Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6:
  Revert "ACPI: ibm-acpi: make non-generic bay support optional"
  ACPI: update MAINTAINERS
  ACPI: schedule obsolete features for deletion
  ACPI: delete two spurious ACPI messages
  ACPI: rename cstate_entry_s to cstate_entry
  ACPI: ec: enable printk on cmdline use
  ACPI: Altix: ACPI _PRT support
2007-01-11 18:25:44 -08:00
takada e4f0ae0ea6 [PATCH] fix typo in geode_configre()@cyrix.c
We write back the wrong register when configuring the Geode processor.
Instead of storing to CCR4, it stores to CCR3.

Cc: Jordan Crouse <jordan.crouse@amd.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2007-01-11 18:18:21 -08:00
Vivek Goyal 664c0d3d57 [PATCH] i386: sched_clock using init data tsc_disable fix
o sched_clock() a non-init function is using init data tsc_disable. This
  is flagged by MODPOST on i386 if CONFIG_RELOCATABLE=y

WARNING: vmlinux - Section mismatch: reference to .init.data:tsc_disable from .text between 'sched_clock' (at offset 0xc0109d58) and 'tsc_update_callback'

Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2007-01-11 18:18:20 -08:00
Len Brown 85f4544fbf Pull trivial into release branch 2007-01-11 01:55:25 -05:00
Venkatesh Pallipadi 5d65131fa8 ACPI: rename cstate_entry_s to cstate_entry
style change only.

Signed-off-by: Len Brown <len.brown@intel.com>
2007-01-10 23:08:38 -05:00
Vivek Goyal 88d20328cd [PATCH] i386: Convert some functions to __init to avoid MODPOST warnings
o Some functions which should have been in init sections as they are called
  only once. Put them in init sections. Otherwise MODPOST generates warning
  as these functions are placed in .text and they end up accessing something
  in init sections.

WARNING: vmlinux - Section mismatch: reference to .init.text:migration_init
from .text between 'do_pre_smp_initcalls' (at offset 0xc01000d1) and
'run_init_process'

Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Andi Kleen <ak@suse.de>
2007-01-11 01:52:44 +01:00
Vivek Goyal 4a5d107a9a [PATCH] i386: cpu hotplug/smpboot misc MODPOST warning fixes
o Misc smpboot/cpu hotplug path cleanups. I did those to supress the
  warnings generated by MODPOST. These warnings are visible only
  if CONFIG_RELOCATABLE=y.

o CONFIG_RELOCATABLE compiles the kernel with --emit-relocs option. This
  option retains relocation information in vmlinux file and MODPOST
  is quick to spit out "Section mismatch" warnings.

o This patch fixes some of those warnings. Many of the functions in
  smpboot case are __devinit type and they in turn accesses text/data which
  if of type __cpuinit. Now if CONFIG_HOTPLUG=y and CONFIG_HOTPLUG_CPU=n
  then we end up in cases where a function in .text segment is calling
  another function in .init.text segment and MODPOST emits warning.

WARNING: vmlinux - Section mismatch: reference to .init.text:identify_cpu from .text between 'smp_store_cpu_info' (at offset 0xc011020d) and 'do_boot_cpu'
WARNING: vmlinux - Section mismatch: reference to .init.text:init_gdt from .text between 'do_boot_cpu' (at offset 0xc01102ca) and '__cpu_up'
WARNING: vmlinux - Section mismatch: reference to .init.text:print_cpu_info from .text between 'do_boot_cpu' (at offset 0xc01105d0) and '__cpu_up'

o It also fixes the issues where CONFIG_HOTPLUG_CPU=y and start_secondary()
  is calling smp_callin() which in-turn calls synchronize_tsc_ap() which is
  of type __init. This should have meant broken CPU hotplug.

WARNING: vmlinux - Section mismatch: reference to .init.data: from .text between 'start_secondary' (at offset 0xc011603f) and 'initialize_secondary'
WARNING: vmlinux - Section mismatch: reference to .init.data: from .text between 'MP_processor_info' (at offset 0xc0116a4f) and 'mp_register_lapic'
WARNING: vmlinux - Section mismatch: reference to .init.data: from .text between 'MP_processor_info' (at offset 0xc0116a4f) and 'mp_register_lapic'

Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Andi Kleen <ak@suse.de>
2007-01-11 01:52:44 +01:00
Vivek Goyal 3771a450cf [PATCH] i386: modpost smpboot code warning fix
o Currently synchronize_tsc_ap() is of type __init. It is called by
  smp_callin() which is of type __cpuinit. So synchronize_tsc_ap()
  should be of type __cpuinit.

o Modpost generates warnings for i386 if CONFIG_RELOCATABLE=y and
  CONFIG_HOTPLUG_CPU=y

WARNING: vmlinux - Section mismatch: reference to .init.data: from .text between 'start_secondary' (at offset 0xc01164dc) and 'initialize_secondary'
WARNING: vmlinux - Section mismatch: reference to .init.data: from .text between 'start_secondary' (at offset 0xc01164e8) and 'initialize_secondary'

o tsc is of type __initdata. It should be of type __cpuinitdata.

Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Cc: Andi Kleen <ak@suse.de>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2007-01-05 23:55:23 -08:00
Vivek Goyal 9dbeeec91e [PATCH] i386: fix another modpost warning
o MODPOST generates warning for i386 if kernel is compiled with
  CONFIG_RELOCATABLE=y

WARNING: vmlinux - Section mismatch: reference to .init.data: from .data between 'this_cpu' (at offset 0xc05194d0) and 'cpuinfo_op'

o this_cpu pointer should be of type __cpuinitdata.

Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Cc: Andi Kleen <ak@suse.de>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2007-01-05 23:55:23 -08:00
Vivek Goyal 1119a33a96 [PATCH] i386: fix modpost warning in SMP trampoline code
o MODPOST generates warning for i386 if kernel is compiled with
  CONFIG_RELOCATABLE=y

WARNING: vmlinux - Section mismatch: reference to .init.text:startup_32_smp
from .data between 'trampoline_data' (at offset 0xc0519cf8) and 'boot_gdt'

o trampoline code/data can go into init section is CPU hotplug is not
  enabled.

Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Cc: Andi Kleen <ak@suse.de>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2007-01-05 23:55:23 -08:00
Linus Torvalds d1398a6ff5 Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6:
  ACPI: asus_acpi: new MAINTAINER
  ACPI: fix section mis-match build warning
  ACPI: increase ACPI_MAX_REFERENCE_COUNT for larger systems
  ACPI: EC: move verbose printk to debug build only
  backlight: fix backlight_device_register compile failures
2007-01-04 08:55:57 -08:00
Linus Torvalds de9e957f12 Merge master.kernel.org:/pub/scm/linux/kernel/git/davej/cpufreq
* master.kernel.org:/pub/scm/linux/kernel/git/davej/cpufreq:
  [CPUFREQ] longhaul: Kill off warnings introduced by recent changes.
  [CPUFREQ] Uninitialized use of cmd.val in arch/i386/kernel/cpu/cpufreq/acpi-cpufreq.c:acpi_cpufreq_target()
  [CPUFREQ] Longhaul - Always guess FSB
  [CPUFREQ] Longhaul - Fix up powersaver assumptions.
  [CPUFREQ] longhaul: Fix up unreachable code.
  [CPUFREQ] speedstep-centrino: missing space and bracket
  [CPUFREQ] Bug fix for acpi-cpufreq and cpufreq_stats oops on frequency change notification
  [CPUFREQ] select consistently
2007-01-03 17:34:12 -08:00
Dave Jones 43c8f12f9f [CPUFREQ] longhaul: Kill off warnings introduced by recent changes.
Bunch of unused vars + one case where gcc isn't smart enough.

Signed-off-by: Dave Jones <davej@redhat.com>
2007-01-02 23:42:16 -05:00
Guillaume Chazarain 76ff28c941 [CPUFREQ] Uninitialized use of cmd.val in arch/i386/kernel/cpu/cpufreq/acpi-cpufreq.c:acpi_cpufreq_target()
cmd.val was used uninitialized on the line below.

Signed-off-by: Guillaume Chazarain <guichaz@yahoo.fr>
Acked-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Dave Jones <davej@redhat.com>
2007-01-02 23:38:39 -05:00
Rafa³ Bilski 24ebead82b [CPUFREQ] Longhaul - Always guess FSB
This is patch that solves Ebox mini PC issue and make
FSB code more specification compilant. At start guess_fsb
function is guessing 200MHz FSB too. It is better to
make it in this way because, thanks to this function, driver
will fail for bogus FSB values caused by bogus multiplier
value. For PowerSaver processors we can't depend on Max /
MinMHzFSB because these values are only used for
PowerSaver 2.0 and 3.0. Most processors on which Longhaul
is used are PowerSaver 1.0 only. I'm changing code for older
CPU's too, but not so much as previously, and this code was
already used for Ezra. Using MinMHzBR for Ezra-T is outside
spec. It is for voltage scaling purpose and don't have to
be equal to minmult (but it is). Same for Nehemiah (it
isn't for sure). Added mult - current multiplier value.

Signed-off-by: Rafa³ Bilski <rafalbilski@interia.pl>
Signed-off-by: Dave Jones <davej@redhat.com>
2007-01-02 23:38:39 -05:00
Len Brown e82c354bb2 ACPI: fix section mis-match build warning
Dunno why this pops out in only in the allmodconfig build.
Though the warning is accurate, all the callers of the flagged
non __init function are __init, this is not a functional change.

WARNING: vmlinux - Section mismatch: reference to .init.data:acpi_sci_flags from .text between 'acpi_sci_ioapic_setup' (at offset 0xc010f0a
6) and 'acpi_gsi_to_irq'                                                                                                                   WARNING: vmlinux - Section mismatch: reference to .init.text:mp_override_legacy_irq from .text between 'acpi_sci_ioapic_setup' (at offset 0
xc010f0de) and 'acpi_gsi_to_irq'                                                                                                           WARNING: vmlinux - Section mismatch: reference to .init.data:acpi_sci_override_gsi from .text between 'acpi_sci_ioapic_setup' (at offset 0x
c010f0e4) and 'acpi_gsi_to_irq'

Signed-off-by: Len Brown <len.brown@intel.com>
2007-01-02 00:19:05 -05:00
Rafa³ Bilski 264166e604 [CPUFREQ] Longhaul - Fix up powersaver assumptions.
ACPI PM2 register was fallback for "Longhaul ver. 1" CPU's.
My assumption that this register isn't present at
"PowerSaver" motherboards is so far true, but current code
will not work correctly in other case. There are three possible
supports: ACPI C3, PM2 and northbridge. That was my assumption
that ACPI C3 and northbridge is for PS and northbridge and PM2
is for V1. In current code we can only check if it is ACPI
support or not by port22_en. So remove port22_en and add
longhaul_flags. If USE_ACPI_C3 and USE_NORTHBRIDGE are both
clear then it means ACPI PM2 support. Also change order of
support probe from ACPI C3, PM2, northbridge to ACPI C3,
northbridge, ACPI PM2. Paranoid protection against port 0x22
cast as ACPI PM2 register. Bit 1 clear in such case - lockup
on AGP DMA. And obvious (now) fixup for do_powersaver. Use
cx->address only for ACPI C3 ("PowerSaver" processor using
PM2 support).

Signed-off-by: Rafa¿ Bilski <rafalbilski@interia.pl>
Signed-off-by: Dave Jones <davej@redhat.com>
2006-12-29 15:56:44 -05:00
Dave Jones 1cfe201426 [CPUFREQ] longhaul: Fix up unreachable code.
Signed-off-by: Rafał Bilski <rafalbilski@interia.pl>
Signed-off-by: Dave Jones <davej@redhat.com>
2006-12-28 22:30:16 -05:00
Brice Goglin d349c4a5ae [CPUFREQ] speedstep-centrino: missing space and bracket
A space and a bracket are missing (and indentation is wrong).

Signed-off-by: Brice Goglin <Brice.Goglin@ens-lyon.org>
Signed-off-by: Dave Jones <davej@redhat.com>
2006-12-22 22:45:41 -05:00
Venkatesh Pallipadi 8edc59d939 [CPUFREQ] Bug fix for acpi-cpufreq and cpufreq_stats oops on frequency change notification
Fixes the oops in cpufreq_stats with acpi_cpufreq driver.  The issue was
that the frequency was reported as 0 in acpi-cpufreq.c.  The bug is due to
different indicies for freq_table and ACPI perf table.

Also adds a check in cpufreq_stats to check for error return from
freq_table_get_index() and avoid using the error return value.

Patch fixes the issue reported at
http://www.ussg.iu.edu/hypermail/linux/kernel/0611.2/0629.html
and also other similar issue here
http://bugme.osdl.org/show_bug.cgi?id=7383 comment 53

Signed-off-by: Dhaval Giani <dhaval.giani@gmail.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Dave Jones <davej@redhat.com>
2006-12-22 22:45:41 -05:00
Linus Torvalds 18ed1c0513 Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (68 commits)
  ACPI: replace kmalloc+memset with kzalloc
  ACPI: Add support for acpi_load_table/acpi_unload_table_id
  fbdev: update after backlight argument change
  ACPI: video: Add dev argument for backlight_device_register
  ACPI: Implement acpi_video_get_next_level()
  ACPI: Kconfig - depend on PM rather than selecting it
  ACPI: fix NULL check in drivers/acpi/osl.c
  ACPI: make drivers/acpi/ec.c:ec_ecdt static
  ACPI: prevent processor module from loading on failures
  ACPI: fix single linked list manipulation
  ACPI: ibm_acpi: allow clean removal
  ACPI: fix git automerge failure
  ACPI: ibm_acpi: respond to workqueue update
  ACPI: dock: add uevent to indicate change in device status
  ACPI: ec: Lindent once again
  ACPI: ec: Change #define to enums there possible.
  ACPI: ec: Style changes.
  ACPI: ec: Acquire Global Lock under EC mutex.
  ACPI: ec: Drop udelay() from poll mode. Loop by reading status field instead.
  ACPI: ec: Rename gpe_bit to gpe
  ...
2006-12-22 18:46:56 -08:00
Ingo Molnar 0888f06ac9 [PATCH] sched: fix bad missed wakeups in the i386, x86_64, ia64, ACPI and APM idle code
Fernando Lopez-Lezcano reported frequent scheduling latencies and audio
xruns starting at the 2.6.18-rt kernel, and those problems persisted all
until current -rt kernels. The latencies were serious and unjustified by
system load, often in the milliseconds range.

After a patient and heroic multi-month effort of Fernando, where he
tested dozens of kernels, tried various configs, boot options,
test-patches of mine and provided latency traces of those incidents, the
following 'smoking gun' trace was captured by him:

                 _------=> CPU#
                / _-----=> irqs-off
               | / _----=> need-resched
               || / _---=> hardirq/softirq
               ||| / _--=> preempt-depth
               |||| /
               |||||     delay
   cmd     pid ||||| time  |   caller
      \   /    |||||   \   |   /
  IRQ_19-1479  1D..1    0us : __trace_start_sched_wakeup (try_to_wake_up)
  IRQ_19-1479  1D..1    0us : __trace_start_sched_wakeup <<...>-5856> (37 0)
  IRQ_19-1479  1D..1    0us : __trace_start_sched_wakeup (c01262ba 0 0)
  IRQ_19-1479  1D..1    0us : resched_task (try_to_wake_up)
  IRQ_19-1479  1D..1    0us : __spin_unlock_irqrestore (try_to_wake_up)
  ...
  <idle>-0     1...1   11us!: default_idle (cpu_idle)
  ...
  <idle>-0     0Dn.1  602us : smp_apic_timer_interrupt (c0103baf 1 0)
  ...
   <...>-5856  0D..2  618us : __switch_to (__schedule)
   <...>-5856  0D..2  618us : __schedule <<idle>-0> (20 162)
   <...>-5856  0D..2  619us : __spin_unlock_irq (__schedule)
   <...>-5856  0...1  619us : trace_stop_sched_switched (__schedule)
   <...>-5856  0D..1  619us : trace_stop_sched_switched <<...>-5856> (37 0)

what is visible in this trace is that CPU#1 ran try_to_wake_up() for
PID:5856, it placed PID:5856 on CPU#0's runqueue and ran resched_task()
for CPU#0. But it decided to not send an IPI that no CPU - due to
TS_POLLING. But CPU#0 never woke up after its NEED_RESCHED bit was set,
and only rescheduled to PID:5856 upon the next lapic timer IRQ. The
result was a 600+ usecs latency and a missed wakeup!

the bug turned out to be an idle-wakeup bug introduced into the mainline
kernel this summer via an optimization in the x86_64 tree:

    commit 495ab9c045
    Author: Andi Kleen <ak@suse.de>
    Date:   Mon Jun 26 13:59:11 2006 +0200

    [PATCH] i386/x86-64/ia64: Move polling flag into thread_info_status

    During some profiling I noticed that default_idle causes a lot of
    memory traffic. I think that is caused by the atomic operations
    to clear/set the polling flag in thread_info. There is actually
    no reason to make this atomic - only the idle thread does it
    to itself, other CPUs only read it. So I moved it into ti->status.

the problem is this type of change:

        if (!hlt_counter && boot_cpu_data.hlt_works_ok) {
-               clear_thread_flag(TIF_POLLING_NRFLAG);
+               current_thread_info()->status &= ~TS_POLLING;
                smp_mb__after_clear_bit();
                while (!need_resched()) {
                        local_irq_disable();

this changes clear_thread_flag() to an explicit clearing of TS_POLLING.
clear_thread_flag() is defined as:

        clear_bit(flag, &ti->flags);

and clear_bit() is a LOCK-ed atomic instruction on all x86 platforms:

  static inline void clear_bit(int nr, volatile unsigned long * addr)
  {
          __asm__ __volatile__( LOCK_PREFIX
                  "btrl %1,%0"

hence smp_mb__after_clear_bit() is defined as a simple compile barrier:

  #define smp_mb__after_clear_bit()       barrier()

but the explicit TS_POLLING clearing introduced by the patch:

+               current_thread_info()->status &= ~TS_POLLING;

is not an atomic op! So the clearing of the TS_POLLING bit is freely
reorderable with the reading of the NEED_RESCHED bit - and both now
reside in different memory addresses.

CPU idle wakeup very much depends on ordered memory ops, the clearing of
the TS_POLLING flag must always be done before we test need_resched()
and hit the idle instruction(s). [Symmetrically, the wakeup code needs
to set NEED_RESCHED before it tests the TS_POLLING flag, so memory
ordering is paramount.]

Fernando's dual-core Athlon64 system has a sufficiently advanced memory
ordering model so that it triggered this scenario very often.

( And it also turned out that the reason why these latencies never
  triggered on my testsystems is that i routinely use idle=poll, which
  was the only idle variant not affected by this bug. )

The fix is to change the smp_mb__after_clear_bit() to an smp_mb(), to
act as an absolute barrier between the TS_POLLING write and the
NEED_RESCHED read. This affects almost all idling methods (default,
ACPI, APM), on all 3 x86 architectures: i386, x86_64, ia64.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Tested-by: Fernando Lopez-Lezcano <nando@ccrma.Stanford.EDU>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-22 08:55:51 -08:00
Jeremy Fitzhardinge 8701ea957d [PATCH] ptrace: Fix EFL_OFFSET value according to i386 pda changes
The PDA patches introduced a bug in ptrace: it reads eflags from the wrong
place on the target's stack, but writes it back to the correct place.  The
result is a corrupted eflags, which is most visible when it turns interrupts
off unexpectedly.

This patch fixes this by making the ptrace code a little less fragile.  It
changes [gs]et_stack_long to take a straightforward byte offset into struct
pt_regs, rather than requiring all callers to do a sizeof(struct pt_regs)
offset adjustment.  This means that the eflag's offset (EFL_OFFSET) on the
target stack can be simply computed with offsetof().

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Frederik Deweerdt <deweerdt@free.fr>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-22 08:55:51 -08:00
Jean Delvare be31f9cbc8 [PATCH] microcode: fix mc_cpu_notifier section warning
Structure mc_cpu_notifier references a __cpuinit function, but isn't
declared __cpuinitdata itself:

WARNING: arch/i386/kernel/microcode.o - Section mismatch: reference
to .init.text: from .data after 'mc_cpu_notifier' (at offset 0x118)

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-22 08:55:50 -08:00
Yasunori Goto 5c95da9f5a [PATCH] compile error of register_memory()
register_memory() becomes double definition in 2.6.20-rc1.  It is defined
in arch/i386/kernel/setup.c as static definition in 2.6.19.  But it is
moved to arch/i386/kernel/e820.c in 2.6.20-rc1.  And same name function is
defined in driver/base/memory.c too.  So, it becomes cause of compile error
of duplicate definition if memory hotplug option is on.

Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-22 08:55:49 -08:00
Len Brown 9774f33841 merge linus into test branch 2006-12-20 02:53:13 -05:00
Linus Torvalds e25db641c0 Merge master.kernel.org:/pub/scm/linux/kernel/git/davej/cpufreq
* master.kernel.org:/pub/scm/linux/kernel/git/davej/cpufreq:
  [CPUFREQ] longhaul compile fix.
  [CPUFREQ] Advise not to use longhaul on VIA C7.
  [CPUFREQ] set policy->curfreq on initialization
  [CPUFREQ] Trivial cleanup for acpi read/write port in acpi-cpufreq.c
  [CPUFREQ] fixes typo in cpufreq.c
2006-12-17 19:08:11 -08:00
Dave Jones 928ee513c2 [CPUFREQ] longhaul compile fix.
Some gcc's are more anal than others about empty switch labels.
error: label at end of compound statement

Signed-off-by: Dave Jones <davej@redhat.com>
2006-12-17 19:09:59 -05:00
Dave Jones 8ec9822dd1 [CPUFREQ] Advise not to use longhaul on VIA C7.
C7's are centrino speedstep-alike.

Signed-off-by: Dave Jones <davej@redhat.com>
2006-12-17 19:07:35 -05:00
Mattia Dongili a507ac4b01 [CPUFREQ] set policy->curfreq on initialization
Check the correct variable and set policy->cur upon acpi-cpufreq
initialization to allow the userspace governor to be used as default.

Signed-off-by: Mattia Dongili <malattia@linux.it>
Acked-by: "Pallipadi, Venkatesh" <venkatesh.pallipadi@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Dave Jones <davej@redhat.com>
2006-12-17 18:09:26 -05:00
Len Brown 7e244322cd ACPI: fix git automerge failure
Signed-off-by: Len Brown <len.brown@intel.com>
2006-12-16 00:59:38 -05:00
Len Brown 463e7c7cf9 Pull trivial into test branch
Conflicts:

	drivers/acpi/ec.c
2006-12-16 00:45:07 -05:00
Linus Torvalds d1526e2cda Remove stack unwinder for now
It has caused more problems than it ever really solved, and is
apparently not getting cleaned up and fixed.  We can put it back when
it's stable and isn't likely to make warning or bug events worse.

In the meantime, enable frame pointers for more readable stack traces.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-15 08:47:51 -08:00
Venkatesh Pallipadi 4e581ff165 [CPUFREQ] Trivial cleanup for acpi read/write port in acpi-cpufreq.c
Small cleanup in acpi-cpufreq.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Dave Jones <davej@redhat.com>
2006-12-13 18:12:31 -05:00
Robert P. J. Day 5cbded585d [PATCH] getting rid of all casts of k[cmz]alloc() calls
Run this:

	#!/bin/sh
	for f in $(grep -Erl "\([^\)]*\) *k[cmz]alloc" *) ; do
	  echo "De-casting $f..."
	  perl -pi -e "s/ ?= ?\([^\)]*\) *(k[cmz]alloc) *\(/ = \1\(/" $f
	done

And then go through and reinstate those cases where code is casting pointers
to non-pointers.

And then drop a few hunks which conflicted with outstanding work.

Cc: Russell King <rmk@arm.linux.org.uk>, Ian Molton <spyro@f2s.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Greg KH <greg@kroah.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Paul Fulghum <paulkf@microgate.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Karsten Keil <kkeil@suse.de>
Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
Cc: Jeff Garzik <jeff@garzik.org>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Cc: Ian Kent <raven@themaw.net>
Cc: Steven French <sfrench@us.ibm.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Neil Brown <neilb@cse.unsw.edu.au>
Cc: Jaroslav Kysela <perex@suse.cz>
Cc: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-13 09:05:58 -08:00
Tigran Aivazian 69688262fb [PATCH] update Tigran's email addresses
As Adrian pointed out recently, there were still a couple of places where
I should have fixed my email address.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-13 09:05:53 -08:00
Andrew Morton 24d34dc564 [PATCH] arch/i386/kernel/smpboot.c: remove unneeded ifdef
#ifdef CONFIG_SMP in a file which isn't compiled in non-SMP kernels.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-13 09:05:46 -08:00
Dave Jones c4366889dd Merge ../linus
Conflicts:

	drivers/cpufreq/cpufreq.c
2006-12-12 17:41:41 -05:00
Rafa³ Bilski db2fb9db57 [CPUFREQ] Longhaul - Add support for CN400
Support for CN400 northbridge when ACPI C3 isn't available.
Tested on Epia SP13000. Thanks to Robert for testing it.

Signed-off-by: Rafa³ Bilski <rafalbilski@interia.pl>
Signed-off-by: Dave Jones <davej@redhat.com>
2006-12-12 17:33:10 -05:00
Rafa³ Bilski 3f4a25f17e [CPUFREQ] Longhaul - fix 200MHz FSB
On board of Epia SP13000 is 10x133Mhz VIA Nehemiah. It is reported
as 10x200MHz. This patch is fixing this issue.

Signed-off-by: Rafa³ Bilski <rafalbilski@interia.pl>
Signed-off-by: Dave Jones <davej@redhat.com>
2006-12-12 17:33:10 -05:00
Dominik Brodowski e11952b971 [CPUFREQ] p4-clockmod: fix support for Core
Support for Core CPUs was broken in two ways in speedstep-lib: for x86_64,
we missed a MSR definition; for both x86_64 and i386, the FSB calculation
was wrong by four (it's a quad-pumped bus). Also increase the accuracy
of the calculation.

Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Dave Jones <davej@redhat.com>
2006-12-12 17:29:04 -05:00
Venkatesh Pallipadi 3d4a7ef3d3 [CPUFREQ] Fix the bug in duplicate freq elimination code in acpi-cpufreq
Fix the bug in duplicate states elimination in acpi-cpufreq.

Bug: Due to duplicate state elimiation in the loop earlier, the number
of valid_states can be less than perf->state_count, in which case
freq_table was ending up with some garbage/uninitialized entries
in the table.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
From:  Alexey Starikovskiy <alexey.y.starikovskiy@intel.com>
Signed-off-by: Dave Jones <davej@redhat.com>
2006-12-12 17:20:50 -05:00
Gary Hade 8b9c6671f8 [CPUFREQ] speedstep-centrino should ignore upper performance control bits
On some systems there could be bits set in the upper half of
the control value provided by the _PSS object.  These bits are
only relevant for cpufreq drivers that use IO ports which are not
currently supported by the speedstep-centrino driver.  The current
MSR oriented code assumes that upper bits are not set and thus
fails to work correctly when they are.  e.g. the control and status
value equality check failed on the IBM x3650 even though the ACPI
spec allows inequality.

Signed-off-by: Gary Hade <garyhade@us.ibm.com>
Signed-off-by: Dave Jones <davej@redhat.com>
2006-12-12 17:20:49 -05:00
Jean Delvare 55e337345d [CPUFREQ] Optimize gx-suspmod revision ID fetching
We don't need a temporary variable to get the PCI revision ID.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Dave Jones <davej@redhat.com>
2006-12-12 17:20:49 -05:00
Andi Kleen 306a22c2a2 [PATCH] i386: Fix io_apic.c warning
gcc 4.2 warns

linux/arch/i386/kernel/io_apic.c: In function ‘create_irq’:
linux/arch/i386/kernel/io_apic.c:2488: warning: ‘vector’ may be used uninitialized in this function

The warning is false, but somewhat legitimate so work around it.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-09 21:33:36 +01:00
Randy Dunlap 7e74437cf6 [PATCH] i386: export smp_num_siblings for oprofile
oprofile uses smp_num_siblings without testing for CONFIG_X86_HT.
I looked at modifying oprofile, but this way is cleaner & simpler
and I didn't see a good reason not to just export it when CONFIG_SMP.

WARNING: "smp_num_siblings" [arch/i386/oprofile/oprofile.ko] undefined!

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-09 21:33:36 +01:00
Andi Kleen 1bac3b383a [PATCH] x86: Work around gcc 4.2 over aggressive optimizer
The new PDA code uses a dummy _proxy_pda variable to describe
memory references to the PDA. It is never referenced
in inline assembly, but exists as input/output arguments.
gcc 4.2 in some cases can CSE references to this which causes
unresolved symbols.  Define it to zero to avoid this.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-09 21:33:36 +01:00
Ravikiran G Thirumalai 92715e282b [PATCH] x86: Fix boot hang due to nmi watchdog init code
2.6.19  stopped booting (or booted based on build/config) on our x86_64
systems due to a bug introduced in 2.6.19.  check_nmi_watchdog schedules an
IPI on all cpus to  busy wait on a flag, but fails to set the busywait
flag if NMI functionality is disabled.  This causes the secondary cpus
to spin in an endless loop, causing the kernel bootup to hang.
Depending upon the build, the  busywait flag got overwritten (stack variable)
and caused  the kernel to bootup on certain builds.  Following patch fixes
the bug by setting the busywait flag before returning from check_nmi_watchdog.
I guess using a stack variable is not good here as the calling function could
potentially return while the busy wait loop is still spinning on the flag.

AK: I redid the patch significantly to be cleaner

Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>
Signed-off-by: Shai Fultheim <shai@scalex86.org>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-09 21:33:35 +01:00
Andi Kleen 16d279d277 [PATCH] x86: Fix verify_quirk_intel_irqbalance()
Fix verify_quirk_intel_irqbalance(). genapic checks should really
happen only on affected versions of the E7520/E7320/E7525 based platforms.

AK: This should akpm's Coyote SDV

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-09 21:33:35 +01:00
Shaohua Li 3b1bdf4e08 [PATCH] CPU hotplug broken with 2GB VMSPLIT
In VMSPLIT mode, kernel PGD might have more entries than user space.

Acked-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-08 08:29:09 -08:00
Josef "Jeff" Sipek aab4c5a51c [PATCH] i386: change uses of f_{dentry, vfsmnt} to use f_path
Change all the uses of f_{dentry,vfsmnt} to f_path.{dentry,mnt} in the i386
arch code.

Signed-off-by: Josef "Jeff" Sipek <jsipek@cs.sunysb.edu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-08 08:28:42 -08:00
Jeremy Fitzhardinge 91768d6c2b [PATCH] Generic BUG for i386
This makes i386 use the generic BUG machinery.  There are no functional
changes from the old i386 implementation.

The main advantage in using the generic BUG machinery for i386 is that the
inlined overhead of BUG is just the ud2a instruction; the file+line(+function)
information are no longer inlined into the instruction stream.  This reduces
cache pollution, and makes disassembly work properly.

Signed-off-by: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Andi Kleen <ak@muc.de>
Cc: Hugh Dickens <hugh@veritas.com>
Cc: Michael Ellerman <michael@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-08 08:28:39 -08:00
Linus Torvalds 4522d58275 Merge branch 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6
* 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6: (156 commits)
  [PATCH] x86-64: Export smp_call_function_single
  [PATCH] i386: Clean up smp_tune_scheduling()
  [PATCH] unwinder: move .eh_frame to RODATA
  [PATCH] unwinder: fully support linker generated .eh_frame_hdr section
  [PATCH] x86-64: don't use set_irq_regs()
  [PATCH] x86-64: check vector in setup_ioapic_dest to verify if need setup_IO_APIC_irq
  [PATCH] x86-64: Make ix86 default to HIGHMEM4G instead of NOHIGHMEM
  [PATCH] i386: replace kmalloc+memset with kzalloc
  [PATCH] x86-64: remove remaining pc98 code
  [PATCH] x86-64: remove unused variable
  [PATCH] x86-64: Fix constraints in atomic_add_return()
  [PATCH] x86-64: fix asm constraints in i386 atomic_add_return
  [PATCH] x86-64: Correct documentation for bzImage protocol v2.05
  [PATCH] x86-64: replace kmalloc+memset with kzalloc in MTRR code
  [PATCH] x86-64: Fix numaq build error
  [PATCH] x86-64: include/asm-x86_64/cpufeature.h isn't a userspace header
  [PATCH] unwinder: Add debugging output to the Dwarf2 unwinder
  [PATCH] x86-64: Clarify error message in GART code
  [PATCH] x86-64: Fix interrupt race in idle callback (3rd try)
  [PATCH] x86-64: Remove unwind stack pointer alignment forcing again
  ...

Fixed conflict in include/linux/uaccess.h manually

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:59:11 -08:00
Magnus Damm 85916f8166 [PATCH] Kexec / Kdump: Unify elf note code
The elf note saving code is currently duplicated over several
architectures.  This cleanup patch simply adds code to a common file and
then replaces the arch-specific code with calls to the newly added code.

The only drawback with this approach is that s390 doesn't fully support
kexec-on-panic which for that arch leads to introduction of unused code.

Signed-off-by: Magnus Damm <magnus@valinux.co.jp>
Cc: Vivek Goyal <vgoyal@in.ibm.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:46 -08:00
Adrian Bunk cd6ed52568 [PATCH] arch/i386/kernel/reboot.c should #include <linux/reboot.h>
Every file should #include the headers containing the prototypes for
its global functions.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:44 -08:00
Ingo Molnar 0231606785 [PATCH] hotplug CPU: clean up hotcpu_notifier() use
There was lots of #ifdef noise in the kernel due to hotcpu_notifier(fn,
prio) not correctly marking 'fn' as used in the !HOTPLUG_CPU case, and thus
generating compiler warnings of unused symbols, hence forcing people to add
#ifdefs.

the compiler can skip truly unused functions just fine:

    text    data     bss     dec     hex filename
 1624412  728710 3674856 6027978  5bfaca vmlinux.before
 1624412  728710 3674856 6027978  5bfaca vmlinux.after

[akpm@osdl.org: topology.c fix]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:39 -08:00
Andrew Morton a38a44c1a9 [PATCH] smp_call_function_single() check that local interrupts are enabled
smp_call_function_single() can deadlock if the caller disabled local
interrupts (the target CPU could be spinning on call_lock).  Check for that.

Why on earth do these functions use spin_lock_bh()??

Cc: "Randy.Dunlap" <rdunlap@xenotime.net>
Cc: Andi Kleen <ak@suse.de>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:39 -08:00
Masami Hiramatsu b4c6c34a53 [PATCH] kprobes: enable booster on the preemptible kernel
When we are unregistering a kprobe-booster, we can't release its
instruction buffer immediately on the preemptive kernel, because some
processes might be preempted on the buffer.  The freeze_processes() and
thaw_processes() functions can clean most of processes up from the buffer.
There are still some non-frozen threads who have the PF_NOFREEZE flag.  If
those threads are sleeping (not preempted) at the known place outside the
buffer, we can ensure safety of freeing.

However, the processing of this check routine takes a long time.  So, this
patch introduces the garbage collection mechanism of insn_slot.  It also
introduces the "dirty" flag to free_insn_slot because of efficiency.

The "clean" instruction slots (dirty flag is cleared) are released
immediately.  But the "dirty" slots which are used by boosted kprobes, are
marked as garbages.  collect_garbage_slots() will be invoked to release
"dirty" slots if there are more than INSNS_PER_PAGE garbage slots or if
there are no unused slots.

Cc: "Keshavamurthy, Anil S" <anil.s.keshavamurthy@intel.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: "bibo,mao" <bibo.mao@intel.com>
Cc: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Cc: Yumiko Sugita <yumiko.sugita.yf@hitachi.com>
Cc: Satoshi Oshima <soshima@redhat.com>
Cc: Hideo Aoki <haoki@redhat.com>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:38 -08:00
Henry Nestler 19e5d9c0d2 [PATCH] initrd: remove unused false condition for initrd_start
After LOADER_TYPE && INITRD_START are true, the short if-condition
for INITRD_START can never be false.

Remove unused code from the else condition.

Signed-off-by: Henry Nestler <henry.ne@arcor.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:38 -08:00
Peter Zijlstra 6cfd76a26d [PATCH] lockdep: name some old style locks
Name some of the remaning 'old_style_spin_init' locks

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:36 -08:00
Nigel Cunningham 7dfb71030f [PATCH] Add include/linux/freezer.h and move definitions from sched.h
Move process freezing functions from include/linux/sched.h to freezer.h, so
that modifications to the freezer or the kernel configuration don't require
recompiling just about everything.

[akpm@osdl.org: fix ueagle driver]
Signed-off-by: Nigel Cunningham <nigel@suspend2.net>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:27 -08:00
Christoph Lameter e94b176609 [PATCH] slab: remove SLAB_KERNEL
SLAB_KERNEL is an alias of GFP_KERNEL.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:24 -08:00
Alan Stern a120586873 [PATCH] Allow NULL pointers in percpu_free
The patch (as824b) makes percpu_free() ignore NULL arguments, as one would
expect for a deallocation routine.  (Note that free_percpu is #defined as
percpu_free in include/linux/percpu.h.) A few callers are updated to remove
now-unneeded tests for NULL.  A few other callers already seem to assume
that passing a NULL pointer to percpu_free() is okay!

The patch also removes an unnecessary NULL check in percpu_depopulate().

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:22 -08:00
Adrian Bunk d9408cefe6 [PATCH] i386: Clean up smp_tune_scheduling()
- remove the write-only local variable "bandwidth"
- don't set "max_cache_size" in the (cachesize < 0) case:
  that's already handled in kernel/sched.c:measure_migration_cost()

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andi Kleen <ak@suse.de>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-12-07 02:14:19 +01:00
Jan Beulich b65780e123 [PATCH] unwinder: move .eh_frame to RODATA
The .eh_frame section contents is never written to, so it can as well
benefit from CONFIG_DEBUG_RODATA.

Diff-ed against firstfloor tree.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:19 +01:00
Oleg Nesterov 6bedb2ccb0 [PATCH] x86-64: don't use set_irq_regs()
We don't need to setup _irq_regs in smp_xxx_interrupt (except apic timer).
These handlers run with irqs disabled and do not call functions which need
"struct pt_regs".

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andi Kleen <ak@suse.de>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: Andi Kleen <ak@suse.de>
Acked-By: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-12-07 02:14:19 +01:00
Burman Yan 116780fc04 [PATCH] i386: replace kmalloc+memset with kzalloc
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:19 +01:00
Adrian Bunk d7fb027128 [PATCH] x86-64: remove remaining pc98 code
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:19 +01:00
David Rientjes f475ff352c [PATCH] x86-64: remove unused variable
Remove unused variable in msr_write().

Reported by D Binderman <dcb314@hotmail.com>.

Cc: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: David Rientjes <rientjes@cs.washington.edu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:13 +01:00
Burman Yan 9cfa5b5dfa [PATCH] x86-64: replace kmalloc+memset with kzalloc in MTRR code
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:13 +01:00
Jan Beulich 359ad0d401 [PATCH] unwinder: more sanity checks in Dwarf2 unwinder
Tighten the requirements on both input to and output from the Dwarf2
unwinder.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:13 +01:00
Dave Jones a36df98ab1 [PATCH] i386: touch softlockup during backtracing
Sometimes the soft watchdog fires after we're done oopsing.
See http://projects.info-pull.com/mokb/MOKB-25-11-2006.html for an example.

AK: changed to touch_nmi_watchdog()

Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:12 +01:00
Chuck Ebbert 0741f4d207 [PATCH] x86: add sysctl for kstack_depth_to_print
Add sysctl for kstack_depth_to_print. This lets users change
the amount of raw stack data printed in dump_stack() without
having to reboot.

Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:11 +01:00
Dave Jones 6df0532eef [PATCH] i386: remove duplicate printk
We do the exact same printk about a dozen lines above
with no intermediate printk's.

Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:11 +01:00
Artiom Myaskouvskey bf7e6a1963 [PATCH] i386: Preserve EFI run time regions with memmap parameter
When using memmap kernel parameter in EFI boot we should also add to memory map
memory regions of runtime services to enable their mapping later.

AK: merged and cleaned up the patch

Signed-off-by: Artiom Myaskouvskey <artiom.myaskouvskey@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:11 +01:00
Karsten Wiese f990fff427 [PATCH] x86: Regard MSRs in lapic_suspend()/lapic_resume()
Read/Write APIC_LVTPC and APIC_LVTTHMR only,
if get_maxlvt() returns certain values.
This is done like everywhere else in i386/kernel/apic.c,
so I guess its correct.
Suspends/Resumes to disk fine and eleminates an smp_error_interrupt()
here on a K8.

AK: ported to x86-64 too

Signed-off-by: Karsten Wiese <fzu@wemgehoertderstaat.de>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:11 +01:00
Stephane Eranian 538f188e03 [PATCH] i386: i386 add Intel BTS cpufeature bit and detection (take 2)
Here is a small patch for i386 which adds a cpufeature flag and
detection code for Intel's Branch Trace Store (BTS) feature. This
feature can be found on Intel P4 and Core 2 processors among others.
It can also be used by perfmon.

changelog:
	- add CPU_FEATURE_BTS
	- add Branch Trace Store detection

signed-off-by: stephane eranian <eranian@hpl.hp.com>

Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:11 +01:00
Adrian Bunk 7e95b593a1 [PATCH] i386: Make irq_vector static
irq_vector[] can now become static.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andi Kleen <ak@suse.de>
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-12-07 02:14:11 +01:00
Adrian Bunk 956fb53197 [PATCH] i386: handle a negative return value
The Coverity checker noted that bad things might happen if
find_isa_irq_apic() returned -1.

[akpm@osdl.org: add debugging checks]
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Acked-by: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-12-07 02:14:11 +01:00
Artiom Myaskouvskey e1cccf48b1 [PATCH] i386: call efi_get_time during suspend
Function efi_get_time called not only during init kernel phase but also
during suspend (from get_cmos_time).

When it is called from get_cmos_time the corresponding runtime service
should be called in virtual and not in physical mode.

Signed-off-by: Artiom Myaskouvskey <artiom.myaskouvskey@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: "Narayanan, Chandramouli" <chandramouli.narayanan@intel.com>
Cc: "Jiossy, Rami" <rami.jiossy@intel.com>
Cc: "Satt, Shai" <shai.satt@intel.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Matt Domsch <Matt_Domsch@dell.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-12-07 02:14:11 +01:00
Siddha, Suresh B b0d0a4ba45 [PATCH] x86: fix the irqbalance quirk for E7320/E7520/E7525
Move the irqbalance quirks for E7320/E7520/E7525(Errata 23 in
http://download.intel.com/design/chipsets/specupdt/30304203.pdf) to early
quirks.

And add a PCI quirk for these platforms to check(which happens very late
during the boot) if the APIC routing is indeed set to default flat mode.

This fixes the breakage(in x86_64) of this quirk due to cpu hotplug which
selects physical mode instead of the logical flat(as needed for this errata
workaround).

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Cc: "Li, Shaohua" <shaohua.li@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-12-07 02:14:10 +01:00
Siddha, Suresh B 72486f1f8f [PATCH] i386: change the 'no_control' field to 'hotpluggable' in the struct cpu
Change the 'no_control' field in the cpu struct to a more positive
and better term 'hotpluggable'. And change(/cleanup) the logic accordingly.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Cc: "Li, Shaohua" <shaohua.li@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-12-07 02:14:10 +01:00
Siddha, Suresh B fd6d7d2689 [PATCH] i386: introduce the mechanism of disabling cpu hotplug control
Add 'enable_cpu_hotplug' flag and when cleared, the hotplug control file
("online") will not be added under /sys/devices/system/cpu/cpuX/

Next patch doing PCI quirks will use this.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Cc: "Li, Shaohua" <shaohua.li@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-12-07 02:14:10 +01:00
Vivek Goyal 79929fd1c1 [PATCH] i386: Convert more absolute symbols to section relative
o Convert more absolute symbols to section relative to keep the theme in
  vmlinux.lds.S file and to avoid problem if kernel is relocated.

o Also put a message so that in future people can be aware of it and
  avoid introducing absolute symbols.

Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-12-07 02:14:10 +01:00
Adrian Bunk ba10650a88 [PATCH] i386: alloc_gdt() static
Make the needlessly global alloc_gdt() static.

(against) pda-percpu-init

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@muc.de>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-12-07 02:14:10 +01:00
Jan Beulich 365bff806e [PATCH] i386: fix MTRR code
Until not so long ago, there were system log messages pointing to
inconsistent MTRR setup of the video frame buffer caused by the way vesafb
and X worked. While vesafb was fixed meanwhile, I believe fixing it there
only hides a shortcoming in the MTRR code itself, in that that code is not
symmetric with respect to the ordering of attempts to set up two (or more)
regions where one contains the other. In the current shape, it permits
only setting up sub-regions of pre-exisiting ones. The patch below makes
this symmetric.

While working on that I noticed a few more inconsistencies in that code,
namely
- use of 'unsigned int' for sizes in many, but not all places (the patch
  is converting this to use 'unsigned long' everywhere, which specifically
  might be necessary for x86-64 once a processor supporting more than 44
  physical address bits would become available)
- the code to correct inconsistent settings during secondary processor
  startup tried (if necessary) to correct, among other things, the value
  in IA32_MTRR_DEF_TYPE, however the newly computed value would never get
  used (i.e. stored in the respective MSR)
- the generic range validation code checked that the end of the
  to-be-added range would be above 1MB; the value checked should have been
  the start of the range
- when contained regions are detected, previously this was allowed only
  when the old region was uncacheable; this can be symmetric (i.e. the new
  region can also be uncacheable) and even further as per Intel's
  documentation write-trough and write-back for either region is also
  compatible with the respective opposite in the other

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:09 +01:00
Jan Beulich 475850c86b [PATCH] i386: conditionalize inclusion of some MTRR flavors
Avoid inclusion of code that's dead for x86-64.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:09 +01:00
Jan Beulich c6ea396de6 [PATCH] i386: Don't touch per cpu memory of offline CPUs in touch_nmi_watchdog
Just like on x86-64, don't touch foreign CPUs' memory if the watchdog
isn't enabled at all.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:09 +01:00
Zachary Amsden 8542b200cb [PATCH] paravirt: Add option to allow skipping the timer check
Add a way to disable the timer IRQ routing check via a boot option.  The
VMI timer code uses this to avoid triggering the pester Mingo code, which
probes for some very unusual and broken motherboard routings.  It fires
100% of the time when using a paravirtual delay mechanism instead of using
a realtime delay, since there is no elapsed real time, and the 4 timer IRQs
have not yet been delivered.

In addition, it is entirely possible, though improbable, that this bug
could surface on real hardware which picks a particularly bad time to enter
SMM mode, causing a long latency during one of the timer IRQs.

While here, make check_timer be __init.

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andi Kleen <ak@suse.de>
[chrisw: use no_timer_check to bring inline with x86_64 as per Andi's request]
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-12-07 02:14:09 +01:00
Rusty Russell bd472c794b [PATCH] paravirt: Be careful about touching BIOS address space
BIOS ROM areas may not be mapped into the guest address space, so be careful
when touching those addresses to make sure they appear to be mapped.

[akpm@osdl.org: fix unused var warning]
AK: Changed __get_user to probe_kernel_address

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-12-07 02:14:08 +01:00
Rusty Russell da181a8b39 [PATCH] paravirt: Add MMU virtualization to paravirt_ops
Add the three bare TLB accessor functions to paravirt-ops.  Most amusingly,
flush_tlb is redefined on SMP, so I can't call the paravirt op flush_tlb.
Instead, I chose to indicate the actual flush type, kernel (global) vs. user
(non-global).  Global in this sense means using the global bit in the page
table entry, which makes TLB entries persistent across CR3 reloads, not
global as in the SMP sense of invoking remote shootdowns, so the term is
confusingly overloaded.

AK: folded in fix from Zach for PAE compilation

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-12-07 02:14:08 +01:00
Rusty Russell 13623d7930 [PATCH] paravirt: Add APIC accessors to paravirt-ops.
Add APIC accessors to paravirt-ops.  Unfortunately, we need two write
functions, as some older broken hardware requires workarounds for
Pentium APIC errata - this is the purpose of apic_write_atomic.

AK: replaced __inline with inline

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-12-07 02:14:08 +01:00
Rusty Russell 6020c8f315 [PATCH] paravirt: Allow disable power management under hypervisor
Two legacy power management modes are much easier to just explicitly disable
when running in paravirtualized mode - neither APM nor PnP is still relevant.
The status of ACPI is still debatable, and noacpi is still a common enough
boot parameter that it is not necessary to explicitly disable ACPI.

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-12-07 02:14:08 +01:00
Andi Kleen 3bbf547254 [PATCH] paravirt: Disable vdso by default when CONFIG_PARAVIRT is enabled
They don't work together and this way even glibc still works.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:08 +01:00
Rusty Russell 4f205fd45a [PATCH] paravirt: Allow selected bug checks to be
Allow selected bug checks to be skipped by paravirt kernels.  The two most
important are the F00F workaround (which is either done by the hypervisor,
or not required), and the 'hlt' instruction check, which can break under
some hypervisors.

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-12-07 02:14:08 +01:00
Rusty Russell c9ccf30d77 [PATCH] paravirt: Add startup infrastructure for paravirtualization
1) Each hypervisor writes a probe function to detect whether we are
   running under that hypervisor.  paravirt_probe() registers this
   function.

2) If vmlinux is booted with ring != 0, we call all the probe
   functions (with registers except %esp intact) in link order: the
   winner will not return.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-12-07 02:14:08 +01:00
Rusty Russell d7cd56111f [PATCH] i386: cpu_detect extraction
Both lhype and Xen want to call the core of the x86 cpu detect code before
calling start_kernel.

(extracted from larger patch)

AK: folded in start_kernel header patch

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-12-07 02:14:08 +01:00
Rusty Russell 139ec7c416 [PATCH] paravirt: Patch inline replacements for paravirt intercepts
It turns out that the most called ops, by several orders of magnitude,
are the interrupt manipulation ops.  These are obvious candidates for
patching, so mark them up and create infrastructure for it.

The method used is that the ops structure has a patch function, which
is called for each place which needs to be patched: this returns a
number of instructions (the rest are NOP-padded).

Usually we can spare a register (%eax) for the binary patched code to
use, but in a couple of critical places in entry.S we can't: we make
the clobbers explicit at the call site, and manually clobber the
allowed registers in debug mode as an extra check.

And:

Don't abuse CONFIG_DEBUG_KERNEL, add CONFIG_DEBUG_PARAVIRT.

And:

AK:  Fix warnings in x86-64 alternative.c build

And:

AK: Fix compilation with defconfig

And:

^From: Andrew Morton <akpm@osdl.org>

Some binutlises still like to emit references to __stop_parainstructions and
__start_parainstructions.

And:

AK: Fix warnings about unused variables when PARAVIRT is disabled.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-12-07 02:14:08 +01:00
Rusty Russell d3561b7fa0 [PATCH] paravirt: header and stubs for paravirtualisation
Create a paravirt.h header for all the critical operations which need to be
replaced with hypervisor calls, and include that instead of defining native
operations, when CONFIG_PARAVIRT.

This patch does the dumbest possible replacement of paravirtualized
instructions: calls through a "paravirt_ops" structure.  Currently these are
function implementations of native hardware: hypervisors will override the ops
structure with their own variants.

All the pv-ops functions are declared "fastcall" so that a specific
register-based ABI is used, to make inlining assember easier.

And:

+From: Andy Whitcroft <apw@shadowen.org>

The paravirt ops introduce a 'weak' attribute onto memory_setup().
Code ordering leads to the following warnings on x86:

    arch/i386/kernel/setup.c:651: warning: weak declaration of
                `memory_setup' after first use results in unspecified behavior

Move memory_setup() to avoid this.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Andy Whitcroft <apw@shadowen.org>
2006-12-07 02:14:07 +01:00
Nicolas Kaiser db91b882aa [PATCH] i386: Fix double #includes in arch/i386
Fix double #includes in arch/i386

Signed-off-by: Nicolas Kaiser <nikai@nikai.net>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:07 +01:00
Chuck Ebbert 8c89812684 [PATCH] i386: remove IOPL check on task switch
IOPL is implicitly saved and restored on task switch,
so explicit check is no longer needed.

Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:07 +01:00
Andi Kleen d15512f442 [PATCH] i386: Fix race in IO-APIC routing entry setup.
Interrupt could happen between setting the IO-APIC entry
and setting its interrupt data.

Pointed out by Linus.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:07 +01:00
bibo,mao cef518e88b [PATCH] i386: Move memory map printing and other code to e820.c
This patch moves e820 memory map print and memmap boot param
parsing function from setup.c to e820.c, also adds limit_regions
and print_memory_map declaration in header file.

Signed-off-by: bibo,mao <bibo.mao@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>

 arch/i386/kernel/e820.c  |  152 +++++++++++++++++++++++++++
 arch/i386/kernel/setup.c |  158 ---------------------------------
 include/asm-i386/e820.h  |    2
 arch/i386/kernel/e820.c  |  152 ++++++++++++++++++++++++++++++++++++++++++++++
 arch/i386/kernel/setup.c |  153 -----------------------------------------------
 include/asm-i386/e820.h  |    2
 3 files changed, 155 insertions(+), 152 deletions(-)
2006-12-07 02:14:06 +01:00
bibo,mao b5b2405706 [PATCH] i386: Move e820/efi memmap walking code to e820.c
This patch moves e820/efi memmap table walking function from
setup.c to e820.c, also this patch adds extern declaration in
header file.

Signed-off-by: bibo,mao <bibo.mao@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>

 arch/i386/kernel/e820.c  |  115 +++++++++++++++++++++++++++++++++
 arch/i386/kernel/setup.c |  118 -----------------------------------
 include/asm-i386/e820.h  |    2
 arch/i386/kernel/e820.c  |  115 +++++++++++++++++++++++++++++++++++++++++++++
 arch/i386/kernel/setup.c |  118 -----------------------------------------------
 include/asm-i386/e820.h  |    2
 3 files changed, 117 insertions(+), 118 deletions(-)
2006-12-07 02:14:06 +01:00
bibo,mao b2dff6a88c [PATCH] i386: Move find_max_pfn function to e820.c
Move more code from setup.c into e820.c

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:06 +01:00
bibo,mao 8e3342f736 [PATCH] i386: create e820.c for e820 map sanitize and copy function
This patch moves bios e820 map sanitize and copy function from
setup.c to e820.c

Signed-off-by: bibo,mao <bibo.mao@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>

 arch/i386/kernel/e820.c  |  252 +++++++++++++++++++++++++++++++++++++++++++++++
 arch/i386/kernel/setup.c |  240 --------------------------------------------
 2 files changed, 252 insertions(+), 240 deletions(-)
2006-12-07 02:14:06 +01:00
bibo,mao 269c2d81ed [PATCH] i386: i386 create e820.c to handle standard io/mem resources
This patch creates new file named e820.c to hanle standard io/mem
resources, moving request_standard_resources function from setup.c
to e820.c. Also this patch modifies Makfile to compile file e820.c.

Signed-off-by: bibo,mao <bibo.mao@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>

 Makefile |    2
 arch/i386/kernel/Makefile |    2
 arch/i386/kernel/e820.c   |  289 ++++++++++++++++++++++++++++++++++++++++++++++
 arch/i386/kernel/setup.c  |  276 -------------------------------------------
 3 files changed, 293 insertions(+), 274 deletions(-)
2006-12-07 02:14:06 +01:00
Andi Kleen 11a4180c0b [PATCH] i386: Use probe_kernel_address instead of __get_user in fault paths
Makes the intention of the code cleaner to read and avoids
a potential deadlock on mmap_sem. Also change the types of
the arguments to not include __user because they're really
not user addresses.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:06 +01:00
Andi Kleen 770d132f03 [PATCH] i386: Retrieve CLFLUSH size from CPUID
Also report it in /proc/cpuinfo similar to x86-64.

Needed for followon patch

Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:05 +01:00
Joe Korty 74b47a7844 [PATCH] i386: Fix entry.S code with !CONFIG_VM86
The entry.S code at work_notifysig is surely wrong.  It drops into unrelated
code if the branch to work_notifysig_v86 is taken, and CONFIG_VM86=n.

	[PATCH] Make vm86 support optional
	tree 9b5daef528
	pushed to git Jan 8, 2006, and first appears in 2.6.16

The 'fix' here is to also compile out the vm86 test & branch when
CONFIG_VM86=n.

Signed-off-by: Joe Korty <joe.korty@ccur.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:04 +01:00
Vivek Goyal e69f202d0a [PATCH] i386: Implement CONFIG_PHYSICAL_ALIGN
o Now CONFIG_PHYSICAL_START is being replaced with CONFIG_PHYSICAL_ALIGN.
  Hardcoding the kernel physical start value creates a problem in relocatable
  kernel context due to boot loader limitations. For ex, if somebody
  compiles a relocatable kernel to be run from address 4MB, but this kernel
  will run from location 1MB as grub loads the kernel at physical address
  1MB. Kernel thinks that I am a relocatable kernel and I should run from
  the address I have been loaded at. So somebody wanting to run kernel
  from 4MB alignment location (for improved performance regions) can't do
  that.

o Hence, Eric proposed that probably CONFIG_PHYSICAL_ALIGN will make
  more sense in relocatable kernel context. At run time kernel will move
  itself to a physical addr location which meets user specified alignment
  restrictions.

Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:04 +01:00
Eric W. Biederman 2a43f3ede4 [PATCH] i386: CONFIG_PHYSICAL_START cleanup
Defining __PHYSICAL_START and __KERNEL_START in asm-i386/page.h works but
it triggers a full kernel rebuild for the silliest of reasons.  This
modifies the users to directly use CONFIG_PHYSICAL_START and linux/config.h
which prevents the full rebuild problem, which makes the code much
more maintainer and hopefully user friendly.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:04 +01:00
Eric W. Biederman 8621b81c74 [PATCH] i386: Reserve kernel memory starting from _text
Currently when we are reserving the memory the kernel text
resides in we start at __PHYSICAL_START which happens to be
correct but not very obvious.  In addition when we start relocating
the kernel __PHYSICAL_START is the wrong value, as it is an
absolute symbol that does not get relocated.

By starting the reservation at __pa_symbol(_text)
the code is clearer and will be correct when relocated.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:03 +01:00
Vivek Goyal 6ed018845f [PATCH] i386: Add comment for align to vmlinux.lds
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:03 +01:00
Vivek Goyal 6569580de7 [PATCH] i386: Distinguish absolute symbols
Ld knows about 2 kinds of symbols,  absolute and section
relative.  Section relative symbols symbols change value
when a section is moved and absolute symbols do not.

Currently in the linker script we have several labels
marking the beginning and ending of sections that
are outside of sections, making them absolute symbols.
Having a mixture of absolute and section relative
symbols refereing to the same data is currently harmless
but it is confusing.

This must be done carefully as newer revs of ld do not place
symbols that appear in sections without data and instead
ld makes those symbols global :(

My ultimate goal is to build a relocatable kernel.  The
safest and least intrusive technique is to generate
relocation entries so the kernel can be relocated at load
time.  The only penalty would be an increase in the size
of the kernel binary.  The problem is that if absolute and
relocatable symbols are not properly specified absolute symbols
will be relocated or section relative symbols won't be, which
is fatal.

The practical motivation is that when generating kernels that
will run from a reserved area for analyzing what caused
a kernel panic, it is simpler if you don't need to hard code
the physical memory location they will run at, especially
for the distributions.

[AK: and merged:]

o Also put a message so that in future people can be aware of it and
  avoid introducing absolute symbols.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:03 +01:00
Andi Kleen 9c5f8be462 [PATCH] x86: Mention PCI instead of RAM in NMI parity error message
On modern systems RAM errors don't cause NMIs, but it's usually
caused by PCI SERR. Mention PCI instead of RAM in the printk.

Reported by r_hayashi@ctc-g.co.jp (Ryutaro Hayashi)

Cc:  r_hayashi@ctc-g.co.jp

Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:03 +01:00
Andi Kleen 72690a2118 [PATCH] x86: Don't use nested idle loops
Currently the idle loop has two nested loops -- one high level
in cpu_idle and in some low level idle functions another one.

Looping in the low level idle functions breaks the idle notifiers
because interrupts waking up sleep states need to execute
exit_idle() which is only in cpu_idle().

So don't do that, only loop in cpu_idle(). This only removes
code.

In some cases e.g. poll_idle the idle loop is a little longer
now because cpu_idle checks more things. I hope that isn't a problem
ACPI idle doesn't change behaviour because it never looped anyways.

Cc: len.brown@intel.com
Cc: eranian@hpl.hp.com
Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:03 +01:00
Jeremy Fitzhardinge ec7fcaabbf [PATCH] i386: Implement "current" with the PDA
Use the pcurrent field in the PDA to implement the "current" macro.  This ends
up compiling down to a single instruction to get the current task.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Chuck Ebbert <76306.1226@compuserve.com>
Cc: Zachary Amsden <zach@vmware.com>
Cc: Jan Beulich <jbeulich@novell.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-12-07 02:14:03 +01:00
Jeremy Fitzhardinge b2938f8808 [PATCH] i386: Implement smp_processor_id() with the PDA
Use the cpu_number in the PDA to implement raw_smp_processor_id.  This is a
little simpler than using thread_info, though the cpu field in thread_info
cannot be removed since it is used for things other than getting the current
CPU in common code.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Chuck Ebbert <76306.1226@compuserve.com>
Cc: Zachary Amsden <zach@vmware.com>
Cc: Jan Beulich <jbeulich@novell.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-12-07 02:14:03 +01:00
Jeremy Fitzhardinge 49d26b6eaa [PATCH] i386: Update sys_vm86 to cope with changed pt_regs and %gs usage
sys_vm86 uses a struct kernel_vm86_regs, which is identical to pt_regs, but
adds an extra space for all the segment registers.  Previously this structure
was completely independent, so changes in pt_regs had to be reflected in
kernel_vm86_regs.  This changes just embeds pt_regs in kernel_vm86_regs, and
makes the appropriate changes to vm86.c to deal with the new naming.

Also, since %gs is dealt with differently in the kernel, this change adjusts
vm86.c to reflect this.

While making these changes, I also cleaned up some frankly bizarre code which
was added when auditing was added to sys_vm86.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Chuck Ebbert <76306.1226@compuserve.com>
Cc: Zachary Amsden <zach@vmware.com>
Cc: Jan Beulich <jbeulich@novell.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-12-07 02:14:03 +01:00
Jeremy Fitzhardinge 66e10a44d7 [PATCH] i386: Fix places where using %gs changes the usermode ABI
There are a few places where the change in struct pt_regs and the use of %gs
affect the userspace ABI.  These are primarily debugging interfaces where
thread state can be inspected or extracted.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Chuck Ebbert <76306.1226@compuserve.com>
Cc: Zachary Amsden <zach@vmware.com>
Cc: Jan Beulich <jbeulich@novell.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-12-07 02:14:02 +01:00
Jeremy Fitzhardinge f95d47caae [PATCH] i386: Use %gs as the PDA base-segment in the kernel
This patch is the meat of the PDA change.  This patch makes several related
changes:

1: Most significantly, %gs is now used in the kernel.  This means that on
   entry, the old value of %gs is saved away, and it is reloaded with
   __KERNEL_PDA.

2: entry.S constructs the stack in the shape of struct pt_regs, and this
   is passed around the kernel so that the process's saved register
   state can be accessed.

   Unfortunately struct pt_regs doesn't currently have space for %gs
   (or %fs). This patch extends pt_regs to add space for gs (no space
   is allocated for %fs, since it won't be used, and it would just
   complicate the code in entry.S to work around the space).

3: Because %gs is now saved on the stack like %ds, %es and the integer
   registers, there are a number of places where it no longer needs to
   be handled specially; namely context switch, and saving/restoring the
   register state in a signal context.

4: And since kernel threads run in kernel space and call normal kernel
   code, they need to be created with their %gs == __KERNEL_PDA.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Chuck Ebbert <76306.1226@compuserve.com>
Cc: Zachary Amsden <zach@vmware.com>
Cc: Jan Beulich <jbeulich@novell.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-12-07 02:14:02 +01:00
Jeremy Fitzhardinge 6211119580 [PATCH] i386: Initialize the per-CPU data area
When a CPU is brought up, a PDA and GDT are allocated for it.  The GDT's
__KERNEL_PDA entry is pointed to the allocated PDA memory, so that all
references using this segment descriptor will refer to the PDA.

This patch rearranges CPU initialization a bit, so that the GDT/PDA are set up
as early as possible in cpu_init().  Also for secondary CPUs, GDT+PDA are
preallocated and initialized so all the secondary CPU needs to do is set up
the ldt and load %gs.  This will be important once smp_processor_id() and
current use the PDA.

In all cases, the PDA is set up in head.S, before a CPU starts running C code,
so the PDA is always available.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Chuck Ebbert <76306.1226@compuserve.com>
Cc: Zachary Amsden <zach@vmware.com>
Cc: Jan Beulich <jbeulich@novell.com>
Cc: Andi Kleen <ak@suse.de>
Cc: James Bottomley <James.Bottomley@SteelEye.com>
Cc: Matt Tolentino <matthew.e.tolentino@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-12-07 02:14:02 +01:00
Jeremy Fitzhardinge 9ca36101a8 [PATCH] i386: Basic definitions for i386-pda
This patch has the basic definitions of struct i386_pda, and the segment
selector in the GDT.

asm-i386/pda.h is more or less a direct copy of asm-x86_64/pda.h.  The most
interesting difference is the use of _proxy_pda, which is used to give gcc a
model for the actual memory operations on the real pda structure.  No actual
reference is ever made to _proxy_pda, so it is never defined.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Chuck Ebbert <76306.1226@compuserve.com>
Cc: Zachary Amsden <zach@vmware.com>
Cc: Jan Beulich <jbeulich@novell.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-12-07 02:14:02 +01:00
Jeremy Fitzhardinge eb5b7b9d86 [PATCH] i386: Use asm-offsets for the offsets of registers into the pt_regs struct
Use asm-offsets for the offsets of registers into the pt_regs struct, rather
than having hard-coded constants

I left the constants in the comments of entry.S because they're useful for
reference; the code in entry.S is very dependent on the layout of pt_regs,
even when using asm-offsets.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Keith Owens <kaos@ocs.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-12-07 02:14:02 +01:00
Amol Lad fa5cecd111 [PATCH] i386: add missing iounmap in i386 hpet clocksource code
ioremap must be balanced by an iounmap and failing to do so can result
in a memory leak.

Tested (compilation only):
- using allmodconfig
- making sure the files are compiling without any warning/error due to
new changes

Signed-off-by: Amol Lad <amol@verismonetworks.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:02 +01:00
Amol Lad c0e84b9901 [PATCH] i386: Add iounmap in error paths in hpet code
Signed-off-by: Amol Lad <amol@verismonetworks.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:02 +01:00
Chuck Ebbert acc207616a [PATCH] i386: add sleazy FPU optimization
i386 port of the sLeAZY-fpu feature.  Chuck reports that this gives him a +/-
0.4% improvement on his simple benchmark

x86_64 description follows:

Right now the kernel on x86-64 has a 100% lazy fpu behavior: after *every*
context switch a trap is taken for the first FPU use to restore the FPU
context lazily.  This is of course great for applications that have very
sporadic or no FPU use (since then you avoid doing the expensive save/restore
all the time).  However for very frequent FPU users...  you take an extra trap
every context switch.

The patch below adds a simple heuristic to this code: After 5 consecutive
context switches of FPU use, the lazy behavior is disabled and the context
gets restored every context switch.  If the app indeed uses the FPU, the trap
is avoided.  (the chance of the 6th time slice using FPU after the previous 5
having done so are quite high obviously).

After 256 switches, this is reset and lazy behavior is returned (until there
are 5 consecutive ones again).  The reason for this is to give apps that do
longer bursts of FPU use still the lazy behavior back after some time.

Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:01 +01:00
Stas Sergeev be44d2aabc [PATCH] i386: espfix cleanup
Clean up the espfix code:

- Introduced PER_CPU() macro to be used from asm
- Introduced GET_DESC_BASE() macro to be used from asm
- Rewrote the fixup code in asm, as calling a C code with the altered %ss
  appeared to be unsafe
- No longer altering the stack from a .fixup section
- 16bit per-cpu stack is no longer used, instead the stack segment base
  is patched the way so that the high word of the kernel and user %esp
  are the same.
- Added the limit-patching for the espfix segment. (Chuck Ebbert)

[jeremy@goop.org: use the x86 scaling addressing mode rather than shifting]
Signed-off-by: Stas Sergeev <stsp@aknet.ru>
Signed-off-by: Andi Kleen <ak@suse.de>
Acked-by: Zachary Amsden <zach@vmware.com>
Acked-by: Chuck Ebbert <76306.1226@compuserve.com>
Acked-by: Jan Beulich <jbeulich@novell.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-12-07 02:14:01 +01:00
Andrew Morton bb81a09e55 [PATCH] x86: all cpu backtrace
When a spinlock lockup occurs, arrange for the NMI code to emit an all-cpu
backtrace, so we get to see which CPU is holding the lock, and where.

Cc: Andi Kleen <ak@muc.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Badari Pulavarty <pbadari@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:01 +01:00
Jeremy Fitzhardinge e5e3a04289 [PATCH] i386: remove default_ldt, and simplify ldt-setting.
This patch removes the default_ldt[] array, as it has been unused since
iBCS stopped being supported.  This means it is now possible to actually
set an empty LDT segment.

In order to deal with this, the set_ldt_desc/load_LDT pair has been
replaced with a single set_ldt() operation which is responsible for both
setting up the LDT descriptor in the GDT, and reloading the LDT register.
If there are no LDT entries, the LDT register is loaded with a NULL
descriptor.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Acked-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-12-07 02:14:01 +01:00
Stephane Eranian 42ed458aa5 [PATCH] i386: i386 add X86_FEATURE_PEBS and detection
Here is a patch (used by perfmon2) to detect the presence of the Precise Event
Based Sampling (PEBS) feature for i386.  The patch also adds the cpu_has_pebs
macro.

- adds X86_FEATURE_PEBS

- adds cpu_has_pebs to test for X86_FEATURE_PEBS

Signed-off-by: stephane eranian <eranian@hpl.hp.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-12-07 02:14:01 +01:00
Dave Jones a63954b5ca [PATCH] i386: remove pointless printk from i386 oops output
This just got removed on x86-64, do the same on 32bit.
It always annoyed me when this ate a line of oops output pushing
interesting stuff off the screen.

Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:00 +01:00
Andi Kleen dd315df176 [PATCH] x86: Compress stack unwinder output
The unwinder has some extra newlines, which eat up loads of screen
space when it spews. (See https://bugzilla.redhat.com/bugzilla/attachment.cgi?id=137900
for a nasty example).

warning_symbol-> and warning-> already printk a newline, so don't add one
in the strings passed to them.

[AK: redone for new code]

Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:00 +01:00
Andi Kleen b615ebdac9 [PATCH] x86: shorten lines in unwinder to be <= 80 characters
Andrew complained about > 80 character lines in the new unwinder.
Fix that.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:00 +01:00
Andreas Mohr 9b48341752 [PATCH] i386: fix buggy MTRR address checks
Fix checks that failed to realize that values are 4-kB-unit-sized (note the
format strings in this same diff context which *do* realize the unit size,
via appended "000"!).  Also fix an incorrect below-1MB area check (as
gathered from Jan Beulich's unapplied patch at
http://www.ussg.iu.edu/hypermail/linux/kernel/0411.1/1378.html ) Update
mtrr_add_page() docu to make 4-kB-sized calculation more obvious.

Given several further items mentioned in Jan's patch mail, all in all MTRR
code seems surprisingly buggy, for a surprisingly long period of time (many
years).  Further work/investigation would be useful.

TBD Note that my patch is pretty much UNTESTED, since I can only verify that it
TBD successfully boots my machine, but I cannot test against actual buggy
TBD hardware which would require these (formerly broken) checks.  Long -mm
TBD simmering would make sense, especially since these now-working checks might
TBD turn out to have adverse effects on unaffected hardware.

Signed-off-by: Andreas Mohr <andi@lisas.de>
Signed-off-by: Andi Kleen <ak@suse.de>
Acked-by: Jan Beulich <jbeulich@novell.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-12-07 02:14:00 +01:00
David Howells 9db7372445 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6
Conflicts:

	drivers/ata/libata-scsi.c
	include/linux/libata.h

Futher merge of Linus's head and compilation fixups.

Signed-Off-By: David Howells <dhowells@redhat.com>
2006-12-05 17:01:28 +00:00
David Howells 4c1ac1b491 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6
Conflicts:

	drivers/infiniband/core/iwcm.c
	drivers/net/chelsio/cxgb2.c
	drivers/net/wireless/bcm43xx/bcm43xx_main.c
	drivers/net/wireless/prism54/islpci_eth.c
	drivers/usb/core/hub.h
	drivers/usb/input/hid-core.c
	net/core/netpoll.c

Fix up merge failures with Linus's head and fix new compilation failures.

Signed-Off-By: David Howells <dhowells@redhat.com>
2006-12-05 14:37:56 +00:00
Al Viro 914e26379d [PATCH] severing fs.h, radix-tree.h -> sched.h
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-12-04 02:00:24 -05:00
Al Viro f6a570333e [PATCH] severing module.h->sched.h
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-12-04 02:00:22 -05:00
Linus Torvalds 37043318b1 Merge branch 'release' of master.kernel.org:/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6
* 'release' of master.kernel.org:/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6:
  Revert "ACPI: SCI interrupt source override"
2006-12-02 08:28:28 -08:00
Len Brown 7bdd21cef9 Revert "ACPI: SCI interrupt source override"
This reverts commit 281ea49b0c,
which broke ACPI Interrupt source overrides that move
the SCI from one IRQ in PIC mode to another in IOAPIC mode.

If the SCI shared an interrupt line with another device,
this would result in a "irq 18: nobody cared" type failure.

http://bugzilla.kernel.org/show_bug.cgi?id=7601

Signed-off-by: Len Brown <len.brown@intel.com>
2006-12-02 02:27:46 -05:00
Linus Torvalds 72a73a69f6 Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6: (28 commits)
  PCI: make arch/i386/pci/common.c:pci_bf_sort static
  PCI: ibmphp_pci.c: fix NULL dereference
  pciehp: remove unnecessary pci_disable_msi
  pciehp: remove unnecessary free_irq
  PCI: rpaphp: change device tree examination
  PCI: Change memory allocation for acpiphp slots
  i2c-i801: SMBus patch for Intel ICH9
  PCI: irq: irq and pci_ids patch for Intel ICH9
  PCI: pci_{enable,disable}_device() nestable ports
  PCI: switch pci_{enable,disable}_device() to be nestable
  PCI: arch/i386/kernel/pci-dma.c: ioremap balanced with iounmap
  pci/i386: style cleanups
  PCI: Block on access to temporarily unavailable pci device
  pci: fix __pci_register_driver error handling
  pci: clear osc support flags if no _OSC method
  acpiphp: fix missing acpiphp_glue_exit()
  acpiphp: fix use of list_for_each macro
  Altix: Initial ACPI support - ROM shadowing.
  Altix: SN ACPI hotplug support.
  Altix: Add initial ACPI IO support
  ...
2006-12-01 16:41:27 -08:00
Greg Kroah-Hartman 07accdc18e Driver core: convert cpuid code to use struct device
Converts from using struct "class_device" to "struct device" making
everything show up properly in /sys/devices/ with symlinks from the
/sys/class directory.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-12-01 14:52:00 -08:00
Greg Kroah-Hartman a271aaf15f Driver core: convert msr code to use struct device
Converts from using struct "class_device" to "struct device" making
everything show up properly in /sys/devices/ with symlinks from the
/sys/class directory.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-12-01 14:52:00 -08:00
Amol Lad 039d09a845 PCI: arch/i386/kernel/pci-dma.c: ioremap balanced with iounmap
ioremap must be balanced by an iounmap and failing to do so can result
in a memory leak.

Tested (compilation only):
- using allmodconfig
- making sure the files are compiling without any warning/error due to
new changes

Signed-off-by: Amol Lad <amol@verismonetworks.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-12-01 14:36:59 -08:00
David Howells c4028958b6 WorkStruct: make allyesconfig
Fix up for make allyesconfig.

Signed-Off-By: David Howells <dhowells@redhat.com>
2006-11-22 14:57:56 +00:00
Linus Torvalds 808dbbb6bb x86: be more careful when walking back the frame pointer chain
When showing the stack backtrace, make sure that we never accept not
only an unchanging frame pointer, but also a frame pointer that moves
back down the stack frame.  It must always grow up (toward older stack
frames).

I doubt this has triggered, but a subtly corrupt stack with extremely
unlucky contents could cause us to loop forever on a bogus endless frame
pointer chain.

This review was triggered by much worse problems happening in some of
the other stack unwinding code.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-11-17 11:14:56 -08:00
Ingo Molnar dc1829a4c3 [PATCH] i386/x86_64: ACPI cpu_idle_wait() fix
The scheduler on Andreas Friedrich's hyperthreading system stopped
working properly: the scheduler would never move tasks to another CPU!
The lask known working kernel was 2.6.8.

After a couple of attempts to corner the bug, the following smoking gun
was found:

  BIOS reported wrong ACPI idfor the processor
  CPU#1: set_cpus_allowed(), swapper:1, 3 -> 2
   [<c0103bbe>] show_trace_log_lvl+0x34/0x4a
   [<c0103ceb>] show_trace+0x2c/0x2e
   [<c01045f8>] dump_stack+0x2b/0x2d
   [<c0116a77>] set_cpus_allowed+0x52/0xec
   [<c0101d86>] cpu_idle_wait+0x2e/0x100
   [<c0259c57>] acpi_processor_power_exit+0x45/0x58
   [<c0259752>] acpi_processor_remove+0x46/0xea
   [<c025c6fb>] acpi_start_single_object+0x47/0x54
   [<c025cee5>] acpi_bus_register_driver+0xa4/0xd3
   [<c04ab2d7>] acpi_processor_init+0x57/0x77
   [<c01004d7>] init+0x146/0x2fd
   [<c0103a87>] kernel_thread_helper+0x7/0x10

a quick look at cpu_idle_wait() shows how broken that code is
on i386: it changes the init task's affinity map but never
restores it ...

and because all userspace tasks get forked by init, they all
inherited that single-CPU affinity mask. x86_64 cloned this
bug too.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Andreas Friedrich <andreas.friedrich@fujitsu-siemens.com>
Cc: Wolfgang Erig <Wolfgang.Erig@fujitsu-siemens.com>
Cc: Andrew Morton <akpm@osdl.org>
Cc: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-11-17 08:20:09 -08:00
Eric W. Biederman 45c9953325 [PATCH] Use delayed disable mode of ioapic edge triggered interrupts
Komuro reports that ISA interrupts do not work after a disable_irq(),
causing some PCMCIA drivers to not work, with messages like

	eth0: Asix AX88190: io 0x300, irq 3, hw_addr xx:xx:xx:xx:xx:xx
	eth0: found link beat
	eth0: autonegotiation complete: 100baseT-FD selected
	eth0: interrupt(s) dropped!
	eth0: interrupt(s) dropped!
	eth0: interrupt(s) dropped!
	...

Linus Torvalds <torvalds@osdl.org> said:

  "Now, edge-triggered interrupts are a _lot_ harder to mask, because the
   Intel APIC is an unbelievable piece of sh*t, and has the edge-detect logic
   _before_ the mask logic, so if a edge happens _while_ the device is
   masked, you'll never ever see the edge ever again (unmasking will not
   cause a new edge, so you simply lost the interrupt).

   So when you "mask" an edge-triggered IRQ, you can't really mask it at all,
   because if you did that, you'd lose it forever if the IRQ comes in while
   you masked it. Instead, we're supposed to leave it active, and set a flag,
   and IF the IRQ comes in, we just remember it, and mask it at that point
   instead, and then on unmasking, we have to replay it by sending a
   self-IPI."

This trivial patch solves the problem.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Ingo Molnar <mingo@redhat.com>
Acked-by: Komuro <komurojun-mbn@nifty.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-11-15 09:04:32 -08:00
Andi Kleen fa18f477d0 [PATCH] x86: Add acpi_user_timer_override option for Asus boards
Timer overrides are normally disabled on Nvidia board because
they are commonly wrong, except on new ones with HPET support.
Unfortunately there are quite some Asus boards around that
don't have HPET, but need a timer override.

We don't know yet how to handle this transparently,
but at least add a command line option to force the timer override
and let them boot.

Cc: len.brown@intel.com

Signed-off-by: Andi Kleen <ak@suse.de>
2006-11-14 16:57:46 +01:00
Eric W. Biederman ec68307cc5 [PATCH] htirq: refactor so we only have one function that writes to the chip
This refactoring actually optimizes the code a little by caching the value
that we think the device is programmed with instead of reading it back from
the hardware.  Which simplifies the code a little and should speed things up a
bit.

This patch introduces the concept of a ht_irq_msg and modifies the
architecture read/write routines to update this code.

There is a minor consistency fix here as well as x86_64 forgot to initialize
the htirq as masked.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Andi Kleen <ak@suse.de>
Acked-by: Bryan O'Sullivan <bos@pathscale.com>
Cc: <olson@pathscale.com>
Cc: Roland Dreier <rolandd@cisco.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-11-08 18:29:24 -08:00
Masami Hiramatsu 8bdc052ecc [PATCH] kretprobe: fix kretprobe-booster to save regs and set status
There are two bugs in the kretprobe-booster.

1) It doesn't make room for gs registers.

2) It doesn't change status of the current kprobe.  This status will
   effect the fault handling.

This patch fixes these bugs and, additionally, saves skipped registers for
compatibility with the original kretprobe.

Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-11-08 18:29:24 -08:00
Vivek Goyal c06cb8b1c4 [PATCH] i386: Force data segment to be 4K aligned
o Currently there is no specific alignment restriction in linker script
  and in some cases it can be placed non 4K aligned addresses. This fails
  kexec which checks that segment to be loaded is page aligned.

o I guess, it does not harm data segment to be 4K aligned.

Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-11-08 18:29:23 -08:00
Arjan van de Ven d654c673d6 [PATCH] Regression in 2.6.19-rc microcode driver
If the microcode driver is built in (rather than module) there are some,
ehm, interesting effects happening due to the new "call out to userspace"
behavior that is introduced..  and which runs too early.  The result is a
boot hang; which is really nasty.

The patch below is a minimally safe patch to fix this regression for 2.6.19
by just not requesting actual microcode updates during early boot.  (That
is a good idea in general anyway)

The "real" fix is a lot more complex given the entire cpu hotplug scenario
(during cpu hotplug you normally need to load the microcode as well); but
the interactions for that are just really messy at this point; this fix at
least makes it work and avoids a full detangle of hotplug.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-11-08 18:29:22 -08:00
Dave Jones 474a14df3f Revert "[CPUFREQ] speedstep-centrino should ignore upper performance control bits"
This reverts commit d7a1944e8d.
2006-11-08 19:22:45 -05:00
Alexey Dobriyan 6fdc2d0750 [CPUFREQ] gx-suspmod: fix "&& 0xff" typo
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Dave Jones <davej@redhat.com>
2006-11-08 17:14:31 -05:00
akpm@osdl.org caede347c3 [CPUFREQ] Fix build failure on x86-64
arch/x86_64/kernel/cpufreq/../../../i386/kernel/cpu/cpufreq/speedstep-lib.c:131: error: 'MSR_FSB_FREQ' undeclared (first use in this function)

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Dave Jones <davej@redhat.com>
2006-11-08 17:14:31 -05:00
Gary Hade d7a1944e8d [CPUFREQ] speedstep-centrino should ignore upper performance control bits
On some systems such as the IBM x3650 there are bits set in the
upper half of the control values provided by the _PSS object.
These bits are only relevant for cpufreq drivers that use IO ports
which are not currently supported by the speedstep-centrino driver.
The current MSR oriented code assumes that upper bits are not set
and thus fails to work correctly when they are.  e.g. the control
and status value equality check fails even though the ACPI spec
allows the inequality.

Signed-off-by: Gary Hade <garyh@us.ibm.com>
Signed-off-by: Dave Jones <davej@redhat.com>
2006-11-08 17:14:30 -05:00
Dominik Brodowski 4e74663c5d [CPUFREQ] p4-clockmod: add more CPUs
Several more Intel CPUs are now capable using the p4-clockmod cpufreq
driver. As it is of limited use most of the time, print a big bold warning
if a better cpufreq driver might be available.

Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Dave Jones <davej@redhat.com>
2006-11-06 19:17:40 -05:00
Andrew Morton 90d5390944 [PATCH] acpi_noirq section fix
WARNING: vmlinux - Section mismatch: reference to .init.data:acpi_noirq from .text between 'pcibios_penalize_isa_irq' (at offset 0xc026ffa1) and 'pirq_serverworks_get'

Acked-by: "Brown, Len" <len.brown@intel.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-11-03 12:27:58 -08:00
Linus Torvalds f9dadfa71b i386: write IO APIC irq routing entries in correct order
Since the "mask" bit is in the low word, when we write a new entry, we
need to write the high word first, before we potentially unmask it.

The exception is when we actually want to mask the interrupt, in which
case we want to write the low word first to make sure that the high word
doesn't change while the interrupt routing is still active.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-11-01 10:06:52 -08:00