Commit Graph

22779 Commits

Author SHA1 Message Date
Haavard Skinnemoen b13d618b44 avr32: Clean up and optimize the TLB operations
This and the following patches aim to optimize the code dealing with
page tables and TLB operations. Each patch reduces the time it takes
to gzip a 16 MB file slightly, but I expect things like fork() and
mmap() will improve somewhat more.

This patch deals with the low-level TLB operations:

  * Remove unused _TLBEHI_I define
  * Use gcc builtins instead of inline assembly
  * Remove a few unnecessary pipeline flushes and nops
  * Introduce NR_TLB_ENTRIES define and use it instead of hardcoding it
    to 32 a few places throughout the code.
  * Use sysreg bitops instead of hardcoded shifts and masks
  * Make a few needlessly global functions static

Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
2008-07-02 11:01:28 +02:00
Haavard Skinnemoen d7ff2a4a28 avr32: Rename at32ap.c -> pdc.c
The only thing left in at32ap.c is some PDC stuff. Rename the file to
reflect what it actually does.

Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
2008-06-28 15:08:48 +02:00
Haavard Skinnemoen 65033ed740 avr32: Move setup_platform() into chip-specific file
Combine at32_clock_init() and at32_portmux_init() into
setup_platform() and remove setup_platform() from at32ap.c. No
functional change since all setup_platform() ever did was call those
two functions.

Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
2008-06-28 15:08:39 +02:00
Haavard Skinnemoen d704fb0cc0 avr32: Kill special exception handler sections
Kill the special exception handler sections .tlbx.ex.text,
.tlbr.ex.text, tlbw.ex.text and .scall.text. Use .org instead to place
the handlers at the required offsets from EVBA.

Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
2008-06-27 17:48:06 +02:00
Haavard Skinnemoen 9dbef285a3 avr32: Clean up time.c #includes
Remove lots of unneeded #includes, add #include <linux/kernel.h> and
sort alphabetically.

Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
2008-06-27 17:48:04 +02:00
Sedji Gaouaou 8405996ff6 atmel_pwm: Rename the "mck" clock to "pwm_clk"
The name "mck" causes a conflict on AT91. Call it "pwm_clk" instead.

Signed-off-by: Sedji Gaouaou <sedji.gaouaou@atmel.com>
Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
2008-06-27 15:32:30 +02:00
David Brownell 9c2baf785e at32ap700x spi: enable pullups on MISO
This is a minor tweak to the at32ap700x pin configuration for the SPI
input pin (MISO), enabling the on-chip weak pullup (typical 190K) to

  (a) ensure a fixed data value for missing or input-only slaves;

  (b) prevent power waste associated with inputs floating near VDDIO/2.

Atmel's boards have no external pullup or pulldown on these pins, so
it's unlikely other boards would address these issues with external
pulldowns.  Were there trouble, board-specific code could turn off
the relevant pullup(s).

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
2008-06-27 15:32:29 +02:00
David Brownell 7ef31e9c4e avr32: improve NGW100 I2C/PMBus setup
Basic I2C initialization for the NGW100 board:

  - Provide empty i2c device table. Daughtercards may add devices,
    and the ATtiny24 could do stuff too.

  - Set up EXTINT(3) so the ATtiny24 can interrupt the AP7000.

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
2008-06-27 15:32:29 +02:00
Hans-Christian Egtvedt d86d314f67 avr32: Add PSIF platform devices
This patch adds the PS/2 interface (PSIF) to the device code, split into
two platform devices, one for each port.

The function for adding the PSIF platform device is also added to the
board header file.

Signed-off-by: Hans-Christian Egtvedt <hcegtvedt@atmel.com>
Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
2008-06-27 15:32:28 +02:00
Hans-Christian Egtvedt 47882cf620 avr32: Add pin configuration choice to LCDC peripheral
This patch lets the board code choose which pin out to use for the LCD
interface.

On AT32AP7000 the LCDC is wired to two sets of pins, which lets the user
choose between dual ethernet and 32-bit EBI. For the ATNGW100 board it
is vital to have the choice to select the alternative pinout since this
pinout is routed to the external headers.

Update ATSTK1002 and ATSTK1004 to use the new interface.

Signed-off-by: Hans-Christian Egtvedt <hcegtvedt@atmel.com>
Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
2008-06-27 15:32:27 +02:00
David Brownell aafafddb01 avr32: minor GPIO handling updates
On the odd chance some code uses a pin as a GPIO IRQ without calling
gpio_request() or gpio_direction_input(), the debug dump should still
show its pin status.

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
2008-06-27 15:32:27 +02:00
Haavard Skinnemoen c1f24ac99f avr32: Fix wrong I/O access size in __raw_readsb
__raw_readsb() should always use byte accesses, never halfword accesses,
to I/O memory.

Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
2008-06-27 15:07:50 +02:00
Martin Koegler 7c1b90a1e9 avr32: Fix sigaltstack behaviour
A signal handler should be able to change the signal stack used for the
next signal by altering the ucontext_t passed as a parameter to the
handler. This does not currently work on avr32 since it doesn't update
the in-kernel signal context from the ucontext_t upon signal handler
return.

Fix it by adding a call to do_sigaltstack() from sys_rt_sigreturn(),
bringing it in line with most other architectures.

Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at>
[haavard.skinnemoen@atmel.com: changed patch description]
Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
2008-06-27 15:07:35 +02:00
Alex 60ed7951d0 avr32: Allow board to define oscillator rates
On our custom board we have other oscillator rates than on atngw100 and
atstk100x.

Currently these rates are hardcoded in arch/avr32/mach-at32ap/at32ap700x.c.

This patch moves them into board specific code.

Signed-off-by: Alex Raimondi <raimondi@miromico.ch>
Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
2008-06-27 15:07:16 +02:00
Haavard Skinnemoen 8bd8974fcd avr32: export empty_zero_page
Fixes one of two ext4 build problems:
ERROR: "empty_zero_page" [fs/ext4/ext4dev.ko] undefined!

Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
2008-06-27 15:07:07 +02:00
Linus Torvalds bd8c540fe8 Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
  [IA64] Eliminate NULL test after alloc_bootmem in iosapic_alloc_rte()
  [IA64] Handle count==0 in sn2_ptc_proc_write()
  [IA64] Fix boot failure on ia64/sn2
2008-06-24 18:12:33 -07:00
Linus Torvalds 919c0d14ae Merge branch 'kvm-updates-2.6.26' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm
* 'kvm-updates-2.6.26' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm:
  KVM: Remove now unused structs from kvm_para.h
  x86: KVM guest: Use the paravirt clocksource structs and functions
  KVM: Make kvm host use the paravirt clocksource structs
  x86: Make xen use the paravirt clocksource structs and functions
  x86: Add structs and functions for paravirt clocksource
  KVM: VMX: Fix host msr corruption with preemption enabled
  KVM: ioapic: fix lost interrupt when changing a device's irq
  KVM: MMU: Fix oops on guest userspace access to guest pagetable
  KVM: MMU: large page update_pte issue with non-PAE 32-bit guests (resend)
  KVM: MMU: Fix rmap_write_protect() hugepage iteration bug
  KVM: close timer injection race window in __vcpu_run
  KVM: Fix race between timer migration and vcpu migration
2008-06-24 18:09:06 -07:00
Linus Torvalds 9bf8a943ad Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  xen: remove support for non-PAE 32-bit
2008-06-24 11:21:47 -07:00
Gerd Hoffmann f6e16d5ad4 x86: KVM guest: Use the paravirt clocksource structs and functions
This patch updates the kvm host code to use the pvclock structs
and functions, thereby making it compatible with Xen.

The patch also fixes an initialization bug: on SMP systems the
per-cpu has two different locations early at boot and after CPU
bringup.  kvmclock must take that in account when registering the
physical address within the host.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-06-24 21:02:33 +03:00
Gerd Hoffmann 50d0a0f987 KVM: Make kvm host use the paravirt clocksource structs
This patch updates the kvm host code to use the pvclock structs.
It also makes the paravirt clock compatible with Xen.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-06-24 21:02:32 +03:00
Gerd Hoffmann 1c7b67f757 x86: Make xen use the paravirt clocksource structs and functions
This patch updates the xen guest to use the pvclock structs
and helper functions.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-06-24 21:02:32 +03:00
Gerd Hoffmann 7af192c954 x86: Add structs and functions for paravirt clocksource
This patch adds structs for the paravirt clocksource ABI
used by both xen and kvm (pvclock-abi.h).

It also adds some helper functions to read system time and
wall clock time from a paravirtual clocksource (pvclock.[ch]).
They are based on the xen code.  They are enabled using
CONFIG_PARAVIRT_CLOCK.

Subsequent patches of this series will put the code in use.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-06-24 21:02:31 +03:00
Julia Lawall e2569b7e57 [IA64] Eliminate NULL test after alloc_bootmem in iosapic_alloc_rte()
As noted by Akinobu Mita alloc_bootmem and related functions never return
NULL and always return a zeroed region of memory.  Thus a NULL test or
memset after calls to these functions is unnecessary.

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2008-06-24 10:28:55 -07:00
Cliff Wickman 8097110d17 [IA64] Handle count==0 in sn2_ptc_proc_write()
The fix applied in e0c6d97c65
"security hole in sn2_ptc_proc_write" didn't take into account
the case where count==0 (which results in a buffer underrun
when adding the trailing '\0').  Thanks to Andi Kleen for
pointing this out.

Signed-off-by: Cliff Wickman <cpw@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2008-06-24 10:20:06 -07:00
Jes Sorensen 2826f8c0f4 [IA64] Fix boot failure on ia64/sn2
Call check_sal_cache_flush() after platform_setup() as
check_sal_cache_flush() now relies on being able to call platform
vector code.

Problem was introduced by: 3463a93def
"Update check_sal_cache_flush to use platform_send_ipi()"

Signed-off-by: Jes Sorensen <jes@sgi.com>
Tested-by: Alex Chiang: <achiang@hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2008-06-24 10:16:27 -07:00
Jeremy Fitzhardinge 2849914393 xen: remove support for non-PAE 32-bit
Non-PAE operation has been deprecated in Xen for a while, and is
rarely tested or used.  xen-unstable has now officially dropped
non-PAE support.  Since Xen/pvops' non-PAE support has also been
broken for a while, we may as well completely drop it altogether.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-06-24 17:00:55 +02:00
Avi Kivity a9b21b6229 KVM: VMX: Fix host msr corruption with preemption enabled
Switching msrs can occur either synchronously as a result of calls to
the msr management functions (usually in response to the guest touching
virtualized msrs), or asynchronously when preempting a kvm thread that has
guest state loaded.  If we're unlucky enough to have the two at the same
time, host msrs are corrupted and the machine goes kaput on the next syscall.

Most easily triggered by Windows Server 2008, as it does a lot of msr
switching during bootup.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-06-24 12:26:17 +03:00
Avi Kivity 6bf6a9532f KVM: MMU: Fix oops on guest userspace access to guest pagetable
KVM has a heuristic to unshadow guest pagetables when userspace accesses
them, on the assumption that most guests do not allow userspace to access
pagetables directly. Unfortunately, in addition to unshadowing the pagetables,
it also oopses.

This never triggers on ordinary guests since sane OSes will clear the
pagetables before assigning them to userspace, which will trigger the flood
heuristic, unshadowing the pagetables before the first userspace access. One
particular guest, though (Xenner) will run the kernel in userspace, triggering
the oops.  Since the heuristic is incorrect in this case, we can simply
remove it.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-06-24 12:20:12 +03:00
Marcelo Tosatti 3094538739 KVM: MMU: large page update_pte issue with non-PAE 32-bit guests (resend)
kvm_mmu_pte_write() does not handle 32-bit non-PAE large page backed
guests properly. It will instantiate two 2MB sptes pointing to the same
physical 2MB page when a guest large pte update is trapped.

Instead of duplicating code to handle this, disallow directory level
updates to happen through kvm_mmu_pte_write(), so the two 2MB sptes
emulating one guest 4MB pte can be correctly created by the page fault
handling path.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-06-24 12:18:18 +03:00
Marcelo Tosatti 6597ca09e6 KVM: MMU: Fix rmap_write_protect() hugepage iteration bug
rmap_next() does not work correctly after rmap_remove(), as it expects
the rmap chains not to change during iteration.  Fix (for now) by restarting
iteration from the beginning.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-06-24 12:17:10 +03:00
Marcelo Tosatti 06e0564566 KVM: close timer injection race window in __vcpu_run
If a timer fires after kvm_inject_pending_timer_irqs() but before
local_irq_disable() the code will enter guest mode and only inject such
timer interrupt the next time an unrelated event causes an exit.

It would be simpler if the timer->pending irq conversion could be done
with IRQ's disabled, so that the above problem cannot happen.

For now introduce a new vcpu requests bit to cancel guest entry.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-06-24 12:16:59 +03:00
Marcelo Tosatti d4acf7e7ab KVM: Fix race between timer migration and vcpu migration
A guest vcpu instance can be scheduled to a different physical CPU
between the test for KVM_REQ_MIGRATE_TIMER and local_irq_disable().

If that happens, the timer will only be migrated to the current pCPU on
the next exit, meaning that guest LAPIC timer event can be delayed until
a host interrupt is triggered.

Fix it by cancelling guest entry if any vcpu request is pending.  This
has the side effect of nicely consolidating vcpu->requests checks.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-06-24 12:16:52 +03:00
Linus Torvalds ee5c2ab09b Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  xen: don't drop NX bit
  xen: mask unwanted pte bits in __supported_pte_mask
  xen: Use wmb instead of rmb in xen_evtchn_do_upcall().
  x86: fix NULL pointer deref in __switch_to
2008-06-23 12:48:17 -07:00
Linus Torvalds b732d9680b Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
  [IA64] SN2: security hole in sn2_ptc_proc_write
2008-06-20 17:10:04 -07:00
Ivan Kokshaysky a744e0160a alpha: resurrect Cypress IDE quirk
Which was removed in the hope that generic legacy IDE quirk in
drivers/pci/probe.c is sufficient for Cypress IDE.
It isn't, as this controller has non-standard BAR layout:
secondary channel registers are in the BAR0-1 of the second
PCI function - not in the BAR2-3 of the same function, as the
generic quirk routine assumes.

Signed-off-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-06-20 16:46:10 -07:00
Ivan Kokshaysky d559d4a24a alpha: fix compile failures with gcc-4.3 (bug #10438)
Vast majority of these build failures are gcc-4.3 warnings
about static functions and objects being referenced from
non-static (read: "extern inline") functions, in conjunction
with our -Werror.

We cannot just convert "extern inline" to "static inline",
as people keep suggesting all the time, because "extern inline"
logic is crucial for generic kernel build.
So
- just make sure that all callees of critical "extern inline"
  functions are also "extern inline";
- use "static inline", wherever it's possible.

traps.c: work around gcc-4.3 being too smart about array
bounds-checking.

TODO: add "gnu_inline" attribute to all our "extern inline"
functions to ensure desired behaviour with future compilers.

Signed-off-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-06-20 16:46:10 -07:00
Ivan Kokshaysky ede426923b alpha: link failure fix
With built-in scsi disk driver, the final link fails with a following
error:
`.exit.text' referenced in section `.rodata' of drivers/built-in.o:
defined in discarded section `.exit.text' of drivers/built-in.o

This happens with -Os (CONFIG_CC_OPTIMIZE_FOR_SIZE=y) with all gcc-4
versions, and also with -O2 and gcc-4.3.

The problem is in sd.c:sd_major() being inlined into __exit function
exit_sd(), and the compiler generating a jump table in .rodata section
for the 'switch' statement in sd_major(). So we have references to
discarded section.

Fixed with a big hammer in the form of -fno-jump-tables.

Note that jump tables vs. discarded sections is a generic problem,
other architectures are just lucky not to suffer from it. But with
a slightly more complex switch/case statement it can be reproduced
on x86 as well. So maybe at some point we should consider
-fno-jump-tables as a generic compile option...

Signed-off-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-06-20 16:46:10 -07:00
Linus Torvalds b1ae8d3a00 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, geode: add a VSA2 ID for General Software
  x86: use BOOTMEM_EXCLUSIVE on 32-bit
  x86, 32-bit: fix boot failure on TSC-less processors
  x86: fix NULL pointer deref in __switch_to
  x86: set PAE PHYSICAL_MASK_SHIFT to 44 bits.
2008-06-20 12:36:38 -07:00
Cliff Wickman e0c6d97c65 [IA64] SN2: security hole in sn2_ptc_proc_write
Security hole in sn2_ptc_proc_write

It is possible to overrun a buffer with a write to this /proc file.

Signed-off-by: Cliff Wickman <cpw@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2008-06-20 12:02:00 -07:00
Linus Torvalds 89f5b7da2a Reinstate ZERO_PAGE optimization in 'get_user_pages()' and fix XIP
KAMEZAWA Hiroyuki and Oleg Nesterov point out that since the commit
557ed1fa26 ("remove ZERO_PAGE") removed
the ZERO_PAGE from the VM mappings, any users of get_user_pages() will
generally now populate the VM with real empty pages needlessly.

We used to get the ZERO_PAGE when we did the "handle_mm_fault()", but
since fault handling no longer uses ZERO_PAGE for new anonymous pages,
we now need to handle that special case in follow_page() instead.

In particular, the removal of ZERO_PAGE effectively removed the core
file writing optimization where we would skip writing pages that had not
been populated at all, and increased memory pressure a lot by allocating
all those useless newly zeroed pages.

This reinstates the optimization by making the unmapped PTE case the
same as for a non-existent page table, which already did this correctly.

While at it, this also fixes the XIP case for follow_page(), where the
caller could not differentiate between the case of a page that simply
could not be used (because it had no "struct page" associated with it)
and a page that just wasn't mapped.

We do that by simply returning an error pointer for pages that could not
be turned into a "struct page *".  The error is arbitrarily picked to be
EFAULT, since that was what get_user_pages() already used for the
equivalent IO-mapped page case.

[ Also removed an impossible test for pte_offset_map_lock() failing:
  that's not how that function works ]

Acked-by: Oleg Nesterov <oleg@tv-sign.ru>
Acked-by: Nick Piggin <npiggin@suse.de>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Roland McGrath <roland@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-06-20 11:18:25 -07:00
Jeremy Fitzhardinge ebb9cfe20f xen: don't drop NX bit
Because NX is now enforced properly, we must put the hypercall page
into the .text segment so that it is executable.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stable Kernel <stable@kernel.org>
Cc: the arch/x86 maintainers <x86@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-06-20 14:56:41 +02:00
Jeremy Fitzhardinge 05345b0f00 xen: mask unwanted pte bits in __supported_pte_mask
[ Stable: this isn't a bugfix in itself, but it's a pre-requiste
  for "xen: don't drop NX bit" ]

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stable Kernel <stable@kernel.org>
Cc: the arch/x86 maintainers <x86@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-06-20 14:56:36 +02:00
Jordan Crouse ffe6e1da86 x86, geode: add a VSA2 ID for General Software
General Software writes their own VSA2 module for their version
of the Geode BIOS, which returns a different ID then the standard
VSA2.  This was causing the framebuffer driver to break for most
GSW boards.

Signed-off-by: Jordan Crouse <jordan.crouse@amd.com>
Cc: tglx@linutronix.de
Cc: linux-geode@lists.infradead.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-06-19 14:19:03 +02:00
Bernhard Walle d3942cff62 x86: use BOOTMEM_EXCLUSIVE on 32-bit
This patch uses the BOOTMEM_EXCLUSIVE for crashkernel reservation also for
i386 and prints a error message on failure.

The patch is still for 2.6.26 since it is only bug fixing. The unification
of reserve_crashkernel() between i386 and x86_64 should be done for 2.6.27.

Signed-off-by: Bernhard Walle <bwalle@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: <stable@kernel.org>
2008-06-19 10:08:48 +02:00
Mikael Pettersson df17b1d990 x86, 32-bit: fix boot failure on TSC-less processors
Booting 2.6.26-rc6 on my 486 DX/4 fails with a "BUG: Int 6"
(invalid opcode) and a kernel halt immediately after the
kernel has been uncompressed. The BUG shows EIP pointing
to an rdtsc instruction in native_read_tsc(), invoked from
native_sched_clock().

(This error occurs so early that not even the serial console
can capture it.)

A bisection showed that this bug first occurs in 2.6.26-rc3-git7,
via commit 9ccc906c97e34fd91dc6aaf5b69b52d824386910:

>x86: distangle user disabled TSC from unstable
>
>tsc_enabled is set to 0 from the command line switch "notsc" and from
>the mark_tsc_unstable code. Seperate those functionalities and replace
>tsc_enable with tsc_disable. This makes also the native_sched_clock()
>decision when to use TSC understandable.
>
>Preparatory patch to solve the sched_clock() issue on 32 bit.
>
>Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

The core reason for this bug is that native_sched_clock() gets
called before tsc_init().

Before the commit above, tsc_32.c used a "tsc_enabled" variable
which defaulted to 0 == disabled, and which only got enabled late
in tsc_init(). Thus early calls to native_sched_clock() would skip
the TSC and use jiffies instead.

After the commit above, tsc_32.c uses a "tsc_disabled" variable
which defaults to 0, meaning that the TSC is Ok to use. Early calls
to native_sched_clock() now erroneously try to use the TSC on
!cpu_has_tsc processors, leading to invalid opcode exceptions.

My proposed fix is to initialise tsc_disabled to a "soft disabled"
state distinct from the hard disabled state set up by the "notsc"
kernel option. This fixes the native_sched_clock() problem. It also
allows tsc_init() to be simplified: instead of setting tsc_disabled = 1
on every error return, we just set tsc_disabled = 0 once when all
checks have succeeded.

I've verified that this lets my 486 boot again. I've also verified
that a Core2 machine still uses the TSC as clocksource after the patch.

Signed-off-by: Mikael Pettersson <mikpe@it.uu.se>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-06-19 10:08:47 +02:00
Suresh Siddha 75118a82e2 x86: fix NULL pointer deref in __switch_to
Patrick McHardy reported a crash:

> > I get this oops once a day, its apparently triggered by something
> > run by cron, but the process is a different one each time.
> >
> > Kernel is -git from yesterday shortly before the -rc6 release
> > (last commit is the usb-2.6 merge, the x86 patches are missing),
> > .config is attached.
> >
> > I'll retry with current -git, but the patches that have gone in
> > since I last updated don't look related.
> >
> > [62060.043009] BUG: unable to handle kernel NULL pointer dereference at
> > 000001ff
> > [62060.043009] IP: [<c0102a9b>] __switch_to+0x2f/0x118
> > [62060.043009] *pde = 00000000
> > [62060.043009] Oops: 0002 [#1] PREEMPT

Vegard Nossum analyzed it:

> This decodes to
>
>    0:   0f ae 00                fxsave (%eax)
>
> so it's related to the floating-point context. This is the exact
> location of the crash:
>
> $ addr2line -e arch/x86/kernel/process_32.o -i ab0
> include/asm/i387.h:232
> include/asm/i387.h:262
> arch/x86/kernel/process_32.c:595
>
> ...so it looks like prev_task->thread.xstate->fxsave has become NULL.
> Or maybe it never had any other value.

Somehow (as described below) TS_USEDFPU is set but the fpu is not
allocated or freed.

Another possible FPU pre-emption issue with the sleazy FPU optimization
which was benign before but not so anymore, with the dynamic FPU allocation
patch.

New task is getting exec'd and it is prempted at the below point.

flush_thread() {
	...
	/*
	* Forget coprocessor state..
	*/
	clear_fpu(tsk);
		<----- Preemption point
	clear_used_math();
	...
}

Now when it context switches in again, as the used_math() is still set
and fpu_counter can be > 5, we will do a math_state_restore() which sets
the task's TS_USEDFPU. After it continues from the above preemption point
it does clear_used_math() and much later free_thread_xstate().

Now, at the next context switch, it is quite possible that xstate is
null, used_math() is not set and TS_USEDFPU is still set. This will
trigger unlazy_fpu() causing kernel oops.

Fix this  by clearing tsk's fpu_counter before clearing task's fpu.

Reported-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-06-19 10:08:45 +02:00
Paul Mackerras 65ba6cdc83 [POWERPC] Clear sub-page HPTE present bits when demoting page size
When we demote a slice from 64k to 4k, and we are about to insert an
HPTE for a 4k subpage and we notice that there is an existing 64k
HPTE, we first invalidate that HPTE before inserting the new 4k
subpage HPTE.  Since the bits that encode which hash bucket the old
HPTE was in overlap with the bits that encode which of the 16 subpages
have HPTEs, we need to clear out the subpage HPTE-present bits before
starting to insert HPTEs for the 4k subpages.  If we don't do that, we
can erroneously think that a subpage already has an HPTE when it
doesn't.

That in itself wouldn't be such a problem except that when we go to
update the HPTE that we think is present on machines with a
hypervisor, the hypervisor can tell us that the HPTE we think is there
is actually there even though it isn't, which can lead to a process
getting stuck in a loop, continually faulting.  The reason for the
confusion is that the AVPN (abbreviated virtual page number) we are
looking for in the HPTE for a 4k subpage can actually match the AVPN
in a stale HPTE for another 64k page.  For example, the HPTE for
the 4k subpage at 0x84000f000 will be in the same hash bucket and have
the same AVPN as the HPTE for the 64k page at 0x8400f0000.

This fixes the code to clear out the subpage HPTE-present bits.

Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-06-18 21:40:43 +10:00
Josh Boyer b17879f71c [POWERPC] 4xx: Clear new TLB cache attribute bits in Data Storage vector
A recent commit added support for the new 440x6 and 464 cores that have the
added WL1, IL1I, IL1D, IL2I, and ILD2 bits for the caching attributes in the
TLBs.  The new bits were cleared in the finish_tlb_load function, however a
similar bit of code was missed in the DataStorage interrupt vector.

Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-06-18 21:40:43 +10:00
Linus Torvalds 42a886af72 x86-64: Fix "bytes left to copy" return value for copy_from_user()
Most users by far do not care about the exact return value (they only
really care about whether the copy succeeded in its entirety or not),
but a few special core routines actually care deeply about exactly how
many bytes were copied from user space.

And the unrolled versions of the x86-64 user copy routines would
sometimes report that it had copied more bytes than it actually had.

Very few uses actually have partial copies to begin with, but to make
this bug even harder to trigger, most x86 CPU's use the "rep string"
instructions for normal user copies, and that version didn't have this
issue.

To make it even harder to hit, the one user of this that really cared
about the return value (and used the uncached version of the copy that
doesn't use the "rep string" instructions) was the generic write
routine, which pre-populated its source, once more hiding the problem by
avoiding the exception case that triggers the bug.

In other words, very special thanks to Bron Gondwana who not only
triggered this, but created a test-program to show it, and bisected the
behavior down to commit 08291429cf ("mm:
fix pagecache write deadlocks") which changed the access pattern just
enough that you can now trigger it with 'writev()' with multiple
iovec's.

That commit itself was not the cause of the bug, it just allowed all the
stars to align just right that you could trigger the problem.

[ Side note: this is just the minimal fix to make the copy routines
  (with __copy_from_user_inatomic_nocache as the particular version that
  was involved in showing this) have the right return values.

  We really should improve on the exceptional case further - to make the
  copy do a byte-accurate copy up to the exact page limit that causes it
  to fail.  As it is, the callers have to do extra work to handle the
  limit case gracefully. ]

Reported-by: Bron Gondwana <brong@fastmail.fm>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

 (which didn't have this problem), and since
most users that do the carethis was very hard to trigger, but
2008-06-17 17:47:50 -07:00
Linus Torvalds c8988f9682 Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
  [IA64] Fix CONFIG_IA64_SGI_UV build error
  [IA64] Update check_sal_cache_flush to use platform_send_ipi()
  [IA64] perfmon: fix async exit bug
2008-06-16 11:52:43 -07:00