Commit Graph

1764 Commits

Author SHA1 Message Date
Mark M. Hoffman 1236441f38 [PATCH] I2C hwmon: hwmon sysfs class
This patch adds the sysfs class "hwmon" for use by hardware monitoring
(sensors) chip drivers.  It also fixes up the related Kconfig/Makefile
bits.

Signed-off-by: Mark M. Hoffman <mhoffman@lightlink.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-09-05 09:14:07 -07:00
bgardner@wabtec.com a61fc683ae [PATCH] I2C: add kobj_to_i2c_client
Move the inline function kobj_to_i2c_client() from max6875.c to i2c.h.

Signed-off-by: Ben Gardner <bgardner@wabtec.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-09-05 09:14:05 -07:00
Linus Torvalds 94f8c66e5e Merge branch 'upstream' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev 2005-09-05 05:50:36 -07:00
Russell King 9d88347758 [ARM] Remove unused DYN_TICK_* macros
Neither DYN_TICK_SKIPPING nor DYN_TICK_SUITABLE are used on ARM.
Remove them.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-09-05 10:21:04 +01:00
Jeff Garzik d0bd99299b /spare/repo/libata-dev branch 'iomap-try3' 2005-09-05 05:20:33 -04:00
Jeff Garzik 586a4ac509 /spare/repo/libata-dev branch 'master' 2005-09-05 05:16:50 -04:00
Linus Torvalds da1f136c26 Merge master.kernel.org:/home/rmk/linux-2.6-mmc 2005-09-05 00:18:09 -07:00
Linus Torvalds 64c4813d9e Merge master.kernel.org:/home/rmk/linux-2.6-arm 2005-09-05 00:17:25 -07:00
Linus Torvalds e766f1cc59 Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6 2005-09-05 00:12:58 -07:00
Linus Torvalds 48467641bc Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2005-09-05 00:11:50 -07:00
Heiko Carstens 9513e5e3f5 [PATCH] s390: spinlock corner case
On s390 the lock value used for spinlocks consists of the lower 32 bits of the
PSW that holds the lock.  If this address happens to be on a four gigabyte
boundary the lock is left unlocked.  This allows other cpus to grab the same
lock and enter a lock protected code path concurrently.  In theory this can
happen if the vmalloc area for the code of a module crosses a 4 GB boundary.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:29 -07:00
Michael Holzheu 942eaabd5d [PATCH] s390: debug feature changes
debug feature changes/bug fixes:

- Use get_clock() function instead of private inline assembly.

- Use 'struct timeval' instead of 'struct timespec' for call to
  tod_to_timeval().  Now the microsecond part of the timestamp is correct
  again.

- Fix a locking problem: when creating a snapshot of the current content
  of the debug areas, lock the entire debug_info object.

Signed-off-by: Michael Holzheu <holzheu@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:26 -07:00
Martin Schwidefsky ae6aa2ea89 [PATCH] s390: machine check handler bugs
The new machine check handler still has a few bugs.

1) The system entry time has to be stored in the machine check handler,

2) the machine check return psw may not be stored at the usual place
   because it might overwrite the return psw of the interrupted context,

3) the return address for the call to s390_handle_mcck in the i/o interrupt
   handler is not correct,

4) the system call cleanup has to take the different save area of the
   machine check handler into account,

5) the machine check handler may not call UPDATE_VTIME before
   CREATE_STACK_FRAME, and

6) the io leave path needs a critical section cleanup to make sure that the
   TIF_MCCK_PENDING bit is really checked before switching back to user space.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:25 -07:00
Adrian Bunk 4c139862b8 [PATCH] xtensa: delete accidental file
This file seems to be an accident.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Chris Zankel <chris@zankel.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:25 -07:00
Adrian Bunk d99cf715a0 [PATCH] xtensa: replace 'extern inline' with 'static inline'
"extern inline" doesn't make sense.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Chris Zankel <chris@zankel.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:25 -07:00
Jeff Dike 7ef9390541 [PATCH] uml: fix x86_64 page leak
We were leaking pmd pages when 3_LEVEL_PGTABLES was enabled.  This fixes that.

Signed-off-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:24 -07:00
Jeff Dike 08964c565b [PATCH] uml: merge duplicated page table code
There is a lot of code which is duplicated between the 2 and 3 level
implementation, with the only difference that the 3-level implementation is a
bit more generalized (instead of accessing directly pte_t.pte, it uses the
appropriate access macros).

So this code is joined together.

As obvious, a "core code nice cleanup" is not a "stability-friendly patch" so
usual care applies.

Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:22 -07:00
Paolo 'Blaisorblade' Giarrusso 1e40cd383c [PATCH] uml: fixes performance regression in activate_mm and thus exec()
Normally, activate_mm() is called from exec(), and thus it used to be a
no-op because we use a completely new "MM context" on the host (for
instance, a new process), and so we didn't need to flush any "TLB entries"
(which for us are the set of memory mappings for the host process from the
virtual "RAM" file).

Kernel threads, instead, are usually handled in a different way.  So, when
for AIO we call use_mm(), things used to break and so Benjamin implemented
activate_mm().  However, that is only needed for AIO, and could slow down
exec() inside UML, so be smart: detect being called for AIO (via
PF_BORROWED_MM) and do the full flush only in that situation.

Comment also the caller so that people won't go breaking UML without
noticing.  I also rely on the caller's locks for testing current->flags.

Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
CC: Benjamin LaHaise <bcrl@kvack.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:21 -07:00
Bodo Stroesser 1b38f0064e [PATCH] Uml support: add PTRACE_SYSEMU_SINGLESTEP option to i386
This patch implements the new ptrace option PTRACE_SYSEMU_SINGLESTEP, which
can be used by UML to singlestep a process: it will receive SINGLESTEP
interceptions for normal instructions and syscalls, but syscall execution will
be skipped just like with PTRACE_SYSEMU.

Signed-off-by: Bodo Stroesser <bstroesser@fujitsu-siemens.com>
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:20 -07:00
Laurent Vivier ed75e8d580 [PATCH] UML Support - Ptrace: adds the host SYSEMU support, for UML and general usage
Jeff Dike <jdike@addtoit.com>,
      Paolo 'Blaisorblade' Giarrusso <blaisorblade_spam@yahoo.it>,
      Bodo Stroesser <bstroesser@fujitsu-siemens.com>

Adds a new ptrace(2) mode, called PTRACE_SYSEMU, resembling PTRACE_SYSCALL
except that the kernel does not execute the requested syscall; this is useful
to improve performance for virtual environments, like UML, which want to run
the syscall on their own.

In fact, using PTRACE_SYSCALL means stopping child execution twice, on entry
and on exit, and each time you also have two context switches; with SYSEMU you
avoid the 2nd stop and so save two context switches per syscall.

Also, some architectures don't have support in the host for changing the
syscall number via ptrace(), which is currently needed to skip syscall
execution (UML turns any syscall into getpid() to avoid it being executed on
the host).  Fixing that is hard, while SYSEMU is easier to implement.

* This version of the patch includes some suggestions of Jeff Dike to avoid
  adding any instructions to the syscall fast path, plus some other little
  changes, by myself, to make it work even when the syscall is executed with
  SYSENTER (but I'm unsure about them). It has been widely tested for quite a
  lot of time.

* Various fixed were included to handle the various switches between
  various states, i.e. when for instance a syscall entry is traced with one of
  PT_SYSCALL / _SYSEMU / _SINGLESTEP and another one is used on exit.
  Basically, this is done by remembering which one of them was used even after
  the call to ptrace_notify().

* We're combining TIF_SYSCALL_EMU with TIF_SYSCALL_TRACE or TIF_SINGLESTEP
  to make do_syscall_trace() notice that the current syscall was started with
  SYSEMU on entry, so that no notification ought to be done in the exit path;
  this is a bit of a hack, so this problem is solved in another way in next
  patches.

* Also, the effects of the patch:
"Ptrace - i386: fix Syscall Audit interaction with singlestep"
are cancelled; they are restored back in the last patch of this series.

Detailed descriptions of the patches doing this kind of processing follow (but
I've already summed everything up).

* Fix behaviour when changing interception kind #1.

  In do_syscall_trace(), we check the status of the TIF_SYSCALL_EMU flag
  only after doing the debugger notification; but the debugger might have
  changed the status of this flag because he continued execution with
  PTRACE_SYSCALL, so this is wrong.  This patch fixes it by saving the flag
  status before calling ptrace_notify().

* Fix behaviour when changing interception kind #2:
  avoid intercepting syscall on return when using SYSCALL again.

  A guest process switching from using PTRACE_SYSEMU to PTRACE_SYSCALL
  crashes.

  The problem is in arch/i386/kernel/entry.S.  The current SYSEMU patch
  inhibits the syscall-handler to be called, but does not prevent
  do_syscall_trace() to be called after this for syscall completion
  interception.

  The appended patch fixes this.  It reuses the flag TIF_SYSCALL_EMU to
  remember "we come from PTRACE_SYSEMU and now are in PTRACE_SYSCALL", since
  the flag is unused in the depicted situation.

* Fix behaviour when changing interception kind #3:
  avoid intercepting syscall on return when using SINGLESTEP.

  When testing 2.6.9 and the skas3.v6 patch, with my latest patch and had
  problems with singlestepping on UML in SKAS with SYSEMU.  It looped
  receiving SIGTRAPs without moving forward.  EIP of the traced process was
  the same for all SIGTRAPs.

What's missing is to handle switching from PTRACE_SYSCALL_EMU to
PTRACE_SINGLESTEP in a way very similar to what is done for the change from
PTRACE_SYSCALL_EMU to PTRACE_SYSCALL_TRACE.

I.e., after calling ptrace(PTRACE_SYSEMU), on the return path, the debugger is
notified and then wake ups the process; the syscall is executed (or skipped,
when do_syscall_trace() returns 0, i.e.  when using PTRACE_SYSEMU), and
do_syscall_trace() is called again.  Since we are on the return path of a
SYSEMU'd syscall, if the wake up is performed through ptrace(PTRACE_SYSCALL),
we must still avoid notifying the parent of the syscall exit.  Now, this
behaviour is extended even to resuming with PTRACE_SINGLESTEP.

Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:20 -07:00
Roman Zippel 072dffda1d [PATCH] m68k: cleanup inline mem functions
Use the builtin functions for memset/memclr/memcpy, special optimizations for
page operations have dedicated functions now.  Uninline memmove/memchr and
move all functions into a single file and clean it up a little.

Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:19 -07:00
Roman Zippel 2855b97020 [PATCH] m68k: move cache functions into separate file
Move a few cache functions into its own file and fix flush_icache_range() so
it can handle both kernel and user addresses correctly (assuming context is
set correctly).

Turn copy_to_user_page/copy_from_user_page into inline functions and add a
missing cache flush.

Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:19 -07:00
Shaohua Li c3c433e4f3 [PATCH] add suspend/resume for timer
The timers lack .suspend/.resume methods.  Because of this, jiffies got a
big compensation after a S3 resume.  And then softlockup watchdog reports
an oops.  This occured with HPET enabled, but it's also possible for other
timers.

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:18 -07:00
Pavel Machek ca078bae81 [PATCH] swsusp: switch pm_message_t to struct
This adds type-checking to pm_message_t, so that people can't confuse it
with int or u32.  It also allows us to fix "disk yoyo" during suspend (disk
spinning down/up/down).

[We've tried that before; since that cpufreq problems were fixed and I've
tried make allyes config and fixed resulting damage.]

Signed-off-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Alexander Nyberg <alexn@telia.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:16 -07:00
Zwane Mwaikambo 4ad8d38342 [PATCH] i386 boottime for_each_cpu broken
for_each_cpu walks through all processors in cpu_possible_map, which is
defined as cpu_callout_map on i386 and isn't initialised until all
processors have been booted. This breaks things which do for_each_cpu
iterations early during boot. So, define cpu_possible_map as a bitmap with
NR_CPUS bits populated. This was triggered by a patch i'm working on which
does alloc_percpu before bringing up secondary processors.

From: Alexander Nyberg <alexn@telia.com>

i386-boottime-for_each_cpu-broken.patch
i386-boottime-for_each_cpu-broken-fix.patch

The SMP version of __alloc_percpu checks the cpu_possible_map before
allocating memory for a certain cpu.  With the above patches the BSP cpuid
is never set in cpu_possible_map which breaks CONFIG_SMP on uniprocessor
machines (as soon as someone tries to dereference something allocated via
__alloc_percpu, which in fact is never allocated since the cpu is not set
in cpu_possible_map).

Signed-off-by: Zwane Mwaikambo <zwane@arm.linux.org.uk>
Signed-off-by: Alexander Nyberg <alexn@telia.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:13 -07:00
Zachary Amsden d7271b14b2 [PATCH] i386: encapsulate copying of pgd entries
Add a clone operation for pgd updates.

This helps complete the encapsulation of updates to page tables (or pages
about to become page tables) into accessor functions rather than using
memcpy() to duplicate them.  This is both generally good for consistency
and also necessary for running in a hypervisor which requires explicit
updates to page table entries.

The new function is:

clone_pgd_range(pgd_t *dst, pgd_t *src, int count);

   dst - pointer to pgd range anwhere on a pgd page
   src - ""
   count - the number of pgds to copy.

   dst and src can be on the same page, but the range must not overlap
   and must not cross a page boundary.

Note that I ommitted using this call to copy pgd entries into the
software suspend page root, since this is not technically a live paging
structure, rather it is used on resume from suspend.  CC'ing Pavel in case
he has any feedback on this.

Thanks to Chris Wright for noticing that this could be more optimal in
PAE compiles by eliminating the memset.

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:13 -07:00
George Anzinger 748f2edb52 [PATCH] x86 NMI: better support for debuggers
This patch adds a notify to the die_nmi notify that the system is about to
be taken down.  If the notify is handled with a NOTIFY_STOP return, the
system is given a new lease on life.

We also change the nmi watchdog to carry on if die_nmi returns.

This give debug code a chance to a) catch watchdog timeouts and b) possibly
allow the system to continue, realizing that the time out may be due to
debugger activities such as single stepping which is usually done with
"other" cpus held.

Signed-off-by: George Anzinger<george@mvista.com>
Cc: Keith Owens <kaos@ocs.com.au>
Signed-off-by: George Anzinger <george@mvista.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:13 -07:00
Zachary Amsden f2f30ebca6 [PATCH] x86: introduce a write acessor for updating the current LDT
Introduce a write acessor for updating the current LDT.  This is required
for hypervisors like Xen that do not allow LDT pages to be directly
written.

Testing - here's a fun little LDT test that can be trivially modified to
test limits as well.

/*
 * Copyright (c) 2005, Zachary Amsden (zach@vmware.com)
 * This is licensed under the GPL.
 */

#include <stdio.h>
#include <signal.h>
#include <asm/ldt.h>
#include <asm/segment.h>
#include <sys/types.h>
#include <unistd.h>
#include <sys/mman.h>
#define __KERNEL__
#include <asm/page.h>

void main(void)
{
        struct user_desc desc;
        char *code;
        unsigned long long tsc;

        code = (char *)mmap(0, 8192, PROT_EXEC|PROT_READ|PROT_WRITE,
                                 MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
        desc.entry_number = 0;
        desc.base_addr = code;
        desc.limit = 1;
        desc.seg_32bit = 1;
        desc.contents = MODIFY_LDT_CONTENTS_CODE;
        desc.read_exec_only = 0;
        desc.limit_in_pages = 1;
        desc.seg_not_present = 0;
        desc.useable = 1;
        if (modify_ldt(1, &desc, sizeof(desc)) != 0) {
                perror("modify_ldt");
        }
        printf("code base is 0x%08x\n", (unsigned)code);
        code[0x0ffe] = 0x0f;  /* rdtsc */
        code[0x0fff] = 0x31;
        code[0x1000] = 0xcb;  /* lret */
        __asm__ __volatile("lcall $7,$0xffe" : "=A" (tsc));
        printf("TSC is 0x%016llx\n", tsc);
}

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:13 -07:00
Zachary Amsden a520112930 [PATCH] x86: make IOPL explicit
The pushf/popf in switch_to are ONLY used to switch IOPL.  Making this
explicit in C code is more clear.  This pushf/popf pair was added as a
bugfix for leaking IOPL to unprivileged processes when using
sysenter/sysexit based system calls (sysexit does not restore flags).

When requesting an IOPL change in sys_iopl(), it is just as easy to change
the current flags and the flags in the stack image (in case an IRET is
required), but there is no reason to force an IRET if we came in from the
SYSENTER path.

This change is the minimal solution for supporting a paravirtualized Linux
kernel that allows user processes to run with I/O privilege.  Other
solutions require radical rewrites of part of the low level fault / system
call handling code, or do not fully support sysenter based system calls.

Unfortunately, this added one field to the thread_struct.  But as a bonus,
on P4, the fastest time measured for switch_to() went from 312 to 260
cycles, a win of about 17% in the fast case through this performance
critical path.

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:12 -07:00
Zachary Amsden 0998e4228a [PATCH] x86: privilege cleanup
Privilege checking cleanup.  Originally, these diffs were much greater, but
recent cleanups in Linux have already done much of the cleanup.  I added
some explanatory comments in places where the reasoning behind certain
tests is rather subtle.

Also, in traps.c, we can skip the user_mode check in handle_BUG().  The
reason is, there are only two call chains - one via die_if_kernel() and one
via do_page_fault(), both entering from die().  Both of these paths already
ensure that a kernel mode failure has happened.  Also, the original check
here, if (user_mode(regs)) was insufficient anyways, since it would not
rule out BUG faults from V8086 mode execution.

Saving the %ss segment in show_regs() rather than assuming a fixed value
also gives better information about the current kernel state in the
register dump.

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:12 -07:00
Zachary Amsden f2ab446124 [PATCH] x86: more asm cleanups
Some more assembler cleanups I noticed along the way.

Signed-off-by: Zachary Amsden <zach@vmware.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:12 -07:00
Ingo Molnar 4f0cb8d978 [PATCH] i386: fix incorrect TSS entry for LDT
Noticed by Chuck Ebbert: the .ldt entry of the TSS was set up incorrectly.
It never mattered since this was a leftover from old times, so remove it.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:12 -07:00
Zachary Amsden c9b02a2413 [PATCH] i386: use set_pte macros in a couple places where they were missing
Also, setting PDPEs in PAE mode does not require atomic operations, since the
PDPEs are cached by the processor, and only reloaded on an explicit or
implicit reload of CR3.

Since the four PDPEs must always be present in an active root, and the kernel
PDPE is never updated, we are safe even from SMIs and interrupts / NMIs using
task gates (which reload CR3).  Actually, much of this is moot, since the user
PDPEs are never updated either, and the only usage of task gates is by the
doublefault handler.  It appears the only place PGDs get updated in PAE mode
is in init_low_mappings() / zap_low_mapping() for initial page table creation
and recovery from ACPI sleep state, and these sites are safe by inspection.
Getting rid of the cmpxchg8b saves code space and 720 cycles in pgd_alloc on
P4.

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:12 -07:00
Zachary Amsden 2f2984eb4a [PATCH] i386: generate better code around descriptor update and access functions
GCC can generate better code around descriptor update and access functions
when there is not an explicit "eax" register constraint.

Testing: You won't boot if this is messed up, since the TSS descriptor will be
corrupted.  Verified the assembler and booted.

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:11 -07:00
Zachary Amsden 4d37e7e3fd [PATCH] i386: inline assembler: cleanup and encapsulate descriptor and task register management
i386 inline assembler cleanup.

This change encapsulates descriptor and task register management.  Also,
it is possible to improve assembler generation in two cases; savesegment
may store the value in a register instead of a memory location, which
allows GCC to optimize stack variables into registers, and MOV MEM, SEG
is always a 16-bit write to memory, making the casting in math-emu
unnecessary.

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:11 -07:00
Zachary Amsden 245067d167 [PATCH] i386: cleanup serialize msr
i386 arch cleanup.  Introduce the serialize macro to serialize processor
state.  Why the microcode update needs it I am not quite sure, since wrmsr()
is already a serializing instruction, but it is a microcode update, so I will
keep the semantic the same, since this could be a timing workaround.  As far
as I can tell, this has always been there since the original microcode update
source.

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:11 -07:00
Zachary Amsden 4bb0d3ec3e [PATCH] i386: inline asm cleanup
i386 Inline asm cleanup.  Use cr/dr accessor functions.

Also, a potential bugfix.  Also, some CR accessors really should be volatile.
Reads from CR0 (numeric state may change in an exception handler), writes to
CR4 (flipping CR4.TSD) and reads from CR2 (page fault) prevent instruction
re-ordering.  I did not add memory clobber to CR3 / CR4 / CR0 updates, as it
was not there to begin with, and in no case should kernel memory be clobbered,
except when doing a TLB flush, which already has memory clobber.

I noticed that page invalidation does not have a memory clobber.  I can't find
a bug as a result, but there is definitely a potential for a bug here:

#define __flush_tlb_single(addr) \
	__asm__ __volatile__("invlpg %0": :"m" (*(char *) addr))

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:11 -07:00
Natalie.Protasevich@unisys.com 56f1d5d52a [PATCH] ES7000 platform update (i386)
This is subarch update for ES7000.  I've modified platform check code and
removed unnecessary OEM table parsing for newer systems that don't use OEM
information during boot.  Parsing the table in fact is causing problems,
and the platform doesn't get recognized.  The patch only affects the ES7000
subach.

Signed-off-by: <Natalie.Protasevich@unisys.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:10 -07:00
Venkatesh Pallipadi 911a62d423 [PATCH] x86: sutomatically enable bigsmp when we have more than 8 CPUs
i386 generic subarchitecture requires explicit dmi strings or command line
to enable bigsmp mode.  The patch below removes that restriction, and uses
bigsmp as soon as it finds more than 8 logical CPUs, Intel processors and
xAPIC support.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:10 -07:00
Matt Tolentino 7ae65fd334 [PATCH] x86: fix EFI memory map parsing
The memory descriptors that comprise the EFI memory map are not fixed in
stone such that the size could change in the future.  This uses the memory
descriptor size obtained from EFI to iterate over the memory map entries
during boot.  This enables the removal of an x86 specific pad (and ifdef)
in the EFI header.  I also couldn't stomach the broken up nature of the
function to put EFI runtime calls into virtual mode any longer so I fixed
that up a bit as well.

For reference, this patch only impacts x86.

Signed-off-by: Matt Tolentino <matthew.e.tolentino@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:09 -07:00
Yoichi Yuasa 6fe7f2578f [PATCH] mips: remove timex.h for vr41xx
vr41xx doesn't need mach-vr41xx/timex.h.  This patch has removed
mach-vr41xx/timex.h.

Signed-off-by: Yoichi Yuasa <yuasa@hh.iij4u.or.jp>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:08 -07:00
Yoichi Yuasa 766160c29f [PATCH] mips: fix build warnings
This patch has fixed the following warnings.

arch/mips/kernel/genex.S:250:5: warning: "CONFIG_64BIT" is not defined
arch/mips/math-emu/cp1emu.c:1128:5: warning: "__mips64" is not defined
arch/mips/math-emu/cp1emu.c:1206:5: warning: "__mips64" is not defined
arch/mips/math-emu/cp1emu.c:1270:5: warning: "__mips64" is not defined
arch/mips/math-emu/cp1emu.c:323:5: warning: "__mips64" is not defined
arch/mips/math-emu/cp1emu.c:808:5: warning: "__mips64" is not defined
arch/mips/math-emu/cp1emu.c:953:5: warning: "__mips64" is not defined
arch/mips/mm/tlbex.c:519:5: warning: "CONFIG_64BIT" is not defined
include/asm/reg.h:73:5: warning: "CONFIG_64BIT" is not defined

Signed-off-by: Yoichi Yuasa <yuasa@hh.iij4u.or.jp>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:08 -07:00
Yoichi Yuasa e63ea56fe2 [PATCH] mips: add pcibios_bus_to_resource
This patch has added pcibios_bus_to_resource to MIPS.

Signed-off-by: Yoichi Yuasa <yuasa@hh.iij4u.or.jp>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:08 -07:00
Yoichi Yuasa e2de84920d [PATCH] mips: add pcibios_select_root
Add pcibios_select_root to MIPS.

Signed-off-by: Yoichi Yuasa <yuasa@hh.iij4u.or.jp>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:08 -07:00
Ralf Baechle 4ce588cd56 [PATCH] mips: fix coherency configuration
Fix the MIPS coherency configuration such that we always keep the mapping
state in <asm/pci.h> when we need to on non-coherent platforms.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:07 -07:00
Ralf Baechle 42a3b4f25a [PATCH] mips: nuke trailing whitespace
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:07 -07:00
Ralf Baechle 875d43e72b [PATCH] mips: clean up 32/64-bit configuration
Start cleaning 32-bit vs. 64-bit configuration.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:06 -07:00
Ralf Baechle dc4ec916f6 [PATCH] MIPS Technologies PCI ID bits
- MIPS Denmark does no longer exist; the PCI vendor ID is now owned by
  MIPS Technologies.

- Add ID for SOC-it, MIPS's system controller.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:04 -07:00
Ralf Baechle 07119621e6 [PATCH] mips: add support for Qemu system architecture
Add support for the virtual MIPS system that is emulated by Qemu.  See
http://www.linux-mips.org/wiki/Qemu for a detailed current status.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:04 -07:00
Ralf Baechle 7901c79982 [PATCH] DEC PMAGB B framebuffer update
Revive HX frame buffer support for 2.6.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:03 -07:00
Ralf Baechle af690a948c [PATCH] DEC PMAG BA frame buffer update
Rewrite PMAG BA frame buffer driver for 2.6.

Acked-by: Antonino Daplas <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:03 -07:00
Ralf Baechle 003b54925e [PATCH] mips: remove HP Laserjet remains
Remove the one file which managed to survive the removel of HP Laserjet
support.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:03 -07:00
Ralf Baechle 0fdda107e1 [PATCH] mips: remove VR4181 support
There seem to be no more users or interest in the NEC Osprey evaluation
system for the NEC VR4181 SOC which is an old part anyway, so remove the
code.  More information on the Osprey can be found at
http://www.linux-mips.org/wiki/Osprey.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:02 -07:00
Yoichi Yuasa 979934da9e [PATCH] mips: update IRQ handling for vr41xx
This patch has updated IRQ handling for vr41xx.
o added common IRQ dispatch
o changed IRQ number in int-handler.S
o added resource management to icu.c

Signed-off-by: Yoichi Yuasa <yuasa@hh.iij4u.or.jp>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:02 -07:00
Olof Johansson 233ccd0d04 [PATCH] ppc64: Add VMX save flag to VPA
We need to indicate to the hypervisor that it needs to save our VMX
registers when switching partitions on a shared-processor system, just as
it needs to for FP and PMC registers.

This could be made to be on-demand when VMX is used, but we don't do that
for FP nor PMC right now either so let's not overcomplicate things.

Signed-off-by: Olof Johansson <olof@lixom.net>
Acked-by: Paul Mackerras <paulus@samba.org>
Cc: Anton Blanchard <anton@samba.org>
Cc: <engebret@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:01 -07:00
Mark A. Greer d01c08c9ae [PATCH] ppc32: mv64x60 updates & enhancements
Updates and enhancement to the ppc32 mv64x60 code:
- move code to get mem size from mem ctlr to bootwrapper
- address some errata in the mv64360 pic code
- some minor cleanups
- export one of the bridge's regs via sysfs so user daemon can watch for
  extraction events

Signed-off-by: Mark A. Greer <mgreer@mvista.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:00 -07:00
Eugene Surovegin e8834801bf [PATCH] ppc32: export cacheable_memcpy()
Add declaration and cacheable_memcpy().  I'll be needing this function in
new 4xx EMAC driver I'm going to submit to netdev soon.

IMHO, the better place for the declaration would be asm-powerpc/string.h,
unfortunately, ppc64 doesn't have this function, so asm-ppc/system.h is the
next best place.

Signed-off-by: Eugene Surovegin <ebs@ebshome.net>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:00 -07:00
Eugene Surovegin 3a0a401b40 [PATCH] ppc32: add dcr_base field to ocp_func_mal_data
Add dcr_base field to ocp_func_mal_data.  This is preparation step for the
new EMAC driver.

Signed-off-by: Eugene Surovegin <ebs@ebshome.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:00 -07:00
Eugene Surovegin 28fa031e76 [PATCH] ppc32: move 4xx PHY_MODE_XXX defines to ibm_ocp.h
Move 4xx PHY_MODE_XXX defines to asm-ppc/ibm_ocp.h.  This is a preparation
step for the new EMAC driver.

Signed-off-by: Eugene Surovegin <ebs@ebshome.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:05:59 -07:00
Kumar Gala 164ada643d [PATCH] ppc32: add CONFIG_HZ
While ppc32 has the CONFIG_HZ Kconfig option, it wasnt actually being used.
Connect it up and set all platforms to 250Hz.  This pretty much mimics the
ppc64 patch from Anton Blanchard.

Signed-off-by: Kumar Gala <kumar.gala@freescale.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:05:58 -07:00
Kumar Gala 88adfe70c6 [PATCH] ppc32: ppc_sys system on chip identification additions
Add the ability to identify an SOC by a name and id.  There are cases in
which the integer identifier is not sufficient to specify a specific SOC.
In these cases we can use a string to further qualify the match.

Signed-off-by: Vitaly Bordug <vbordug@ru.mvista.com>
Signed-off-by: Kumar Gala <kumar.gala@freescale.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:05:58 -07:00
Roland Dreier 5a6a4d4320 [PATCH] ppc32: Don't sleep in flush_dcache_icache_page()
flush_dcache_icache_page() will be called on an instruction page fault.  We
can't sleep in the fault handler, so use kmap_atomic() instead of just
kmap() for the Book-E case.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
Acked-by: Matt Porter <mporter@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:05:57 -07:00
Kumar Gala 39cdc4bfb5 [PATCH] ppc32: Cleaned up global namespace of Book-E watchdog variables
Renamed global variables used to convey if the watchdog is enabled and
periodicity of the timer and moved the declarations into a header for these
variables

Signed-off-by: Matt McClintock <msm@freescale.com>
Signed-off-by: Kumar Gala <kumar.gala@freescale.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:05:57 -07:00
Matt Porter 2698ebcb43 [PATCH] ppc32: add phy excluded features to ocp_func_emac_data
This patch adds a field to struct ocp_func_emac_data that allows
platform-specific unsupported PHY features to be passed in to the ibm_emac
ethernet driver.

This patch also adds some logic for the Bamboo eval board to populate this
field based on the dip switches on the board.  This is a workaround for the
improperly biased RJ-45 sockets on the Rev.  0 Bamboo.

Signed-off-by: Wade Farnsworth <wfarnsworth@mvista.com>
Signed-off-by: Matt Porter <mporter@kernel.crashing.org>
Cc: Jeff Garzik <jgarzik@pobox.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:05:56 -07:00
Kumar Gala 8e8fff0975 [PATCH] ppc32: Add ppc_sys descriptions for PowerQUICC II devices
Added ppc_sys device and system definitions for PowerQUICC II devices.
This will allow drivers for PQ2 to be proper platform device drivers.
Which can be shared on PQ3 processors with the same peripherals.

Signed-off-by: Matt McClintock <msm@freescale.com>
Signed-off-by: Kumar Gala <kumar.gala@freescale.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:05:56 -07:00
Christoph Hellwig d27477c225 [PATCH] ppc32: fix asm-ppc/dma-mapping.h sparse warning
GFP flags must be passed as unisgned int __nocast these days, else we'll
get tons of sparse warnings in every driver.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:05:55 -07:00
Kumar Gala f4ad35a34b [PATCH] ppc32: Remove board support for SPD823TS
Support for the SPD823TS board is no longer maintained and thus being removed

Signed-off-by: Kumar Gala <kumar.gala@freescale.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:05:55 -07:00
Kumar Gala ea08dcfa54 [PATCH] ppc32: Remove board support for REDWOOD
Support for the REDWOOD board is no longer maintained and thus being removed

Signed-off-by: Kumar Gala <kumar.gala@freescale.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:05:54 -07:00
Kumar Gala 37330c9146 [PATCH] ppc32: Remove board support for OAK
Support for the OAK board is no longer maintained and thus being removed

Signed-off-by: Kumar Gala <kumar.gala@freescale.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:05:54 -07:00
Kumar Gala 89d7f53030 [PATCH] ppc32: Remove board support for MCPN765
Support for the MCPN765 board is no longer maintained and thus being removed

Signed-off-by: Kumar Gala <kumar.gala@freescale.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:05:54 -07:00
Kumar Gala f4f1269cb3 [PATCH] ppc32: Remove board support for ASH
Support for the ASH board is no longer maintained and thus being removed

Signed-off-by: Kumar Gala <kumar.gala@freescale.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:05:53 -07:00
Martin Hicks c07e02db76 [PATCH] VM: add page_state info to per-node meminfo
Add page_state info to the per-node meminfo file in sysfs.  This is mostly
just for informational purposes.

The lack of this information was brought up recently during a discussion
regarding pagecache clearing, and I put this patch together to test out one
of the suggestions.

It seems like interesting info to have, so I'm submitting the patch.

Signed-off-by: Martin Hicks <mort@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:05:49 -07:00
Zachary Amsden 61e06037e7 [PATCH] x86_64: avoid some atomic operations during address space destruction
Any architecture that has hardware updated A/D bits that require
synchronization against other processors during PTE operations can benefit
from doing non-atomic PTE updates during address space destruction.
Originally done on i386, now ported to x86_64.

Doing a read/write pair instead of an xchg() operation saves the implicit
lock, which turns out to be a big win on 32-bit (esp w PAE).

Signed-off-by: Zachary Amsden <zach@vmware.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:05:48 -07:00
Zachary Amsden a600388d28 [PATCH] x86: ptep_clear optimization
Add a new accessor for PTEs, which passes the full hint from the mmu_gather
struct; this allows architectures with hardware pagetables to optimize away
atomic PTE operations when destroying an address space.  Removing the
locked operation should allow better pipelining of memory access in this
loop.  I measured an average savings of 30-35 cycles per zap_pte_range on
the first 500 destructions on Pentium-M, but I believe the optimization
would win more on older processors which still assert the bus lock on xchg
for an exclusive cacheline.

Update: I made some new measurements, and this saves exactly 26 cycles over
ptep_get_and_clear on Pentium M.  On P4, with a PAE kernel, this saves 180
cycles per ptep_get_and_clear, for a whopping 92160 cycles savings for a
full address space destruction.

pte_clear_full is not yet used, but is provided for future optimizations
(in particular, when running inside of a hypervisor that queues page table
updates, the full hint allows us to avoid queueing unnecessary page table
update for an address space in the process of being destroyed.

This is not a huge win, but it does help a bit, and sets the stage for
further hypervisor optimization of the mm layer on all architectures.

Signed-off-by: Zachary Amsden <zach@vmware.com>
Cc: Christoph Lameter <christoph@lameter.com>
Cc: <linux-mm@kvack.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:05:48 -07:00
Kyle Moffett fa5b08d5f8 [PATCH] sab: consolidate kmem_bufctl_t
This is used only in slab.c and each architecture gets to define whcih
underlying type is to be used.

Seems a bit silly - move it to slab.c and use the same type for all
architectures: unsigned int.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:05:48 -07:00
Chen, Kenneth W 0e5c9f39f6 [PATCH] remove hugetlb_clean_stale_pgtable() and fix huge_pte_alloc()
I don't think we need to call hugetlb_clean_stale_pgtable() anymore
in 2.6.13 because of the rework with free_pgtables().  It now collect
all the pte page at the time of munmap.  It used to only collect page
table pages when entire one pgd can be freed and left with staled pte
pages.  Not anymore with 2.6.13.  This function will never be called
and We should turn it into a BUG_ON.

I also spotted two problems here, not Adam's fault :-)
(1) in huge_pte_alloc(), it looks like a bug to me that pud is not
    checked before calling pmd_alloc()
(2) in hugetlb_clean_stale_pgtable(), it also missed a call to
    pmd_free_tlb.  I think a tlb flush is required to flush the mapping
    for the page table itself when we clear out the pmd pointing to a
    pte page.  However, since hugetlb_clean_stale_pgtable() is never
    called, so it won't trigger the bug.

Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Cc: Adam Litke <agl@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:05:46 -07:00
Adam Litke 32e51a8c97 [PATCH] hugetlb: add pte_huge() macro
This patch adds a macro pte_huge(pte) for i386/x86_64 which is needed by a
patch later in the series.  Instead of repeating (_PAGE_PRESENT |
_PAGE_PSE), I've added __LARGE_PTE to i386 to match x86_64.

Signed-off-by: Adam Litke <agl@us.ibm.com>
Cc: <linux-mm@kvack.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:05:46 -07:00
Deepak Saxena fd195c49fb [PATCH] arm: allow for arch-specific IOREMAP_MAX_ORDER
Version 6 of the ARM architecture introduces the concept of 16MB pages
(supersections) and 36-bit (40-bit actually, but nobody uses this) physical
addresses.  36-bit addressed memory and I/O and ARMv6 can only be mapped
using supersections and the requirement on these is that both virtual and
physical addresses be 16MB aligned.  In trying to add support for ioremap()
of 36-bit I/O, we run into the issue that get_vm_area() allows for a
maximum of 512K alignment via the IOREMAP_MAX_ORDER constant.  To work
around this, we can:

- Allocate a larger VM area than needed (size + (1ul << IOREMAP_MAX_ORDER))
  and then align the pointer ourselves, but this ends up with 512K of
  wasted VM per ioremap().

- Provide a new __get_vm_area_aligned() API and make __get_vm_area() sit
  on top of this. I did this and it works but I don't like the idea
  adding another VM API just for this one case.

- My preferred solution which is to allow the architecture to override
  the IOREMAP_MAX_ORDER constant with it's own version.

Signed-off-by: Deepak Saxena <dsaxena@plexity.net>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:05:46 -07:00
Paolo 'Blaisorblade' Giarrusso 9b4ee40ebb [PATCH] mm: correct _PAGE_FILE comment
_PAGE_FILE does not indicate whether a file is in page / swap cache, it is
set just for non-linear PTE's.  Correct the comment for i386, x86_64, UML.
Also clearify _PAGE_NONE.

Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:05:45 -07:00
Paolo 'Blaisorblade' Giarrusso e83a959671 [PATCH] comment typo fix
smp_entry_t -> swap_entry_t

Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:05:45 -07:00
Martin Hicks bce5f6ba34 [PATCH] VM: add capabilites check to set_zone_reclaim
Add a capability check to sys_set_zone_reclaim().  This syscall is not
something that should be available to a user.

Signed-off-by:  Martin Hicks <mort@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:05:44 -07:00
Nick Piggin 242e546862 [PATCH] mm: remove atomic
This bitop does not need to be atomic because it is performed when there will
be no references to the page (ie.  the page is being freed).

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:05:44 -07:00
Christoph Lameter 6e21c8f145 [PATCH] /proc/<pid>/numa_maps to show on which nodes pages reside
This patch was recently discussed on linux-mm:
http://marc.theaimsgroup.com/?t=112085728500002&r=1&w=2

I inherited a large code base from Ray for page migration.  There was a
small patch in there that I find to be very useful since it allows the
display of the locality of the pages in use by a process.  I reworked that
patch and came up with a /proc/<pid>/numa_maps that gives more information
about the vma's of a process.  numa_maps is indexes by the start address
found in /proc/<pid>/maps.  F.e.  with this patch you can see the page use
of the "getty" process:

margin:/proc/12008 # cat maps
00000000-00004000 r--p 00000000 00:00 0
2000000000000000-200000000002c000 r-xp 00000000 08:04 516                /lib/ld-2.3.3.so
2000000000038000-2000000000040000 rw-p 00028000 08:04 516                /lib/ld-2.3.3.so
2000000000040000-2000000000044000 rw-p 2000000000040000 00:00 0
2000000000058000-2000000000260000 r-xp 00000000 08:04 54707842           /lib/tls/libc.so.6.1
2000000000260000-2000000000268000 ---p 00208000 08:04 54707842           /lib/tls/libc.so.6.1
2000000000268000-2000000000274000 rw-p 00200000 08:04 54707842           /lib/tls/libc.so.6.1
2000000000274000-2000000000280000 rw-p 2000000000274000 00:00 0
2000000000280000-20000000002b4000 r--p 00000000 08:04 9126923            /usr/lib/locale/en_US.utf8/LC_CTYPE
2000000000300000-2000000000308000 r--s 00000000 08:04 60071467           /usr/lib/gconv/gconv-modules.cache
2000000000318000-2000000000328000 rw-p 2000000000318000 00:00 0
4000000000000000-4000000000008000 r-xp 00000000 08:04 29576399           /sbin/mingetty
6000000000004000-6000000000008000 rw-p 00004000 08:04 29576399           /sbin/mingetty
6000000000008000-600000000002c000 rw-p 6000000000008000 00:00 0          [heap]
60000fff7fffc000-60000fff80000000 rw-p 60000fff7fffc000 00:00 0
60000ffffff44000-60000ffffff98000 rw-p 60000ffffff44000 00:00 0          [stack]
a000000000000000-a000000000020000 ---p 00000000 00:00 0                  [vdso]

cat numa_maps
2000000000000000 default MaxRef=43 Pages=11 Mapped=11 N0=4 N1=3 N2=2 N3=2
2000000000038000 default MaxRef=1 Pages=2 Mapped=2 Anon=2 N0=2
2000000000040000 default MaxRef=1 Pages=1 Mapped=1 Anon=1 N0=1
2000000000058000 default MaxRef=43 Pages=61 Mapped=61 N0=14 N1=15 N2=16 N3=16
2000000000268000 default MaxRef=1 Pages=2 Mapped=2 Anon=2 N0=2
2000000000274000 default MaxRef=1 Pages=3 Mapped=3 Anon=3 N0=3
2000000000280000 default MaxRef=8 Pages=3 Mapped=3 N0=3
2000000000300000 default MaxRef=8 Pages=2 Mapped=2 N0=2
2000000000318000 default MaxRef=1 Pages=1 Mapped=1 Anon=1 N2=1
4000000000000000 default MaxRef=6 Pages=2 Mapped=2 N1=2
6000000000004000 default MaxRef=1 Pages=1 Mapped=1 Anon=1 N0=1
6000000000008000 default MaxRef=1 Pages=1 Mapped=1 Anon=1 N0=1
60000fff7fffc000 default MaxRef=1 Pages=1 Mapped=1 Anon=1 N0=1
60000ffffff44000 default MaxRef=1 Pages=1 Mapped=1 Anon=1 N0=1

getty uses ld.so.  The first vma is the code segment which is used by 43
other processes and the pages are evenly distributed over the 4 nodes.

The second vma is the process specific data portion for ld.so.  This is
only one page.

The display format is:

<startaddress>	 Links to information in /proc/<pid>/map
<memory policy>  This can be "default" "interleave={}", "prefer=<node>" or "bind={<zones>}"
MaxRef=		<maximum reference to a page in this vma>
Pages=		<Nr of pages in use>
Mapped=		<Nr of pages with mapcount >
Anon=		<nr of anonymous pages>
Nx=		<Nr of pages on Node x>

The content of the proc-file is self-evident.  If this would be tied into
the sparsemem system then the contents of this file would not be too
useful.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:05:43 -07:00
Hugh Dickins 5d337b9194 [PATCH] swap: swap_lock replace list+device
The idea of a swap_device_lock per device, and a swap_list_lock over them all,
is appealing; but in practice almost every holder of swap_device_lock must
already hold swap_list_lock, which defeats the purpose of the split.

The only exceptions have been swap_duplicate, valid_swaphandles and an
untrodden path in try_to_unuse (plus a few places added in this series).
valid_swaphandles doesn't show up high in profiles, but swap_duplicate does
demand attention.  However, with the hold time in get_swap_pages so much
reduced, I've not yet found a load and set of swap device priorities to show
even swap_duplicate benefitting from the split.  Certainly the split is mere
overhead in the common case of a single swap device.

So, replace swap_list_lock and swap_device_lock by spinlock_t swap_lock
(generally we seem to prefer an _ in the name, and not hide in a macro).

If someone can show a regression in swap_duplicate, then probably we should
add a hashlock for the swap_map entries alone (shorts being anatomic), so as
to help the case of the single swap device too.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:05:42 -07:00
Hugh Dickins 52b7efdbe5 [PATCH] swap: scan_swap_map drop swap_device_lock
get_swap_page has often shown up on latency traces, doing lengthy scans while
holding two spinlocks.  swap_list_lock is already dropped, now scan_swap_map
drop swap_device_lock before scanning the swap_map.

While scanning for an empty cluster, don't worry that racing tasks may
allocate what was free and free what was allocated; but when allocating an
entry, check it's still free after retaking the lock.  Avoid dropping the lock
in the expected common path.  No barriers beyond the locks, just let the
cookie crumble; highest_bit limit is volatile, but benign.

Guard against swapoff: must check SWP_WRITEOK before allocating, must raise
SWP_SCANNING reference count while in scan_swap_map, swapoff wait for that to
fall - just use schedule_timeout, we don't want to burden scan_swap_map
itself, and it's very unlikely that anyone can really still be in
scan_swap_map once swapoff gets this far.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:05:41 -07:00
Hugh Dickins 6eb396dc4a [PATCH] swap: swap unsigned int consistency
The swap header's unsigned int last_page determines the range of swap pages,
but swap_info has been using int or unsigned long in some cases: use unsigned
int throughout (except, in several places a local unsigned long is useful to
avoid overflows when adding).

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Jens Axboe <axboe@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:05:41 -07:00
Hugh Dickins 53092a7402 [PATCH] swap: show span of swap extents
The "Adding %dk swap" message shows the number of swap extents, as a guide to
how fragmented the swapfile may be.  But a useful further guide is what total
extent they span across (sometimes scarily large).

And there's no need to keep nr_extents in swap_info: it's unused after the
initial message, so save a little space by keeping it on stack.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:05:40 -07:00
Hugh Dickins 11d31886db [PATCH] swap: swap extent list is ordered
There are several comments that swap's extent_list.prev points to the lowest
extent: that's not so, it's extent_list.next which points to it, as you'd
expect.  And a couple of loops in add_swap_extent which go all the way through
the list, when they should just add to the other end.

Fix those up, and let map_swap_page search the list forwards: profiles shows
it to be twice as quick that way - because prefetch works better on how the
structs are typically kmalloc'ed?  or because usually more is written to than
read from swap, and swap is allocated ascendingly?

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:05:40 -07:00
Stephen Rothwell fd4fd5aac1 [PATCH] mm: consolidate get_order
Someone mentioned that almost all the architectures used basically the same
implementation of get_order.  This patch consolidates them into
asm-generic/page.h and includes that in the appropriate places.  The
exceptions are ia64 and ppc which have their own (presumably optimised)
versions.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:05:39 -07:00
Dave Hansen 28ae55c98e [PATCH] sparsemem extreme: hotplug preparation
This splits up sparse_index_alloc() into two pieces.  This is needed
because we'll allocate the memory for the second level in a different place
from where we actually consume it to keep the allocation from happening
underneath a lock

Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Bob Picco <bob.picco@hp.com>
Cc: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:05:38 -07:00
Bob Picco 3e347261a8 [PATCH] sparsemem extreme implementation
With cleanups from Dave Hansen <haveblue@us.ibm.com>

SPARSEMEM_EXTREME makes mem_section a one dimensional array of pointers to
mem_sections.  This two level layout scheme is able to achieve smaller
memory requirements for SPARSEMEM with the tradeoff of an additional shift
and load when fetching the memory section.  The current SPARSEMEM
implementation is a one dimensional array of mem_sections which is the
default SPARSEMEM configuration.  The patch attempts isolates the
implementation details of the physical layout of the sparsemem section
array.

SPARSEMEM_EXTREME requires bootmem to be functioning at the time of
memory_present() calls.  This is not always feasible, so architectures
which do not need it may allocate everything statically by using
SPARSEMEM_STATIC.

Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Bob Picco <bob.picco@hp.com>
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:05:38 -07:00
Bob Picco 802f192e4a [PATCH] SPARSEMEM EXTREME
A new option for SPARSEMEM is ARCH_SPARSEMEM_EXTREME.  Architecture
platforms with a very sparse physical address space would likely want to
select this option.  For those architecture platforms that don't select the
option, the code generated is equivalent to SPARSEMEM currently in -mm.
I'll be posting a patch on ia64 ml which uses this new SPARSEMEM feature.

ARCH_SPARSEMEM_EXTREME makes mem_section a one dimensional array of
pointers to mem_sections.  This two level layout scheme is able to achieve
smaller memory requirements for SPARSEMEM with the tradeoff of an
additional shift and load when fetching the memory section.  The current
SPARSEMEM -mm implementation is a one dimensional array of mem_sections
which is the default SPARSEMEM configuration.  The patch attempts isolates
the implementation details of the physical layout of the sparsemem section
array.

ARCH_SPARSEMEM_EXTREME depends on 64BIT and is by default boolean false.

I've boot tested under aim load ia64 configured for ARCH_SPARSEMEM_EXTREME.
 I've also boot tested a 4 way Opteron machine with !ARCH_SPARSEMEM_EXTREME
and tested with aim.

Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Bob Picco <bob.picco@hp.com>
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:05:38 -07:00
Russell King 664399e1fb [ARM] Wrap calls to descriptor handlers
This is part of Thomas Gleixner's generic IRQ patch, which converts
ARM to use the generic IRQ subsystem.  Here, we wrap calls to
desc->handler() in an inline function, desc_handle_irq().  This
reduces the size of Thomas' patch since the changes become more
localised.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-09-04 19:45:00 +01:00
Russell King 7801907b8c [ARM] Change irq_chip wake/type methods to set_wake/set_type
This is part of Thomas Gleixner's generic IRQ patch, which converts
ARM to use the generic IRQ subsystem.  Here, we rename two of the
irq_chip methods - wake becomes set_wake, and type becomes set_type.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-09-04 19:43:13 +01:00
Pierre Ossman 865e9f13c9 [MMC] ios for mmc chip select
Adds a new ios for setting the chip select pin on MMC cards. Needed on
SD controllers which use this pin for other things and therefore cannot
have it pulled high at all times.

Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-09-03 16:45:02 +01:00
Rolf Eike Beer d51fe1be3f [PATCH] remove driverfs references from include/linux/cpu.h and net/sunrpc/rpc_pipe.c
This patch is against 2.6.10, but still applies cleanly. It's just
s/driverfs/sysfs/ in these two files.

Signed-off-by: Rolf Eike Beer <eike-kernel@sf-tec.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-02 00:57:31 -07:00
Greg Ungerer e70bd11601 [PATCH] m68knommu: need pfn_valid macro
Need pfn_valid macro, even on MMUless platforms.
Enclose the macro args of __pa and __va in parentheses.

Signed-off-by: Greg Ungerer <gerg@uclinux.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-02 00:57:30 -07:00
Linus Torvalds 138307b475 Merge HEAD from master.kernel.org:/home/rmk/linux-2.6-serial 2005-09-02 00:53:36 -07:00
Linus Torvalds 66f3767376 Merge HEAD from master.kernel.org:/home/rmk/linux-2.6-arm 2005-09-02 00:52:05 -07:00
Linus Torvalds 5d8c397f30 Merge refs/heads/ieee80211-wifi from master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6 2005-09-02 00:48:33 -07:00