Commit Graph

13408 Commits

Author SHA1 Message Date
Alexey Dobriyan a5c43dae7a Fix race between cat /proc/slab_allocators and rmmod
Same story as with cat /proc/*/wchan race vs rmmod race, only
/proc/slab_allocators want more info than just symbol name.

Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:08 -07:00
Alexey Dobriyan 9d65cb4a17 Fix race between cat /proc/*/wchan and rmmod et al
kallsyms_lookup() can go iterating over modules list unprotected which is OK
for emergency situations (oops), but not OK for regular stuff like
/proc/*/wchan.

Introduce lookup_symbol_name()/lookup_module_symbol_name() which copy symbol
name into caller-supplied buffer or return -ERANGE.  All copying is done with
module_mutex held, so...

Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:08 -07:00
Alexey Dobriyan ea07890a68 Fix race between rmmod and cat /proc/kallsyms
module_get_kallsym() leaks "struct module *" outside of module_mutex which is
no-no, because module can dissapear right after mutex unlock.

Copy all needed information from inside module_mutex into caller-supplied
space.

[bunk@stusta.de: is_exported() can now become static]
Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:08 -07:00
Alexey Dobriyan ae84e32470 Simplify module_get_kallsym() by dropping length arg
module_get_kallsym() could in theory truncate module symbol name to fit in
buffer, but nobody does this.  Always use KSYM_NAME_LEN + 1 bytes for name.

Suggested by lg^WRusty.

Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:08 -07:00
David Brownell 55955aad7c PNPACPI sets pnpdev->dev.archdata
Teach PNPACPI how to hook up its devices to their ACPI nodes, so that
pnpdev->dev.archdata points to the parallel acpi device node.  Previously
this only worked for PCI, leaving a notable hole.

Export "acpi_bus_type" so this can work.

Remove some extraneous whitespace.

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Cc: Adam Belay <ambx1@neo.rr.com>
Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: Len Brown <lenb@kernel.org>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:08 -07:00
Ananth N Mavinakayanahalli 0f95b7fc83 Kprobes: print details of kretprobe on assertion failure
In certain cases like when the real return address can't be found or when
the number of tracked calls to a kretprobed function is less than the
number of returns, we may not be able to find the correct return address
after processing a kretprobe.  Currently we just do a BUG_ON, but no
information is provided about the actual failing kretprobe.

Print out details of the kretprobe before calling BUG().

Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Cc: Jim Keniston <jkenisto@us.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: Maneesh Soni <maneesh@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:08 -07:00
Simon Horman 6672f76a5a kdump/kexec: calculate note size at compile time
Currently the size of the per-cpu region reserved to save crash notes is
set by the per-architecture value MAX_NOTE_BYTES.  Which in turn is
currently set to 1024 on all supported architectures.

While testing ia64 I recently discovered that this value is in fact too
small.  The particular setup I was using actually needs 1172 bytes.  This
lead to very tedious failure mode where the tail of one elf note would
overwrite the head of another if they ended up being alocated sequentially
by kmalloc, which was often the case.

It seems to me that a far better approach is to caclculate the size that
the area needs to be.  This patch does just that.

If a simpler stop-gap patch for ia64 to be squeezed into 2.6.21(.X) is
needed then this should be as easy as making MAX_NOTE_BYTES larger in
arch/asm-ia64/kexec.h.  Perhaps 2048 would be a good choice.  However, I
think that the approach in this patch is a much more robust idea.

Acked-by:  Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:07 -07:00
Ken Chen 7328508274 remove artificial software max_loop limit
Remove artificial maximum 256 loop device that can be created due to a
legacy device number limit.  Searching through lkml archive, there are
several instances where users complained about the artificial limit that
the loop driver impose.  There is no reason to have such limit.

This patch rid the limit entirely and make loop device and associated block
queue instantiation on demand.  With on-demand instantiation, it also gives
the benefit of not wasting memory if these devices are not in use (compare
to current implementation that always create 8 loop devices), a net
improvement in both areas.  This version is both tested with creation of
large number of loop devices and is compatible with existing losetup/mount
user land tools.

There are a number of people who worked on this and provided valuable
suggestions, in no particular order, by:

Jens Axboe
Jan Engelhardt
Christoph Hellwig
Thomas M

Signed-off-by: Ken Chen <kenchen@google.com>
Cc: Jan Engelhardt <jengelh@linux01.gwdg.de>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:07 -07:00
Jeremy Fitzhardinge 04c9167f91 add touch_all_softlockup_watchdogs()
Add touch_all_softlockup_watchdogs() to allow the softlockup watchdog
timers on all cpus to be updated.  This is used to prevent sysrq-t from
generating a spurious watchdog message when generating lots of output.

Softlockup watchdogs use sched_clock() as its timebase, which is inherently
per-cpu (at least, when it is measuring unstolen time).  Because of this,
it isn't possible for one CPU to directly update the other CPU's timers,
but it is possible to tell the other CPUs to do update themselves
appropriately.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Acked-by: Chris Lalancette <clalance@redhat.com>
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Cc: Rick Lindsley <ricklind@us.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:06 -07:00
john stultz 8524070b79 Move timekeeping code to timekeeping.c
Move the timekeeping code out of kernel/timer.c and into
kernel/time/timekeeping.c.  I made no cleanups or other changes in transit.

[akpm@linux-foundation.org: build fix]
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:06 -07:00
Eric Dumazet 329c8d84ca time: SMP friendly alignment of struct clocksource
struct clocksource is a critical data structure.

Most of its fields are read only, some of them are heavily modified at each
timer interrupt.

It makes sense to separate those fields and make sure they all share one
cache line, or at least the minimum for machines with small cache lines.

[akpm@linux-foundation.org: build fix]
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Acked-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:06 -07:00
Ralf Baechle 3367b994fe <linux/sysdev.h> needs to include <linux/module.h>
sysdev.h uses THIS_MODULE so should include <linux/module.h>.

[akpm@linux-foundation.org: couple of fixes]
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Andi Kleen <ak@suse.de>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:05 -07:00
Venki Pallipadi 28287033e1 Add a new deferrable delayed work init
Add a new deferrable delayed work init.  This can be used to schedule work
that are 'unimportant' when CPU is idle and can be called later, when CPU
eventually comes out of idle.

Use this init in cpufreq ondemand governor.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:05 -07:00
Venki Pallipadi 6e453a6751 Add support for deferrable timers
Introduce a new flag for timers - deferrable: Timers that work normally
when system is busy.  But, will not cause CPU to come out of idle (just to
service this timer), when CPU is idle.  Instead, this timer will be
serviced when CPU eventually wakes up with a subsequent non-deferrable
timer.

The main advantage of this is to avoid unnecessary timer interrupts when
CPU is idle.  If the routine currently called by a timer can wait until
next event without any issues, this new timer can be used to setup timer
event for that routine.  This, with dynticks, allows CPUs to be lazy,
allowing them to stay in idle for extended period of time by reducing
unnecesary wakeup and thereby reducing the power consumption.

This patch:

Builds this new timer on top of existing timer infrastructure.  It uses
last bit in 'base' pointer of timer_list structure to store this deferrable
timer flag.  __next_timer_interrupt() function skips over these deferrable
timers when CPU looks for next timer event for which it has to wake up.

This is exported by a new interface init_timer_deferrable() that can be
called in place of regular init_timer().

[akpm@linux-foundation.org: Privatise a #define]
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:05 -07:00
David Brownell c15a3837d2 parport->dev driver model support
Currently a parport_driver can't get a handle on the device node for the
underlying parport (PNPACPI, PCI, etc).  That prevents correct placement of
sysfs child nodes, which can affect things like power management.

This patch adds a field to "struct parport" pointing to that device node, and
updates non-legacy port drivers to initialize that device pointer.  That field
replaces the analagous PCI-only support in parport_pc.

[akpm@linux-foundation.org: fix powerpc build]
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:05 -07:00
Robert P. J. Day c467a388ae Delete unused header file linux/awe_voice.h
Delete the unused header file include/linux/awe_voice.h, as well as
its corresponding Kbuild entry.

Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:05 -07:00
Adrian Bunk e5f00f42f3 make remove_inode_dquot_ref() static
remove_inode_dquot_ref() can now become static.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:05 -07:00
Robert P. J. Day 757dea93e1 Delete unused header file math-emu/extended.h
Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:05 -07:00
Mark Fasheh ef51c97623 Remove do_sync_file_range()
Remove do_sync_file_range() and convert callers to just use
do_sync_mapping_range().

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:04 -07:00
Christoph Hellwig 1eeb66a1bb move die notifier handling to common code
This patch moves the die notifier handling to common code.  Previous
various architectures had exactly the same code for it.  Note that the new
code is compiled unconditionally, this should be understood as an appel to
the other architecture maintainer to implement support for it aswell (aka
sprinkling a notify_die or two in the proper place)

arm had a notifiy_die that did something totally different, I renamed it to
arm_notify_die as part of the patch and made it static to the file it's
declared and used at.  avr32 used to pass slightly less information through
this interface and I brought it into line with the other architectures.

[akpm@linux-foundation.org: build fix]
[akpm@linux-foundation.org: fix vmalloc_sync_all bustage]
[bryan.wu@analog.com: fix vmalloc_sync_all in nommu]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: <linux-arch@vger.kernel.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Bryan Wu <bryan.wu@analog.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:04 -07:00
Eric W. Biederman 98a27ba485 tty: introduce no_tty and use it in selinux
While researching the tty layer pid leaks I found a weird case in selinux when
we drop a controlling tty because of inadequate permissions we don't do the
normal hangup processing.  Which is a problem if it happens the session leader
has exec'd something that can no longer access the tty.

We already have code in the kernel to handle this case in the form of the
TIOCNOTTY ioctl.  So this patch factors out a helper function that is the
essence of that ioctl and calls it from the selinux code.

This removes the inconsistency in handling dropping of a controlling tty and
who knows it might even make some part of user space happy because it received
a SIGHUP it was expecting.

In addition since this removes the last user of proc_set_tty outside of
tty_io.c proc_set_tty is made static and removed from tty.h

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: James Morris <jmorris@namei.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:04 -07:00
Andrew Morton 6ae9200f2c enlarge console.name
console.name[] is eight chars, but so is "earlyvga".  So when we try to print
console->name when using earlyvga it runs off the end of the string.

Make it bigger.

Diagnosed-by: Gerd Hoffmann <kraxel@redhat.com>

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:04 -07:00
Rusty Russell 9adef58b1d futex: get_futex_key, get_key_refs and drop_key_refs
lguest uses the convenient futex infrastructure for inter-domain I/O, so
expose get_futex_key, get_key_refs (renamed get_futex_key_refs) and
drop_key_refs (renamed drop_futex_key_refs).  Also means we need to expose the
union that these use.

No code changes.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:03 -07:00
Klaus Kudielka 1a86b5e34e cyclades: remove custom types
Switch from private uclong, etc over to standard types.

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:03 -07:00
Klaus Kudielka 7c4e95bf48 fix cyclades.h for x86_64 (and probably others)
At least on x86_64 the present cyclades.h is broken due to the wrong size
of uclong.  This affects, of course, both the kernel and the user-level
utilities.  The symptom is that cyzload refuses to load the firmware.  I
also managed to freeze the machine when unloading the module.

The patch below fixes this in an architecture-independent way.  I have
tested it with 2.6.19 and the driver works fine again with a Cyclades-Z on
an Athlon 64 X2.

[akpm@linux-foundation.org: fix warnings]

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:03 -07:00
Ananth N Mavinakayanahalli 9b3af29bf3 Kprobes: Make kprobe.symbol_name const
Kprobes doesn't scribble the kprobe.symbol_name field.  Its only set by the
module when registering the probe.  Modules that exercise good hygiene
using the "const" qualifier will see warnings...

	warning: assignment discards qualifiers from pointer target type

Make struct kprobe.symbol_name const char *

Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Jim Keniston <jkenisto@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:03 -07:00
Alan Cox 6de02123bf tty: i386/x86_64 arbitary speed support
Adds the needed TCGETS2/TCSETS2 ioctl calls, structures, defines and the like.
Tested against the test suite and passes.  Other platforms should need
roughly the same change.

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:03 -07:00
Eric Dumazet c23fbb6bcb VFS: delay the dentry name generation on sockets and pipes
1) Introduces a new method in 'struct dentry_operations'.  This method
   called d_dname() might be called from d_path() to build a pathname for
   special filesystems.  It is called without locks.

   Future patches (if we succeed in having one common dentry for all
   pipes/sockets) may need to change prototype of this method, but we now
   use : char *d_dname(struct dentry *dentry, char *buffer, int buflen);

2) Adds a dynamic_dname() helper function that eases d_dname() implementations

3) Defines d_dname method for sockets : No more sprintf() at socket
   creation.  This is delayed up to the moment someone does an access to
   /proc/pid/fd/...

4) Defines d_dname method for pipes : No more sprintf() at pipe
   creation.  This is delayed up to the moment someone does an access to
   /proc/pid/fd/...

A benchmark consisting of 1.000.000 calls to pipe()/close()/close() gives a
*nice* speedup on my Pentium(M) 1.6 Ghz :

3.090 s instead of 3.450 s

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Acked-by: Christoph Hellwig <hch@infradead.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:03 -07:00
Alexey Dobriyan 7695650a92 Fix race between proc_get_inode() and remove_proc_entry()
proc_lookup				remove_proc_entry
===========				=================

lock_kernel();
spin_lock(&proc_subdir_lock);
[find PDE with refcount 0]
spin_unlock(&proc_subdir_lock);
					spin_lock(&proc_subdir_lock);
					[find PDE with refcount 0]
					[check refcount and free PDE]
					spin_unlock(&proc_subdir_lock);
proc_get_inode:
	de_get(de); /* boom */

Signed-off-by: Alexey Dobriyan <adobriyan@openvz.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:01 -07:00
Miklos Szeredi 79c0b2df79 add filesystem subtype support
There's a slight problem with filesystem type representation in fuse
based filesystems.

From the kernel's view, there are just two filesystem types: fuse and
fuseblk.  From the user's view there are lots of different filesystem
types.  The user is not even much concerned if the filesystem is fuse based
or not.  So there's a conflict of interest in how this should be
represented in fstab, mtab and /proc/mounts.

The current scheme is to encode the real filesystem type in the mount
source.  So an sshfs mount looks like this:

  sshfs#user@server:/   /mnt/server    fuse   rw,nosuid,nodev,...

This url-ish syntax works OK for sshfs and similar filesystems.  However
for block device based filesystems (ntfs-3g, zfs) it doesn't work, since
the kernel expects the mount source to be a real device name.

A possibly better scheme would be to encode the real type in the type
field as "type.subtype".  So fuse mounts would look like this:

  /dev/hda1       /mnt/windows   fuseblk.ntfs-3g   rw,...
  user@server:/   /mnt/server    fuse.sshfs        rw,nosuid,nodev,...

This patch adds the necessary code to the kernel so that this can be
correctly displayed in /proc/mounts.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:01 -07:00
David Brownell 2e17c5508f init dma masks in pnp_dev
PNP now initializes device dma masks, which prevents oopses when generic
dma calls are made using pnp device nodes.

This assumes PNP only uses ISA DMA, with 24 bit addresses; and that it's
safe to init those masks for all devices (rather than finding out which
devices have been assigned DMA channels, and handling only those).

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Cc: Adam Belay <abelay@novell.com>
Cc: Jaroslav Kysela <perex@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:00 -07:00
Badari Pulavarty e3222c4ecc Merge sys_clone()/sys_unshare() nsproxy and namespace handling
sys_clone() and sys_unshare() both makes copies of nsproxy and its associated
namespaces.  But they have different code paths.

This patch merges all the nsproxy and its associated namespace copy/clone
handling (as much as possible).  Posted on container list earlier for
feedback.

- Create a new nsproxy and its associated namespaces and pass it back to
  caller to attach it to right process.

- Changed all copy_*_ns() routines to return a new copy of namespace
  instead of attaching it to task->nsproxy.

- Moved the CAP_SYS_ADMIN checks out of copy_*_ns() routines.

- Removed unnessary !ns checks from copy_*_ns() and added BUG_ON()
  just incase.

- Get rid of all individual unshare_*_ns() routines and make use of
  copy_*_ns() instead.

[akpm@osdl.org: cleanups, warning fix]
[clg@fr.ibm.com: remove dup_namespaces() declaration]
[serue@us.ibm.com: fix CONFIG_IPC_NS=n, clone(CLONE_NEWIPC) retval]
[akpm@linux-foundation.org: fix build with CONFIG_SYSVIPC=n]
Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
Signed-off-by: Serge Hallyn <serue@us.ibm.com>
Cc: Cedric Le Goater <clg@fr.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: <containers@lists.osdl.org>
Signed-off-by: Cedric Le Goater <clg@fr.ibm.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:00 -07:00
Monakhov Dmitriy 616883df78 IRQ: add __must_check to request_irq
This could help to find buggy drivers where request_irq return value wasn't
checked.  There's just no reason to ignore errors which can and do occur.
Anyone who got warning during compilation have to realise what it is't
realy safe code.

Signed-off-by: Monakhov Dmitriy <dmonakhov@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:00 -07:00
Robert P. J. Day c761c84154 kconfig: centralize the selection of semaphore debugging in lib/Kconfig.debug
Remove the Kconfig selection of semaphore debugging from the ALPHA and FRV
Kconfig files, and centralize it in lib/Kconfig.debug.

There doesn't seem to be much point in letting individual architectures
independently define the same Kconfig option when it can just as easily be
put in a single Kconfig file and made dependent on a subset of
architectures.  that way, at least the option shows up in the same relative
location in the menu each time.

Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Richard Henderson <rth@twiddle.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:00 -07:00
Alexey Dobriyan fe08a9d498 reiserfs: shrink superblock if no xattrs
This makes in-core superblock fit into one cacheline here.

Before:
    struct dentry *            xattr_root;           /*   124     4 */
    /* --- cacheline 1 boundary (128 bytes) --- */
    struct rw_semaphore        xattr_dir_sem;        /*   128    12 */
    int                        j_errno;              /*   140     4 */
    }; /* size: 144, cachelines: 2 */
       /* sum members: 142, holes: 1, sum holes: 2 */
       /* last cacheline: 16 bytes */

After:
    int                        j_errno;              /*   124     4 */
    /* --- cacheline 1 boundary (128 bytes) --- */
    }; /* size: 128, cachelines: 1 */
       /* sum members: 126, holes: 1, sum holes: 2 */

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: <reiserfs-dev@namesys.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:00 -07:00
Michal Schmidt ee7b9e3706 Fix compilation of drivers with -O0
It is sometimes useful to compile individual drivers with optimization
disabled for easier debugging.  Currently drivers which use htonl() and
similar functions don't compile with -O0.  This patch fixes it.  It also
removes obsolete and misleading comments.  This header is not for
userspace, so we don't have to care about strange programs these comments
mention.

(akpm: -O0 probably isn't a good idea, but this code looks pretty crufty and
unuseful)

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:00 -07:00
Adrian Bunk 46595390e9 init/do_mounts.c: proper prepare_namespace() prototype
Add a proper protype for prepare_namespace() in include/linux/init.h.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:00 -07:00
Chris Snook 1ae7075bcd use use SEEK_MAX to validate user lseek arguments
Add SEEK_MAX and use it to validate lseek arguments from userspace.

Signed-off-by: Chris Snook <csnook@redhat.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:14:59 -07:00
Trent Piepho 8e2c20023f Fix constant folding and poor optimization in byte swapping code
Constant folding does not work for the swabXX() byte swapping functions,
and the C versions optimize poorly.

Attempting to initialize a global variable to swab16(0x1234) or put
something like "case swab32(42):" in a switch statement will not compile.
It can work, swab.h just isn't doing it correctly.  This patch fixes that.

Contrary to the comment in asm-i386/byteorder.h, gcc does not recognize the
"C" version of swab16 and turn it into efficient code.  gcc can do this,
just not with the current code.  The simple function:

u16 foo(u16 x) { return swab16(x); }

Would compile to:
        movzwl  %ax, %eax
        movl    %eax, %edx
        shrl    $8, %eax
        sall    $8, %edx
        orl     %eax, %edx

With this patch, it will compile to:
        rolw    $8, %ax

I also attempted to document the maze different macros/inline functions
that are used to create the final product.

Signed-off-by: Trent Piepho <xyzzy@speakeasy.org>
Cc: Francois-Rene Rideau <fare@tunes.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:14:59 -07:00
Corey Minyard f64da958df ipmi: add new IPMI nmi watchdog handling
Convert over to the new NMI handling for getting IPMI watchdog timeouts via an
NMI.  This add config options to know if there is the ability to receive NMIs
and if it has an NMI post processing call.  Then it modifies the IPMI watchdog
to take advantage of this so that it can know if an NMI comes in.

It also adds testing that the IPMI NMI watchdog works.

Signed-off-by: Corey Minyard <minyard@acm.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:14:58 -07:00
William Cohen 97dc32cdb1 reduce size of task_struct on 64-bit machines
This past week I was playing around with that pahole tool
(http://oops.ghostprotocols.net:81/acme/dwarves/) and looking at the size
of various struct in the kernel.  I was surprised by the size of the
task_struct on x86_64, approaching 4K.  I looked through the fields in
task_struct and found that a number of them were declared as "unsigned
long" rather than "unsigned int" despite them appearing okay as 32-bit
sized fields.  On x86_64 "unsigned long" ends up being 8 bytes in size and
forces 8 byte alignment.  Is there a reason there a reason they are
"unsigned long"?

The patch below drops the size of the struct from 3808 bytes (60 64-byte
cachelines) to 3760 bytes (59 64-byte cachelines).  A couple other fields
in the task struct take a signficant amount of space:

struct thread_struct       thread;               688
struct held_lock           held_locks[30];       1680

CONFIG_LOCKDEP is turned on in the .config

[akpm@linux-foundation.org: fix printk warnings]
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:14:58 -07:00
Christoph Hellwig ab1b6f03a1 simplify the stacktrace code
Simplify the stacktrace code:

 - remove the unused task argument to save_stack_trace, it's always
   current
 - remove the all_contexts flag, it's alwasy 0

Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Andi Kleen <ak@suse.de>
Cc: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:14:58 -07:00
Guillaume Chazarain 3e9f45bd18 Factor outstanding I/O error handling
Cleanup: setting an outstanding error on a mapping was open coded too many
times.  Factor it out in mapping_set_error().

Signed-off-by: Guillaume Chazarain <guichaz@yahoo.fr>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:14:57 -07:00
Dmitriy Monakhov 0ceb331433 mm: move common segment checks to separate helper function
[akpm@linux-foundation.org: cleanup]
Signed-off-by: Monakhov Dmitriy <dmonakhov@openvz.org>
Cc: Christoph Hellwig <hch@lst.de>
Acked-by: Anton Altaparmakov <aia21@cam.ac.uk>
Acked-by: David Chinner <dgc@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:14:57 -07:00
David Woodhouse b46b8f19c9 Increase slab redzone to 64bits
There are two problems with the existing redzone implementation.

Firstly, it's causing misalignment of structures which contain a 64-bit
integer, such as netfilter's 'struct ipt_entry' -- causing netfilter
modules to fail to load because of the misalignment.  (In particular, the
first check in
net/ipv4/netfilter/ip_tables.c::check_entry_size_and_hooks())

On ppc32 and sparc32, amongst others, __alignof__(uint64_t) == 8.

With slab debugging, we use 32-bit redzones. And allocated slab objects
aren't sufficiently aligned to hold a structure containing a uint64_t.

By _just_ setting ARCH_KMALLOC_MINALIGN to __alignof__(u64) we'd disable
redzone checks on those architectures.  By using 64-bit redzones we avoid that
loss of debugging, and also fix the other problem while we're at it.

When investigating this, I noticed that on 64-bit platforms we're using a
32-bit value of RED_ACTIVE/RED_INACTIVE in the 64-bit memory location set
aside for the redzone.  Which means that the four bytes immediately before
or after the allocated object at 0x00,0x00,0x00,0x00 for LE and BE
machines, respectively.  Which is probably not the most useful choice of
poison value.

One way to fix both of those at once is just to switch to 64-bit
redzones in all cases.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Christoph Lameter <clameter@engr.sgi.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:14:57 -07:00
Linus Torvalds a989705c4c Merge git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
  [IA64] update memory attribute aliasing documentation & test cases
  [IA64] fail mmaps that span areas with incompatible attributes
  [IA64] allow WB /sys/.../legacy_mem mmaps
  [IA64] make ioremap avoid unsupported attributes
  [IA64] rename ioremap variables to match i386
  [IA64] relax per-cpu TLB requirement to DTC
  [IA64] remove per-cpu ia64_phys_stacked_size_p8
  [IA64] Fix example error injection program
  [IA64] Itanium MC Error Injection Tool: pal_mc_error_inject() interface
  [IA64] Itanium MC Error Injection Tool: Makefile changes
  [IA64] Itanium MC Error Injection Tool: Driver sysfs interface
  [IA64] Itanium MC Error Injection Tool: Doc and sample application
  [IA64] Itanium MC Error Injection Tool: Kernel configuration
2007-05-07 12:34:57 -07:00
Linus Torvalds 2d56d3c43c Merge branch 'server-cluster-locking-api' of git://linux-nfs.org/~bfields/linux
* 'server-cluster-locking-api' of git://linux-nfs.org/~bfields/linux:
  gfs2: nfs lock support for gfs2
  lockd: add code to handle deferred lock requests
  lockd: always preallocate block in nlmsvc_lock()
  lockd: handle test_lock deferrals
  lockd: pass cookie in nlmsvc_testlock
  lockd: handle fl_grant callbacks
  lockd: save lock state on deferral
  locks: add fl_grant callback for asynchronous lock return
  nfsd4: Convert NFSv4 to new lock interface
  locks: add lock cancel command
  locks: allow {vfs,posix}_lock_file to return conflicting lock
  locks: factor out generic/filesystem switch from setlock code
  locks: factor out generic/filesystem switch from test_lock
  locks: give posix_test_lock same interface as ->lock
  locks: make ->lock release private data before returning in GETLK case
  locks: create posix-to-flock helper functions
  locks: trivial removal of unnecessary parentheses
2007-05-07 12:34:24 -07:00
Linus Torvalds 5cefcab3db Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw
* git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw: (34 commits)
  [GFS2] Uncomment sprintf_symbol calling code
  [DLM] lowcomms style
  [GFS2] printk warning fixes
  [GFS2] Patch to fix mmap of stuffed files
  [GFS2] use lib/parser for parsing mount options
  [DLM] Lowcomms nodeid range & initialisation fixes
  [DLM] Fix dlm_lowcoms_stop hang
  [DLM] fix mode munging
  [GFS2] lockdump improvements
  [GFS2] Patch to detect corrupt number of dir entries in leaf and/or inode blocks
  [GFS2] bz 236008: Kernel gpf doing cat /debugfs/gfs2/xxx (lock dump)
  [DLM] fs/dlm/ast.c should #include "ast.h"
  [DLM] Consolidate transport protocols
  [DLM] Remove redundant assignment
  [GFS2] Fix bz 234168 (ignoring rgrp flags)
  [DLM] change lkid format
  [DLM] interface for purge (2/2)
  [DLM] add orphan purging code (1/2)
  [DLM] split create_message function
  [GFS2] Set drop_count to 0 (off) by default
  ...
2007-05-07 12:26:27 -07:00
Linus Torvalds 9fa0853a85 Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
  [NET]: rfkill: add support for input key to control wireless radio
  [NET] net/core: Fix error handling
  [TG3]: Update version and reldate.
  [TG3]: Eliminate spurious interrupts.
  [TG3]: Add ASPM workaround.
  [Bluetooth] Correct SCO buffer for another Broadcom based dongle
  [Bluetooth] Add support for Targus ACB10US USB dongle
  [Bluetooth] Disconnect L2CAP connection after last RFCOMM DLC
  [Bluetooth] Check that device is in rfcomm_dev_list before deleting
  [Bluetooth] Use in-kernel sockets API
  [Bluetooth] Attach host adapters to the Bluetooth bus
  [Bluetooth] Fix L2CAP and HCI setsockopt() information leaks
2007-05-07 12:23:31 -07:00
Linus Torvalds ef93127e4c Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6:
  [SERIAL] sunsu: Fix section mismatch warnings.
  [SPARC64]: pgtable_cache_init() should be __init.
  [SPARC64]: Fix section mismatch warnings in arch/sparc64/kernel/prom.c
  [SPARC64]: Fix section mismatch warnings in arch/sparc64/kernel/pci.c
  [SPARC64]: Fix section mismatch warnings in arch/sparc64/kernel/console.c
  [MM]: sparse_init() should be __init.
  [SPARC64]: Update defconfig.
  [VIDEO]: Add Sun XVR-2500 framebuffer driver.
  [VIDEO]: Add Sun XVR-500 framebuffer driver.
  [SPARC64]: SUN4U PCI-E controller support.
  [SPARC]: Fix comment typo in smp4m_blackbox_current().
  [SCSI] SUNESP: sun_esp.c needs linux/delay.h

Fix up conflict in arch/sparc64/mm/init.c manually due to removal of
pgtable_cache_init() through the -mm patches (even though that patch was
also by David ;)

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07 12:22:48 -07:00