167924 Commits

Author SHA1 Message Date
KOSAKI Motohiro
ab8a3e14e6 mbind(): fix leak of never putback pages
If mbind() receives an invalid address, do_mbind leaks a page.  The
following test program detects this leak.

This patch fixes it.

migrate_efault.c
=======================================
 #include <numaif.h>
 #include <numa.h>
 #include <sys/mman.h>
 #include <stdio.h>
 #include <unistd.h>
 #include <stdlib.h>
 #include <string.h>

static unsigned long pagesize;

static void* make_hole_mapping(void)
{

	void* addr;

	addr = mmap(NULL, pagesize*3, PROT_READ|PROT_WRITE,
		    MAP_ANON|MAP_PRIVATE, 0, 0);
	if (addr == MAP_FAILED)
		return NULL;

	/* make page populate */
	memset(addr, 0, pagesize*3);

	/* make memory hole */
	munmap(addr+pagesize, pagesize);

	return addr;
}

int main(int argc, char** argv)
{
	void* addr;
	int ch;
	int node;
	struct bitmask *nmask = numa_allocate_nodemask();
	int err;
	int node_set = 0;

	while ((ch = getopt(argc, argv, "n:")) != -1){
		switch (ch){
		case 'n':
			node = strtol(optarg, NULL, 0);
			numa_bitmask_setbit(nmask, node);
			node_set = 1;
			break;
		default:
			;
		}
	}
	argc -= optind;
	argv += optind;

	if (!node_set)
		numa_bitmask_setbit(nmask, 0);

	pagesize = getpagesize();

	addr = make_hole_mapping();

	err = mbind(addr, pagesize*3, MPOL_BIND, nmask->maskp, nmask->size, MPOL_MF_MOVE_ALL);
	if (err)
		perror("mbind ");

	return 0;
}
=======================================

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: Christoph Lameter <cl@linux-foundation.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-10-29 07:39:29 -07:00
Jeff Mahoney
47f365eb57 hfs: fix oops on mount with corrupted btree extent records
A particular fsfuzzer run caused an hfs file system to crash on mount.
This is due to a corrupted MDB extent record causing a miscalculation of
HFS_I(inode)->first_blocks for the extent tree.  If the extent records are
zereod out, it won't trigger the first_blocks special case.  Instead it
falls through to the extent code which we're still in the middle of
initializing.

This patch catches the 0 size extent records, reports the corruption, and
fails the mount.

Reported-by: Ramon de Carvalho Valle <rcvalle@linux.vnet.ibm.com>
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Cc: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-10-29 07:39:29 -07:00
Alexey Dobriyan
cf6e693212 loop: fix NULL dereference if mount fails
Commit bb21488482bd36eae6b30b014d93619063773fd4 ("[PATCH] switch loop")
started to pass NULL bdev to ioctl hook.

Steps to reproduce:

	[boot with loop.max_part=1]
	[mount -o loop something so mount fails]

BUG: unable to handle kernel NULL pointer dereference at 00000000000000b8
IP: [<ffffffff811486ee>] blkdev_ioctl+0x2e/0xa30
PGD 0
Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
last sysfs file: /sys/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/device:35/ACPI0003:00/power_supply/ACAD/online
CPU 0
Modules linked in: zfs nvidia(P) [last unloaded: zfs]
Pid: 15177, comm: mount Tainted: P           2.6.32-rc4-zfs #2 Satellite X200
RIP: 0010:[<ffffffff811486ee>]  [<ffffffff811486ee>] blkdev_ioctl+0x2e/0xa30
RSP: 0018:ffff88003b3d5bb8  EFLAGS: 00010286
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: 000000000000125f RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffff88003b3d5ce8 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 00007ffffffff000
R13: 0000000000000000 R14: ffff880071cef280 R15: 00000000000200da
FS:  00007fd77cfe7740(0000) GS:ffff880001600000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000000000000b8 CR3: 0000000001001000 CR4: 00000000000026f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process mount (pid: 15177, threadinfo ffff88003b3d4000, task ffff88007572f920)
Stack:
 ffff88003b3d5c38 ffffffff812f95f5 ffff88007eeb6600 0000000000000000
<0> 0000000000000000 ffff88003b3d5c18 ffffffff811547d9 ffff88001bf11ef0
<0> 7fffffffffffffff ffff88001bf11ee8 ffff88001bf11ef0 0000000000000000
Call Trace:
 [<ffffffff812f95f5>] ? schedule_timeout+0x1f5/0x250
 [<ffffffff811547d9>] ? rb_insert_color+0x109/0x140
 [<ffffffff812fb754>] ? _spin_unlock_irq+0x14/0x40
 [<ffffffff812f84c6>] ? wait_for_common+0x66/0x170
 [<ffffffff8105a280>] ? default_wake_function+0x0/0x10
 [<ffffffff810f8258>] ioctl_by_bdev+0x38/0x50
 [<ffffffff811d2481>] loop_clr_fd+0x1e1/0x210
 [<ffffffff811d2522>] lo_release+0x72/0x80
 [<ffffffff810f934c>] __blkdev_put+0x1ac/0x1d0
 [<ffffffff810f937b>] blkdev_put+0xb/0x10
 [<ffffffff810f93b9>] blkdev_close+0x39/0x60
 [<ffffffff810ccef3>] __fput+0xd3/0x230
 [<ffffffff810cd06d>] fput+0x1d/0x30
 [<ffffffff810c9680>] filp_close+0x50/0x80
 [<ffffffff81061f11>] put_files_struct+0x81/0x100
 [<ffffffff81061fde>] exit_files+0x4e/0x60
 [<ffffffff81063ec5>] do_exit+0x6b5/0x730
 [<ffffffff8107b279>] ? up_read+0x9/0x10
 [<ffffffff8104c86e>] ? do_page_fault+0x18e/0x2a0
 [<ffffffff81063f81>] do_group_exit+0x41/0xc0
 [<ffffffff81064012>] sys_exit_group+0x12/0x20
 [<ffffffff81030deb>] system_call_fastpath+0x16/0x1b
Code: f8 48 89 e5 48 81 ec 30 01 00 00 48 89 5d d8 4c 89 6d e8 4c 89 65 e0 4c 89 75 f0 4c 89 7d f8 48 89 bd e8 fe ff ff 49 89 cd 89 f3 <49> 8b 88 b8 00 00 00 81 fa 68 12 00 00 0f 84 57 05 00 00 0f 86
RIP  [<ffffffff811486ee>] blkdev_ioctl+0x2e/0xa30
 RSP <ffff88003b3d5bb8>
CR2: 00000000000000b8
---[ end trace c0b4d3c3118d1427 ]---
Fixing recursive fault but reboot is needed!

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-10-29 07:39:27 -07:00
Wu Fengguang
41e20983fe vmscan: limit VM_EXEC protection to file pages
It is possible to have !Anon but SwapBacked pages, and some apps could
create huge number of such pages with MAP_SHARED|MAP_ANONYMOUS.  These
pages go into the ANON lru list, and hence shall not be protected: we only
care mapped executable files.  Failing to do so may trigger OOM.

Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-10-29 07:39:27 -07:00
Andrew Morton
b76146ed1a revert "mm: oom analysis: add buffer cache information to show_free_areas()"
Revert

    commit 71de1ccbe1fb40203edd3beb473f8580d917d2ca
    Author:     KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
    AuthorDate: Mon Sep 21 17:01:31 2009 -0700
    Commit:     Linus Torvalds <torvalds@linux-foundation.org>
    CommitDate: Tue Sep 22 07:17:27 2009 -0700

        mm: oom analysis: add buffer cache information to show_free_areas()

show_free_areas() is called during page allocation failures, and page
allocation failures can occur in any calling context.

But nr_blockdev_pages() takes VFS locks which should not be taken from
hard IRQ context (at least).  The result is lockdep warnings (and
deadlockability) during page allocation failures.

Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-10-29 07:39:27 -07:00
Ben Hutchings
5c36fe3d87 hfsplus: refuse to mount volumes larger than 2TB
As found in <http://bugs.debian.org/550010>, hfsplus is using type u32
rather than sector_t for some sector number calculations.

In particular, hfsplus_get_block() does:

        u32 ablock, dblock, mask;
...
        map_bh(bh_result, sb, (dblock << HFSPLUS_SB(sb).fs_shift) + HFSPLUS_SB(sb).blockoffset + (iblock & mask));

I am not confident that I can find and fix all cases where a sector number
may be truncated.  For now, avoid data loss by refusing to mount HFS+
volumes with more than 2^32 sectors (2TB).

[akpm@linux-foundation.org: fix 32 and 64-bit issues]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Cc: Eric Sesterhenn <snakebyte@gmx.de>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-10-29 07:39:27 -07:00
Bartlomiej Zolnierkiewicz
b5654f5e7f MAINTAINERS: rt2x00 list is moderated
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-10-29 07:39:27 -07:00
Grant Likely
860c44c196 MAINTAINERS: add Open Firmware / Flattened Device Tree entry
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-10-29 07:39:27 -07:00
Joe Perches
c7c4fb18d0 MAINTAINERS: document new "K:" entry type
K: is for keyword.  Syntax is perl extended regex.

Reorganized header documentation and indent the section entry descriptions
so that the first K: would not be considered a regex to match by
get_maintainer.pl

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-10-29 07:39:26 -07:00
Joe Perches
dcf36a92f5 scripts/get_maintainer.pl: add patch/file search for keywords
Based on an idea from Wolfram Sang.

Add search for MAINTAINERS line "K:" regex pattern match in a patch or file
Matches are added after file pattern matches
Add --keywords command line switch (default 1, on)
Change version to 0.21

Signed-off-by: Joe Perches <joe@perches.com>
Cc: Wolfram Sang <w.sang@pengutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-10-29 07:39:26 -07:00
Joe Perches
27480ccc29 MAINTAINERS: update WOLFSON MICROELECTRONICS
Integrate P:/M: lines
Remove L: linux-kernel@vger.kernel.org

Signed-off-by: Joe Perches <joe@perches.com>
Cc: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-10-29 07:39:26 -07:00
Joe Perches
b55ef929cb MAINTAINERS: fix up PERIPHERAL spelling
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Li Yang <leoli@freescale.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-10-29 07:39:26 -07:00
Joe Perches
364e9e1887 MAINTAINERS: WINBOND CIR - Integrate P:/M: lines, fixup David Härdeman's name
Signed-off-by: Joe Perches <joe@perches.com>
Cc: David Härdeman <david@hardeman.nu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-10-29 07:39:26 -07:00
Joe Perches
2bf822d79f MAINTAINERS: SIMPLE FIRMWARE INTERFACE: update email style
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Len Brown <lenb@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-10-29 07:39:26 -07:00
Joe Perches
a2681a7542 MAINTAINERS: update SCORE architecture name style and add file pattern
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Chen Liqin <liqin.chen@sunplusct.com>
Cc: Lennox Wu <lennox.wu@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-10-29 07:39:26 -07:00
Joe Perches
ee709b0c62 MAINTAINERS: update Kernel Janitors after mismerge
Fix the mismerge of the W: URL and the S: status fields.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-10-29 07:39:26 -07:00
Joe Perches
7a241d6ecd MAINTAINERS: use tab not spaces after field types
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-10-29 07:39:26 -07:00
Joe Perches
476604de5a MAINTAINERS: change ATM mailing list to moderated
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Chas Williams <chas@cmf.nrl.navy.mil>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-10-29 07:39:25 -07:00
Joe Perches
0e24bdd49d MAINTAINERS: update OMAP Tony Lindgren email name
Which had an embedded and duplicated email address

Signed-off-by: Joe Perches <joe@perches.com>
Cc: Tony Lindgren <tony@atomide.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-10-29 07:39:25 -07:00
Joe Perches
d6f005a108 MAINTAINERS: update TRACING section
Move to alphabetic position
Use single line F: entries

Signed-off-by: Joe Perches <joe@perches.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-10-29 07:39:25 -07:00
Joe Perches
bda2562c34 MAINTAINERS: update GENERIC UIO FOR PCI DEVICES
Quote a name with a period
remove L: linux-kernel@vger.kernel.org

Signed-off-by: Joe Perches <joe@perches.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-10-29 07:39:25 -07:00
Roger Quadros
8753298a11 omap_hsmmc: add missing probe handler hook
The missing probe handler hook will never probe the driver. Add it back.
Fixes broken MMC on OMAP.

We use platform_driver_probe() API since omap_hsmmc is not a hot-pluggable
device.

Signed-off-by: Roger Quadros <ext-roger.quadros@nokia.com>
Tested-by: Felipe Contreras <felipe.contreras@gmail.com>
Tested-by: Tony Lindgren <tony@atomide.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Felipe Contreras <felipe.contreras@gmail.com>
Cc: Denis Karpov <ext-denis.2.karpov@nokia.com>
Cc: Madhusudhan Chikkature <madhu.cr@ti.com>
Cc: Greg KH <gregkh@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-10-29 07:39:25 -07:00
KOSAKI Motohiro
0a1b71b400 strstrip(): mark as as must_check
strstrip() can return a modified value of its input argument, when
removing elading whitesapce.  So it is surely bug for this function's
return value to be ignored.  The caller is probably going to use the
incorrect original pointer.

So mark it __must_check to prevent this frm happening (as it has before).

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-10-29 07:39:25 -07:00
KOSAKI Motohiro
478988d3b2 cgroup: fix strstrip() misuse
cgroup_write_X64() and cgroup_write_string() ignore the return value of
strstrip().  it makes small inconsistent behavior.

example:
=========================
 # cd /mnt/cgroup/hoge
 # cat memory.swappiness
 60
 # echo "59 " > memory.swappiness
 # cat memory.swappiness
 59
 # echo " 58" > memory.swappiness
 bash: echo: write error: Invalid argument

This patch fixes it.

Cc: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Paul Menage <menage@google.com>
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-10-29 07:39:25 -07:00
KOSAKI Motohiro
58355c7876 congestion_wait(): don't use WRITE
commit 8aa7e847d (Fix congestion_wait() sync/async vs read/write
confusion) replace WRITE with BLK_RW_ASYNC.  Unfortunately, concurrent mm
development made the unchanged place accidentally.

This patch fixes it too.

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: Jens Axboe <jens.axboe@oracle.com>
Acked-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-10-29 07:39:25 -07:00
Christian Borntraeger
0d0df599f1 connector: fix regression introduced by sid connector
Since commit 02b51df1b07b4e9ca823c89284e704cadb323cd1 (proc connector: add
event for process becoming session leader) we have the following warning:

Badness at kernel/softirq.c:143
[...]
Krnl PSW : 0404c00180000000 00000000001481d4 (local_bh_enable+0xb0/0xe0)
[...]
Call Trace:
([<000000013fe04100>] 0x13fe04100)
 [<000000000048a946>] sk_filter+0x9a/0xd0
 [<000000000049d938>] netlink_broadcast+0x2c0/0x53c
 [<00000000003ba9ae>] cn_netlink_send+0x272/0x2b0
 [<00000000003baef0>] proc_sid_connector+0xc4/0xd4
 [<0000000000142604>] __set_special_pids+0x58/0x90
 [<0000000000159938>] sys_setsid+0xb4/0xd8
 [<00000000001187fe>] sysc_noemu+0x10/0x16
 [<00000041616cb266>] 0x41616cb266

The warning is
--->    WARN_ON_ONCE(in_irq() || irqs_disabled());

The network code must not be called with disabled interrupts but
sys_setsid holds the tasklist_lock with spinlock_irq while calling the
connector.

After a discussion we agreed that we can move proc_sid_connector from
__set_special_pids to sys_setsid.

We also agreed that it is sufficient to change the check from
task_session(curr) != pid into err > 0, since if we don't change the
session, this means we were already the leader and return -EPERM.

One last thing:
There is also daemonize(), and some people might want to get a
notification in that case. Since daemonize() is only needed if a user
space does kernel_thread this does not look important (and there seems
to be no consensus if this connector should be called in daemonize). If
we really want this, we can add proc_sid_connector to daemonize() in an
additional patch (Scott?)

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Scott James Remnant <scott@ubuntu.com>
Cc: Matt Helsley <matthltc@us.ibm.com>
Cc: David S. Miller <davem@davemloft.net>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Evgeniy Polyakov <zbr@ioremap.net>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-10-29 07:39:25 -07:00
Hugh Dickins
370c28def6 hwpoison: fix/proc/meminfo alignment
Given such a long name, the kB count in /proc/meminfo's HardwareCorrupted
line is being shown too far right (it does align with x86_64's VmallocChunk
above, but I hope nobody will ever have that much corrupted!).  Align it.

Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-10-29 07:39:25 -07:00
Hugh Dickins
92f7ba70ee hwpoison: fix oops on ksm pages
Memory failure on a KSM page currently oopses on its NULL anon_vma in
page_lock_anon_vma(): that may not be much worse than the consequence of
ignoring it, but it is better to be consistent with how ZERO_PAGE and
hugetlb pages and other awkward cases are treated.  Just skip it.

We could fix it for 2.6.32 at the KSM end, by putting a dummy anon_vma
pointer in there; but that would get harder next time, when KSM will put a
pointer to something else there (and I'm not currently planning to do any
work to open that up to memory_failure).  So I would prefer this simple
PageKsm test, until the other exceptions are handled.

Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-10-29 07:39:24 -07:00
Randy Dunlap
2eca40a8cc cpufreq: add cpufreq_get() stub for CONFIG_CPU_FREQ=n
When CONFIG_CPU_FREQ is disabled, cpufreq_get() needs a stub.  Used by kvm
(although it looks like a bit of the kvm code could be omitted when
CONFIG_CPU_FREQ is disabled).

arch/x86/built-in.o: In function `kvm_arch_init':
(.text+0x10de7): undefined reference to `cpufreq_get'

(Needed in linux-next's KVM tree, but it's correct in 2.6.32).

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Tested-by: Eric Paris <eparis@redhat.com>
Cc: Jiri Slaby <jirislaby@gmail.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-10-29 07:39:24 -07:00
Jens Axboe
592b09a42f backing-dev: ensure that a removed bdi no longer has super_block referencing it
When the bdi is being removed, we have to ensure that no super_blocks
currently have that cached in sb->s_bdi. Normally this is ensured by
the sb having a longer life span than the bdi, but if the device is
suddenly yanked, we have to kill this reference. sb->s_bdi is pointed
to freed memory at that point.

This fixes a problem with sync(1) hanging when a USB stick is pulled
without cleanly umounting it first.

Reported-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-29 11:46:12 +01:00
Rusty Russell
3c7d76e371 param: fix setting arrays of bool
We create a dummy struct kernel_param on the stack for parsing each
array element, but we didn't initialize the flags word.  This matters
for arrays of type "bool", where the flag indicates if it really is
an array of bools or unsigned int (old-style).

Reported-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: stable@kernel.org
2009-10-29 08:56:20 +10:30
Rusty Russell
d553ad864e param: fix NULL comparison on oom
kp->arg is always true: it's the contents of that pointer we care about.

Reported-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: stable@kernel.org
2009-10-29 08:56:18 +10:30
Rusty Russell
65afac7d80 param: fix lots of bugs with writing charp params from sysfs, by leaking mem.
e180a6b7759a "param: fix charp parameters set via sysfs" fixed the case
where charp parameters written via sysfs were freed, leaving drivers
accessing random memory.

Unfortunately, storing a flag in the kparam struct was a bad idea: it's
rodata so setting it causes an oops on some archs.  But that's not all:

1) module_param_array() on charp doesn't work reliably, since we use an
   uninitialized temporary struct kernel_param.
2) there's a fundamental race if a module uses this parameter and then
   it's changed: they will still access the old, freed, memory.

The simplest fix (ie. for 2.6.32) is to never free the memory.  This
prevents all these problems, at cost of a memory leak.  In practice, there
are only 18 places where a charp is writable via sysfs, and all are
root-only writable.

Reported-by: Takashi Iwai <tiwai@suse.de>
Cc: Sitsofe Wheeler <sitsofe@yahoo.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: stable@kernel.org
2009-10-29 08:56:17 +10:30
Michael S. Tsirkin
2d61ba9503 virtio: order used ring after used index read
On SMP guests, reads from the ring might bypass used index reads. This
causes guest crashes because host writes to used index to signal ring
data readiness.  Fix this by inserting rmb before used ring reads.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: stable@kernel.org
2009-10-29 08:50:37 +10:30
Michael S. Tsirkin
0b22bd0ba0 virtio-pci: fix per-vq MSI-X request logic
Commit f68d24082e22ccee3077d11aeb6dc5354f0ca7f1
in 2.6.32-rc1 broke requesting IRQs for per-VQ MSI-X vectors:
- vector number was used instead of the vector itself
- we try to request an IRQ for VQ which does not
  have a callback handler

This is a regression that causes warnings in kernel log,
potentially lower performance as we need to scan vq list,
and might cause system failure if the interrupt
requested is in fact needed by another system.

This was not noticed earlier because in most cases
we were falling back on shared interrupt for all vqs.

The warnings often look like this:

virtio-pci 0000:00:03.0: irq 26 for MSI/MSI-X
virtio-pci 0000:00:03.0: irq 27 for MSI/MSI-X
virtio-pci 0000:00:03.0: irq 28 for MSI/MSI-X
IRQ handler type mismatch for IRQ 1
current handler: i8042
Pid: 2400, comm: modprobe Tainted: G        W
2.6.32-rc3-11952-gf3ed8d8-dirty #1
Call Trace:
 [<ffffffff81072aed>] ? __setup_irq+0x299/0x304
 [<ffffffff81072ff3>] ? request_threaded_irq+0x144/0x1c1
 [<ffffffff813455af>] ? vring_interrupt+0x0/0x30
 [<ffffffff81346598>] ? vp_try_to_find_vqs+0x583/0x5c7
 [<ffffffffa0015188>] ? skb_recv_done+0x0/0x34 [virtio_net]
 [<ffffffff81346609>] ? vp_find_vqs+0x2d/0x83
 [<ffffffff81345d00>] ? vp_get+0x3c/0x4e
 [<ffffffffa0016373>] ? virtnet_probe+0x2f1/0x428 [virtio_net]
 [<ffffffffa0015188>] ? skb_recv_done+0x0/0x34 [virtio_net]
 [<ffffffffa00150d8>] ? skb_xmit_done+0x0/0x39 [virtio_net]
 [<ffffffff8110ab92>] ? sysfs_do_create_link+0xcb/0x116
 [<ffffffff81345cc2>] ? vp_get_status+0x14/0x16
 [<ffffffff81345464>] ? virtio_dev_probe+0xa9/0xc8
 [<ffffffff8122b11c>] ? driver_probe_device+0x8d/0x128
 [<ffffffff8122b206>] ? __driver_attach+0x4f/0x6f
 [<ffffffff8122b1b7>] ? __driver_attach+0x0/0x6f
 [<ffffffff8122a9f9>] ? bus_for_each_dev+0x43/0x74
 [<ffffffff8122a374>] ? bus_add_driver+0xea/0x22d
 [<ffffffff8122b4a3>] ? driver_register+0xa7/0x111
 [<ffffffffa001a000>] ? init+0x0/0xc [virtio_net]
 [<ffffffff81009051>] ? do_one_initcall+0x50/0x148
 [<ffffffff8106e117>] ? sys_init_module+0xc5/0x21a
 [<ffffffff8100af02>] ? system_call_fastpath+0x16/0x1b
virtio-pci 0000:00:03.0: irq 26 for MSI/MSI-X
virtio-pci 0000:00:03.0: irq 27 for MSI/MSI-X

Reported-by: Marcelo Tosatti <mtosatti@redhat.com>
Reported-by: Shirley Ma <xma@us.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-10-29 08:50:36 +10:30
Dave Airlie
77de0846ae drm/kms: fix kms/fbdev colormap support properly.
This sets the fbcon to use TRUECOLOR by default, it then
only modifies the pseudo palette for fbcon, and only touches
the real palette when in 8-bit pseudo color mode.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2009-10-28 11:23:48 +10:00
Zhao Yakui
fcb4561144 drm: Add the basic check for the detailed timing in EDID
Sometimes we will get the incorrect display modeline when parsing the detailed
timing in EDID. For example:
   >hsync/vsync width is zero
   >sync is beyond the blank.

So add the basic check for the detailed timing in EDID to avoid the incorrect
display modeline.

Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2009-10-28 11:23:39 +10:00
Dave Airlie
93239ea158 drm/radeon/kms: ignore vga arbiter return.
Since we register all radeon devices, and the arbiter only cares about
VGA class ones, we will fail to startup on display controller class devices.
We don't gain anything by using the return value here.

this helps kms on sparc64 get started.

Reported-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2009-10-28 11:09:58 +10:00
Benjamin Herrenschmidt
40578fca24 Merge commit 'gcl/merge' into merge 2009-10-28 09:56:18 +11:00
Jesse Barnes
55a1098476 Revert "PCI: get larger bridge ranges when space is available"
This reverts commit 308cf8e13f42f476dfd6552aeff58fdc0788e566.  This
patch had trouble with transparent bridges, among other things.  A more
readable and correct version should land in 2.6.33.

Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-10-27 09:39:18 -07:00
Benjamin Herrenschmidt
4f917ba3d5 powerpc/ppc64: Use preempt_schedule_irq instead of preempt_schedule
Based on an original patch by Valentine Barshak <vbarshak@ru.mvista.com>

Use preempt_schedule_irq to prevent infinite irq-entry and
eventual stack overflow problems with fast-paced IRQ sources.

This kind of problems has been observed on the PASemi Electra IDE
controller. We have to make sure we are soft-disabled before calling
preempt_schedule_irq and hard disable interrupts after that
to avoid unrecoverable exceptions.

This patch also moves the "clrrdi r9,r1,THREAD_SHIFT" out of
the #ifdef CONFIG_PPC_BOOK3E scope, since r9 is clobbered
and has to be restored in both cases.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-10-27 16:42:43 +11:00
Kumar Gala
01deab98e3 powerpc: Minor cleanup to lib/Kconfig.debug
We don't need an explicit PPC64 in the DEBUG_PREEMPT dependancies as all
PPC platforms now support TRACE_IRQFLAGS_SUPPORT.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-10-27 16:42:42 +11:00
Kumar Gala
f8a3ae6c84 powerpc: Minor cleanup to sound/ppc/Kconfig
We can replace PPC32 || PPC64 as a dependancy with just PPC as all
powerpc platforms (32-bit and 64-bit) define PPC now.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-10-27 16:42:42 +11:00
Kumar Gala
022382a561 powerpc: Minor cleanup to init/Kconfig
We dont need to depend on PPC64 explicitly as all powerpc platforms
(32-bit and 64-bit) define PPC now.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-10-27 16:42:41 +11:00
Kumar Gala
ed84a07a12 powerpc: Limit memory hotplug support to PPC64 Book-3S machines
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-10-27 16:42:41 +11:00
Kumar Gala
0cd9ad73b8 powerpc: Limit hugetlbfs support to PPC64 Book-3S machines
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-10-27 16:42:41 +11:00
Kumar Gala
ce7a35c73a powerpc: Fix compile errors found by new ppc64e_defconfig
Fix the following 3 issues:

arch/powerpc/kernel/process.c: In function 'arch_randomize_brk':
arch/powerpc/kernel/process.c:1183: error: 'mmu_highuser_ssize' undeclared (first use in this function)
arch/powerpc/kernel/process.c:1183: error: (Each undeclared identifier is reported only once
arch/powerpc/kernel/process.c:1183: error: for each function it appears in.)
arch/powerpc/kernel/process.c:1183: error: 'MMU_SEGSIZE_1T' undeclared (first use in this function)

In file included from arch/powerpc/kernel/setup_64.c:60:
arch/powerpc/include/asm/mmu-hash64.h:132: error: redefinition of 'struct mmu_psize_def'
arch/powerpc/include/asm/mmu-hash64.h:159: error: expected identifier or '(' before numeric constant
arch/powerpc/include/asm/mmu-hash64.h:396: error: conflicting types for 'mm_context_t'
arch/powerpc/include/asm/mmu-book3e.h:184: error: previous declaration of 'mm_context_t' was here

cc1: warnings being treated as errors
arch/powerpc/kernel/pci_64.c: In function 'pcibios_unmap_io_space':
arch/powerpc/kernel/pci_64.c💯 error: unused variable 'res'

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-10-27 16:42:41 +11:00
Kumar Gala
fafbe983d9 powerpc: Add a Book-3E 64-bit defconfig
This defconfig's purpose at this time is to help catch compile errors
between Book-3S and Book-3E support in ppc64.  It is based on the
ppc64_defconfig with some things disabled that we dont support on
Book-3E right now (hugetlbfs, slices, etc.)

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-10-27 16:42:41 +11:00
Josh Boyer
cdd3904dcc powerpc/booke: Fix xmon single step on PowerPC Book-E
Prior to the arch/ppc -> arch/powerpc transition, xmon had support for single
stepping on 4xx boards.  The functionality was lost when arch/ppc was removed.
This patch restores single step support for 44x boards, and Book-E in general.

Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-10-27 16:42:40 +11:00
Andreas Schwab
348aa30300 powerpc: Align vDSO base address
The ABI specifies a 64K alignment, we need to map the vDSO accordingly

Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-10-27 16:42:40 +11:00