Commit Graph

97326 Commits

Author SHA1 Message Date
Ingo Molnar 7e09b2a02d x86: fix canary of the boot CPU's idle task
the boot CPU's idle task has a zero stackprotector canary value.

this is a special task that is never forked, so the fork code
does not randomize its canary. Do it when we hit cpu_idle().

Academic sidenote: this means that the early init code runs with a
zero canary and hence the canary becomes predictable for this short,
boot-only amount of time.

Although attack vectors against early init code are very rare, it might
make sense to move this initialization to an earlier point.
(to one of the early init functions that never return - such as
start_kernel())

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-26 16:15:31 +02:00
Arjan van de Ven ce22bd92cb x86: setup stack canary for the idle threads
The idle threads for non-boot CPUs are a bit special in how they
are created; the result is that these don't have the stack canary
set up properly in their PDA. Easiest fix is to just always set
the PDA up correctly when entering the idle thread; this is a NOP
for the boot cpu.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-26 16:15:31 +02:00
Ingo Molnar e00320875d x86: fix stackprotector canary updates during context switches
fix a bug noticed and fixed by pageexec@freemail.hu.

if built with -fstack-protector-all then we'll have canary checks built
into the __switch_to() function. That does not work well with the
canary-switching code there: while we already use the %rsp of the
new task, we still call __switch_to() whith the previous task's canary
value in the PDA, hence the __switch_to() ssp prologue instructions
will store the previous canary. Then we update the PDA and upon return
from __switch_to() the canary check triggers and we panic.

so update the canary after we have called __switch_to(), where we are
at the same stackframe level as the last stackframe of the next
(and now freshly current) task.

Note: this means that we call __switch_to() [and its sub-functions]
still with the old canary, but that is not a problem, both the previous
and the next task has a high-quality canary. The only (mostly academic)
disadvantage is that the canary of one task may leak onto the stack of
another task, increasing the risk of information leaks, were an attacker
able to read the stack of specific tasks (but not that of others).

To solve this we'll have to reorganize the way we switch tasks, and move
the PDA setting into the switch_to() assembly code. That will happen in
another patch.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-26 16:15:31 +02:00
Ingo Molnar 4c7f8900f1 x86: stackprotector & PARAVIRT fix
on paravirt enabled 64-bit kernels the paravirt ops do function
calls themselves - which is bad with the stackprotector - for
example pda_init() loads 0 into %gs and then does MSR_GS_BASE
write (which modifies gs.base) - but that MSR write is a function
call on paravirt, which with stackprotector tries to read the
stack canary from the PDA ... crashing the bootup.

the solution was suggested by Arjan van de Ven: to exclude paravirt.c
from stackprotector, too many lowlevel functionality is in it. It's
not like we'll have paravirt functions with character arrays on
their stack anyway...

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-26 16:15:31 +02:00
Linus Torvalds b3733034f1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-fixes
* git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-fixes:
  Kconfig: introduce ARCH_DEFCONFIG to DEFCONFIG_LIST
  .gitignore: match ncscope.out
  scripts/ver_linux use 'gcc -dumpversion'
2008-05-25 15:00:27 -07:00
Linus Torvalds c8ff99a7c2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdog
* git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdog:
  [WATCHDOG] Add ICH9DO into the iTCO_wdt.c driver
  [WATCHDOG] Fix booke_wdt.c on MPC85xx SMP system's
  [WATCHDOG] Add a watchdog driver based on the CS5535/CS5536 MFGPT timers
  [WATCHDOG] hpwdt: Fix NMI handling.
  [WATCHDOG] Blackfin Watchdog Driver: split platform device/driver
  [WATCHDOG] Add w83697h_wdt early_disable option
  [WATCHDOG] Make w83697h_wdt timeout option string similar to others
  [WATCHDOG] Make w83697h_wdt void-like functions void
2008-05-25 14:59:59 -07:00
Linus Torvalds 32522bfdae Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
  [ALSA] hda - Fix capture mute Widget for stac9250/9251
  [ALSA] snd-pcsp - fix pcsp_treble_info() to honour an item number
  [ALSA] hda - Added support for Foxconn P35AX-S mainboard
  [ALSA] hda - Fix COEF and EAPD in ALC889 auto-configuration mode
  [ALSA] hda - Fix noise on VT1708 codec
  [ALSA] hda - Add model for ASUS P5K-E/WIFI-AP
2008-05-25 14:59:27 -07:00
Sam Ravnborg 73531905ed Kconfig: introduce ARCH_DEFCONFIG to DEFCONFIG_LIST
init/Kconfig contains a list of configs that are searched
for if 'make *config' are used with no .config present.
Extend this list to look at the config identified by
ARCH_DEFCONFIG.

With this change we now try the defconfig targets last.

This fixes a regression reported
by: Linus Torvalds <torvalds@linux-foundation.org>

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
2008-05-25 23:03:18 +02:00
Jike Song 9723c046bd .gitignore: match ncscope.out
Sometimes I got this:

    $ git-status
    {snip}
    # On branch master
    # Untracked files:
    #   (use "git add <file>..." to include in what will be committed)
    #
    #       ncscope.out
    nothing added to commit but untracked files present (use "git add"
to track)

Fix it.

Signed-off-by: Jike Song <albcamus@gmail.com>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2008-05-25 23:02:48 +02:00
Gabriel C 656a3f7978 scripts/ver_linux use 'gcc -dumpversion'
These magic greps and hacks in ver_linux to get the gcc version always break after some gcc releases.

Since now gcc >4.3 allows compiling with '--with-pkgversion' ( which can be everything 'My Cool Gcc' or something )
ver_linux will report random junk for these.

Simply use 'gcc -dumpversion' to get the gcc version which should always work.

Signed-off-by: Gabriel C <nix.or.die@googlemail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2008-05-25 23:02:43 +02:00
Mauro Carvalho Chehab 587755f1f6 [ALSA] hda - Fix capture mute Widget for stac9250/9251
Fix capture mute widget for STAC9250/9251 codecs.  The widget 0x09
has no mute but 0x14 does actually.

Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2008-05-25 18:21:18 +02:00
Stas Sergeev 97e08f5d73 [ALSA] snd-pcsp - fix pcsp_treble_info() to honour an item number
This solves the problem with mixers wrongly displaying the PWM freq.

Signed-off-by: Stas Sergeev <stsp@aknet.ru>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2008-05-25 18:21:10 +02:00
Gabriel C a49056da03 [WATCHDOG] Add ICH9DO into the iTCO_wdt.c driver
Add the Intel ICH9DO controller ID's for the iTCO_wdt kernel driver and bump
the driver version.

Tested on an P5E-VM DO ASUS motherboard.

Signed-off-by: Gabriel Craciunescu <nix.or.die@googlemail.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2008-05-25 09:45:39 +00:00
Chen Gong f172ddc61a [WATCHDOG] Fix booke_wdt.c on MPC85xx SMP system's
On Book-E SMP systems each core has its own private watchdog.  If only one
watchdog is enabled, when the core that doesn't enable the watchdog is hung,
system can't reset because no watchdog is running on it.  That's bad.  It
means we must enable watchdogs on both cores.

We can use smp_call_function() to send appropriate messages to all the other
cores to enable and update the watchdog.

Signed-off-by: Chen Gong <g.chen@freescale.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2008-05-25 09:43:06 +00:00
Jordan Crouse 0b36086b5d [WATCHDOG] Add a watchdog driver based on the CS5535/CS5536 MFGPT timers
Add a watchdog timer based on the MFGPT timers in the CS5535/CS5536
companion chips to the AMD Geode GX and LX processors.  Only caveat
is that the BIOS must provide at least a one free timer, and most
do not.

Signed-off-by: Jordan Crouse <jordan.crouse@amd.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2008-05-25 09:02:17 +00:00
Mingarelli, Thomas 7f7f894c6d [WATCHDOG] hpwdt: Fix NMI handling.
I need to just return in case it's not my NMI so someone else can take a look
at it (and reset die_nmi_called to 0 in case I actually do get one that's mine
to handle).

Signed-off-by: Thomas Mingarelli <thomas.mingarelli@hp.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2008-05-25 09:01:48 +00:00
Mike Frysinger 93539b1946 [WATCHDOG] Blackfin Watchdog Driver: split platform device/driver
- split platform device/driver registering from actual watchdog device/driver
   registering so that we can cleanly load/unload
 - fixup __initdata with __initconst and __devinitdata with __devinitconst

Signed-off-by: Mike Frysinger <vapier.adi@gmail.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2008-05-25 09:01:36 +00:00
Samuel Tardieu 6fd656012b [WATCHDOG] Add w83697h_wdt early_disable option
Pádraig Brady requested the possibility of not disabling the watchdog
at module load time or kernel boot time if it had been previously enabled
in the bios. It may help rebooting the machine if it freezes before the
userland daemon kicks in.

Signed-off-by: Samuel Tardieu <sam@rfc1149.net>
Cc: Pádraig Brady <P@draigBrady.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2008-05-25 09:00:51 +00:00
Samuel Tardieu 5794a9f412 [WATCHDOG] Make w83697h_wdt timeout option string similar to others
Signed-off-by: Samuel Tardieu <sam@rfc1149.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2008-05-25 09:00:49 +00:00
Samuel Tardieu 03315adca7 [WATCHDOG] Make w83697h_wdt void-like functions void
Some non-exported functions always returned 0. Mark them void instead.

Signed-off-by: Samuel Tardieu <sam@rfc1149.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2008-05-25 09:00:47 +00:00
Linus Torvalds eb90d81d03 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-tip:
  x86: prevent PGE flush from interruption/preemption
  x86: use explicit copy in vdso_gettimeofday()
  namespacecheck: automated fixes
  x86/xen: fix arbitrary_virt_to_machine()
  x86: don't read maxlvt before checking if APIC is mapped
  x86: disable TSC for sched_clock() when calibration failed
  x86: distangle user disabled TSC from unstable
  x86: fix setup of cyc2ns in tsc_64.c
2008-05-24 10:20:00 -07:00
Linus Torvalds d3c5f8b93f Merge branch 'for-linus' of master.kernel.org:/home/rmk/linux-2.6-arm
* 'for-linus' of master.kernel.org:/home/rmk/linux-2.6-arm:
  [ARM] integrator: fix build warnings and errors
  [ARM] fix OMAP include loops
  Revert "[ARM] pxa: spitz wants PXA27x UDC definitions"
  [ARM] 5053/1: define before use of processor_id
  [ARM] 5052/1: export clock functions for the at91x40
  [ARM] 5051/1: define pgtable_t for the !CONFIG_MMU case too
  [ARM] omap: fix omap clk support build errors
  [ARM] 5039/1: S3C244X: Rename SDI device if running on S3C244X.
  [ARM] 5043/1: pxafb: remove unused mode variable in pxafb_init_fbinfo
  [ARM] 5041/1: VR1000: Fix DM9000 IRQ flags initialisation
  [ARM] 5040/1: BAST: Fix DM9000 IRQ flags initialisation
  [ARM] 5038/1: ARM: OMAP: Remove tsc2102 references from board-palmte.c
  [ARM] 5025/2: fix collie cpu initialisation
2008-05-24 10:13:16 -07:00
David Brownell 25d5cb4b03 spi: remove some spidev oops-on-rmmod paths
Somehow the spidev code forgot to include a critical mechanism: when the
underlying device is removed (e.g.  spi_master rmmod), open file
descriptors must be prevented from issuing new I/O requests to that
device.  On penalty of the oopsing reported by Sebastian Siewior
<bigeasy@tglx.de> ...

This is a partial fix, adding handshaking between the lower level (SPI
messaging) and the file operations using the spi_dev.  (It also fixes an
issue where reads and writes didn't return the number of bytes sent or
received.)

There's still a refcounting issue to be addressed (separately).

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Reported-by: Sebastian Siewior <bigeasy@tglx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-24 09:56:14 -07:00
Cedric Le Goater 5c02b57578 cgroups: remove node_ prefix_from ns subsystem
This is a slight change in the namespace cgroup subsystem api.

The change is that previously when cgroup_clone() was called (currently
only from the unshare path in ns_proxy cgroup, you'd get a new group named
"node_$pid" whereas now you'll get a group named after just your pid.)

The only users who would notice it are those who are using the ns_proxy
cgroup subsystem to auto-create cgroups when namespaces are unshared -
something of an experimental feature, which I think really needs more
complete container/namespace support in order to be useful.  I suspect the
only users are Cedric and Serge, or maybe a few others on
containers@lists.linux-foundation.org.  And in fact it would only be
noticed by the users who make the assumption about how the name is
generated, rather than getting it from the /proc/<pid>/cgroups file for
the process in question.

Whether the change is actually needed or not I'm fairly agnostic on, but I
guess it is more elegant to just use the pid as the new group name rather
than adding a fairly arbitrary "node_" prefix on the front.

[menage@google.com: provided changelog]
Signed-off-by: Cedric Le Goater <clg@fr.ibm.com>
Cc: "Paul Menage" <menage@google.com>
Cc: "Serge E. Hallyn" <serue@us.ibm.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-24 09:56:14 -07:00
Fernando Luis Vazquez Cao 12d15f0d51 for_each_online_pgdat(): kerneldoc fix
for_each_pgdat() was renamed to for_each_online_pgdat() and kerneldoc
comments should be updated accordingly.

Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-24 09:56:13 -07:00
Adrian Bunk fb56f0f992 frv: export empty_zero_page
Fix the following build error:

ERROR: "empty_zero_page" [fs/ext4/ext4dev.ko] undefined!

Reported-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-24 09:56:13 -07:00
Shi Weihua 7b26655f62 sys_prctl(): fix return of uninitialized value
If none of the switch cases match, the PR_SET_PDEATHSIG and
PR_SET_DUMPABLE cases of the switch statement will never write to local
variable `error'.

Signed-off-by: Shi Weihua <shiwh@cn.fujitsu.com>
Cc: Andrew G. Morgan <morgan@kernel.org>
Acked-by: "Serge E. Hallyn" <serue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-24 09:56:13 -07:00
Kumar Gala f99c90094b edac: mpc85xx: fix building as a module
including of <asm/mpc85xx.h> causes build problems since it doesn't exist.

Also removed warning:
drivers/edac/mpc85xx_edac.c:45: warning: 'mpc85xx_ctl_name' defined but not used

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Acked-by: Doug Thompson <dougthompson@xmission.com>
Acked-by: Dave Jiang <djiang@mvista.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-24 09:56:13 -07:00
David Brownell 6ea0205b56 gpio: build fixes
This fixes various gpio-related build errors (mostly potential)
reported in part by Russell King and Uwe Kleine-König.

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Cc: Uwe Kleine-König <Uwe.Kleine-Koenig@digi.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Arnaud Patard <arnaud.patard@rtp-net.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-24 09:56:13 -07:00
Ben Dooks ee29420aca S3C2410: fix driver MODULE_ALIAS()
Add a correct MODULE_ALIAS() entry for this driver to enable udev module
loading.

Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Cc: Arnaud Patard <arnaud.patard@rtp-net.org>
Acked-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-24 09:56:13 -07:00
Ben Dooks 6a0e4ec7bc S3C2410: clean out changelog header and tidy
Remove the old changelog entries which are now out of date and should be
extractable from git anyway.  Also tidy up the copyright for the driver.

Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Cc: Arnaud Patard <arnaud.patard@rtp-net.org>
Acked-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-24 09:56:13 -07:00
Ben Dooks d585dfe840 S3C2410: add error print if we cannot add attribute
Fix the following warning by checking the result of device_create_file and
printing an error but not removing the device (loss of debug registers is
not fatal).

drivers/video/s3c2410fb.c:905: warning: ignoring return value of 'device_create_file', declared with attribute warn_unused_result

Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Cc: Arnaud Patard <arnaud.patard@rtp-net.org>
Acked-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-24 09:56:12 -07:00
Ben Dooks 673b4600e3 S3C2410: ensure that FB_BLANK_POWERDOWN shuts down the controller
When a blank level of FB_BLANK_POWERDOWN is used, we should shut down the
controller so that it no longer tries to produce any panel signals or
data, and shuts down the DMA which is not needed.

Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Cc: Arnaud Patard <arnaud.patard@rtp-net.org>
Acked-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-24 09:56:12 -07:00
Ben Dooks cdc83ae245 SM501: reverse FPEN/VBIASEN flags behaviour
To keep backwards compatibility, reverse the meanings of these flags so
that when they are not set, the driver uses the original behvaiour.

Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Cc: Arnaud Patard <arnaud.patard@rtp-net.org>
Acked-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-24 09:56:12 -07:00
Heiko Carstens cd94b9dbfa memory hotplug: fix early allocation handling
Trying to add memory via add_memory() from within an initcall function
results in

bootmem alloc of 163840 bytes failed!
Kernel panic - not syncing: Out of memory

This is caused by zone_wait_table_init() which uses system_state to decide
if it should use the bootmem allocator or not.

When initcalls are handled the system_state is still SYSTEM_BOOTING but
the bootmem allocator doesn't work anymore.  So the allocation will fail.

To fix this use slab_is_available() instead as indicator like we do it
everywhere else.

[akpm@linux-foundation.org: coding-style fix]
Reviewed-by: Andy Whitcroft <apw@shadowen.org>
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-24 09:56:12 -07:00
Andy Whitcroft 7eb54824b7 zonelists: handle a node zonelist with no applicable entries
When booting 2.6.26-rc3 on a multi-node x86_32 numa system we are seeing
panics when trying node local allocations:

 BUG: unable to handle kernel NULL pointer dereference at 0000034c
 IP: [<c1042507>] get_page_from_freelist+0x4a/0x18e
 *pdpt = 00000000013a7001 *pde = 0000000000000000
 Oops: 0000 [#1] SMP
 Modules linked in:

 Pid: 0, comm: swapper Not tainted (2.6.26-rc3-00003-g5abc28d #82)
 EIP: 0060:[<c1042507>] EFLAGS: 00010282 CPU: 0
 EIP is at get_page_from_freelist+0x4a/0x18e
 EAX: c1371ed8 EBX: 00000000 ECX: 00000000 EDX: 00000000
 ESI: f7801180 EDI: 00000000 EBP: 00000000 ESP: c1371ec0
  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
 Process swapper (pid: 0, ti=c1370000 task=c12f5b40 task.ti=c1370000)
 Stack: 00000000 00000000 00000000 00000000 000612d0 000412d0 00000000 000412d0
        f7801180 f7c0101c f7c01018 c10426e4 f7c01018 00000001 00000044 00000000
        00000001 c12f5b40 00000001 00000010 00000000 000412d0 00000286 000412d0
 Call Trace:
  [<c10426e4>] __alloc_pages_internal+0x99/0x378
  [<c10429ca>] __alloc_pages+0x7/0x9
  [<c105e0e8>] kmem_getpages+0x66/0xef
  [<c105ec55>] cache_grow+0x8f/0x123
  [<c105f117>] ____cache_alloc_node+0xb9/0xe4
  [<c105f427>] kmem_cache_alloc_node+0x92/0xd2
  [<c122118c>] setup_cpu_cache+0xaf/0x177
  [<c105e6ca>] kmem_cache_create+0x2c8/0x353
  [<c13853af>] kmem_cache_init+0x1ce/0x3ad
  [<c13755c5>] start_kernel+0x178/0x1ee

This occurs when we are scanning the zonelists looking for a ZONE_NORMAL
page.  In this system there is only ZONE_DMA and ZONE_NORMAL memory on
node 0, all other nodes are mapped above 4GB physical.  Here is a dump
of the zonelists from this system:

    zonelists pgdat=c1400000
     0: c14006c0:2 f7c006c0:2 f7e006c0:2 c1400360:1 c1400000:0
     1: c14006c0:2 c1400360:1 c1400000:0
    zonelists pgdat=f7c00000
     0: f7c006c0:2 f7e006c0:2 c14006c0:2 c1400360:1 c1400000:0
     1: f7c006c0:2
    zonelists pgdat=f7e00000
     0: f7e006c0:2 c14006c0:2 f7c006c0:2 c1400360:1 c1400000:0
     1: f7e006c0:2

When performing a node local allocation we call get_page_from_freelist()
looking for a page.  It in turn calls first_zones_zonelist() which returns
a preferred_zone.  Where there are no applicable zones this will be NULL.
However we use this unconditionally, leading to this panic.

Where there are no applicable zones there is no possibility of a successful
allocation, so simply fail the allocation.

Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-24 09:56:11 -07:00
Arjan van de Ven 03a74dcc7e serial: fix enable_irq_wake/disable_irq_wake imbalance in serial_core.c
enable_irq_wake() and disable_irq_wake() need to be balanced.  However,
serial_core.c calls these for different conditions during the suspend and
resume functions...

This is causing a regular WARN_ON() as found at
http://www.kerneloops.org/search.php?search=set_irq_wake

This patch makes the conditions for triggering the _wake enable/disable
sequence identical.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-24 09:56:11 -07:00
Denis V. Lunev c4185a0e01 proc: proc_get_inode() should get module only once
Any file under /proc/net opened more than once leaked the refcounter
on the module it belongs to.

The problem is that module_get is called for each file opening while
module_put is called only when /proc inode is destroyed. So, lets put
module counter if we are dealing with already initialised inode.

Addresses http://bugzilla.kernel.org/show_bug.cgi?id=10737

Signed-off-by: Denis V. Lunev <den@openvz.org>
Cc: David Miller <davem@davemloft.net>
Cc: Patrick McHardy <kaber@trash.net>
Acked-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Robert Olsson <robert.olsson@its.uu.se>
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Reported-by: Roland Kletzing <devzero@web.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-24 09:56:11 -07:00
Marcin Krol 53978d0a7a brd: don't show ramdisks in /proc/partitions
In 2.6.25, ramdisk devices show up in /proc/partitions, which is a
behaviour change from the old rd.c.  Add GENHD_FL_SUPPRESS_PARTITION_INFO,
which was present in rd.c.

All kernels prior to 2.6.25 weren't displaying ramdisks in
/proc/partitions.  Since there are many userspace tools using information
from /proc/partitions some of them may now behave incorrectly (I didn't
tested any though).  For example before 2.6.25 /proc/partitions was empty
if no block devices like hard disks and such were detected by kernel.  Now
all 16 ramdisks are always visible there.  Some software may rely on such
information (I mean, on empty /proc/partitions).

There was quite similar situation back in 2004, and ramdisks were excluded
back from displaying.  Thats why I called this a regression (maybe a bit
unfortunate).  See this patch for info:
http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.3-rc2/2.6.3-rc2-mm1/broken-out/nbd-proc-partitions-fix.patch

I also think that someone somewhere (long time ago) excluded ramdisks from
/proc/partitions for good reasons.  It is possible that now such new
"feature" is harmless, but I think there are more chances that someone
will say "hey, /proc/partitions has changed, now my software doesn't work"
then "hey where did my new 2.6.25 feature go".  nbd devices are also
excluded, maybe for very same (unknown to me) reasons.

Signed-off-by: Marcin Krol <hawk@pld-linux.org>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-24 09:56:11 -07:00
Alan Cox 6089093e58 ip2: fix crashes on load/unload
This doesn't need to be two modules, and making it one cleans up the
problem

Signed-off-by: Alan Cox <alan@redhat.com>
Cc: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-24 09:56:11 -07:00
Trent Piepho bff5fda972 gpiolib: fix off by one errors
The last gpio belonging to a chip is chip->base + chip->ngpios - 1.  Some
places in the code, but not all, forgot the critical minus one.

Signed-off-by: Trent Piepho <xyzzy@speakeasy.org>
Acked-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-24 09:56:11 -07:00
Roel Kluin 1d1c1d9b55 gpio: mcp23s08 debug fix
The return value of mcp23s08_read_regs() can only be evaluated when signed

Signed-off-by: Roel Kluin <12o3l@tiscali.nl>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-24 09:56:11 -07:00
David Brownell 69292b3421 gpio: pca953x driver handles pca9554 too
Teach drivers/gpio/pca953x.c about PCA9554, another compatible chip.

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-24 09:56:11 -07:00
Oleg Nesterov da7978b034 signals: fix sigqueue_free() vs __exit_signal() race
__exit_signal() does flush_sigqueue(tsk->pending) outside of ->siglock.
This can race with another thread doing sigqueue_free(), we can free the
same SIGQUEUE_PREALLOC sigqueue twice or corrupt the pending->list.

Note that even sys_exit_group() can trigger this race, not only
sys_timer_delete().

Move the callsite of flush_sigqueue(tsk->pending) under ->siglock.

This patch doesn't touch flush_sigqueue(->shared_pending) below, it is
called when there are no other threads which can play with signals, and
sigqueue_free() can't be used outside of our thread group.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Acked-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-24 09:56:10 -07:00
NeilBrown dfc7064500 md: restart recovery cleanly after device failure.
When we get any IO error during a recovery (rebuilding a spare), we abort
the recovery and restart it.

For RAID6 (and multi-drive RAID1) it may not be best to restart at the
beginning: when multiple failures can be tolerated, the recovery may be
able to continue and re-doing all that has already been done doesn't make
sense.

We already have the infrastructure to record where a recovery is up to
and restart from there, but it is not being used properly.
This is because:
  - We sometimes abort with MD_RECOVERY_ERR rather than just MD_RECOVERY_INTR,
    which causes the recovery not be be checkpointed.
  - We remove spares and then re-added them which loses important state
    information.

The distinction between MD_RECOVERY_ERR and MD_RECOVERY_INTR really isn't
needed.  If there is an error, the relevant drive will be marked as
Faulty, and that is enough to ensure correct handling of the error.  So we
first remove MD_RECOVERY_ERR, changing some of the uses of it to
MD_RECOVERY_INTR.

Then we cause the attempt to remove a non-faulty device from an array to
fail (unless recovery is impossible as the array is too degraded).  Then
when remove_and_add_spares attempts to remove the devices on which
recovery can continue, it will fail, they will remain in place, and
recovery will continue on them as desired.

Issue:  If we are halfway through rebuilding a spare and another drive
fails, and a new spare is immediately available,  do we want to:
 1/ complete the current rebuild, then go back and rebuild the new spare or
 2/ restart the rebuild from the start and rebuild both devices in
    parallel.

Both options can be argued for.  The code currently takes option 2 as
  a/ this requires least code change
  b/ this results in a minimally-degraded array in minimal time.

Cc: "Eivind Sarto" <ivan@kasenna.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-24 09:56:10 -07:00
Bernd Schubert 90b08710e4 md: allow parallel resync of md-devices.
In some configurations, a raid6 resync can be limited by CPU speed
(Calculating P and Q and moving data) rather than by device speed.  In
these cases there is nothing to be gained byt serialising resync of arrays
that share a device, and doing the resync in parallel can provide benefit.
 So add a sysfs tunable to flag an array as being allowed to resync in
parallel with other arrays that use (a different part of) the same device.

Signed-off-by: Bernd Schubert <bs@q-leap.de>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-24 09:56:10 -07:00
Dan Williams 4f54b0e948 md: notify userspace on 'stop' events
This additional notification to 'array_state' is needed to allow the
monitor application to learn about stop events via sysfs.  The
sysfs_notify("sync_action") call that comes at the end of do_md_stop()
(via md_new_event) is insufficient since the 'sync_action' attribute has
been removed by this point.

(Seems like a sysfs-notify-on-removal patch is a better fix.  Currently
removal updates the event count but does not wake up waiters)

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-24 09:56:10 -07:00
NeilBrown 09a44cc150 md: notify userspace on 'write-pending' changes to array_state
When an array enters write pending, 'array_state' changes, so we must be
sure to sysfs_notify.

Also, when waiting for user-space to acknowledge 'write-pending' by
marking the metadata as dirty, we don't want to wait for MD_CHANGE_DEVS to
be cleared as that might not happen.  So explicity test for the bits that
we are really interested in.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-24 09:56:10 -07:00
NeilBrown 698b18c1e8 md: raid1: Fix restoration of bio between failed read and write.
When performing a "recovery" or "check" pass on a RAID1 array, we read
from each device and possible, if there is a difference or a read error,
write back to some devices.

We use the same 'bio' for both read and write, resetting various fields
between the two operations.

We forgot to reset bv_offset and bv_len however.  These are often left
unchanged, but in the case where there is an IO error one or two sectors
into a page, they are changed.

This results in correctable errors not being corrected properly.  It does
not result in any data corruption.

Cc: "Fairbanks, David" <David.Fairbanks@stratus.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-24 09:56:10 -07:00
Bernd Schubert 6be9d49401 md: md: raid5 rate limit error printk
Last night we had scsi problems and a hardware raid unit was offlined
during heavy i/o.  While this happened we got for about 3 minutes a huge
number messages like these

Apr 12 03:36:07 pfs1n14 kernel: [197510.696595] raid5:md7: read error not correctable (sector 2993096568 on sdj2).

I guess the high error rate is responsible for not scheduling other events
- during this time the system was not pingable and in the end also other
devices run into scsi command timeouts causing problems on these unrelated
devices as well.

Signed-off-by: Bernd Schubert <bernd-schubert@gmx.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-24 09:56:10 -07:00