Commit Graph

254741 Commits

Author SHA1 Message Date
Mel Gorman da175d06b4 mm: vmscan: evaluate the watermarks against the correct classzone
When deciding if kswapd is sleeping prematurely, the classzone is taken
into account but this is different to what balance_pgdat() and the
allocator are doing.  Specifically, the DMA zone will be checked based on
the classzone used when waking kswapd which could be for a GFP_KERNEL or
GFP_HIGHMEM request.  The lowmem reserve limit kicks in, the watermark is
not met and kswapd thinks it's sleeping prematurely keeping kswapd awake in
error.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Reported-by: Pádraig Brady <P@draigBrady.com>
Tested-by: Pádraig Brady <P@draigBrady.com>
Tested-by: Andrew Lutomirski <luto@mit.edu>
Acked-by: Rik van Riel <riel@redhat.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-08 21:14:43 -07:00
Mel Gorman d7868dae89 mm: vmscan: do not apply pressure to slab if we are not applying pressure to zone
During allocator-intensive workloads, kswapd will be woken frequently
causing free memory to oscillate between the high and min watermark.  This
is expected behaviour.

When kswapd applies pressure to zones during node balancing, it checks if
the zone is above a high+balance_gap threshold.  If it is, it does not
apply pressure but it unconditionally shrinks slab on a global basis which
is excessive.  In the event kswapd is being kept awake due to a high small
unreclaimable zone, it skips zone shrinking but still calls shrink_slab().

Once pressure has been applied, the check for zone being unreclaimable is
being made before the check is made if all_unreclaimable should be set.
This miss of unreclaimable can cause has_under_min_watermark_zone to be
set due to an unreclaimable zone preventing kswapd backing off on
congestion_wait().

Signed-off-by: Mel Gorman <mgorman@suse.de>
Reported-by: Pádraig Brady <P@draigBrady.com>
Tested-by: Pádraig Brady <P@draigBrady.com>
Tested-by: Andrew Lutomirski <luto@mit.edu>
Acked-by: Rik van Riel <riel@redhat.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-08 21:14:43 -07:00
Mel Gorman 08951e5459 mm: vmscan: correct check for kswapd sleeping in sleeping_prematurely
During allocator-intensive workloads, kswapd will be woken frequently
causing free memory to oscillate between the high and min watermark.  This
is expected behaviour.  Unfortunately, if the highest zone is small, a
problem occurs.

This seems to happen most with recent sandybridge laptops but it's
probably a co-incidence as some of these laptops just happen to have a
small Normal zone.  The reproduction case is almost always during copying
large files that kswapd pegs at 100% CPU until the file is deleted or
cache is dropped.

The problem is mostly down to sleeping_prematurely() keeping kswapd awake
when the highest zone is small and unreclaimable and compounded by the
fact we shrink slabs even when not shrinking zones causing a lot of time
to be spent in shrinkers and a lot of memory to be reclaimed.

Patch 1 corrects sleeping_prematurely to check the zones matching
	the classzone_idx instead of all zones.

Patch 2 avoids shrinking slab when we are not shrinking a zone.

Patch 3 notes that sleeping_prematurely is checking lower zones against
	a high classzone which is not what allocators or balance_pgdat()
	is doing leading to an artifical belief that kswapd should be
	still awake.

Patch 4 notes that when balance_pgdat() gives up on a high zone that the
	decision is not communicated to sleeping_prematurely()

This problem affects 2.6.38.8 for certain and is expected to affect 2.6.39
and 3.0-rc4 as well.  If accepted, they need to go to -stable to be picked
up by distros and this series is against 3.0-rc4.  I've cc'd people that
reported similar problems recently to see if they still suffer from the
problem and if this fixes it.

This patch: correct the check for kswapd sleeping in sleeping_prematurely()

During allocator-intensive workloads, kswapd will be woken frequently
causing free memory to oscillate between the high and min watermark.  This
is expected behaviour.

A problem occurs if the highest zone is small.  balance_pgdat() only
considers unreclaimable zones when priority is DEF_PRIORITY but
sleeping_prematurely considers all zones.  It's possible for this sequence
to occur

  1. kswapd wakes up and enters balance_pgdat()
  2. At DEF_PRIORITY, marks highest zone unreclaimable
  3. At DEF_PRIORITY-1, ignores highest zone setting end_zone
  4. At DEF_PRIORITY-1, calls shrink_slab freeing memory from
        highest zone, clearing all_unreclaimable. Highest zone
        is still unbalanced
  5. kswapd returns and calls sleeping_prematurely
  6. sleeping_prematurely looks at *all* zones, not just the ones
     being considered by balance_pgdat. The highest small zone
     has all_unreclaimable cleared but the zone is not
     balanced. all_zones_ok is false so kswapd stays awake

This patch corrects the behaviour of sleeping_prematurely to check the
zones balance_pgdat() checked.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Reported-by: Pádraig Brady <P@draigBrady.com>
Tested-by: Pádraig Brady <P@draigBrady.com>
Tested-by: Andrew Lutomirski <luto@mit.edu>
Acked-by: Rik van Riel <riel@redhat.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-08 21:14:42 -07:00
Linus Torvalds 902daf6580 Merge branch 'gpio/merge' of git://git.secretlab.ca/git/linux-2.6
* 'gpio/merge' of git://git.secretlab.ca/git/linux-2.6:
  gpio/langwell_gpio: ack the correct bit for langwell gpio interrupts
2011-07-08 09:01:11 -07:00
Linus Torvalds 54af2bd25c Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs
* 'for-linus' of git://oss.sgi.com/xfs/xfs:
  xfs: unpin stale inodes directly in IOP_COMMITTED
2011-07-08 09:00:51 -07:00
Linus Torvalds c60ffcbb62 Merge branch 'omap-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap-2.6
* 'omap-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap-2.6:
  omap: drop __initdata tags from static struct platform_device declarations
2011-07-08 09:00:02 -07:00
Linus Torvalds 3546eea837 Merge branch 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6
* 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
  drm/kms: allow drm_mode_group with no objects
  drm/radeon/kms: free ib pool on module unloading
  drm/radeon/kms: fix typo in evergreen disp int status register
  drm/radeon/kms: fix typo in IH_CNTL swap bitfield
2011-07-08 08:59:39 -07:00
Mathias Nyman 2345b20fd9 gpio/langwell_gpio: ack the correct bit for langwell gpio interrupts
The wrong bit was masked when acking langwell gpio interrupts.

Reason for maskig the wrong bit was probably because__ffs() and ffs() functions
return bit indexes differently (0..31 vs 1..32)

This fixes langwell based devices from hanging when a gpio interrupt is
triggered and undoes the breakage which occurred in change set
732063b92b

Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-07-08 09:32:01 -06:00
Linus Torvalds f1bb20a836 Merge branch 'for-30-rc5/all-i2c' of git://git.fluff.org/bjdooks/linux
* 'for-30-rc5/all-i2c' of git://git.fluff.org/bjdooks/linux:
  i2c-bfin-twi: abort transfer is MEM bit is reset unexpectedly
  i2c-s3c2410: Remove useless break code
  i2c-s3c2410: Fix typo 'i2s' -> 'i2c'
  i2c: tegra: Assign unused slave address
2011-07-07 16:29:29 -07:00
Linus Torvalds 90c69064c9 Merge branch 'usb-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6
* 'usb-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6:
  USB: additional regression fix for device removal
2011-07-07 15:10:33 -07:00
Alan Stern ca5c485f55 USB: additional regression fix for device removal
Commit e534c5b831 (USB: fix regression
occurring during device removal) didn't go far enough.  It failed to
take into account that when a driver claims multiple interfaces, it may
release them all at the same time.  As a result, some interfaces can
get released before they are unregistered, and we deadlock trying to
acquire the bandwidth_mutex that we already own.

This patch (asl478) handles this case by setting the "unregistering"
flag on all the interfaces before removing any of them.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Cc: stable <stable@kernel.org>
Tested-by: Éric Piel <eric.piel@tremplin-utc.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-07-07 13:29:33 -07:00
Linus Torvalds 31cb852809 Merge branch 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6
* 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6:
  PM / Hibernate: Fix free_unnecessary_pages()
2011-07-07 13:22:41 -07:00
Linus Torvalds 2a9d6df425 Merge branch 'for-linus' of git://git.kernel.dk/linux-block
* 'for-linus' of git://git.kernel.dk/linux-block:
  drbd: we should write meta data updates with FLUSH FUA
  drbd: fix limit define, we support 1 PiByte now
  drbd: when receive times out on meta socket, also check last receive time on data socket
  drbd: account bitmap IO during resync as resync-(related-)-io
  drbd: don't cond_resched_lock with IRQs disabled
  drbd: add missing spinlock to bitmap receive
  drbd: Use the correct max_bio_size when creating resync requests
  cfq-iosched: make code consistent
  cfq-iosched: fix a rcu warning
2011-07-07 13:22:26 -07:00
David Howells c902ce1bfb FS-Cache: Add a helper to bulk uncache pages on an inode
Add an FS-Cache helper to bulk uncache pages on an inode.  This will
only work for the circumstance where the pages in the cache correspond
1:1 with the pages attached to an inode's page cache.

This is required for CIFS and NFS: When disabling inode cookie, we were
returning the cookie and setting cifsi->fscache to NULL but failed to
invalidate any previously mapped pages.  This resulted in "Bad page
state" errors and manifested in other kind of errors when running
fsstress.  Fix it by uncaching mapped pages when we disable the inode
cookie.

This patch should fix the following oops and "Bad page state" errors
seen during fsstress testing.

  ------------[ cut here ]------------
  kernel BUG at fs/cachefiles/namei.c:201!
  invalid opcode: 0000 [#1] SMP
  Pid: 5, comm: kworker/u:0 Not tainted 2.6.38.7-30.fc15.x86_64 #1 Bochs Bochs
  RIP: 0010: cachefiles_walk_to_object+0x436/0x745 [cachefiles]
  RSP: 0018:ffff88002ce6dd00  EFLAGS: 00010282
  RAX: ffff88002ef165f0 RBX: ffff88001811f500 RCX: 0000000000000000
  RDX: 0000000000000000 RSI: 0000000000000100 RDI: 0000000000000282
  RBP: ffff88002ce6dda0 R08: 0000000000000100 R09: ffffffff81b3a300
  R10: 0000ffff00066c0a R11: 0000000000000003 R12: ffff88002ae54840
  R13: ffff88002ae54840 R14: ffff880029c29c00 R15: ffff88001811f4b0
  FS:  00007f394dd32720(0000) GS:ffff88002ef00000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
  CR2: 00007fffcb62ddf8 CR3: 000000001825f000 CR4: 00000000000006e0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
  Process kworker/u:0 (pid: 5, threadinfo ffff88002ce6c000, task ffff88002ce55cc0)
  Stack:
   0000000000000246 ffff88002ce55cc0 ffff88002ce6dd58 ffff88001815dc00
   ffff8800185246c0 ffff88001811f618 ffff880029c29d18 ffff88001811f380
   ffff88002ce6dd50 ffffffff814757e4 ffff88002ce6dda0 ffffffff8106ac56
  Call Trace:
   cachefiles_lookup_object+0x78/0xd4 [cachefiles]
   fscache_lookup_object+0x131/0x16d [fscache]
   fscache_object_work_func+0x1bc/0x669 [fscache]
   process_one_work+0x186/0x298
   worker_thread+0xda/0x15d
   kthread+0x84/0x8c
   kernel_thread_helper+0x4/0x10
  RIP  cachefiles_walk_to_object+0x436/0x745 [cachefiles]
  ---[ end trace 1d481c9af1804caa ]---

I tested the uncaching by the following means:

 (1) Create a big file on my NFS server (104857600 bytes).

 (2) Read the file into the cache with md5sum on the NFS client.  Look in
     /proc/fs/fscache/stats:

	Pages  : mrk=25601 unc=0

 (3) Open the file for read/write ("bash 5<>/warthog/bigfile").  Look in proc
     again:

	Pages  : mrk=25601 unc=25601

Reported-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-and-Tested-by: Suresh Jayaraman <sjayaraman@suse.de>
cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-07 13:21:56 -07:00
Linus Torvalds 075d9db131 Merge branch 'stable/bug.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
* 'stable/bug.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
  xen/pci: Move check for acpi_sci_override_gsi to xen_setup_acpi_sci.
2011-07-07 13:19:04 -07:00
Linus Torvalds e55f1b1c00 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: Don't use the EFI reboot method by default
  x86, suspend: Restore MISC_ENABLE MSR in realmode wakeup
  x86, reboot: Acer Aspire One A110 reboot quirk
  x86-32, NUMA: Fix boot regression caused by NUMA init unification on highmem machines
2011-07-07 13:18:13 -07:00
Linus Torvalds 27a3b735b7 Merge branches 'core-urgent-for-linus', 'perf-urgent-for-linus' and 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  debugobjects: Fix boot crash when kmemleak and debugobjects enabled

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  jump_label: Fix jump_label update for modules
  oprofile, x86: Fix race in nmi handler while starting counters

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  sched: Disable (revert) SCHED_LOAD_SCALE increase
  sched, cgroups: Fix MIN_SHARES on 64-bit boxen
2011-07-07 13:17:45 -07:00
Linus Torvalds 85746e429f Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (31 commits)
  sctp: fix missing send up SCTP_SENDER_DRY_EVENT when subscribe it
  net: refine {udp|tcp|sctp}_mem limits
  vmxnet3: round down # of queues to power of two
  net: sh_eth: fix the parameter for the ETHER of SH7757
  net: sh_eth: fix cannot work half-duplex mode
  net: vlan: enable soft features regardless of underlying device
  vmxnet3: fix starving rx ring whenoc_skb kb fails
  bridge: Always flood broadcast packets
  greth: greth_set_mac_add would corrupt the MAC address.
  net: bind() fix error return on wrong address family
  natsemi: silence dma-debug warnings
  net: 8139too: Initial necessary vlan_features to support vlan
  Fix call trace when interrupts are disabled while sleeping function kzalloc is called
  qlge:Version change to v1.00.00.29
  qlge: Fix printk priority so chip fatal errors are always reported.
  qlge:Fix crash caused by mailbox execution on wedged chip.
  xfrm4: Don't call icmp_send on local error
  ipv4: Don't use ufo handling on later transformed packets
  xfrm: Remove family arg from xfrm_bundle_ok
  ipv6: Don't put artificial limit on routing table size.
  ...
2011-07-07 13:16:21 -07:00
Konrad Rzeszutek Wilk ee339fe63a xen/pci: Move check for acpi_sci_override_gsi to xen_setup_acpi_sci.
Previously we would check for acpi_sci_override_gsi == gsi every time
a PCI device was enabled. That works during early bootup, but later
on it could lead to triggering unnecessarily the acpi_gsi_to_irq(..) lookup.
The reason is that acpi_sci_override_gsi was declared in __initdata and
after early bootup could contain bogus values.

This patch moves the check for acpi_sci_override_gsi to the
site where the ACPI SCI is preset.

CC: stable@kernel.org
Reported-by: Raghavendra D Prabhu <rprabhu@wnohang.net>
Tested-by: Raghavendra D Prabhu <rprabhu@wnohang.net>
[http://lists.xensource.com/archives/html/xen-devel/2011-07/msg00154.html]
Suggested-by:  Ian Campbell <ijc@hellion.org.uk>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-07-07 12:19:08 -04:00
Wei Yongjun 949123016a sctp: fix missing send up SCTP_SENDER_DRY_EVENT when subscribe it
We forgot to send up SCTP_SENDER_DRY_EVENT notification when
user app subscribes to this event, and there is no data to be
sent or retransmit.

This is required by the Socket API and used by the DTLS/SCTP
implementation.

Reported-by: Michael Tüxen <Michael.Tuexen@lurchi.franken.de>
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Tested-by: Robin Seggelmann <seggelmann@fh-muenster.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-07 04:10:26 -07:00
Matthew Garrett f70e957cda x86: Don't use the EFI reboot method by default
Testing suggests that at least some Lenovos and some Intels will
fail to reboot via EFI, attempting to jump to an unmapped
physical address. In the long run we could handle this by
providing a page table with a 1:1 mapping of physical addresses,
but for now it's probably just easier to assume that ACPI or
legacy methods will be present and reboot via those.

Signed-off-by: Matthew Garrett <mjg@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Alan Cox <alan@linux.intel.com>
Link: http://lkml.kernel.org/r/1309985557-15350-1-git-send-email-mjg@redhat.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-07 11:35:05 +02:00
Ben Skeggs d61a06862b drm/kms: allow drm_mode_group with no objects
Sometimes we could be controlling a device (such as an NVIDIA Tesla) that
has no crtcs/encoders/connectors.

One could argue that the driver should unset DRIVER_MODESET in this case,
but that changes a whole heap of the DRM's other behaviours, and it's much
easier to just be a modesetting driver without any outputs.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-07-07 17:49:00 +10:00
Jerome Glisse ccd6895d40 drm/radeon/kms: free ib pool on module unloading
ib pool weren't free for various newer asic on module unload.
This doesn't cause much arm but still could be candidate for
stable.

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
cc: stable@kernel.org
Reviewed-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-07-07 17:48:27 +10:00
Alex Deucher 37cba6c6f4 drm/radeon/kms: fix typo in evergreen disp int status register
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Cc: stable@kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-07-07 17:47:44 +10:00
Alex Deucher fcb857abc4 drm/radeon/kms: fix typo in IH_CNTL swap bitfield
Only affects BE systems.

Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Cc: stable@kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-07-07 17:47:12 +10:00
Eric Dumazet f03d78db65 net: refine {udp|tcp|sctp}_mem limits
Current tcp/udp/sctp global memory limits are not taking into account
hugepages allocations, and allow 50% of ram to be used by buffers of a
single protocol [ not counting space used by sockets / inodes ...]

Lets use nr_free_buffer_pages() and allow a default of 1/8 of kernel ram
per protocol, and a minimum of 128 pages.
Heavy duty machines sysadmins probably need to tweak limits anyway.


References: https://bugzilla.stlinux.com/show_bug.cgi?id=38032
Reported-by: starlight <starlight@binnacle.cx>
Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-07 00:27:05 -07:00
Shreyas Bhatewara eebb02b1f0 vmxnet3: round down # of queues to power of two
vmxnet3 device supports only power-of-two number of queues. The driver
therefore needs to check this and rounds down the number of queues to the
nearest power of two.

Signed-off-by: Yong Wang <yongwang@vmware.com>
Signed-off-by: Shreyas N Bhatewara <sbhatewara@vmware.com>
Reviewed-by: Dmitry Torokhov <dtor@vmware.com>
2011-07-07 00:25:52 -07:00
Kees Cook 7a3136666b x86, suspend: Restore MISC_ENABLE MSR in realmode wakeup
Some BIOSes will reset the Intel MISC_ENABLE MSR (specifically the
XD_DISABLE bit) when resuming from S3, which can interact poorly with
ebba638ae7. In 32bit PAE mode, this can
lead to a fault when EFER is restored by the kernel wakeup routines,
due to it setting the NX bit for a CPU that (thanks to the BIOS reset)
now incorrectly thinks it lacks the NX feature. (64bit is not affected
because it uses a common CPU bring-up that specifically handles the
XD_DISABLE bit.)

The need for MISC_ENABLE being restored so early is specific to the S3
resume path. Normally, MISC_ENABLE is saved in save_processor_state(),
but this happens after the resume header is created, so just reproduce
the logic here. (acpi_suspend_lowlevel() creates the header, calls
do_suspend_lowlevel, which calls save_processor_state(), so the saved
processor context isn't available during resume header creation.)

[ hpa: Consider for stable if OK in mainline ]

Signed-off-by: Kees Cook <kees.cook@canonical.com>
Link: http://lkml.kernel.org/r/20110707011034.GA8523@outflux.net
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: <stable@kernel.org> 2.6.38+
2011-07-06 20:09:34 -07:00
Linus Torvalds 4dd1b49c6d Merge branch 'gpio/merge' of git://git.secretlab.ca/git/linux-2.6
* 'gpio/merge' of git://git.secretlab.ca/git/linux-2.6:
  gpio: tps65910: add missing breaks in tps65910_gpio_init
2011-07-06 18:36:53 -07:00
Dave Chinner 1316d4da3f xfs: unpin stale inodes directly in IOP_COMMITTED
When inodes are marked stale in a transaction, they are treated
specially when the inode log item is being inserted into the AIL.
It tries to avoid moving the log item forward in the AIL due to a
race condition with the writing the underlying buffer back to disk.
The was "fixed" in commit de25c18 ("xfs: avoid moving stale inodes
in the AIL").

To avoid moving the item forward, we return a LSN smaller than the
commit_lsn of the completing transaction, thereby trying to trick
the commit code into not moving the inode forward at all. I'm not
sure this ever worked as intended - it assumes the inode is already
in the AIL, but I don't think the returned LSN would have been small
enough to prevent moving the inode. It appears that the reason it
worked is that the lower LSN of the inodes meant they were inserted
into the AIL and flushed before the inode buffer (which was moved to
the commit_lsn of the transaction).

The big problem is that with delayed logging, the returning of the
different LSN means insertion takes the slow, non-bulk path.  Worse
yet is that insertion is to a position -before- the commit_lsn so it
is doing a AIL traversal on every insertion, and has to walk over
all the items that have already been inserted into the AIL. It's
expensive.

To compound the matter further, with delayed logging inodes are
likely to go from clean to stale in a single checkpoint, which means
they aren't even in the AIL at all when we come across them at AIL
insertion time. Hence these were all getting inserted into the AIL
when they simply do not need to be as inodes marked XFS_ISTALE are
never written back.

Transactional/recovery integrity is maintained in this case by the
other items in the unlink transaction that were modified (e.g. the
AGI btree blocks) and committed in the same checkpoint.

So to fix this, simply unpin the stale inodes directly in
xfs_inode_item_committed() and return -1 to indicate that the AIL
insertion code does not need to do any further processing of these
inodes.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
2011-07-06 15:44:40 -05:00
Andrea Righi 9b61fc4cf3 Documentation: fix cgroup blkio throttle filenames
All the blkio.throttle.* file names are incorrectly reported without
".throttle" in the documentation. Fix it.

Signed-off-by: Andrea Righi <andrea@betterlinux.com>
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-06 13:17:51 -07:00
Jesper Juhl 316b379988 Documentation: update CodingStyle memory allocators
The list of available general purpose memory allocators in
Documentation/CodingStyle chapter 14 is incomplete. This patch adds
the missing vzalloc() to the list.

Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-06 13:17:51 -07:00
Randy Dunlap 0dcb6d737c MAINTAINERS: move kernel-doc patches location
Move location of quilt series for kernel-doc patches.

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-06 13:17:51 -07:00
Linus Torvalds de3796e77a Merge branch 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6
* 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6: (46 commits)
  [media] rc: call input_sync after scancode reports
  [media] imon: allow either proto on unknown 0xffdc
  [media] imon: auto-config ffdc 7e device
  [media] saa7134: fix raw IR timeout value
  [media] rc: fix ghost keypresses with certain hw
  [media] [staging] lirc_serial: allocate irq at init time
  [media] lirc_zilog: fix spinning rx thread
  [media] keymaps: fix table for pinnacle pctv hd devices
  [media] ite-cir: 8709 needs to use pnp resource 2
  [media] V4L: mx1-camera: fix uninitialized variable
  [media] omap_vout: Added check in reqbuf & mmap for buf_size allocation
  [media] OMAP_VOUT: Change hardcoded device node number to -1
  [media] OMAP_VOUTLIB: Fix wrong resizer calculation
  [media] uvcvideo: Disable the queue when failing to start
  [media] uvcvideo: Remove buffers from the queues when freeing
  [media] uvcvideo: Ignore entities for terminals with no supported format
  [media] v4l: Don't access media entity after is has been destroyed
  [media] media: omap3isp: fix a potential NULL deref
  [media] media: vb2: fix allocation failure check
  [media] media: vb2: reset queued_count value during queue reinitialization
  ...

Fix up trivial conflict in MAINTAINERS as per Mauro
2011-07-06 12:16:49 -07:00
Davidlohr Bueso bcb65a797e FDPIC: Fix memory leak
The shdr4extnum variable isn't being freed in the cleanup process of
elf_fdpic_core_dump().

Signed-off-by: Davidlohr Bueso <dave@gnu.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-06 12:15:16 -07:00
Rafael J. Wysocki 4d4cf23cdd PM / Hibernate: Fix free_unnecessary_pages()
There is a bug in free_unnecessary_pages() that causes it to
attempt to free too many pages in some cases, which triggers the
BUG_ON() in memory_bm_clear_bit() for copy_bm.  Namely, if
count_data_pages() is initially greater than alloc_normal, we get
to_free_normal equal to 0 and "save" greater from 0.  In that case,
if the sum of "save" and count_highmem_pages() is greater than
alloc_highmem, we subtract a positive number from to_free_normal.
Hence, since to_free_normal was 0 before the subtraction and is
an unsigned int, the result is converted to a huge positive number
that is used as the number of pages to free.

Fix this bug by checking if to_free_normal is actually greater
than or equal to the number we're going to subtract from it.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Reported-and-tested-by: Matthew Garrett <mjg@redhat.com>
Cc: stable@kernel.org
2011-07-06 20:15:23 +02:00
Ram Pai 23c570a674 resource: ability to resize an allocated resource
Provides the ability to resize a resource that is already allocated.
This functionality is put in place to support reallocation needs of
pci resources.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-06 10:54:08 -07:00
Miklos Szeredi a51cb91d81 fs: fix lock initialization
locks_alloc_lock() assumed that the allocated struct file_lock is
already initialized to zero members.  This is only true for the first
allocation of the structure, after reuse some of the members will have
random values.

This will for example result in passing random fl_start values to
userspace in fuse for FL_FLOCK locks, which is an information leak at
best.

Fix by reinitializing those members which may be non-zero after freeing.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
CC: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-06 10:41:13 -07:00
Yoshihiro Shimoda 2e98e7974d net: sh_eth: fix the parameter for the ETHER of SH7757
If the driver didn't set this parameter on the ETHER, the CPU will
encounter the "data address error" exception.

Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-05 23:41:17 -07:00
Yoshihiro Shimoda 91a5615203 net: sh_eth: fix cannot work half-duplex mode
When link was down, the bit of DM in ECMR was always set.
So, we could not use half-duplex mode on the controller.

Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-05 23:41:17 -07:00
Axel Lin 58956ba23e gpio: tps65910: add missing breaks in tps65910_gpio_init
Signed-off-by: Axel Lin <axel.lin@gmail.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-07-05 23:17:08 -06:00
Linus Torvalds a2fa83faf4 Merge branch 'usb-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6
* 'usb-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6:
  USB: fix regression occurring during device removal
  USB: fsl_udc_core: fix build breakage when building for ARM arch
2011-07-05 20:57:45 -07:00
Linus Torvalds 2d12a18b89 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6:
  mfd: Add Makefile and Kconfig Entries for tps65911 comparator
  mfd: Fix build error for tps65911-comparator.c
  Revert "mfd: Add omap-usbhs runtime PM support"
  input: pmic8xxx-pwrkey: Do not use mfd_get_data()
  input: pmic8xxx-keypad: Do not use mfd_get_data()
2011-07-05 20:57:08 -07:00
Shan Wei 712ae51afd net: vlan: enable soft features regardless of underlying device
If gso/gro feature of underlying device is turned off,
then new created vlan device never can turn gso/gro on. 

Although underlying device don't support TSO, we still
should use software segments for vlan device.

Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-05 20:43:12 -07:00
Peter Chubb b49c78d482 x86, reboot: Acer Aspire One A110 reboot quirk
Since git commit
  660e34cebf x86: reorder reboot method
  preferences,
my Acer Aspire One hangs on reboot.  It appears that its ACPI method
for rebooting is broken.  The attached patch adds a quirk so that the
machine will reboot via the BIOS.

[ hpa: verified that the ACPI control on this machine is just plain broken. ]

Signed-off-by: Peter Chubb <peter.chubb@nicta.com.au>
Link: http://lkml.kernel.org/r/w439iki5vl.wl%25peter@chubb.wattle.id.au
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2011-07-05 19:43:23 -07:00
Shreyas Bhatewara 5318d809d7 vmxnet3: fix starving rx ring whenoc_skb kb fails
If the rx ring is completely empty, then the device may never fire an rx
interrupt. Unfortunately, the rx interrupt is what triggers populating the
rx ring with fresh buffers, so this will cause networking to lock up.

This patch replenishes the skb in recv descriptor as soon as it is
peeled off while processing rx completions. If the skb/buffer
allocation fails, existing one is recycled and the packet in hand is
dropped. This way none of the RX desc is ever left empty, thus avoiding
starvation

Signed-off-by: Scott J. Goldman <scottjg@vmware.com>
Signed-off-by: Shreyas N Bhatewara <sbhatewara@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-05 18:39:40 -07:00
Herbert Xu 44661462ee bridge: Always flood broadcast packets
As is_multicast_ether_addr returns true on broadcast packets as
well, we need to explicitly exclude broadcast packets so that
they're always flooded.  This wasn't an issue before as broadcast
packets were considered to be an unregistered multicast group,
which were always flooded.  However, as we now only flood such
packets to router ports, this is no longer acceptable.

Reported-by: Michael Guntsche <mike@it-loops.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-05 18:39:39 -07:00
Linus Torvalds 121782a248 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
  ceph: fix sync and dio writes across stripe boundaries
  libceph: fix page calculation for non-page-aligned io
  ceph: fix page alignment corrections
2011-07-05 13:15:57 -07:00
Linus Torvalds a8728d3554 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hch/hfsplus
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hch/hfsplus:
  hfsplus: Fix double iput of the same inode in hfsplus_fill_super()
  hfsplus: add missing call to bio_put()
2011-07-05 10:04:27 -07:00
Peter Zijlstra e4c2fb0d57 sched: Disable (revert) SCHED_LOAD_SCALE increase
Alex reported that commit c8b281161d ("sched: Increase
SCHED_LOAD_SCALE resolution") caused a power usage regression
under light load as it increases the number of load-balance
operations and keeps idle cpus from staying idle.

Time has run out to find the root cause for this release so
disable the feature for v3.0 until we can figure out what
causes the problem.

Reported-by: "Alex, Shi" <alex.shi@intel.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Nikhil Rao <ncrao@google.com>
Cc: Ming Lei <tom.leiming@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/n/tip-m4onxn0sxnyn5iz9o88eskc3@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-05 11:28:18 +02:00