Commit Graph

200413 Commits

Author SHA1 Message Date
Linus Torvalds 90ec781973 Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
  module: fix bne2 "gave up waiting for init of module libcrc32c"
  module: verify_export_symbols under the lock
  module: move find_module check to end
  module: make locking more fine-grained.
  module: Make module sysfs functions private.
  module: move sysfs exposure to end of load_module
  module: fix kdb's illicit use of struct module_use.
  module: Make the 'usage' lists be two-way
2010-06-04 21:09:48 -07:00
Rusty Russell 9bea7f2395 module: fix bne2 "gave up waiting for init of module libcrc32c"
Problem: it's hard to avoid an init routine stumbling over a
request_module these days.  And it's not clear it's always a bad idea:
for example, a module like kvm with dynamic dependencies on kvm-intel
or kvm-amd would be neater if it could simply request_module the right
one.

In this particular case, it's libcrc32c:

	libcrc32c_mod_init
	 crypto_alloc_shash
	  crypto_alloc_tfm
	   crypto_find_alg
	    crypto_alg_mod_lookup
	     crypto_larval_lookup
	      request_module

If another module is waiting inside resolve_symbol() for libcrc32c to
finish initializing (ie. bne2 depends on libcrc32c) then it does so
holding the module lock, and our request_module() can't make progress
until that is released.

Waiting inside resolve_symbol() without the lock isn't all that hard:
we just need to pass the -EBUSY up the call chain so we can sleep
where we don't hold the lock.  Error reporting is a bit trickier: we
need to copy the name of the unfinished module before releasing the
lock.

Other notes:
1) This also fixes a theoretical issue where a weak dependency would allow
   symbol version mismatches to be ignored.
2) We rename use_module to ref_module to make life easier for the only
   external user (the out-of-tree ksplice patches).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Tim Abbot <tabbott@ksplice.com>
Tested-by: Brandon Philips <bphilips@suse.de>
2010-06-05 11:17:37 +09:30
Rusty Russell be593f4ce4 module: verify_export_symbols under the lock
It disabled preempt so it was "safe", but nothing stops another module
slipping in before this module is added to the global list now we don't
hold the lock the whole time.

So we check this just after we check for duplicate modules, and just
before we put the module in the global list.

(find_symbol finds symbols in coming and going modules, too).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-06-05 11:17:37 +09:30
Linus Torvalds 3bafeb6247 module: move find_module check to end
I think Rusty may have made the lock a bit _too_ finegrained there, and
didn't add it to some places that needed it. It looks, for example, like
PATCH 1/2 actually drops the lock in places where it's needed
("find_module()" is documented to need it, but now load_module() didn't
hold it at all when it did the find_module()).

Rather than adding a new "module_loading" list, I think we should be able
to just use the existing "modules" list, and just fix up the locking a
bit.

In fact, maybe we could just move the "look up existing module" a bit
later - optimistically assuming that the module doesn't exist, and then
just undoing the work if it turns out that we were wrong, just before
adding ourselves to the list.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-06-05 11:17:37 +09:30
Rusty Russell 75676500f8 module: make locking more fine-grained.
Kay Sievers <kay.sievers@vrfy.org> reports that we still have some
contention over module loading which is slowing boot.

Linus also disliked a previous "drop lock and regrab" patch to fix the
bne2 "gave up waiting for init of module libcrc32c" message.

This is more ambitious: we only grab the lock where we need it.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Brandon Philips <brandon@ifup.org>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
2010-06-05 11:17:36 +09:30
Rusty Russell 6407ebb271 module: Make module sysfs functions private.
These were placed in the header in ef665c1a06 to get the various
SYSFS/MODULE config combintations to compile.

That may have been necessary then, but it's not now.  These functions
are all local to module.c.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
2010-06-05 11:17:36 +09:30
Rusty Russell 80a3d1bb41 module: move sysfs exposure to end of load_module
This means a little extra work, but is more logical: we don't put
anything in sysfs until we're about to put the module into the
global list an parse its parameters.

This also gives us a logical place to put duplicate module detection
in the next patch.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-06-05 11:17:36 +09:30
Rusty Russell c8e21ced08 module: fix kdb's illicit use of struct module_use.
Linus changed the structure, and luckily this didn't compile any more.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: Martin Hicks <mort@sgi.com>
2010-06-05 11:17:36 +09:30
Linus Torvalds 2c02dfe7fe module: Make the 'usage' lists be two-way
When adding a module that depends on another one, we used to create a
one-way list of "modules_which_use_me", so that module unloading could
see who needs a module.

It's actually quite simple to make that list go both ways: so that we
not only can see "who uses me", but also see a list of modules that are
"used by me".

In fact, we always wanted that list in "module_unload_free()": when we
unload a module, we want to also release all the other modules that are
used by that module.  But because we didn't have that list, we used to
first iterate over all modules, and then iterate over each "used by me"
list of that module.

By making the list two-way, we simplify module_unload_free(), and it
allows for some trivial fixes later too.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (cleaned & rebased)
2010-06-05 11:17:35 +09:30
Huang Weiyi ca7335948e X25: remove duplicated #include
Remove duplicated #include('s) in drivers/net/wan/x25_asy.c

Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-04 16:14:48 -07:00
Eric Dumazet c446492165 tcp: use correct net ns in cookie_v4_check()
Its better to make a route lookup in appropriate namespace.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-04 15:56:03 -07:00
Eric Dumazet ca55158c6e rps: tcp: fix rps_sock_flow_table table updates
I believe a moderate SYN flood attack can corrupt RFS flow table
(rps_sock_flow_table), making RPS/RFS much less effective.

Even in a normal situation, server handling short lived sessions suffer
from bad steering for the first data packet of a session, if another SYN
packet is received for another session.

We do following action in tcp_v4_rcv() :

	sock_rps_save_rxhash(sk, skb->rxhash);

We should _not_ do this if sk is a LISTEN socket, as about each
packet received on a LISTEN socket has a different rxhash than
previous one.
 -> RPS_NO_CPU markers are spread all over rps_sock_flow_table.

Also, it makes sense to protect sk->rxhash field changes with socket
lock (We currently can change it even if user thread owns the lock
and might use rxhash)

This patch moves sock_rps_save_rxhash() to a sock locked section,
and only for non LISTEN sockets.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-04 15:56:02 -07:00
Ben McKeegan 536e00e570 ppp_generic: fix multilink fragment sizes
Fix bug in multilink fragment size calculation introduced by
commit 9c705260fe
"ppp: ppp_mp_explode() redesign"

Signed-off-by: Ben McKeegan <ben@netservers.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-04 15:56:01 -07:00
Florian Westphal 57f1553ee5 syncookies: remove Kconfig text line about disabled-by-default
syncookies default to on since
e994b7c901
(tcp: Don't make syn cookies initial setting depend on CONFIG_SYSCTL).

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-04 15:56:01 -07:00
John Fastabend ca73948166 ixgbe: only check pfc bits in hang logic if pfc is enabled
Only check pfc bits in hang logic if PFC is enabled.  Previously,
if DCB was enabled but PFC was disabled the incorrect pause
bits would be checked.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-04 15:56:00 -07:00
Steffen Klassert 8764ab2ca7 net: check for refcount if pop a stacked dst_entry
xfrm triggers a warning if dst_pop() drops a refcount
on a noref dst. This patch changes dst_pop() to
skb_dst_pop(). skb_dst_pop() drops the refcnt only
on a refcounted dst. Also we don't clone the child
dst_entry, so it is not refcounted and we can use
skb_dst_set_noref() in xfrm_output_one().

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-04 15:56:00 -07:00
Linus Torvalds 8ce655e737 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: wacom - add Cintiq 21UX2 and Intuos4 WL
  Input: ads7846 - fix compiler warning in ads7846_probe()
  Input: tps6507x-ts - a couple work queue cleanups
  Input: s3c2410_ts - tone down logging
  Input: s3c2410_ts - fix build error due to ADC Kconfig rename
2010-06-04 15:42:30 -07:00
Linus Torvalds 999fd1ab34 Merge git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6: (23 commits)
  sh: Make intc messages consistent via pr_fmt.
  sh: make sure static declaration on ms7724se
  sh: make sure static declaration on mach-migor
  sh: make sure static declaration on mach-ecovec24
  sh: make sure static declaration on mach-ap325rxa
  clocksource: sh_cmt: compute mult and shift before registration
  clocksource: sh_tmu: compute mult and shift before registration
  sh: PIO disabling for x3proto and urquell.
  sh: mach-sdk7786: conditionally disable PIO support.
  sh: support for platforms without PIO.
  usb: r8a66597-hcd pio to mmio accessor conversion.
  usb: gadget: r8a66597-udc pio to mmio accessor conversion.
  usb: gadget: m66592-udc pio to mmio accessor conversion.
  sh: add romImage MMCIF boot for sh7724 and Ecovec V2
  sh: add boot code to MMCIF driver header
  sh: prepare MMCIF driver header file
  sh: allow romImage data between head.S and the zero page
  sh: Add support MMCIF for ecovec
  sh: remove duplicated #include
  input: serio: disable i8042 for non-cayman sh platforms.
  ...
2010-06-04 15:42:09 -07:00
Linus Torvalds 9a9620db07 Merge branch 'linux_next' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/i7core
* 'linux_next' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/i7core: (83 commits)
  i7core_edac: Better describe the supported devices
  Add support for Westmere to i7core_edac driver
  i7core_edac: don't free on success
  i7core_edac: Add support for X5670
  Always call i7core_[ur]dimm_check_mc_ecc_err
  i7core_edac: fix memory leak of i7core_dev
  EDAC: add __init to i7core_xeon_pci_fixup
  i7core_edac: Fix wrong device id for channel 1 devices
  i7core: add support for Lynnfield alternate address
  i7core_edac: Add initial support for Lynnfield
  i7core_edac: do not export static functions
  edac: fix i7core build
  edac: i7core_edac produces undefined behaviour on 32bit
  i7core_edac: Use a more generic approach for probing PCI devices
  i7core_edac: PCI device is called NONCORE, instead of NOCORE
  i7core_edac: Fix ringbuffer maxsize
  i7core_edac: First store, then increment
  i7core_edac: Better parse "any" addrmask
  i7core_edac: Use a lockless ringbuffer
  edac: Create an unique instance for each kobj
  ...
2010-06-04 15:39:54 -07:00
Linus Torvalds e620d1e39a Merge branch 'v4l_for_2.6.35' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6
* 'v4l_for_2.6.35' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6: (87 commits)
  V4L/DVB: ivtv: Timing tweaks and code re-order to try and improve stability
  V4L/DVB: ivtv: Avoid accidental video standard change
  V4L/DVB: ivtvfb : Module load / unload fixes
  V4L/DVB: cx2341x: Report correct temporal setting for log-status
  V4L/DVB: cx18, cx23885, v4l2 doc, MAINTAINERS: Update Andy Walls' email address
  V4L/DVB: drivers/media: Eliminate a NULL pointer dereference
  V4L/DVB: dvb-core: Fix ULE decapsulation bug
  V4L/DVB: Bug fix: make IR work again for dm1105
  V4L/DVB: media/IR: nec-decoder needs to select BITREV
  V4L/DVB: video/saa7134: change dprintk() to i2cdprintk()
  V4L/DVB: video/saa7134: remove duplicate break
  V4L/DVB: IR/imon: add auto-config for 0xffdc rf device
  V4L/DVB: IR/imon: clean up usage of bools
  V4L/DVB: em28xx: remove unneeded null checks
  V4L/DVB: ngene: remove unused #include <linux/version.h>
  V4L/DVB: ak881x needs slab.h
  V4L/DVB: FusionHDTV: Use quick reads for I2C IR device probing
  V4L/DVB: Technotrend S2-3200 ships with a TT 1500 remote
  V4L/DVB: drivers/media: Use kzalloc
  V4L/DVB: m920x: Select simple tuner
  ...
2010-06-04 15:38:12 -07:00
Linus Torvalds d2dd328b7f Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block: (27 commits)
  block: make blk_init_free_list and elevator_init idempotent
  block: avoid unconditionally freeing previously allocated request_queue
  pipe: change /proc/sys/fs/pipe-max-pages to byte sized interface
  pipe: change the privilege required for growing a pipe beyond system max
  pipe: adjust minimum pipe size to 1 page
  block: disable preemption before using sched_clock()
  cciss: call BUG() earlier
  Preparing 8.3.8rc2
  drbd: Reduce verbosity
  drbd: use drbd specific ratelimit instead of global printk_ratelimit
  drbd: fix hang on local read errors while disconnected
  drbd: Removed the now empty w_io_error() function
  drbd: removed duplicated #includes
  drbd: improve usage of MSG_MORE
  drbd: need to set socket bufsize early to take effect
  drbd: improve network latency, TCP_QUICKACK
  drbd: Revert "drbd: Create new current UUID as late as possible"
  brd: support discard
  Revert "writeback: fix WB_SYNC_NONE writeback from umount"
  Revert "writeback: ensure that WB_SYNC_NONE writeback with sb pinned is sync"
  ...
2010-06-04 15:37:44 -07:00
Linus Torvalds c1518f12ba Merge branch 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6
* 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6:
  gconfig: fix build failure on fedora 13
2010-06-04 15:37:21 -07:00
Linus Torvalds a094c0afc3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging-2.6: (27 commits)
  Staging: sep: return -EFAULT on copy_to_user errors
  Staging: rc2860: return -EFAULT on copy_to_user errors
  Staging: Eliminate a NULL pointer dereference
  staging: Use GFP_ATOMIC when a lock is held
  Staging: comedi - correct parameter gainlkup for DAQCard-6024E in driver ni_mio_cs.c
  Staging: comedi: fixing ni_labpc to mite dependancy
  Staging: wlags49_h2, wlags49_h25: fixed Kconfig dependencies
  Staging: phison: depends on ATA_BMDMA
  Staging: iio-utils: fix memory overflow for dynamically allocateded memory to hold filename
  Staging: adis16255: add proper section markings to hotplug funcs
  Staging: adis16255: fix typo in Kconfig
  Staging: batman-adv: Don't allocate icmp packet with GFP_KERNEL
  Staging: batman-adv: Don't call free_netdev twice
  Staging: batman-adv: Call unregister_netdev on failures to get rtnl lock
  Staging: batman-adv: fix rogue packets on shutdown
  Staging: add MSM framebuffer driver
  Staging: comedi: fixing ni_tio to mite PCI dependancy
  Staging: comedi: fix 8255 and DAS08 Kconfig dependancies.
  Staging: comedi: For COMEDI_BUFINFO, check access to command
  Staging: comedi: COMEDI_BUFINFO with no async - report no bytes read or written
  ...
2010-06-04 15:27:59 -07:00
Linus Torvalds f9196e7c03 Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6:
  fix setattr error handling in sysfs, configfs
  kobject: free memory if netlink_kernel_create() fails
  lib/kobject_uevent.c: fix CONIG_NET=n warning
2010-06-04 15:27:27 -07:00
Linus Torvalds bf4282cbcf Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6:
  serial: add support for various Titan PCI cards
  vt_ioctl: return -EFAULT on copy_from_user errors
  serial: altera_uart: Proper section for altera_uart_remove
  tty: fix a little bug in scrup, vt.c
  altera_uart: Simplify altera_uart_console_putc
  altera_uart: Don't take spinlock in already protected functions
  TTY/n_gsm: potential double lock
  serial: bfin_5xx: fix typo in IER check
  serial: bfin_5xx: IRDA is not affected by anomaly 05000230
  serial_cs: add and sort IDs for serial and modem cards
  msm_serial: fix serial on trout
2010-06-04 15:23:07 -07:00
Linus Torvalds d7940b04fa Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6:
  USB: unbind all interfaces before rebinding them
  USB: serial: digi_acceleport: Eliminate a NULL pointer dereference
  usb: fix ehci_hcd build failure when both generic-OF and xilinx is selected
  USB: cdc-acm: fix resource reclaim in error path of acm_probe
  USB: ftdi_sio: fix DTR/RTS line modes
  USB: s3c-hsotg: Ensure FIFOs are fully flushed after layout
  USB: s3c-hsotg: SoftDisconnect minimum 3ms
  USB: s3c-hsotg: Ensure TX FIFO addresses setup when initialising FIFOs
  USB: s3c_hsotg: define USB_GADGET_DUALSPEED in Kconfig
  USB: s3c: Enable soft disconnect during initialization
  USB: xhci: Print NEC firmware version.
  USB: xhci: Wait for host to start running.
  USB: xhci: Wait for controller to be ready after reset.
  USB: isp1362: fix inw warning on Blackfin systems
  USB: mos7840: fix null-pointer dereference
2010-06-04 15:22:31 -07:00
Cory Maccarrone 683eb94777 omap: remove BUG_ON for disabled interrupts
Remove a BUG_ON for when interrupts are disabled during an MMC request.

During boot, interrupts can be disabled when a request is made, causing
this bug to be triggered.  In reality, there's no reason this should halt
the kernel, as the driver has proved reliable in spite of disabled
interrupts, and additionally, there's nothing in this code that would
require interrupts to be enabled.

The only setup I've managed to make it trigger on is on the HTC Herald
during bootup when the driver is built into the kernel (mostly because
that's all I have).  I believe it's related to the fact that on bootup I
get many timeout errors on "CMD5" while initializing the card.  Each CMD5
timeout triggers that bug (I changed it to a WARN_ON to get it to boot in)
due to the fact that part of the timeout code involves sending the request
again.  With interrupts turned off, that BUG would be triggered.

Signed-off-by: Cory Maccarrone <darkstar6262@gmail.com>
Acked-by: Tony Lindgren <tony@atomide.com>
Cc: <linux-mmc@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-06-04 15:21:45 -07:00
KOSAKI Motohiro bb21c7ce18 vmscan: fix do_try_to_free_pages() return value when priority==0 reclaim failure
Greg Thelen reported recent Johannes's stack diet patch makes kernel hang.
 His test is following.

  mount -t cgroup none /cgroups -o memory
  mkdir /cgroups/cg1
  echo $$ > /cgroups/cg1/tasks
  dd bs=1024 count=1024 if=/dev/null of=/data/foo
  echo $$ > /cgroups/tasks
  echo 1 > /cgroups/cg1/memory.force_empty

Actually, This OOM hard to try logic have been corrupted since following
two years old patch.

	commit a41f24ea9f
	Author: Nishanth Aravamudan <nacc@us.ibm.com>
	Date:   Tue Apr 29 00:58:25 2008 -0700

	    page allocator: smarter retry of costly-order allocations

Original intention was "return success if the system have shrinkable zones
though priority==0 reclaim was failure".  But the above patch changed to
"return nr_reclaimed if .....".  Oh, That forgot nr_reclaimed may be 0 if
priority==0 reclaim failure.

And Johannes's patch 0aeb2339e5 ("vmscan: remove all_unreclaimable scan
control") made it more corrupt.  Originally, priority==0 reclaim failure
on memcg return 0, but this patch changed to return 1.  It totally
confused memcg.

This patch fixes it completely.

Reported-by: Greg Thelen <gthelen@google.com>
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Tested-by: Greg Thelen <gthelen@google.com>
Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-06-04 15:21:45 -07:00
Akinobu Mita 9e506f7adc kernel/: fix BUG_ON checks for cpu notifier callbacks direct call
The commit 80b5184cc5 ("kernel/: convert cpu
notifier to return encapsulate errno value") changed the return value of
cpu notifier callbacks.

Those callbacks don't return NOTIFY_BAD on failures anymore.  But there
are a few callbacks which are called directly at init time and checking
the return value.

I forgot to change BUG_ON checking by the direct callers in the commit.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-06-04 15:21:45 -07:00
Greg Thelen 94b3dd0f7b cgroups: alloc_css_id() increments hierarchy depth
Child groups should have a greater depth than their parents.  Prior to
this change, the parent would incorrectly report zero memory usage for
child cgroups when use_hierarchy is enabled.

test script:
  mount -t cgroup none /cgroups -o memory
  cd /cgroups
  mkdir cg1

  echo 1 > cg1/memory.use_hierarchy
  mkdir cg1/cg11

  echo $$ > cg1/cg11/tasks
  dd if=/dev/zero of=/tmp/foo bs=1M count=1

  echo
  echo CHILD
  grep cache cg1/cg11/memory.stat

  echo
  echo PARENT
  grep cache cg1/memory.stat

  echo $$ > tasks
  rmdir cg1/cg11 cg1
  cd /
  umount /cgroups

Using fae9c79, a recent patch that changed alloc_css_id() depth computation,
the parent incorrectly reports zero usage:
  root@ubuntu:~# ./test
  1+0 records in
  1+0 records out
  1048576 bytes (1.0 MB) copied, 0.0151844 s, 69.1 MB/s

  CHILD
  cache 1048576
  total_cache 1048576

  PARENT
  cache 0
  total_cache 0

With this patch, the parent correctly includes child usage:
  root@ubuntu:~# ./test
  1+0 records in
  1+0 records out
  1048576 bytes (1.0 MB) copied, 0.0136827 s, 76.6 MB/s

  CHILD
  cache 1052672
  total_cache 1052672

  PARENT
  cache 0
  total_cache 1052672

Signed-off-by: Greg Thelen <gthelen@google.com>
Acked-by: Paul Menage <menage@google.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Li Zefan <lizf@cn.fujitsu.com>
Cc: <stable@kernel.org>		[2.6.34.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-06-04 15:21:45 -07:00
Heiko Carstens 007d08678e lib: add s390 to atomic64_dec_if_positive archs
Add s390 to list of architectures that have atomic64_dec_if_positive
implemented so we get rid of this warning:

lib/atomic64_test.c:129:2: warning: #warning Please implement
atomic64_dec_if_positive for your architecture, and add it to the IF above

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Luca Barbieri <luca@luca-barbieri.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-06-04 15:21:45 -07:00
Thadeu Lima de Souza Cascardo b1413357d9 fbdev: fix frame buffer devices menu
Commit f601441916 ("imxfb: add support for
i.MX25:) has inserted the symbol HAVE_FB_IMX, which does not depend on FB
after the menuconfig FB.  This breaks the menu, presenting most of the
drivers outside of it, when using menuconfig.

Moving the symbol to the start of the file, just like HAVE_FB_ATMEL, fixes
the problem without breaking it for iMX25 configurations (tested with
ARCH=arm, no build).

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@holoscopio.com>
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Baruch Siach <baruch@tkos.co.il>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-06-04 15:21:45 -07:00
Cesar Eduardo Barros fc0ccfceb8 arch/um: fix kunmap_atomic() call in skas/uaccess.c
kunmap_atomic() takes a pointer to within the page, not the struct page.

Signed-off-by: Cesar Eduardo Barros <cesarb@cesarb.net>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-06-04 15:21:45 -07:00
Oleg Nesterov 485d527686 sys_personality: change sys_personality() to accept "unsigned int" instead of u_long
task_struct->pesonality is "unsigned int", but sys_personality() paths use
"unsigned long pesonality".  This means that every assignment or
comparison is not right.  In particular, if this argument does not fit
into "unsigned int" __set_personality() changes the caller's personality
and then sys_personality() returns -EINVAL.

Turn this argument into "unsigned int" and avoid overflows.  Obviously,
this is the user-visible change, we just ignore the upper bits.  But this
can't break the sane application.

There is another thing which can confuse the poorly written applications.
User-space thinks that this syscall returns int, not long.  This means
that the returned value can be negative and look like the error code.  But
note that libc won't be confused and thus errno won't be set, and with
this patch the user-space can never get -1 unless sys_personality() really
fails.  And, most importantly, the negative RET != -1 is only possible if
that app previously called personality(RET).

Pointed-out-by: Wenming Zhang <wezhang@redhat.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-06-04 15:21:45 -07:00
Albert Herranz d6d03f9158 fb_defio: redo fix for non-dirty ptes
As pointed by Nick Piggin, ->page_mkwrite provides a way to keep a page
locked until the associated PTE is marked dirty.

Re-implement the fix by using this mechanism.

Signed-off-by: Albert Herranz <albert_herranz@yahoo.es>
Acked-by: Jaya Kumar <jayakumar.lkml@gmail.com>
Acked-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-06-04 15:21:45 -07:00
Albert Herranz 3f505ca457 Revert "fb_defio: fix for non-dirty ptes"
This reverts commit 49bbd815fd ("fb_defio:
fix for non-dirty ptes").

Although the fix provided is correct, it's been suggested to avoid the
underlying race in the same way as it is currently done in filesystems
like NFS, for maintainability.

A following patch "fb_defio: redo fix for non-dirty ptes" will provide
such an alternate fix.

Signed-off-by: Albert Herranz <albert_herranz@yahoo.es>
Cc: Jaya Kumar <jayakumar.lkml@gmail.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-06-04 15:21:45 -07:00
Mike Frysinger 1da083c9b2 flat: fix unmap len in load error path
The data chunk is mmaped with 'len' which remains unchanged, so use that
when unmapping in the error path rather than trying to recalculate (and
incorrectly so) the value used originally.

Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Acked-by: David McCullough <davidm@snapgear.com>
Acked-by: Greg Ungerer <gerg@uclinux.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-06-04 15:21:45 -07:00
Mike Frysinger 2e94de8acb fs/binfmt_flat.c: split the stack & data alignments
The stack and data have different alignment requirements, so don't force
them to wear the same shoe.  Increase the data alignment to match that
which the elf2flt linker script has always been using: 0x20 bytes.  Not
only does this bring the kernel loader in line with the toolchain, but it
also fixes a swath of gcc tests which try to force larger alignment values
but randomly fail when the FLAT loader fails to deliver.

Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: David Woodhouse <David.Woodhouse@intel.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Acked-by: David McCullough <davidm@snapgear.com>
Acked-by: Greg Ungerer <gerg@uclinux.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Tested-by: Michal Simek <monstr@monstr.eu>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Jie Zhang <jie@codesourcery.com>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-06-04 15:21:44 -07:00
Dmitry Torokhov 55adaa495e vmware balloon: clamp number of collected non-balloonable pages
Limit number of accumulated non-balloonable pages during inflation cycle,
otherwise there is a chance we will be spinning and growing the list
forever.  This happens during torture tests when balloon target changes
while we are in the middle of inflation cycle and monitor starts refusing
to lock pages (since they are not needed anymore).

Signed-off-by: Dmitry Torokhov <dtor@vmware.com>
Acked-by: Bhavesh Davda <bhavesh@vmware.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-06-04 15:21:44 -07:00
Nick Piggin f76f5d7104 xtensa: invoke oom-killer from page fault
As explained in commit 1c0fe6e3bd ("mm: invoke oom-killer from page
fault") , we want to call the architecture independent oom killer when
getting an unexplained OOM from handle_mm_fault, rather than simply
killing current.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Chris Zankel <chris@zankel.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-06-04 15:21:44 -07:00
Nick Piggin c421b08ef5 mn10300: invoke oom-killer from page fault
As explained in commit 1c0fe6e3bd ("mm: invoke oom-killer from page
fault") , we want to call the architecture independent oom killer when
getting an unexplained OOM from handle_mm_fault, rather than simply
killing current.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-06-04 15:21:44 -07:00
Nick Piggin 68db30ce60 m32r: invoke oom-killer from page fault
As explained in commit 1c0fe6e3bd ("mm: invoke oom-killer from page
fault") , we want to call the architecture independent oom killer when
getting an unexplained OOM from handle_mm_fault, rather than simply
killing current.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-06-04 15:21:44 -07:00
Nick Piggin f9c497c4ae frv: invoke oom-killer from page fault
As explained in commit 1c0fe6e3bd ("mm: invoke oom-killer from page
fault") , we want to call the architecture independent oom killer when
getting an unexplained OOM from handle_mm_fault, rather than simply
killing current.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Acked-by: David Howells <dhowells@redhat.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-06-04 15:21:44 -07:00
Heiko Carstens b7e5d1f041 ramoops: add HAS_IOMEM dependency
The driver fails to compile on s390:

drivers/char/ramoops.c: In function 'ramoops_init':
drivers/char/ramoops.c:122: error: implicit declaration of function 'ioremap'

Since we won't make use of the driver anyway on s390 just let it depend on
HAS_IOMEM.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Marco Stornelli <marco.stornelli@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-06-04 15:21:44 -07:00
Heiko Carstens 7cbe17701a fs/compat_rw_copy_check_uvector: add missing compat_ptr call
A call to access_ok is missing a compat_ptr conversion.  Introduced with
b83733639a "compat: factor out
compat_rw_copy_check_uvector from compat_do_readv_writev"

fs/compat.c: In function 'compat_rw_copy_check_uvector':
fs/compat.c:629: warning: passing argument 1 of '__access_ok' makes pointer from integer without a cast

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-06-04 15:21:44 -07:00
Maurus Cuelenaere eaa6e4dd4b rtc: s3c: initialize s3c_rtc_cpu_type before using it
Make sure s3c_rtc_cpu_type is initialised _before_ it's used in an if()
check.

Reported-by: Jiri Pinkava <jiri.pinkava@vscht.cz>
Signed-off-by: Maurus Cuelenaere <mcuelenaere@gmail.com>
Cc: Paul Gortmaker <p_gortmaker@yahoo.com>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: Maurus Cuelenaere <mcuelenaere@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-06-04 15:21:44 -07:00
Maurus Cuelenaere e893de59a4 rtc: s3c: initialize driver data before using it
s3c_rtc_setfreq() uses the platform driver data to derive struct rtc_device,
so make sure drvdata is set _before_ s3c_rtc_setfreq() is called.

Signed-off-by: Maurus Cuelenaere <mcuelenaere@gmail.com>
Cc: Paul Gortmaker <p_gortmaker@yahoo.com>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: Maurus Cuelenaere <mcuelenaere@gmail.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-06-04 15:21:44 -07:00
Andrew Hendry 01afaf6198 Minix: Clean up left over label
Remove a left over fail label.

Signed-off-by: Andrew Hendry <andrew.hendry@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-06-04 17:16:30 -04:00
Nick Piggin af5a30d8cf fix truncate inode time modification breakage
mtime and ctime should be changed only if the file size has actually
changed. Patches changing ext2 and tmpfs from vmtruncate to new truncate
sequence has caused regressions where they always update timestamps.

There is some strange cases in POSIX where truncate(2) must not update
times unless the size has acutally changed, see 6e656be89.

This area is all still rather buggy in different ways in a lot of
filesystems and needs a cleanup and audit (ideally the vfs will provide
a simple attribute or call to direct all filesystems exactly which
attributes to change). But coming up with the best solution will take a
while and is not appropriate for rc anyway.

So fix recent regression for now.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-06-04 17:16:30 -04:00
Nick Piggin 8718d36cf9 fix setattr error handling in sysfs, configfs
sysfs and configfs setattr functions have error cases after the generic inode's
attributes have been changed. Fix consistency by changing the generic inode
attributes only when it is guaranteed to succeed.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-06-04 17:16:29 -04:00