Commit Graph

44032 Commits

Author SHA1 Message Date
Paul Parsons 74e32d1b68 mfd: Fix ASIC3 SD Host Controller Configuration size
The size of the TC6380AF SD Host Controller Configuration area is 0x200 bytes (assuming registers are aligned on 32-bit boundaries), not 0x400 bytes. Source: Toshiba TC6380AF Specification sections 4.2 and 4.3.1

Signed-off-by: Paul Parsons <lost.distance@yahoo.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2011-05-26 19:45:51 +02:00
Paul Parsons 7d9e7e9fbd leds: Add ASIC3 LED support
Add LED support for the HTC ASIC3. Underlying support is provided by the mfd/asic3 and leds/leds-asic3 drivers. An example configuration is provided by the pxa/hx4700 platform.

Signed-off-by: Paul Parsons <lost.distance@yahoo.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2011-05-26 19:45:46 +02:00
Peter Ujfalusi 4a7c00cd94 mfd: Update twl4030-code maintainer e-mail address
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2011-05-26 19:45:45 +02:00
Linus Walleij 863dde5bfa mfd: Provide ab8500-core enumerators for chip cuts
Since functionality in MFD cells may need to be adjusted according to
chip revision, let's enumerate them and keep track of them.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2011-05-26 19:45:42 +02:00
Graeme Gregory 6523b148b4 mfd: Fix twl6030 irq definitions
The charger fault IRQs from the twl will in future patches be handled
by a seperate IRQ handler in the charger driver than the general charger
IRQ. Give them different IRQ numbers now to allow the charger driver to
be merged in the future.

Signed-off-by: Graeme Gregory <gg@slimlogic.co.uk>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2011-05-26 19:45:40 +02:00
Graeme Gregory 521d8ec3f0 mfd: Add phoenix lite (twl6025) support to twl6030
Phoenix Lite is based on the twl6030 family of PMICs. It has mostly the
same feature set of twl6030 but with small changes. The codec block has
also been removed. It also has a new charger block and new features in
its ADC block. VUSB handling also differs.

Signed-off-by: Graeme Gregory <gg@slimlogic.co.uk>
Reviewed-by: Mark Brown <broonie@opensource.wolfsonicro.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2011-05-26 19:45:39 +02:00
Haojian Zhuang 008b30408c mfd: Add rtc support to 88pm860x
Enable rtc function in 88pm860x PMIC.

Signed-off-by: Haojian Zhuang <haojian.zhuang@marvell.com>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2011-05-26 19:45:34 +02:00
Abhijeet Dharmapurikar c013f0a56c mfd: Add pm8xxx irq support
Add support for the irq controller in Qualcomm 8xxx pmic. The 8xxx
interrupt controller provides control for gpio and mpp configured as
interrupts in addition to other subdevice interrupts. The interrupt
controller also provides a way to read the real time status of an
interrupt. This real time status is the only way one can get the
input values of gpio and mpp lines.

Signed-off-by: Abhijeet Dharmapurikar <adharmap@codeaurora.org>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2011-05-26 19:45:28 +02:00
Abhijeet Dharmapurikar cbdb53e1f3 mfd: Add Qualcomm PMIC 8921 core driver
Add support for the Qualcomm PM8921 PMIC chip. The core driver
will communicate with the PMIC chip via the MSM SSBI bus.

Signed-off-by: Abhijeet Dharmapurikar <adharmap@codeaurora.org>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2011-05-26 19:45:27 +02:00
Lesly A M ca972d1338 mfd: TWL5030 version checking in twl-core
Added API to get the TWL5030 Si version from the IDCODE register.
It is used for enabling the workaround for TWL erratum 27.

Signed-off-by: Lesly A M <leslyam@ti.com>
Cc: Nishanth Menon <nm@ti.com>
Cc: David Derrick <dderrick@ti.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2011-05-26 19:45:24 +02:00
Lesly A M d7ac829fa3 mfd: Modifying the twl4030-power macro name Main_Ref to all caps
Modifying the macro name Main_Ref to all caps(MAIN_REF).

Suggested by Nishanth Menon <nm@ti.com>

Signed-off-by: Lesly A M <leslyam@ti.com>
Cc: Nishanth Menon <nm@ti.com>
Cc: David Derrick <dderrick@ti.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2011-05-26 19:45:23 +02:00
Mark Brown 0b14c22ea1 mfd: Provide platform data for WM831x GPIO configuration
Allow the GPIO mode of WM831x devices to be configured using platform data.
Users may provide a table of GPIO register values in gpio_defaults[]. In
order to allow 0 to be set explicitly out of range values are accepted and
masked off, with a WM831X_GPIO_CONFIGURE define provided to set an out of
range value.

This can be used to configure higher numbered GPIOs or override values set
in OTP for GPIOs configured using OTP.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2011-05-26 19:45:18 +02:00
Mark Brown 8997619a04 mfd: Remove compatibility interface for WM831x specific IRQ API
The last user was removed in the merge window.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2011-05-26 19:45:17 +02:00
Samuel Ortiz ba279f58c6 mfd: Remove mfd_data
Cell pointers are passed through device->mfd_cell and platform data
is passed through the MFD cell platform_data pointer.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2011-05-26 19:45:16 +02:00
Samuel Ortiz 1f235a3785 mfd: Use mfd cell platform_data for ab3550 cells platform bits
With the addition of a platform device mfd_cell pointer, MFD drivers
can go back to passing platform data back to their sub drivers.
This allows for an mfd_cell->mfd_data removal and thus keep the sub drivers
MFD agnostic. This is mostly needed for non MFD aware sub drivers.

Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2011-05-26 19:44:57 +02:00
Samuel Ortiz eb8956074e mfd: Add platform data pointer back
Now that we have a way to pass MFD cells down to the sub drivers,
we can gradually get rid of mfd_data by putting the platform pointer
back in place.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2011-05-26 19:44:56 +02:00
Jozsef Kadlecsik b141c242ff netfilter: ipset: remove unused variable from type_pf_tdel()
Variable 'ret' is set in type_pf_tdel() but not used, remove.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2011-05-26 19:08:05 +02:00
Jozsef Kadlecsik 249ddc79a3 netfilter: ipset: Use proper timeout value to jiffies conversion
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2011-05-26 19:07:32 +02:00
Linus Torvalds 35806b4f7c Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (61 commits)
  jbd2: Add MAINTAINERS entry
  jbd2: fix a potential leak of a journal_head on an error path
  ext4: teach ext4_ext_split to calculate extents efficiently
  ext4: Convert ext4 to new truncate calling convention
  ext4: do not normalize block requests from fallocate()
  ext4: enable "punch hole" functionality
  ext4: add "punch hole" flag to ext4_map_blocks()
  ext4: punch out extents
  ext4: add new function ext4_block_zero_page_range()
  ext4: add flag to ext4_has_free_blocks
  ext4: reserve inodes and feature code for 'quota' feature
  ext4: add support for multiple mount protection
  ext4: ensure f_bfree returned by ext4_statfs() is non-negative
  ext4: protect bb_first_free in ext4_trim_all_free() with group lock
  ext4: only load buddy bitmap in ext4_trim_fs() when it is needed
  jbd2: Fix comment to match the code in jbd2__journal_start()
  ext4: fix waiting and sending of a barrier in ext4_sync_file()
  jbd2: Add function jbd2_trans_will_send_data_barrier()
  jbd2: fix sending of data flush on journal commit
  ext4: fix ext4_ext_fiemap_cb() to handle blocks before request range correctly
  ...
2011-05-26 09:53:20 -07:00
Linus Torvalds 32e51f141f Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (25 commits)
  cifs: remove unnecessary dentry_unhash on rmdir/rename_dir
  ocfs2: remove unnecessary dentry_unhash on rmdir/rename_dir
  exofs: remove unnecessary dentry_unhash on rmdir/rename_dir
  nfs: remove unnecessary dentry_unhash on rmdir/rename_dir
  ext2: remove unnecessary dentry_unhash on rmdir/rename_dir
  ext3: remove unnecessary dentry_unhash on rmdir/rename_dir
  ext4: remove unnecessary dentry_unhash on rmdir/rename_dir
  btrfs: remove unnecessary dentry_unhash in rmdir/rename_dir
  ceph: remove unnecessary dentry_unhash calls
  vfs: clean up vfs_rename_other
  vfs: clean up vfs_rename_dir
  vfs: clean up vfs_rmdir
  vfs: fix vfs_rename_dir for FS_RENAME_DOES_D_MOVE filesystems
  libfs: drop unneeded dentry_unhash
  vfs: update dentry_unhash() comment
  vfs: push dentry_unhash on rename_dir into file systems
  vfs: push dentry_unhash on rmdir into file systems
  vfs: remove dget() from dentry_unhash()
  vfs: dentry_unhash immediately prior to rmdir
  vfs: Block mmapped writes while the fs is frozen
  ...
2011-05-26 09:52:14 -07:00
KOSAKI Motohiro ca16d140af mm: don't access vm_flags as 'int'
The type of vma->vm_flags is 'unsigned long'. Neither 'int' nor
'unsigned int'. This patch fixes such misuse.

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
[ Changed to use a typedef - we'll extend it to cover more cases
  later, since there has been discussion about making it a 64-bit
  type..                      - Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-26 09:20:31 -07:00
Dan Magenheimer 5bc20fc597 xen: cleancache shim to Xen Transcendent Memory
This patch provides a shim between the kernel-internal cleancache
API (see Documentation/mm/cleancache.txt) and the Xen Transcendent
Memory ABI (see http://oss.oracle.com/projects/tmem).

Xen tmem provides "hypervisor RAM" as an ephemeral page-oriented
pseudo-RAM store for cleancache pages, shared cleancache pages,
and frontswap pages.  Tmem provides enterprise-quality concurrency,
full save/restore and live migration support, compression
and deduplication.

A presentation showing up to 8% faster performance and up to 52%
reduction in sectors read on a kernel compile workload, despite
aggressive in-kernel page reclamation ("self-ballooning") can be
found at:

http://oss.oracle.com/projects/tmem/dist/documentation/presentations/TranscendentMemoryXenSummit2010.pdf

Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
Reviewed-by: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Rik Van Riel <riel@redhat.com>
Cc: Jan Beulich <JBeulich@novell.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Andreas Dilger <adilger@sun.com>
Cc: Ted Ts'o <tytso@mit.edu>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <joel.becker@oracle.com>
Cc: Nitin Gupta <ngupta@vflare.org>
2011-05-26 10:02:21 -06:00
Dan Magenheimer 077b1f83a6 mm: cleancache core ops functions and config
This third patch of eight in this cleancache series provides
the core code for cleancache that interfaces between the hooks in
VFS and individual filesystems and a cleancache backend.  It also
includes build and config patches.

Two new files are added: mm/cleancache.c and include/linux/cleancache.h.

Note that CONFIG_CLEANCACHE can default to on; in systems that do
not provide a cleancache backend, all hooks devolve to a simple
check of a global enable flag, so performance impact should
be negligible but can be reduced to zero impact if config'ed off.
However for this first commit, it defaults to off.

Details and a FAQ can be found in Documentation/vm/cleancache.txt

Credits: Cleancache_ops design derived from Jeremy Fitzhardinge
design for tmem

[v8: dan.magenheimer@oracle.com: fix exportfs call affecting btrfs]
[v8: akpm@linux-foundation.org: use static inline function, not macro]
[v7: dan.magenheimer@oracle.com: cleanup sysfs and remove cleancache prefix]
[v6: JBeulich@novell.com: robustly handle buggy fs encode_fh actor definition]
[v5: jeremy@goop.org: clean up global usage and static var names]
[v5: jeremy@goop.org: simplify init hook and any future fs init changes]
[v5: hch@infradead.org: cleaner non-global interface for ops registration]
[v4: adilger@sun.com: interface must support exportfs FS's]
[v4: hch@infradead.org: interface must support 64-bit FS on 32-bit kernel]
[v3: akpm@linux-foundation.org: use one ops struct to avoid pointer hops]
[v3: akpm@linux-foundation.org: document and ensure PageLocked reqts are met]
[v3: ngupta@vflare.org: fix success/fail codes, change funcs to void]
[v2: viro@ZenIV.linux.org.uk: use sane types]
Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
Reviewed-by: Jeremy Fitzhardinge <jeremy@goop.org>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Al Viro <viro@ZenIV.linux.org.uk>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Nitin Gupta <ngupta@vflare.org>
Acked-by: Minchan Kim <minchan.kim@gmail.com>
Acked-by: Andreas Dilger <adilger@sun.com>
Acked-by: Jan Beulich <JBeulich@novell.com>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Rik Van Riel <riel@redhat.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ted Ts'o <tytso@mit.edu>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <joel.becker@oracle.com>
2011-05-26 10:01:36 -06:00
Dan Magenheimer 9fdfdcf171 fs: add field to superblock to support cleancache
This second patch of eight in this cleancache series adds a field to
the generic superblock to squirrel away a pool identifier that is
dynamically provided by cleancache-enabled filesystems at mount time
to uniquely identify files and pages belonging to this mounted filesystem.

Details and a FAQ can be found in Documentation/vm/cleancache.txt

[v8: trivial merge conflict update]
Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
Reviewed-by: Jeremy Fitzhardinge <jeremy@goop.org>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Rik Van Riel <riel@redhat.com>
Cc: Jan Beulich <JBeulich@novell.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Andreas Dilger <adilger@sun.com>
Cc: Ted Ts'o <tytso@mit.edu>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <joel.becker@oracle.com>
Cc: Nitin Gupta <ngupta@vflare.org>
2011-05-26 10:01:19 -06:00
Ingo Molnar de66ee979d Merge branch 'linus' into x86/urgent
Merge reason: we want to queue up a dependent patch.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-26 13:51:35 +02:00
Jan Kara ea13a86463 vfs: Block mmapped writes while the fs is frozen
We should not allow file modification via mmap while the filesystem is
frozen. So block in block_page_mkwrite() while the filesystem is frozen.
We cannot do the blocking wait in __block_page_mkwrite() since e.g. ext4
will want to call that function with transaction started in some cases
and that would deadlock. But we can at least do the non-blocking reliable
check in __block_page_mkwrite() which is the hardest part anyway.

We have to check for frozen filesystem with the page marked dirty and under
page lock with which we then return from ->page_mkwrite(). Only that way we
cannot race with writeback done by freezing code - either we mark the page
dirty after the writeback has started, see freezing in progress and block, or
writeback will wait for our page lock which is released only when the fault is
done and then writeback will writeout and writeprotect the page again.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-05-26 07:26:45 -04:00
Jan Kara 24da4fab5a vfs: Create __block_page_mkwrite() helper passing error values back
Create __block_page_mkwrite() helper which does all what block_page_mkwrite()
does except that it passes back errors from __block_write_begin /
block_commit_write calls.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-05-26 07:26:44 -04:00
Will Deacon 7b7bf499f7 ARM: 6913/1: sparsemem: allow pfn_valid to be overridden when using SPARSEMEM
In commit eb33575c ("[ARM] Double check memmap is actually valid with a
memmap has unexpected holes V2"), a new function, memmap_valid_within,
was introduced to mmzone.h so that holes in the memmap which pass
pfn_valid in SPARSEMEM configurations can be detected and avoided.

The fix to this problem checks that the pfn <-> page linkages are
correct by calculating the page for the pfn and then checking that
page_to_pfn on that page returns the original pfn. Unfortunately, in
SPARSEMEM configurations, this results in reading from the page flags to
determine the correct section. Since the memmap here has been freed,
junk is read from memory and the check is no longer robust.

In the best case, reading from /proc/pagetypeinfo will give you the
wrong answer. In the worst case, you get SEGVs, Kernel OOPses and hung
CPUs. Furthermore, ioremap implementations that use pfn_valid to
disallow the remapping of normal memory will break.

This patch allows architectures to provide their own pfn_valid function
instead of using the default implementation used by sparsemem. The
architecture-specific version is aware of the memmap state and will
return false when passed a pfn for a freed page within a valid section.

Acked-by: Mel Gorman <mgorman@suse.de>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-05-26 10:23:24 +01:00
Milton Miller ce2a40458e Remove unused MSG_ flags in linux/smp.h
Now that powerpc has removed its use of MSG_ALL_BUT_SELF and MSG_ALL
all these MSG_ flags are unused.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-26 13:38:58 +10:00
Steven Rostedt b1cff0ad10 ftrace: Add internal recursive checks
Witold reported a reboot caused by the selftests of the dynamic function
tracer. He sent me a config and I used ktest to do a config_bisect on it
(as my config did not cause the crash). It pointed out that the problem
config was CONFIG_PROVE_RCU.

What happened was that if multiple callbacks are attached to the
function tracer, we iterate a list of callbacks. Because the list is
managed by synchronize_sched() and preempt_disable, the access to the
pointers uses rcu_dereference_raw().

When PROVE_RCU is enabled, the rcu_dereference_raw() calls some
debugging functions, which happen to be traced. The tracing of the debug
function would then call rcu_dereference_raw() which would then call the
debug function and then... well you get the idea.

I first wrote two different patches to solve this bug.

1) add a __rcu_dereference_raw() that would not do any checks.
2) add notrace to the offending debug functions.

Both of these patches worked.

Talking with Paul McKenney on IRC, he suggested to add recursion
detection instead. This seemed to be a better solution, so I decided to
implement it. As the task_struct already has a trace_recursion to detect
recursion in the ring buffer, and that has a very small number it
allows, I decided to use that same variable to add flags that can detect
the recursion inside the infrastructure of the function tracer.

I plan to change it so that the task struct bit can be checked in
mcount, but as that requires changes to all archs, I will hold that off
to the next merge window.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/1306348063.1465.116.camel@gandalf.stny.rr.com
Reported-by: Witold Baryluk <baryluk@smp.if.uj.edu.pl>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-05-25 22:13:49 -04:00
liubo 7f34b746f7 tracing: Update btrfs's tracepoints to use u64 interface
To avoid 64->32 truncating WARNING, update btrfs's tracepoints.

Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com>
Link: http://lkml.kernel.org/r/4DACE6E3.8080200@cn.fujitsu.com
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-05-25 22:13:47 -04:00
liubo 2fc1b6f0d0 tracing: Add __print_symbolic_u64 to avoid warnings on 32bit machine
Filesystem, like Btrfs, has some "ULL" macros, and when these macros are passed
to tracepoints'__print_symbolic(), there will be 64->32 truncate WARNINGS during
compiling on 32bit box.

Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com>
Link: http://lkml.kernel.org/r/4DACE6E0.7000507@cn.fujitsu.com
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-05-25 22:13:44 -04:00
Linus Torvalds 14d74e0cab Merge git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/linux-2.6-nsfd
* git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/linux-2.6-nsfd:
  net: fix get_net_ns_by_fd for !CONFIG_NET_NS
  ns proc: Return -ENOENT for a nonexistent /proc/self/ns/ entry.
  ns: Declare sys_setns in syscalls.h
  net: Allow setting the network namespace by fd
  ns proc: Add support for the ipc namespace
  ns proc: Add support for the uts namespace
  ns proc: Add support for the network namespace.
  ns: Introduce the setns syscall
  ns: proc files for namespace naming policy.
2011-05-25 18:10:16 -07:00
Hans Petter Selasky 5d8f290c05 [media] Add missing include guard to header file
Signed-off-by: Hans Petter Selasky <hselasky@c2i.net>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2011-05-25 21:42:30 -03:00
Matthew Garrett 916f676f8d x86, efi: Retain boot service code until after switching to virtual mode
UEFI stands for "Unified Extensible Firmware Interface", where "Firmware"
is an ancient African word meaning "Why do something right when you can
do it so wrong that children will weep and brave adults will cower before
you", and "UEI" is Celtic for "We missed DOS so we burned it into your
ROMs". The UEFI specification provides for runtime services (ie, another
way for the operating system to be forced to depend on the firmware) and
we rely on these for certain trivial tasks such as setting up the
bootloader. But some hardware fails to work if we attempt to use these
runtime services from physical mode, and so we have to switch into virtual
mode. So far so dreadful.

The specification makes it clear that the operating system is free to do
whatever it wants with boot services code after ExitBootServices() has been
called. SetVirtualAddressMap() can't be called until ExitBootServices() has
been. So, obviously, a whole bunch of EFI implementations call into boot
services code when we do that. Since we've been charmingly naive and
trusted that the specification may be somehow relevant to the real world,
we've already stuffed a picture of a penguin or something in that address
space. And just to make things more entertaining, we've also marked it
non-executable.

This patch allocates the boot services regions during EFI init and makes
sure that they're executable. Then, after SetVirtualAddressMap(), it
discards them and everyone lives happily ever after. Except for the ones
who have to work on EFI, who live sad lives haunted by the knowledge that
someone's eventually going to write yet another firmware specification.

[ hpa: adding this to urgent with a stable tag since it fixes currently-broken
  hardware.  However, I do not know what the dependencies are and so I do
  not know which -stable versions this may be a candidate for. ]

Signed-off-by: Matthew Garrett <mjg@redhat.com>
Link: http://lkml.kernel.org/r/1306331593-28715-1-git-send-email-mjg@redhat.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: <stable@kernel.org>
2011-05-25 17:03:53 -07:00
Linus Torvalds 3f5785ec31 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (89 commits)
  bonding: documentation and code cleanup for resend_igmp
  bonding: prevent deadlock on slave store with alb mode (v3)
  net: hold rtnl again in dump callbacks
  Add Fujitsu 1000base-SX PCI ID to tg3
  bnx2x: protect sequence increment with mutex
  sch_sfq: fix peek() implementation
  isdn: netjet - blacklist Digium TDM400P
  via-velocity: don't annotate MAC registers as packed
  xen: netfront: hold RTNL when updating features.
  sctp: fix memory leak of the ASCONF queue when free asoc
  net: make dev_disable_lro use physical device if passed a vlan dev (v2)
  net: move is_vlan_dev into public header file (v2)
  bug.h: Fix build with CONFIG_PRINTK disabled.
  wireless: fix fatal kernel-doc error + warning in mac80211.h
  wireless: fix cfg80211.h new kernel-doc warnings
  iwlagn: dbg_fixed_rate only used when CONFIG_MAC80211_DEBUGFS enabled
  dst: catch uninitialized metrics
  be2net: hash key for rss-config cmd not set
  bridge: initialize fake_rtable metrics
  net: fix __dst_destroy_metrics_generic()
  ...

Fix up trivial conflicts in drivers/staging/brcm80211/brcmfmac/wl_cfg80211.c
2011-05-25 17:00:17 -07:00
Steven Rostedt f29c50419c maccess,probe_kernel: Make write/read src const void *
The functions probe_kernel_write() and probe_kernel_read() do not modify
the src pointer. Allow const pointers to be passed in without the need
of a typecast.

Acked-by: Mike Frysinger <vapier@gentoo.org>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/1305824936.1465.4.camel@gandalf.stny.rr.com
2011-05-25 19:56:23 -04:00
Linus Torvalds 8c1c77ff9b Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc: (75 commits)
  mmc: core: eMMC bus width may not work on all platforms
  mmc: sdhci: Auto-CMD23 fixes.
  mmc: sdhci: Auto-CMD23 support.
  mmc: core: Block CMD23 support for UHS104/SDXC cards.
  mmc: sdhci: Implement MMC_CAP_CMD23 for SDHCI.
  mmc: core: Use CMD23 for multiblock transfers when we can.
  mmc: quirks: Add/remove quirks conditional support.
  mmc: Add new VUB300 USB-to-SD/SDIO/MMC driver
  mmc: sdhci-pxa: Add quirks for DMA/ADMA to match h/w
  mmc: core: duplicated trial with same freq in mmc_rescan_try_freq()
  mmc: core: add support for eMMC Dual Data Rate
  mmc: core: eMMC signal voltage does not use CMD11
  mmc: sdhci-pxa: add platform code for UHS signaling
  mmc: sdhci: add hooks for setting UHS in platform specific code
  mmc: core: clear MMC_PM_KEEP_POWER flag on resume
  mmc: dw_mmc: fixed wrong regulator_enable in suspend/resume
  mmc: sdhi: allow powering down controller with no card inserted
  mmc: tmio: runtime suspend the controller, where possible
  mmc: sdhi: support up to 3 interrupt sources
  mmc: sdhi: print physical base address and clock rate
  ...
2011-05-25 16:55:55 -07:00
Linus Torvalds 0c63e38a12 Merge branch 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging
* 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
  hwmon: New driver for the SMSC EMC6W201
  hwmon: (abituguru) Depend on DMI
  hwmon: (it87) Use request_muxed_region
  hwmon: (sch5627) Trigger Vbat measurements
  hwmon: (sch5627) Add sch5627_send_cmd function
  i8k: Integrate with the hwmon subsystem
  hwmon: (max6650) Properly support the MAX6650
  hwmon: (max6650) Drop device detection
  Move ACPI power meter driver to hwmon
  hwmon: (f71882fg) Add support for F71808A
  hwmon: (f71882fg) Split has_beep in fan_has_beep and temp_has_beep
  hwmon: (asc7621) Drop duplicate dependency
  hwmon: (jc42) Change detection class
  hwmon: Add driver for AMD family 15h processor power information
  hwmon: (k10temp) Add support for Fam15h (Bulldozer)
  hwmon: Use helper functions to set and get driver data
  i8k: Avoid lahf in 64-bit code
2011-05-25 16:52:50 -07:00
Linus Torvalds 0798b1dbfb Merge git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile
* git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile: (26 commits)
  arch/tile: prefer "tilepro" as the name of the 32-bit architecture
  compat: include aio_abi.h for aio_context_t
  arch/tile: cleanups for tilegx compat mode
  arch/tile: allocate PCI IRQs later in boot
  arch/tile: support signal "exception-trace" hook
  arch/tile: use better definitions of xchg() and cmpxchg()
  include/linux/compat.h: coding-style fixes
  tile: add an RTC driver for the Tilera hypervisor
  arch/tile: finish enabling support for TILE-Gx 64-bit chip
  compat: fixes to allow working with tile arch
  arch/tile: update defconfig file to something more useful
  tile: do_hardwall_trap: do not play with task->sighand
  tile: replace mm->cpu_vm_mask with mm_cpumask()
  tile,mn10300: add device parameter to dma_cache_sync()
  audit: support the "standard" <asm-generic/unistd.h>
  arch/tile: clarify flush_buffer()/finv_buffer() function names
  arch/tile: kernel-related cleanups from removing static page size
  arch/tile: various header improvements for building drivers
  arch/tile: disable GX prefetcher during cache flush
  arch/tile: tolerate disabling CONFIG_BLK_DEV_INITRD
  ...
2011-05-25 15:35:32 -07:00
Linus Torvalds ad363e0916 Merge branch 'for-torvalds' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-stericsson
* 'for-torvalds' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-stericsson:
  mach-ux500: voltage domain regulators for DB8500
  cpufreq: make DB8500 cpufreq driver compile
  cpufreq: update DB8500 cpufreq driver
  mach-ux500: move CPUfreq driver to cpufreq subsystem
  mfd: add DB5500 PRCMU driver
  mfd: update DB8500 PRCMU driver
  mach-ux500: move the DB8500 PRCMU driver to MFD
  mach-ux500: make PRCMU base address dynamic
  mach-ux500: rename PRCMU driver per SoC
  mach-ux500: update ASIC version detection
  mach-ux500: update SoC and board IRQ handling
  mach-ux500: update the DB5500 register file
  mach-ux500: update the DB8500 register file
2011-05-25 15:35:03 -07:00
Neil Horman 6dcbbe25dc net: move is_vlan_dev into public header file (v2)
Migrate is_vlan_dev() to if_vlan.h so that core networkig can use it

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: davem@davemloft.net
CC: bhutchings@solarflare.com
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-25 17:55:23 -04:00
Andrei Warkentin 8edf63710b mmc: sdhci: Auto-CMD23 support.
Enables Auto-CMD23 support where available (SDHCI 3.0 controllers)

Signed-off-by: Andrei Warkentin <andreiw@motorola.com>
Tested-by: Arindam Nath <arindam.nath@amd.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
2011-05-25 16:51:40 -04:00
Andrei Warkentin f0d89972b0 mmc: core: Block CMD23 support for UHS104/SDXC cards.
SD cards operating at UHS104 or better support SET_BLOCK_COUNT.

Signed-off-by: Andrei Warkentin <andreiw@motorola.com>
Reviewed-by: Arindam Nath <arindam.nath@amd.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
2011-05-25 16:49:03 -04:00
Andrei Warkentin e89d456fcd mmc: sdhci: Implement MMC_CAP_CMD23 for SDHCI.
Implements support for multiblock transfers bounded
by SET_BLOCK_COUNT (CMD23).

Signed-off-by: Andrei Warkentin <andreiw@motorola.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
2011-05-25 16:49:00 -04:00
Andrei Warkentin d0c97cfb81 mmc: core: Use CMD23 for multiblock transfers when we can.
CMD23-prefixed instead of open-ended multiblock transfers
have a performance advantage on some MMC cards.

Signed-off-by: Andrei Warkentin <andreiw@motorola.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
2011-05-25 16:48:46 -04:00
Nir Muchtar 753f618ae0 RDMA/cma: Add support for netlink statistics export
Add callbacks and data types for statistics export of all current
devices/ids.  The schema for RDMA CM is a series of netlink messages.
Each one contains an rdma_cm_stat struct.  Additionally, two netlink
attributes are created for the addresses for each message (if
applicable).

Their types used are:
RDMA_NL_RDMA_CM_ATTR_SRC_ADDR (The source address for this ID)
RDMA_NL_RDMA_CM_ATTR_DST_ADDR (The destination address for this ID)
sockaddr_* structs are encapsulated within these attributes.

In other words, every transaction contains a series of messages like:

-------message 1-------
struct rdma_cm_id_stats {
       __u32 qp_num;
       __u32 bound_dev_if;
       __u32 port_space;
       __s32 pid;
       __u8 cm_state;
       __u8 node_type;
       __u8 port_num;
       __u8 reserved;
}
RDMA_NL_RDMA_CM_ATTR_SRC_ADDR attribute - contains the source address
RDMA_NL_RDMA_CM_ATTR_DST_ADDR attribute - contains the destination address
-------end 1-------
-------message 2-------
struct rdma_cm_id_stats
RDMA_NL_RDMA_CM_ATTR_SRC_ADDR attribute
RDMA_NL_RDMA_CM_ATTR_DST_ADDR attribute
-------end 2-------

Signed-off-by: Nir Muchtar <nirm@voltaire.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-05-25 13:46:23 -07:00
Sean Hefty b26f9b9949 RDMA/cma: Pass QP type into rdma_create_id()
The RDMA CM currently infers the QP type from the port space selected
by the user.  In the future (eg with RDMA_PS_IB or XRC), there may not
be a 1-1 correspondence between port space and QP type.  For netlink
export of RDMA CM state, we want to export the QP type to userspace,
so it is cleaner to explicitly associate a QP type to an ID.

Modify rdma_create_id() to allow the user to specify the QP type, and
use it to make our selections of datagram versus connected mode.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-05-25 13:46:23 -07:00
Roland Dreier 9a7147b506 RDMA: Update exported headers list
Various RDMA headers are intended to be exported to userspace, so add
them to the headers-y list.  Add a (strictly speaking, superfluous)
include of <linux/types.h> to avoid a headers_check warning.

Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-05-25 13:46:23 -07:00
Nir Muchtar 550e5ca77e RDMA/cma: Export enum cma_state in <rdma/rdma_cm.h>
Move cma.c's internal definition of enum cma_state to enum rdma_cm_state
in an exported header so that it can be exported via RDMA netlink.

Signed-off-by: Nir Muchtar <nirm@voltaire.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-05-25 13:46:22 -07:00
Linus Torvalds 57bb559574 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (23 commits)
  ceph: fix cap flush race reentrancy
  libceph: subscribe to osdmap when cluster is full
  libceph: handle new osdmap down/state change encoding
  rbd: handle online resize of underlying rbd image
  ceph: avoid inode lookup on nfs fh reconnect
  ceph: use LOOKUPINO to make unconnected nfs fh more reliable
  rbd: use snprintf for disk->disk_name
  rbd: cleanup: make kfree match kmalloc
  rbd: warn on update_snaps failure on notify
  ceph: check return value for start_request in writepages
  ceph: remove useless check
  libceph: add missing breaks in addr_set_port
  libceph: fix TAG_WAIT case
  ceph: fix broken comparison in readdir loop
  libceph: fix osdmap timestamp assignment
  ceph: fix rare potential cap leak
  libceph: use snprintf for unknown addrs
  libceph: use snprintf for formatting object name
  ceph: use snprintf for dirstat content
  libceph: fix uninitialized value when no get_authorizer method is set
  ...
2011-05-25 11:46:31 -07:00
Jean Delvare 774466add7 hwmon: (jc42) Change detection class
While the JC42-compatible chips are temperature sensors, I2C_CLASS_SPD
makes more sense because these chips always live on memory modules.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Guenter Roeck <guenter.roeck@ericsson.com>
2011-05-25 20:43:32 +02:00
David S. Miller 22e95ac87d Merge branch 'for-davem' of ssh://master.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 2011-05-25 13:28:55 -04:00
Linus Torvalds 2a651c7f8d Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs:
  9p: update Documentation pointers
  net/9p: enable 9p to work in non-default network namespace
  net/9p: p9_idpool_get return -1 on error
  fs/9p: Don't clunk dentry fid when we fail to get a writeback inode
  9p: Small cleanup in <net/9p/9p.h>
  9p: remove experimental tag from tested configurations
  9p: typo fixes and minor cleanups
  net/9p: Change linuxdoc names to match functions.
2011-05-25 09:21:56 -07:00
Linus Torvalds 929cfdd5d3 Merge branch 'for-2.6.40/drivers' of git://git.kernel.dk/linux-2.6-block
* 'for-2.6.40/drivers' of git://git.kernel.dk/linux-2.6-block: (110 commits)
  loop: handle on-demand devices correctly
  loop: limit 'max_part' module param to DISK_MAX_PARTS
  drbd: fix warning
  drbd: fix warning
  drbd: Fix spelling
  drbd: fix schedule in atomic
  drbd: Take a more conservative approach when deciding max_bio_size
  drbd: Fixed state transitions after async outdate-peer-handler returned
  drbd: Disallow the peer_disk_state to be D_OUTDATED while connected
  drbd: Fix for the connection problems on high latency links
  drbd: fix potential activity log refcount imbalance in error path
  drbd: Only downgrade the disk state in case of disk failures
  drbd: fix disconnect/reconnect loop, if ping-timeout == ping-int
  drbd: fix potential distributed deadlock
  lru_cache.h: fix comments referring to ts_ instead of lc_
  drbd: Fix for application IO with the on-io-error=pass-on policy
  xen/p2m: Add EXPORT_SYMBOL_GPL to the M2P override functions.
  xen/p2m/m2p/gnttab: Support GNTMAP_host_map in the M2P override.
  xen/blkback: don't fail empty barrier requests
  xen/blkback: fix xenbus_transaction_start() hang caused by double xenbus_transaction_end()
  ...
2011-05-25 09:15:35 -07:00
Linus Torvalds 798ce8f1cc Merge branch 'for-2.6.40/core' of git://git.kernel.dk/linux-2.6-block
* 'for-2.6.40/core' of git://git.kernel.dk/linux-2.6-block: (40 commits)
  cfq-iosched: free cic_index if cfqd allocation fails
  cfq-iosched: remove unused 'group_changed' in cfq_service_tree_add()
  cfq-iosched: reduce bit operations in cfq_choose_req()
  cfq-iosched: algebraic simplification in cfq_prio_to_maxrq()
  blk-cgroup: Initialize ioc->cgroup_changed at ioc creation time
  block: move bd_set_size() above rescan_partitions() in __blkdev_get()
  block: call elv_bio_merged() when merged
  cfq-iosched: Make IO merge related stats per cpu
  cfq-iosched: Fix a memory leak of per cpu stats for root group
  backing-dev: Kill set but not used var in  bdi_debug_stats_show()
  block: get rid of on-stack plugging debug checks
  blk-throttle: Make no throttling rule group processing lockless
  blk-cgroup: Make cgroup stat reset path blkg->lock free for dispatch stats
  blk-cgroup: Make 64bit per cpu stats safe on 32bit arch
  blk-throttle: Make dispatch stats per cpu
  blk-throttle: Free up a group only after one rcu grace period
  blk-throttle: Use helper function to add root throtl group to lists
  blk-throttle: Introduce a helper function to fill in device details
  blk-throttle: Dynamically allocate root group
  blk-cgroup: Allow sleeping while dynamically allocating a group
  ...
2011-05-25 09:14:07 -07:00
Linus Torvalds 22e12bbc9b Merge branch 'timers-ptp-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'timers-ptp-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  ptp: Fix dp83640 build warning when building statically
  ptp: Added a clock driver for the National Semiconductor PHYTER.
  ptp: Added a clock driver for the IXP46x.
  ptp: Added a clock that uses the eTSEC found on the MPC85xx.
  ptp: Added a brand new class driver for ptp clocks.
2011-05-25 08:59:42 -07:00
Linus Torvalds 19426a8f81 Merge branch 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  posix-timers: RCU conversion
2011-05-25 08:58:50 -07:00
Linus Torvalds 0f1493a601 Merge git://git.kernel.org/pub/scm/linux/kernel/git/lethal/fbdev-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/lethal/fbdev-2.6: (126 commits)
  sh_mobile_meram: Safely disable MERAM operation when not initialized
  video: mb862xxfb: add support for L1 displaying
  video: mb862xx: add support for controller's I2C bus adapter
  video: mb862xxfb: relocate register space to get contiguous vram
  video: mb862xxfb: use pre-initialized configuration for PCI GDCs
  video: mb862xxfb: correct fix.smem_len field initialization
  video: s3c-fb: correct transparency checking in 32bpp
  video: s3c-fb: add gpio setup function to resume function
  fbdev/amifb: Remove superfluous alignment of frame buffer memory
  fbdev/amifb: Do not call panic() if there's not enough Chip RAM
  fbdev/amifb: Correct check for video memory size
  video: mb862xxfb: Require either FB_MB862XX_PCI_GDC or FB_MB862XX_LIME
  video: s3c-fb: add window variant information for S5P
  video: s3c-fb: add additional validate bpps
  video: s3c-fb: correct window osd size offset values
  udlfb: include prefetch.h explicitly
  drivers/video/s3c2410fb.c: Convert release_resource to release_mem_region
  drivers/video/sm501fb.c: Convert release_resource to release_mem_region
  drivers/video: Convert release_resource to release_mem_region
  video, udlfb: Fix two build warnings about 'ignoring return value'
  ...
2011-05-25 08:42:37 -07:00
Shaohua Li c84598bbfa percpu_counter: change return value and add comments
The percpu_counter_*_positive() API in UP case doesn't check if return
value is positive.  Add comments to explain why we don't.  Also if count <
0, returns 0 instead of 1 for *read_positive().

[akpm@linux-foundation.org: tweak comment]
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:54 -07:00
Jean-Christophe PLAGNIOL-VILLARD 3c8f370ded lib/genalloc.c: add support for specifying the physical address
So we can specify the virtual address as the base of the pool chunk and
then get physical addresses for hardware IP.

For example on at91 we will use this on spi, uart or macb

Signed-off-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
Cc: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: Patrice VILCHEZ <patrice.vilchez@atmel.com>
Cc: Jes Sorensen <jes@wildopensource.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:54 -07:00
Jean-Christophe PLAGNIOL-VILLARD 6aae6e0304 include/linux/genalloc.h: add multiple-inclusion guards
Signed-off-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
Cc: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: Patrice VILCHEZ <patrice.vilchez@atmel.com>
Cc: Jes Sorensen <jes@wildopensource.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:53 -07:00
Alexey Dobriyan c196e32a11 lib: add kstrto*_from_user()
There is quite a lot of code which does copy_from_user() + strict_strto*()
or simple_strto*() combo in slightly different ways.

Before doing conversions all over tree, let's get final API correct.

Enter kstrtoull_from_user() and friends.

Typical code which uses them looks very simple:

	TYPE val;
	int rv;

	rv = kstrtoTYPE_from_user(buf, count, 0, &val);
	if (rv < 0)
		return rv;
	[use val]
	return count;

There is a tiny semantic difference from the plain kstrto*() API -- the
latter allows any amount of leading zeroes, while the former copies data
into buffer on stack and thus allows leading zeroes as long as it fits
into buffer.

This shouldn't be a problem for typical usecase "echo 42 > /proc/x".

The point is to make reading one integer from userspace _very_ simple and
very bug free.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:52 -07:00
Uwe Kleine-König 4440673a95 leds: provide helper to register "leds-gpio" devices
This function makes a deep copy of the platform data to allow it to live
in init memory.  For a kernel that supports several machines and so
includes the definition for several leds-gpio devices this saves quite
some memory because all but one definition can be free'd after boot.

As the function is used by arch code it must be builtin and so cannot go
into leds-gpio.c.

[akpm@linux-foundation.org: s/CONFIG_LED_REGISTER_GPIO/CONFIG_LEDS_REGISTER_GPIO/]
Signed-off-by: Uwe Kleine-König  <u.kleine-koenig@pengutronix.de>
Cc: Russell King <rmk@arm.linux.org.uk>
Acked-by: Richard Purdie <richard.purdie@linuxfoundation.org>
Cc: Fabio Estevam <fabio.estevam@freescale.com>
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Tested-by: H Hartley Sweeten <hartleys@visionengravers.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:51 -07:00
Joachim Eastwood 3c1ab50d0a drivers/leds/leds-pca9532.c: add gpio capability
Allow unused leds on pca9532 to be used as gpio.  The board I am working
on now has no less than 6 pca9532 chips.  One chips is used for only leds,
one has 14 leds and 2 gpio and the rest of the chips are gpio only.

There is also one board in mainline which could use this capabilty;
arch/arm/mach-iop32x/n2100.c
 232         {       .type = PCA9532_TYPE_NONE }, /* power OFF gpio */
 233         {       .type = PCA9532_TYPE_NONE }, /* reset gpio */

This patch defines a new pin type, PCA9532_TYPE_GPIO, and registers a
gpiochip if any pin has this type set.  The gpio will registers all chip
pins but will filter on gpio_request.

[randy.dunlap@oracle.com: fix build when GPIOLIB is not enabled]
Signed-off-by: Joachim Eastwood <joachim.eastwood@jotron.com>
Reviewed-by: Wolfram Sang <w.sang@pengutronix.de>
Reviewed-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Cc: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Jan Weitzel <j.weitzel@phytec.de>
Cc: Juergen Kilb <j.kilb@phytec.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:50 -07:00
Mike Travis 162a7e7500 printk: allocate kernel log buffer earlier
On larger systems, because of the numerous ACPI, Bootmem and EFI messages,
the static log buffer overflows before the larger one specified by the
log_buf_len param is allocated.  Minimize the overflow by allocating the
new log buffer as soon as possible.

On kernels without memblock, a later call to setup_log_buf from
kernel/init.c is the fallback.

[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: fix CONFIG_PRINTK=n build]
Signed-off-by: Mike Travis <travis@sgi.com>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Jack Steiner <steiner@sgi.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:48 -07:00
Yinghai Lu 95dde50190 memblock: add error return when CONFIG_HAVE_MEMBLOCK is not set
On larger systems, information in the kernel log is lost because there is
so much early text printed, that it overflows the static log buffer before
the log_buf_len kernel parameter can be processed, and a bigger log buffer
allocated.

Distros are relunctant to increase memory usage by increasing the size of
the static log buffer, so minimize the problem by allocating the new log
buffer as early as possible.

This patch:

Add an error return if CONFIG_HAVE_MEMBLOCK is not set instead of having
to add #ifdef CONFIG_HAVE_MEMBLOCK around blocks of code calling that
function.

Signed-off-by: Mike Travis <travis@sgi.com>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Jack Steiner <steiner@sgi.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:48 -07:00
KOSAKI Motohiro 746a2a838d sparse: Undef __compiletime_{warning,error} if __CHECKER__ is defined
sparse can't parse warning and error attribute.  then they should be
hidden from sparse.

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Dave Hansen <dave@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:47 -07:00
KOSAKI Motohiro 5bd7e6a30e sparse: define __must_be_array() for __CHECKER__
commit c5e631cf65 ("ARRAY_SIZE: check for type") added __must_be_array().
But sparse can't parse this gcc extention.

Now make C=2 makes following sparse errors a lot.

  kernel/futex.c:2699:25: error: No right hand side of '+'-expression

Because __must_be_array() is used for ARRAY_SIZE() macro and it is
used very widely.

This patch fixes it.

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Dave Hansen <dave@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:46 -07:00
KOSAKI Motohiro 903c0c7cdc sparse: define dummy BUILD_BUG_ON definition for sparse
BUILD_BUG_ON() causes a syntax error to detect coding errors.  So it
causes sparse to detect an error too.  This reduces sparse's usefulness.

This patch makes a dummy BUILD_BUG_ON() definition for sparse.

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Dave Hansen <dave@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:46 -07:00
Eric Paris 1dbe39424a xattr.h: expose string defines to userspace
af4f136056 ("security: move LSM xattrnames to xattr.h") moved the
XATTR_CAPS_SUFFIX define from capability.h to xattr.h.  This makes sense
except it was previously exports to userspace but xattr.h does not export
it to userspace.  This patch exports these headers to userspace to fix the
ABI regression.

There is some slight possibility that this will cause problems in other
applications which used these #defines differently (wrongly) and I could
JUST export the capabilities xattr name that we broke.  Does anyonehave an
idea how exposing these headers could cause a problem?

Below is what is being exposed to userspace, included here since it isn't
clear exactly what is going to be made available from the patch.

/* Namespaces */
#define XATTR_OS2_PREFIX "os2."
#define XATTR_OS2_PREFIX_LEN (sizeof (XATTR_OS2_PREFIX) - 1)

#define XATTR_SECURITY_PREFIX   "security."
#define XATTR_SECURITY_PREFIX_LEN (sizeof (XATTR_SECURITY_PREFIX) - 1)

#define XATTR_SYSTEM_PREFIX "system."
#define XATTR_SYSTEM_PREFIX_LEN (sizeof (XATTR_SYSTEM_PREFIX) - 1)

#define XATTR_TRUSTED_PREFIX "trusted."
#define XATTR_TRUSTED_PREFIX_LEN (sizeof (XATTR_TRUSTED_PREFIX) - 1)

#define XATTR_USER_PREFIX "user."
#define XATTR_USER_PREFIX_LEN (sizeof (XATTR_USER_PREFIX) - 1)

/* Security namespace */
#define XATTR_SELINUX_SUFFIX "selinux"
#define XATTR_NAME_SELINUX XATTR_SECURITY_PREFIX XATTR_SELINUX_SUFFIX

#define XATTR_SMACK_SUFFIX "SMACK64"
#define XATTR_SMACK_IPIN "SMACK64IPIN"
#define XATTR_SMACK_IPOUT "SMACK64IPOUT"
#define XATTR_NAME_SMACK XATTR_SECURITY_PREFIX XATTR_SMACK_SUFFIX
#define XATTR_NAME_SMACKIPIN    XATTR_SECURITY_PREFIX XATTR_SMACK_IPIN
#define XATTR_NAME_SMACKIPOUT   XATTR_SECURITY_PREFIX XATTR_SMACK_IPOUT

#define XATTR_CAPS_SUFFIX "capability"
#define XATTR_NAME_CAPS XATTR_SECURITY_PREFIX XATTR_CAPS_SUFFIX

Reported-by: Ozan Çaglayan <ozan@pardus.org.tr>
Signed-off-by: Eric Paris <eparis@redhat.com>
Cc: Mimi Zohar <zohar@us.ibm.com>
Cc: Serge Hallyn <serue@us.ibm.com>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:45 -07:00
Mike Travis 4b060420a5 bitmap, irq: add smp_affinity_list interface to /proc/irq
Manually adjusting the smp_affinity for IRQ's becomes unwieldy when the
cpu count is large.

Setting smp affinity to cpus 256 to 263 would be:

	echo 000000ff,00000000,00000000,00000000,00000000,00000000,00000000,00000000 > smp_affinity

instead of:

	echo 256-263 > smp_affinity_list

Think about what it looks like for cpus around say, 4088 to 4095.

We already have many alternate "list" interfaces:

/sys/devices/system/cpu/cpuX/indexY/shared_cpu_list
/sys/devices/system/cpu/cpuX/topology/thread_siblings_list
/sys/devices/system/cpu/cpuX/topology/core_siblings_list
/sys/devices/system/node/nodeX/cpulist
/sys/devices/pci***/***/local_cpulist

Add a companion interface, smp_affinity_list to use cpu lists instead of
cpu maps.  This conforms to other companion interfaces where both a map
and a list interface exists.

This required adding a bitmap_parselist_user() function in a manner
similar to the bitmap_parse_user() function.

[akpm@linux-foundation.org: make __bitmap_parselist() static]
Signed-off-by: Mike Travis <travis@sgi.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jack Steiner <steiner@sgi.com>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:45 -07:00
Amerigo Wang e50c1f609c fscache: remove dead code under CONFIG_WORKQUEUE_DEBUGFS
There is no CONFIG_WORKQUEUE_DEBUGFS any more, so this code is dead.

Signed-off-by: WANG Cong <amwang@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:44 -07:00
Wanlong Gao 21b7d815b2 include/linux/c2port.h: remove wrong and never used macros
The macro to_class_dev() uses the deprecated structure class_device, and
the c2port_device has no member named class in the definition of the macro
to_c2port_device.

Signed-off-by: Wanlong Gao <wanlong.gao@gmail.com>
Cc: Rodolfo Giometti <giometti@linux.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:43 -07:00
Tim Gardner 0ac1ee0bfe ulimit: raise default hard ulimit on number of files to 4096
Apps are increasingly using more than 1024 file descriptors.  See
discussion in several distro bug trackers, e.g.  BugLink:
http://bugs.launchpad.net/bugs/663090
https://issues.rpath.com/browse/RPL-2054

You don't want to raise the default soft limit, since that might break
apps that use select(), but it's safe to raise the default hard limit;
that way, apps that know they need lots of file descriptors can raise
their soft limit without needing root, and without user intervention.

Ubuntu is doing this with a kernel change because they have a policy of
not changing kernel defaults in userland.

While 4096 might not be enough for *all* apps, it seems to be plenty for
the apps I've seen lately that are unhappy with 1024.

Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Cc: Dan Kegel <dank@kegel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:43 -07:00
Mike Frysinger f68aa5b445 asm-generic/cacheflush.h: flush icache when copying to user pages
The copy_to_user_page() function is supposed to flush the icache on the
memory that was written, but the current asm-generic version lacks that
logic.  While normally it isn't a big deal as the asm-generic version of
icache flushing is a stub, it is a deal for ports that want to use the
asm-generic version as a baseline and then overlay its own specific parts
(like icache flushing).

Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:37 -07:00
Daniel Kiper a539f3533b mm: add SECTION_ALIGN_UP() and SECTION_ALIGN_DOWN() macro
Add SECTION_ALIGN_UP() and SECTION_ALIGN_DOWN() macro which aligns given
pfn to upper section and lower section boundary accordingly.

Required for the latest memory hotplug support for the Xen balloon driver.

Signed-off-by: Daniel Kiper <dkiper@net-space.pl>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:36 -07:00
Stephen Wilson f2beb79836 proc: make struct proc_maps_private truly private
Now that mm/mempolicy.c is no longer implementing /proc/pid/numa_maps
there is no need to export struct proc_maps_private to the world.  Move it
to fs/proc/internal.h instead.

Signed-off-by: Stephen Wilson <wilsons@start.ca>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:35 -07:00
Stephen Wilson 13057efb0a mm: declare mpol_to_str() when CONFIG_TMPFS=n
When CONFIG_TMPFS=n mpol_to_str() is not declared in mempolicy.h.
However, in the NUMA case, the definition is always compiled.

Since it is not strictly true that tmpfs is the only client, and since the
symbol was always lurking around anyways, export mpol_to_str()
unconditionally.  Furthermore, this will allow us to move show_numa_map()
out of mempolicy.c and into the procfs subsystem.

Signed-off-by: Stephen Wilson <wilsons@start.ca>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:34 -07:00
Stephen Wilson d98f6cb67f mm: export get_vma_policy()
In commit 48fce3429d ("mempolicies: unexport get_vma_policy()")
get_vma_policy() was marked static as all clients were local to
mempolicy.c.

However, the decision to generate /proc/pid/numa_maps in the numa memory
policy code and outside the procfs subsystem introduces an artificial
interdependency between the two systems.  Exporting get_vma_policy() once
again is the first step to clean up this interdependency.

Signed-off-by: Stephen Wilson <wilsons@start.ca>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:32 -07:00
Hugh Dickins c856507f2b mm: remove last trace of shmem_get_unmapped_area
Remove noMMU declaration of shmem_get_unmapped_area() from mm.h: it fell
out of use in 2.6.21 and ceased to exist in 2.6.29.

Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:31 -07:00
Eric Paris b09e0fa4b4 tmpfs: implement generic xattr support
Implement generic xattrs for tmpfs filesystems.  The Feodra project, while
trying to replace suid apps with file capabilities, realized that tmpfs,
which is used on the build systems, does not support file capabilities and
thus cannot be used to build packages which use file capabilities.  Xattrs
are also needed for overlayfs.

The xattr interface is a bit odd.  If a filesystem does not implement any
{get,set,list}xattr functions the VFS will call into some random LSM hooks
and the running LSM can then implement some method for handling xattrs.
SELinux for example provides a method to support security.selinux but no
other security.* xattrs.

As it stands today when one enables CONFIG_TMPFS_POSIX_ACL tmpfs will have
xattr handler routines specifically to handle acls.  Because of this tmpfs
would loose the VFS/LSM helpers to support the running LSM.  To make up
for that tmpfs had stub functions that did nothing but call into the LSM
hooks which implement the helpers.

This new patch does not use the LSM fallback functions and instead just
implements a native get/set/list xattr feature for the full security.* and
trusted.* namespace like a normal filesystem.  This means that tmpfs can
now support both security.selinux and security.capability, which was not
previously possible.

The basic implementation is that I attach a:

struct shmem_xattr {
	struct list_head list; /* anchored by shmem_inode_info->xattr_list */
	char *name;
	size_t size;
	char value[0];
};

Into the struct shmem_inode_info for each xattr that is set.  This
implementation could easily support the user.* namespace as well, except
some care needs to be taken to prevent large amounts of unswappable memory
being allocated for unprivileged users.

[mszeredi@suse.cz: new config option, suport trusted.*, support symlinks]
Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Acked-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Tested-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Acked-by: Hugh Dickins <hughd@google.com>
Tested-by: Jordi Pujol <jordipujolp@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:31 -07:00
Yinghai Lu 8bba154ef2 memblock/nobootmem: allow alloc_bootmem() to take 0 as low limit
The bootmem wrapper with memblock supports top-down now, so we do not need
to set the low limit to __pa(MAX_DMA_ADDRESS).

The logic should be: good to allocate above __pa(MAX_DMA_ADDRESS), but it
is ok if we can not find memory above 16M on system that has a small
amount of RAM.

Signed-off-by: Yinghai LU <yinghai@kernel.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Olaf Hering <olaf@aepfle.de>
Cc: Tejun Heo <tj@kernel.org>
Cc: Lucas De Marchi <lucas.demarchi@profusion.mobi>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:30 -07:00
Matt Fleming 172703b08c mm: delete non-atomic mm counter implementation
The problem with having two different types of counters is that developers
adding new code need to keep in mind whether it's safe to use both the
atomic and non-atomic implementations.  For example, when adding new
callers of the *_mm_counter() functions a developer needs to ensure that
those paths are always executed with page_table_lock held, in case we're
using the non-atomic implementation of mm counters.

Hugh Dickins introduced the atomic mm counters in commit f412ac08c9
("[PATCH] mm: fix rss and mmlist locking").  When asked why he left the
non-atomic counters around he said,

  | The only reason was to avoid adding costly atomic operations into a
  | configuration that had no need for them there: the page_table_lock
  | sufficed.
  |
  | Certainly it would be simpler just to delete the non-atomic variant.
  |
  | And I think it's fair to say that any configuration on which we're
  | measuring performance to that degree (rather than "does it boot fast?"
  | type measurements), would already be going the split ptlocks route.

Removing the non-atomic counters eases the maintenance burden because
developers no longer have to mindful of the two implementations when using
*_mm_counter().

Note that all architectures provide a means of atomically updating
atomic_long_t variables, even if they have to revert to the generic
spinlock implementation because they don't support 64-bit atomic
instructions (see lib/atomic64.c).

Signed-off-by: Matt Fleming <matt.fleming@linux.intel.com>
Acked-by: Dave Hansen <dave@linux.vnet.ibm.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:30 -07:00
Daniel Kiper ba93fa81b5 mm: do not define PFN_SECTION_SHIFT if !CONFIG_SPARSEMEM
Do not define PFN_SECTION_SHIFT if !CONFIG_SPARSEMEM.

Signed-off-by: Daniel Kiper <dkiper@net-space.pl>
Acked-by: Dave Hansen <dave@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:29 -07:00
Daniel Kiper e3c40f379a mm: pfn_to_section_nr()/section_nr_to_pfn() is valid only in CONFIG_SPARSEMEM context
pfn_to_section_nr()/section_nr_to_pfn() is valid only in CONFIG_SPARSEMEM
context.  Move it to proper place.

Signed-off-by: Daniel Kiper <dkiper@net-space.pl>
Cc: Dave Hansen <dave@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:29 -07:00
Daniel Kiper bf4e8902ee mm: enable set_page_section() only if CONFIG_SPARSEMEM and !CONFIG_SPARSEMEM_VMEMMAP
set_page_section() is meaningful only in CONFIG_SPARSEMEM and
!CONFIG_SPARSEMEM_VMEMMAP context.  Move it to proper place and amend
accordingly functions which are using it.

Signed-off-by: Daniel Kiper <dkiper@net-space.pl>
Acked-by: Dave Hansen <dave@linux.vnet.ibm.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:28 -07:00
Ying Han 1495f230fa vmscan: change shrinker API by passing shrink_control struct
Change each shrinker's API by consolidating the existing parameters into
shrink_control struct.  This will simplify any further features added w/o
touching each file of shrinker.

[akpm@linux-foundation.org: fix build]
[akpm@linux-foundation.org: fix warning]
[kosaki.motohiro@jp.fujitsu.com: fix up new shrinker API]
[akpm@linux-foundation.org: fix xfs warning]
[akpm@linux-foundation.org: update gfs2]
Signed-off-by: Ying Han <yinghan@google.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Acked-by: Pavel Emelyanov <xemul@openvz.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Dave Hansen <dave@linux.vnet.ibm.com>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:26 -07:00
Ying Han a09ed5e000 vmscan: change shrink_slab() interfaces by passing shrink_control
Consolidate the existing parameters to shrink_slab() into a new
shrink_control struct.  This is needed later to pass the same struct to
shrinkers.

Signed-off-by: Ying Han <yinghan@google.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Acked-by: Pavel Emelyanov <xemul@openvz.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Dave Hansen <dave@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:25 -07:00
Wu Fengguang 7b1de5868b readahead: readahead page allocations are OK to fail
Pass __GFP_NORETRY|__GFP_NOWARN for readahead page allocations.

readahead page allocations are completely optional.  They are OK to fail
and in particular shall not trigger OOM on themselves.

Reported-by: Dave Young <hidave.darkstar@gmail.com>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
Reviewed-by: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:25 -07:00
Dave Hansen a238ab5b02 mm: break out page allocation warning code
This originally started as a simple patch to give vmalloc() some more
verbose output on failure on top of the plain page allocator messages.
Johannes suggested that it might be nicer to lead with the vmalloc() info
_before_ the page allocator messages.

But, I do think there's a lot of value in what __alloc_pages_slowpath()
does with its filtering and so forth.

This patch creates a new function which other allocators can call instead
of relying on the internal page allocator warnings.  It also gives this
function private rate-limiting which separates it from other
printk_ratelimit() users.

Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Michal Nazarewicz <mina86@mina86.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:21 -07:00
KOSAKI Motohiro de03c72cfc mm: convert mm->cpu_vm_cpumask into cpumask_var_t
cpumask_t is very big struct and cpu_vm_mask is placed wrong position.
It might lead to reduce cache hit ratio.

This patch has two change.
1) Move the place of cpumask into last of mm_struct. Because usually cpumask
   is accessed only front bits when the system has cpu-hotplug capability
2) Convert cpu_vm_mask into cpumask_var_t. It may help to reduce memory
   footprint if cpumask_size() will use nr_cpumask_bits properly in future.

In addition, this patch change the name of cpu_vm_mask with cpu_vm_mask_var.
It may help to detect out of tree cpu_vm_mask users.

This patch has no functional change.

[akpm@linux-foundation.org: build fix]
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:21 -07:00
Peter Zijlstra 9547d01bfb mm: uninline large generic tlb.h functions
Some of these functions have grown beyond inline sanity, move them
out-of-line.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Requested-by: Andrew Morton <akpm@linux-foundation.org>
Requested-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:20 -07:00
Peter Zijlstra 2b575eb64f mm: convert anon_vma->lock to a mutex
Straightforward conversion of anon_vma->lock to a mutex.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Hugh Dickins <hughd@google.com>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Miller <davem@davemloft.net>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Tony Luck <tony.luck@intel.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:19 -07:00
Peter Zijlstra 25aeeb046e mm: revert page_lock_anon_vma() lock annotation
Its beyond ugly and gets in the way.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Miller <davem@davemloft.net>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Tony Luck <tony.luck@intel.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:18 -07:00
Peter Zijlstra 3d48ae45e7 mm: Convert i_mmap_lock to a mutex
Straightforward conversion of i_mmap_lock to a mutex.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Miller <davem@davemloft.net>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Tony Luck <tony.luck@intel.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:18 -07:00
Peter Zijlstra 97a894136f mm: Remove i_mmap_lock lockbreak
Hugh says:
 "The only significant loser, I think, would be page reclaim (when
  concurrent with truncation): could spin for a long time waiting for
  the i_mmap_mutex it expects would soon be dropped? "

Counter points:
 - cpu contention makes the spin stop (need_resched())
 - zap pages should be freeing pages at a higher rate than reclaim
   ever can

I think the simplification of the truncate code is definitely worth it.

Effectively reverts: 2aa15890f3 ("mm: prevent concurrent
unmap_mapping_range() on the same inode") and takes out the code that
caused its problem.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Miller <davem@davemloft.net>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:17 -07:00
Peter Zijlstra e4c70a6629 lockdep, mutex: provide mutex_lock_nest_lock
In order to convert i_mmap_lock to a mutex we need a mutex equivalent to
spin_lock_nest_lock(), thus provide the mutex_lock_nest_lock() annotation.

As with spin_lock_nest_lock(), mutex_lock_nest_lock() allows annotation of
the locking pattern where an outer lock serializes the acquisition order
of nested locks.  That is, if every time you lock multiple locks A, say A1
and A2 you first acquire N, the order of acquiring A1 and A2 is
irrelevant.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Miller <davem@davemloft.net>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Tony Luck <tony.luck@intel.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:17 -07:00
Peter Zijlstra e303297e6c mm: extended batches for generic mmu_gather
Instead of using a single batch (the small on-stack, or an allocated
page), try and extend the batch every time it runs out and only flush once
either the extend fails or we're done.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Requested-by: Nick Piggin <npiggin@kernel.dk>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Miller <davem@davemloft.net>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:16 -07:00
Peter Zijlstra 2672391169 mm, powerpc: move the RCU page-table freeing into generic code
In case other architectures require RCU freed page-tables to implement
gup_fast() and software filled hashes and similar things, provide the
means to do so by moving the logic into generic code.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Requested-by: David Miller <davem@davemloft.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Tony Luck <tony.luck@intel.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:16 -07:00
Peter Zijlstra d16dfc550f mm: mmu_gather rework
Rework the existing mmu_gather infrastructure.

The direct purpose of these patches was to allow preemptible mmu_gather,
but even without that I think these patches provide an improvement to the
status quo.

The first 9 patches rework the mmu_gather infrastructure.  For review
purpose I've split them into generic and per-arch patches with the last of
those a generic cleanup.

The next patch provides generic RCU page-table freeing, and the followup
is a patch converting s390 to use this.  I've also got 4 patches from
DaveM lined up (not included in this series) that uses this to implement
gup_fast() for sparc64.

Then there is one patch that extends the generic mmu_gather batching.

After that follow the mm preemptibility patches, these make part of the mm
a lot more preemptible.  It converts i_mmap_lock and anon_vma->lock to
mutexes which together with the mmu_gather rework makes mmu_gather
preemptible as well.

Making i_mmap_lock a mutex also enables a clean-up of the truncate code.

This also allows for preemptible mmu_notifiers, something that XPMEM I
think wants.

Furthermore, it removes the new and universially detested unmap_mutex.

This patch:

Remove the first obstacle towards a fully preemptible mmu_gather.

The current scheme assumes mmu_gather is always done with preemption
disabled and uses per-cpu storage for the page batches.  Change this to
try and allocate a page for batching and in case of failure, use a small
on-stack array to make some progress.

Preemptible mmu_gather is desired in general and usable once i_mmap_lock
becomes a mutex.  Doing it before the mutex conversion saves us from
having to rework the code by moving the mmu_gather bits inside the
pte_lock.

Also avoid flushing the tlb batches from under the pte lock, this is
useful even without the i_mmap_lock conversion as it significantly reduces
pte lock hold times.

[akpm@linux-foundation.org: fix comment tpyo]
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Miller <davem@davemloft.net>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Tony Luck <tony.luck@intel.com>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Hugh Dickins <hughd@google.com>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:12 -07:00
Michal Hocko d05f3169c0 mm: make expand_downwards() symmetrical with expand_upwards()
Currently we have expand_upwards exported while expand_downwards is
accessible only via expand_stack or expand_stack_downwards.

check_stack_guard_page is a nice example of the asymmetry.  It uses
expand_stack for VM_GROWSDOWN while expand_upwards is called for
VM_GROWSUP case.

Let's clean this up by exporting both functions and make those names
consistent.  Let's use expand_{upwards,downwards} because expanding
doesn't always involve stack manipulation (an example is
ia64_do_page_fault which uses expand_upwards for registers backing store
expansion).  expand_downwards has to be defined for both
CONFIG_STACK_GROWS{UP,DOWN} because get_arg_page calls the downwards
version in the early process initialization phase for growsup
configuration.

Signed-off-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:12 -07:00
Dave Hansen 82d4b5779a include/linux/gfp.h: convert BUG_ON() into VM_BUG_ON()
VM_BUG_ON() if effectively a BUG_ON() undef #ifdef CONFIG_DEBUG_VM.  That
is exactly what we have here now, and two different folks have suggested
doing it this way.

Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com>
Cc: Christoph Lameter <cl@linux.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:11 -07:00
Dave Hansen 15fa8f4255 include/linux/gfp.h: work around apparent sparse confusion
Running sparse on page_alloc.c today, it errors out:
        include/linux/gfp.h:254:17: error: bad constant expression
        include/linux/gfp.h:254:17: error: cannot size expression

which is a line in gfp_zone():

        BUILD_BUG_ON((GFP_ZONE_BAD >> bit) & 1);

That's really unfortunate, because it ends up hiding all of the other
legitimate sparse messages like this:
        mm/page_alloc.c:5315:59: warning: incorrect type in argument 1 (different base types)
        mm/page_alloc.c:5315:59:    expected unsigned long [unsigned] [usertype] size
        mm/page_alloc.c:5315:59:    got restricted gfp_t [usertype] <noident>
...

Having sparse be able to catch these very oopsable bugs is a lot more
important than keeping a BUILD_BUG_ON().  Kill the BUILD_BUG_ON().

Compiles on x86_64 with and without CONFIG_DEBUG_VM=y.  defconfig boots
fine for me.

Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:11 -07:00
David Rientjes 72788c3856 oom: replace PF_OOM_ORIGIN with toggling oom_score_adj
There's a kernel-wide shortage of per-process flags, so it's always
helpful to trim one when possible without incurring a significant penalty.
 It's even more important when you're planning on adding a per- process
flag yourself, which I plan to do shortly for transparent hugepages.

PF_OOM_ORIGIN is used by ksm and swapoff to prefer current since it has a
tendency to allocate large amounts of memory and should be preferred for
killing over other tasks.  We'd rather immediately kill the task making
the errant syscall rather than penalizing an innocent task.

This patch removes PF_OOM_ORIGIN since its behavior is equivalent to
setting the process's oom_score_adj to OOM_SCORE_ADJ_MAX.

The process's old oom_score_adj is stored and then set to
OOM_SCORE_ADJ_MAX during the time it used to have PF_OOM_ORIGIN.  The old
value is then reinstated when the process should no longer be considered a
high priority for oom killing.

Signed-off-by: David Rientjes <rientjes@google.com>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Izik Eidus <ieidus@redhat.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:10 -07:00
KOSAKI Motohiro a6cccdc36c mm, mem-hotplug: update pcp->stat_threshold when memory hotplug occur
Currently, cpu hotplug updates pcp->stat_threshold, but memory hotplug
doesn't.  There is no reason for this.

[akpm@linux-foundation.org: fix CONFIG_SMP=n build]
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Acked-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:09 -07:00
KOSAKI Motohiro 1b79acc911 mm, mem-hotplug: recalculate lowmem_reserve when memory hotplug occurs
Currently, memory hotplug calls setup_per_zone_wmarks() and
calculate_zone_inactive_ratio(), but doesn't call
setup_per_zone_lowmem_reserve().

It means the number of reserved pages aren't updated even if memory hot
plug occur.  This patch fixes it.

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:09 -07:00
KOSAKI Motohiro 37b23e0525 x86,mm: make pagefault killable
When an oom killing occurs, almost all processes are getting stuck at the
following two points.

	1) __alloc_pages_nodemask
	2) __lock_page_or_retry

1) is not very problematic because TIF_MEMDIE leads to an allocation
failure and getting out from page allocator.

2) is more problematic.  In an OOM situation, zones typically don't have
page cache at all and memory starvation might lead to greatly reduced IO
performance.  When a fork bomb occurs, TIF_MEMDIE tasks don't die quickly,
meaning that a fork bomb may create new process quickly rather than the
oom-killer killing it.  Then, the system may become livelocked.

This patch makes the pagefault interruptible by SIGKILL.

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:08 -07:00
KOSAKI Motohiro f62e00cc3a mm: introduce wait_on_page_locked_killable()
commit 2687a356 ("Add lock_page_killable") introduced killable
lock_page().  Similarly this patch introdues killable
wait_on_page_locked().

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:08 -07:00
KOSAKI Motohiro fa25c503df mm: per-node vmstat: show proper vmstats
commit 2ac390370a ("writeback: add
/sys/devices/system/node/<node>/vmstat") added vmstat entry.  But
strangely it only show nr_written and nr_dirtied.

        # cat /sys/devices/system/node/node20/vmstat
        nr_written 0
        nr_dirtied 0

Of course, It's not adequate.  With this patch, the vmstat show all vm
stastics as /proc/vmstat.

        # cat /sys/devices/system/node/node0/vmstat
	nr_free_pages 899224
	nr_inactive_anon 201
	nr_active_anon 17380
	nr_inactive_file 31572
	nr_active_file 28277
	nr_unevictable 0
	nr_mlock 0
	nr_anon_pages 17321
	nr_mapped 8640
	nr_file_pages 60107
	nr_dirty 33
	nr_writeback 0
	nr_slab_reclaimable 6850
	nr_slab_unreclaimable 7604
	nr_page_table_pages 3105
	nr_kernel_stack 175
	nr_unstable 0
	nr_bounce 0
	nr_vmscan_write 0
	nr_writeback_temp 0
	nr_isolated_anon 0
	nr_isolated_file 0
	nr_shmem 260
	nr_dirtied 1050
	nr_written 938
	numa_hit 962872
	numa_miss 0
	numa_foreign 0
	numa_interleave 8617
	numa_local 962872
	numa_other 0
	nr_anon_transparent_hugepages 0

[akpm@linux-foundation.org: no externs in .c files]
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Michael Rubin <mrubin@google.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:07 -07:00
David Rientjes 7bf02ea22c arch, mm: filter disallowed nodes from arch specific show_mem functions
Architectures that implement their own show_mem() function did not pass
the filter argument to show_free_areas() to appropriately avoid emitting
the state of nodes that are disallowed in the current context.  This patch
now passes the filter argument to show_free_areas() so those nodes are now
avoided.

This patch also removes the show_free_areas() wrapper around
__show_free_areas() and converts existing callers to pass an empty filter.

ia64 emits additional information for each node, so skip_free_areas_zone()
must be made global to filter disallowed nodes and it is converted to use
a nid argument rather than a zone for this use case.

Signed-off-by: David Rientjes <rientjes@google.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Helge Deller <deller@gmx.de>
Cc: James Bottomley <jejb@parisc-linux.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:03 -07:00
Sasha Levin 08bb3a5076 9p: Small cleanup in <net/9p/9p.h>
There are two small cleanups in this patch:
 - p9_errstr2errno was declared twice - remove one declaration.
 - A uint8_t type was mixed in, change it to u8 to match
with the rest of the type names and remove dependency.

Cc: Eric Van Hensbergen <ericvh@gmail.com>
Cc: Ron Minnich <rminnich@sandia.gov>
Cc: Latchesar Ionkov <lucho@ionkov.net>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2011-05-25 08:46:38 -05:00
Rob Landley aca0076336 9p: typo fixes and minor cleanups
Typo fixes and minor cleanups for v9fs

Signed-off-by: Rob Landley <rob@landley.net>
Reviewed-by: Venkateswararao Jujjuri (JV) <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2011-05-25 08:46:37 -05:00
Vinod Koul f2889fee8c Merge branch 'next' into for-linus 2011-05-25 18:34:07 +05:30
Viresh Kumar aecb7b64dd dmaengine/dw_dmac: Update maintainer-ship
Nobody is currently maintaining dw_dmac. We are using dw_dmac for SPEAr13xx and
are currently maintaining it. After discussing with Vinod, sending this patch to
update maintainer-ship of dw_dmac.

Signed-off-by: Viresh Kumar <viresh.kumar@st.com>
Acked-by: Havard Skinnemoen <hskinnemoen@gmail.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2011-05-25 18:30:37 +05:30
HeungJun, Kim bc125106f8 [media] Add support for M-5MOLS 8 Mega Pixel camera ISP
Add I2C/V4L2 subdev driver for M-5MOLS integrated image signal processor
with 8 Mega Pixel sensor.

Signed-off-by: HeungJun, Kim <riverful.kim@samsung.com>
Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2011-05-25 07:51:18 -03:00
Andrei Warkentin c59de92879 mmc: quirks: Add/remove quirks conditional support.
Conditional add/remove quirks for MMC and SD.

Signed-off-by: Andrei Warkentin <andreiw@motorola.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
2011-05-24 23:54:01 -04:00
Philip Rakity 4c4cb17105 mmc: core: add support for eMMC Dual Data Rate
eMMC voltage change not required for 1.8V.  3.3V and 1.8V vcc
are capable of doing DDR. vccq of 1.8v is not required.

Signed-off-by: Philip Rakity <prakity@marvell.com>
Reviewed-by: Arindam Nath <arindam.nath@amd.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
2011-05-24 23:53:58 -04:00
Guennadi Liakhovetski 2595880481 mmc: sdhi: allow powering down controller with no card inserted
Supply a link to TMIO private data for platforms to implement their
own card detection.

Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Chris Ball <cjb@laptop.org>
2011-05-24 23:53:55 -04:00
Guennadi Liakhovetski 7311bef069 mmc: tmio: runtime suspend the controller, where possible
The TMIO MMC controller cannot be powered off to save power, when no
card is plugged in, because then it will not be able to detect a new
card-insertion event. On some implementations, however, it is
possible to switch to using another source to detect card insertion.
This patch adds support for such implementations.

Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Chris Ball <cjb@laptop.org>
2011-05-24 23:53:55 -04:00
Stefan Nilsson XK 06e8935feb mmc: sdio: optimized SDIO IRQ handling for single irq
If there is only 1 function interrupt registered it is possible to
improve performance by directly calling the irq handler and avoiding
the overhead of reading the CCCR registers.

Signed-off-by: Per Forlin <per.forlin@linaro.org>
Acked-by: Ulf Hansson <ulf.hansson@stericsson.com>
Reviewed-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
2011-05-24 23:53:50 -04:00
Arindam Nath cf2b5eea1e mmc: sdhci: add support for retuning mode 1
Host Controller v3.00 can support retuning modes 1,2 or 3 depending on
the bits 46-47 of the Capabilities register. Also, the timer count for
retuning is indicated by bits 40-43 of the same register. We initialize
timer_list for retuning the first time we execute tuning procedure. This
condition is indicated by SDHCI_NEEDS_RETUNING not being set. Since
retuning mode 1 sets a limit of 4MB on the maximum data length, we set
max_blk_count appropriately. Once the tuning timer expires, we set
SDHCI_NEEDS_RETUNING flag, and if the flag is set, we execute tuning
procedure before sending the next command. We need to restore mmc_request
structure after executing retuning procedure since host->mrq is used
inside the procedure to send CMD19. We also disable and re-enable this
flag during suspend and resume respectively, as per the spec v3.00.

Tested by Zhangfei Gao with a Toshiba uhs card and general hs card,
on mmp2 in SDMA mode.

Signed-off-by: Arindam Nath <arindam.nath@amd.com>
Reviewed-by: Philip Rakity <prakity@marvell.com>
Tested-by: Philip Rakity <prakity@marvell.com>
Acked-by: Zhangfei Gao <zhangfei.gao@marvell.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
2011-05-24 23:53:48 -04:00
Arindam Nath c3ed387762 mmc: sdhci: add support for programmable clock mode
Host Controller v3.00 supports programmable clock mode as an optional
feature. The support for this mode is indicated by non-zero value in
bits 48-55 of the Capabilities register. If supported, the actual
value of Clock Multiplier is one more than the value provided in the
bit fields. We only set Clock Generator Select (bit 5) and SDCLK
Frequency Select (bits 8-15) of the Clock Control register in case
Preset Value Enable is not set, otherwise these fields are automatically
set by the Host Controller based on the UHS mode selected. Also, since
the maximum and minimum clock frequency in this mode can be
(Base Clock * Clock Mul) and (Base Clock * Clock Mul)/1024 respectively,
f_max and f_min have been recalculated to reflect this change.

Tested by Zhangfei Gao with a Toshiba uhs card and general hs card,
on mmp2 in SDMA mode.

Signed-off-by: Arindam Nath <arindam.nath@amd.com>
Reviewed-by: Philip Rakity <prakity@marvell.com>
Tested-by: Philip Rakity <prakity@marvell.com>
Acked-by: Zhangfei Gao <zhangfei.gao@marvell.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
2011-05-24 23:53:48 -04:00
Arindam Nath 4d55c5a13a mmc: sdhci: enable preset value after uhs initialization
According to the Host Controller spec v3.00, setting Preset Value Enable
in the Host Control2 register lets SDCLK Frequency Select, Clock Generator
Select and Driver Strength Select to be set automatically by the Host
Controller based on the UHS-I mode set. This patch enables this feature.
Since Preset Value Enable makes sense only for UHS-I cards, we enable this
feature after successfull UHS-I initialization. We also reset Preset Value
Enable next time before initialization.

Tested by Zhangfei Gao with a Toshiba uhs card and general hs card,
on mmp2 in SDMA mode.

Signed-off-by: Arindam Nath <arindam.nath@amd.com>
Reviewed-by: Philip Rakity <prakity@marvell.com>
Tested-by: Philip Rakity <prakity@marvell.com>
Acked-by: Zhangfei Gao <zhangfei.gao@marvell.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
2011-05-24 23:53:47 -04:00
Arindam Nath b513ea250e mmc: sd: add support for tuning during uhs initialization
Host Controller needs tuning during initialization to operate SDR50
and SDR104 UHS-I cards. Whether SDR50 mode actually needs tuning is
indicated by bit 45 of the Host Controller Capabilities register.
A new command CMD19 has been defined in the Physical Layer spec
v3.01 to request the card to send tuning pattern.

We enable Buffer Read Ready interrupt at the very begining of tuning
procedure, because that is the only interrupt generated by the Host
Controller during tuning. We program the block size to 64 in the
Block Size register. We make sure that DMA Enable and Multi Block
Select in the Transfer Mode register are set to 0 before actually
sending CMD19. The tuning block is sent by the card to the Host
Controller using DAT lines, so we set Data Present Select (bit 5) in
the Command register. The Host Controller is responsible for doing
the verfication of tuning block sent by the card at the hardware
level. After sending CMD19, we wait for Buffer Read Ready interrupt.
In case we don't receive an interrupt after the specified timeout
value, we fall back on fixed sampling clock by setting Execute
Tuning (bit 6) and Sampling Clock Select (bit 7) of Host Control2
register to 0. Before exiting the tuning procedure, we disable Buffer
Read Ready interrupt and re-enable other interrupts.

Tested by Zhangfei Gao with a Toshiba uhs card and general hs card,
on mmp2 in SDMA mode.

Signed-off-by: Arindam Nath <arindam.nath@amd.com>
Reviewed-by: Philip Rakity <prakity@marvell.com>
Tested-by: Philip Rakity <prakity@marvell.com>
Acked-by: Zhangfei Gao <zhangfei.gao@marvell.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
2011-05-24 23:53:46 -04:00
Arindam Nath 3a30351143 mmc: sd: report correct speed and capacity of uhs cards
Since only UHS-I cards respond with S18A set in response to ACMD41,
we set the card as ultra-high-speed after successfull initialization.
We need to decide whether a card is SDXC based on the C_SIZE field
of CSDv2.0 register. According to Physical Layer spec v3.01, the
minimum value of C_SIZE for SDXC card is 00FFFFh.

Tested by Zhangfei Gao with a Toshiba uhs card and general hs card,
on mmp2 in SDMA mode.

Signed-off-by: Arindam Nath <arindam.nath@amd.com>
Reviewed-by: Philip Rakity <prakity@marvell.com>
Tested-by: Philip Rakity <prakity@marvell.com>
Acked-by: Zhangfei Gao <zhangfei.gao@marvell.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
2011-05-24 23:53:46 -04:00
Arindam Nath 5371c927bc mmc: sd: set current limit for uhs cards
We decide on the current limit to be set for the card based on the
Capability of Host Controller to provide current at 1.8V signalling,
and the maximum current limit of the card as indicated by CMD6
mode 0. We then set the current limit for the card using CMD6 mode 1.
As per the Physical Layer Spec v3.01, the current limit switch is
only applicable for SDR50, SDR104, and DDR50 bus speed modes. For
other UHS-I modes, we set the default current limit of 200mA.

Tested by Zhangfei Gao with a Toshiba uhs card and general hs card,
on mmp2 in SDMA mode.

Signed-off-by: Arindam Nath <arindam.nath@amd.com>
Reviewed-by: Philip Rakity <prakity@marvell.com>
Tested-by: Philip Rakity <prakity@marvell.com>
Acked-by: Zhangfei Gao <zhangfei.gao@marvell.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
2011-05-24 23:53:45 -04:00
Arindam Nath 49c468fcf8 mmc: sd: add support for uhs bus speed mode selection
This patch adds support for setting UHS-I bus speed mode during UHS-I
initialization procedure. Since both the host and card can support
more than one bus speed, we select the highest speed based on both of
their capabilities. First we set the bus speed mode for the card using
CMD6 mode 1, and then we program the host controller to support the
required speed mode. We also set High Speed Enable in case one of the
UHS-I modes is selected. We take care to reset SD clock before setting
UHS mode in the Host Control2 register, and then re-enable it as per
the Host Controller spec v3.00. We then set the clock frequency for
the UHS-I mode selected.

Tested by Zhangfei Gao with a Toshiba uhs card and general hs card,
on mmp2 in SDMA mode.

Signed-off-by: Arindam Nath <arindam.nath@amd.com>
Reviewed-by: Philip Rakity <prakity@marvell.com>
Tested-by: Philip Rakity <prakity@marvell.com>
Acked-by: Zhangfei Gao <zhangfei.gao@marvell.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
2011-05-24 23:53:45 -04:00
Arindam Nath d6d50a15a2 mmc: sd: add support for driver type selection
This patch adds support for setting driver strength during UHS-I
initialization procedure. Since UHS-I cards set S18A (bit 24) in
response to ACMD41, we use this as a base for UHS-I initialization.
We modify the parameter list of mmc_sd_get_cid() so that we can
save the ROCR from ACMD41 to check whether bit 24 is set.

We decide whether the Host Controller supports A, C, or D driver
type depending on the Capabilities register. Driver type B is
suported by default. We then set the appropriate driver type for
the card using CMD6 mode 1. As per Host Controller spec v3.00, we
set driver type for the host only if Preset Value Enable in the
Host Control2 register is not set. SDHCI_HOST_CONTROL has been
renamed to SDHCI_HOST_CONTROL1 to conform to the spec.

Tested by Zhangfei Gao with a Toshiba uhs card and general hs card,
on mmp2 in SDMA mode.

Signed-off-by: Arindam Nath <arindam.nath@amd.com>
Reviewed-by: Philip Rakity <prakity@marvell.com>
Tested-by: Philip Rakity <prakity@marvell.com>
Acked-by: Zhangfei Gao <zhangfei.gao@marvell.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
2011-05-24 23:53:24 -04:00
Jamie Iles 6a8a98b22b mtd: kill CONFIG_MTD_PARTITIONS
Now that none of the drivers use CONFIG_MTD_PARTITIONS we can remove
it from Kconfig and the last remaining uses.

Signed-off-by: Jamie Iles <jamie@jamieiles.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2011-05-25 02:25:35 +01:00
Jamie Iles eea72d5fdf mtd: remove add_mtd_partitions, add_mtd_device and friends
These symbols are replaced with mtd_device_register() (and removal with
mtd_device_unregister()) for public registration.

Signed-off-by: Jamie Iles <jamie@jamieiles.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2011-05-25 02:25:16 +01:00
Jamie Iles 984e6d8ec5 mtd: physmap: convert to mtd_device_register()
Convert to mtd_device_register() and remove the CONFIG_MTD_PARTITIONS
preprocessor conditionals as partitioning is always available.

Signed-off-by: Jamie Iles <jamie@jamieiles.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2011-05-25 02:15:37 +01:00
Jamie Iles 11b73c8b10 mtd: provide of_mtd_parse_partitions for !CONFIG_MTD_OF_PARTS
If we don't have OpenFirmware enabled then provide a stub
of_mtd_parse_partitions that returns no partitions so drivers don't need
ifdeffery inside.

Signed-off-by: Jamie Iles <jamie@jamieiles.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2011-05-25 02:12:51 +01:00
Jamie Iles f5671ab3f6 mtd: introduce mtd_device_(un)register()
To prepare for the removal of add_mtd_device and add_mtd_partitions(),
introduce mtd_device_register().  This will create partitions if they
are supplied or register the whole device if there are no partitions.

Once all drivers are converted to use mtd_device_register(),
add_mtd_device() and add_mtd_partitions() will be made internal only.

v2: move kerneldoc to implementation file and fixup some kerneldoc
warnings.

Artem: tweak comments: remove junk tabs, use dots consistently.

Signed-off-by: Jamie Iles <jamie@jamieiles.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2011-05-25 02:12:36 +01:00
Arindam Nath 013909c4ff mmc: sd: query function modes for uhs cards
SD cards which conform to Physical Layer Spec v3.01 can support
additional Bus Speed Modes, Driver Strength, and Current Limit
other than the default values. We use CMD6 mode 0 to read these
additional card functions. The values read here will be used
during UHS-I initialization steps.

Tested by Zhangfei Gao with a Toshiba uhs card and general hs card,
on mmp2 in SDMA mode.

Signed-off-by: Arindam Nath <arindam.nath@amd.com>
Reviewed-by: Philip Rakity <prakity@marvell.com>
Tested-by: Philip Rakity <prakity@marvell.com>
Acked-by: Zhangfei Gao <zhangfei.gao@marvell.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
2011-05-24 21:04:40 -04:00
Arindam Nath f2119df6b7 mmc: sd: add support for signal voltage switch procedure
Host Controller v3.00 adds another Capabilities register. Apart
from other things, this new register indicates whether the Host
Controller supports SDR50, SDR104, and DDR50 UHS-I modes. The spec
doesn't mention about explicit support for SDR12 and SDR25 UHS-I
modes, so the Host Controller v3.00 should support them by default.
Also if the controller supports SDR104 mode, it will also support
SDR50 mode as well. So depending on the host support, we set the
corresponding MMC_CAP_* flags. One more new register. Host Control2
is added in v3.00, which is used during Signal Voltage Switch
procedure described below.

Since as per v3.00 spec, UHS-I supported hosts should set S18R
to 1, we set S18R (bit 24) of OCR before sending ACMD41. We also
need to set XPC (bit 28) of OCR in case the host can supply >150mA.
This support is indicated by the Maximum Current Capabilities
register of the Host Controller.

If the response of ACMD41 has both CCS and S18A set, we start the
signal voltage switch procedure, which if successfull, will switch
the card from 3.3V signalling to 1.8V signalling. Signal voltage
switch procedure adds support for a new command CMD11 in the
Physical Layer Spec v3.01. As part of this procedure, we need to
set 1.8V Signalling Enable (bit 3) of Host Control2 register, which
if remains set after 5ms, means the switch to 1.8V signalling is
successfull. Otherwise, we clear bit 24 of OCR and retry the
initialization sequence. When we remove the card, and insert the
same or another card, we need to make sure that we start with 3.3V
signalling voltage. So we call mmc_set_signal_voltage() with
MMC_SIGNAL_VOLTAGE_330 set so that we are back to 3.3V signalling
voltage before we actually start initializing the card.

Tested by Zhangfei Gao with a Toshiba uhs card and general hs card,
on mmp2 in SDMA mode.

Signed-off-by: Arindam Nath <arindam.nath@amd.com>
Reviewed-by: Philip Rakity <prakity@marvell.com>
Tested-by: Philip Rakity <prakity@marvell.com>
Acked-by: Zhangfei Gao <zhangfei.gao@marvell.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
2011-05-24 21:04:38 -04:00
John Calixto cb87ea28ed mmc: core: Add mmc CMD+ACMD passthrough ioctl
Allows appropriately-privileged applications to send CMD (normal) and ACMD
(application-specific; preceded with CMD55) commands to cards/devices on
the mmc bus.  This is primarily useful for enabling the security
functionality built in to every SD card.

It can also be used as a generic passthrough (e.g. to enable virtual
machines to control mmc bus devices directly).  However, this use case has
not been tested rigorously.  Generic passthrough testing was only conducted
for a few non-security opcodes to prove the feasibility of the passthrough.

Since any opcode can be sent using this passthrough, it is very possible to
render the card/device unusable.  Applications that use this ioctl must
have CAP_SYS_RAWIO.

Security commands tested on TI PCIxx12 (SDHCI), Sigma Designs SMP8652 SoC,
TI OMAP3621/OMAP3630 SoC, Samsung S5PC110 SoC, Qualcomm MSM7200A SoC.

Signed-off-by: John Calixto <john.calixto@modsystems.com>
Reviewed-by: Andrei Warkentin <andreiw@motorola.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Chris Ball <cjb@laptop.org>
2011-05-24 21:02:54 -04:00
Takashi Iwai 82b0e23a29 mmc: sdhci: Fix read-only detection with JMicron 388 chip
On HP laptops with JMicron 388 chip, the write-locked SD card isn't
detected correctly as read-only in many cases.  This is because the
PRESENT_STATE register becomes unsable just after plugging, and it
returns the WRITE_PROTECT bit wrongly at the first read.

This patch fixes the read-only detection by adding a new sdhci quirk
indicating to check the register more intensively with a relatively
long delay.

The patch is tested with 2.6.39-rc4 kernel.

Cc: Aries Lee <arieslee@jmicron.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Chris Ball <cjb@laptop.org>
2011-05-24 21:02:42 -04:00
Andrei Warkentin 6a7a6b45f4 mmc: quirks: Fix erase/trim for certain SanDisk cards.
CMD38 argument is passed through EXT_CSD[113].

Signed-off-by: Andrei Warkentin <andreiw@motorola.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
2011-05-24 21:01:34 -04:00
Andrei Warkentin 371a689f64 mmc: MMC boot partitions support.
Allows device MMC boot partitions to be accessed. MMC partitions are
treated effectively as separate block devices on the same MMC card.

Signed-off-by: Andrei Warkentin <andreiw@motorola.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Chris Ball <cjb@laptop.org>
2011-05-24 21:01:21 -04:00
Andrei Warkentin d3a8d95dcb mmc: core: Allow setting CMD timeout for CMD6 (SWITCH).
CMD6 is an R1B-type command, where DAT is used as busy. Depending
on register written using CMD6, timeout value can be different
as per spec.

Signed-off-by: Andrei Warkentin <andreiw@motorola.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
2011-05-24 21:01:13 -04:00
Andrei Warkentin eaa02f751f mmc: core: Rename erase_timeout to cmd_timeout_ms.
Renames erase_timeout to cmd_timeout_ms inside struct mmc_command.
First step to making host honor timeouts for non-data-transfer
commands. Cleans up erase timeout code.

Signed-off-by: Andrei Warkentin <andreiw@motorola.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
2011-05-24 21:01:05 -04:00
Randy Dunlap 853c6cac0d mmc: quirks: fix truncation warnings
Fix data truncation warnings: .manfid is not unsigned long:

drivers/mmc/core/quirks.c:36: warning: large integer implicitly truncated to unsigned type
drivers/mmc/core/quirks.c:40: warning: large integer implicitly truncated to unsigned type
drivers/mmc/core/quirks.c:43: warning: large integer implicitly truncated to unsigned type
drivers/mmc/core/quirks.c:46: warning: large integer implicitly truncated to unsigned type

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
2011-05-24 21:00:59 -04:00
Andrei Warkentin 32780cd135 mmc: quirks: Extends card quirks with MMC/SD quirks matching the CID.
The current mechanism is SDIO-only. This allows us to create
function-specific quirks, without creating messy Kconfig dependencies,
or polluting core/ with function-specific code.

Signed-off-by: Andrei Warkentin <andreiw@motorola.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Chris Ball <cjb@laptop.org>
2011-05-24 21:00:54 -04:00
Ohad Ben-Cohen 2059a02dcb mmc: add MMC_QUIRK_DISABLE_CD
006ebd5d introduced sdio_disable_cd(), which disconnects the pull-up
resistor on CD/DAT[3] (pin 1) of the card.

Make it possible to start using sdio_disable_cd() by introducing
MMC_QUIRK_DISABLE_CD.

Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
2011-05-24 21:00:01 -04:00
Ohad Ben-Cohen eab4068795 mmc: add MMC_QUIRK_NONSTD_FUNC_IF
Introduce MMC_QUIRK_NONSTD_FUNC_IF to ignore the "SDIO Standard Function
interface code" as indicated by the card's FBR, and instead treat all
functions as non-standard interfaces.

This is required to prevent standard drivers from facing
errors when trying to communicate with SDIO cards that erroneously
indicate standard function interface codes.

Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
2011-05-24 20:59:52 -04:00
Ohad Ben-Cohen 6b93d01fe5 mmc: do not switch to 1-bit mode if not required
6b5eda36 followed SDIO spec part E1 section 8, which states that
in case SDIO interrupts are being used to wake up a suspended host,
then it is required to switch to 1-bit mode before stopping the clock.

Before switching to 1-bit mode (or back to 4-bit mode on resume),
make sure that SDIO interrupts are really being used to wake the host.

This is helpful for devices which have an external irq line (e.g.
wl1271), and do not use SDIO interrupts to wake up the host.

In this case, switching to 1-bit mode (and back to 4-bit mode on resume)
is not necessary.

Reported-by: Eliad Peller <eliad@wizery.com>
Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
2011-05-24 20:59:47 -04:00
Grant Erickson 33b53716bc mtd: create function to perform large allocations
Introduce a common function to handle large, contiguous kmalloc buffer
allocations by exponentially backing off on the size of the requested
kernel transfer buffer until it succeeds or until the requested
transfer buffer size falls below the page size.

This helps ensure the operation can succeed under low-memory, highly-
fragmented situations albeit somewhat more slowly.

Artem: so this patch solves the problem that the kernel tries to kmalloc too
large buffers, which (a) may fail and does fail - people complain about this,
and (b) slows down the system in case of high memory fragmentation, because
the kernel starts dropping caches, writing back, swapping, etc. But we do not
really have to allocate a lot of memory to do the I/O, we may do this even with
as little as one min. I/O unit (NAND page) of RAM. So the idea of this patch is
that if the user asks to read or write a lot, we try to kmalloc a lot, with GFP
flags which make the kernel _not_ drop caches, etc. If we can allocate it - good,
if not - we try to allocate twice as less, and so on, until we reach the min.
I/O unit size, which is our last resort allocation and use the normal
GFP_KERNEL flag.

Artem: re-write the allocation function so that it makes sure the allocated
buffer is aligned to the min. I/O size of the flash.

Signed-off-by: Grant Erickson <marathon96@gmail.com>
Tested-by: Ben Gardiner <bengardiner@nanometrics.ca>
Tested-by: Stefano Babic <sbabic@denx.de>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2011-05-25 01:59:43 +01:00
Ohad Ben-Cohen a5e9425d20 mmc: mmc_card_keep_power cleanups
mmc_card_is_powered_resumed is a mouthful; instead, simply use
mmc_card_keep_power, which also better explains the purpose of
the macro.

Employ mmc_card_keep_power() where possible.

Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
2011-05-24 20:59:43 -04:00
Andrei Warkentin f4c5522b0a mmc: Reliable write support.
Allows reliable writes to be used for MMC writes. Reliable writes are used
to service write REQ_FUA/REQ_META requests. Handles both the legacy and
the enhanced reliable write support in MMC cards.

Signed-off-by: Andrei Warkentin <andreiw@motorola.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Chris Ball <cjb@laptop.org>
2011-05-24 20:59:38 -04:00