Currently, I_SYNC can never be set when evict_inode() (and thus
end_writeback()) is called because flusher thread holds inode reference while
inode is under writeback. As a result inode_sync_wait() in those places
currently does nothing. However that is going to change and unveils problems
with calling inode_sync_wait() from end_writeback(). Several filesystems call
end_writeback() after they have deleted the inode (btrfs, gfs2, ...) and other
filesystems (ext3, ext4, reiserfs, ...) can deadlock when waiting for I_SYNC
because they call end_writeback() from within a transaction.
To avoid these issues, we move inode_sync_wait() into evict_inode() before
calling ->evict_inode(). That way we preserve the current property that
->evict_inode() and writeback never run in parallel and all filesystems are
safe.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
The code in writeback_single_inode() is relatively complex. The list requeing
logic makes sense only for flusher thread but not really for sync_inode() or
write_inode_now() callers. Also when we want to get rid of inode references
held by flusher thread, we will need a special I_SYNC handling there.
So separate part of writeback_single_inode() which does the real writeback work
into __writeback_single_inode() and make writeback_single_inode() do only stuff
necessary for callers writing only one inode, moving the special list handling
into writeback_sb_inodes(). As a sideeffect this fixes a possible race where we
could skip some inode during sync(2) because other writer refiled it from b_io
to b_dirty list. Also I_SYNC handling is moved into the callers of
__writeback_single_inode() to make locking easier.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
writeback_single_inode() doesn't need wb->list_lock for anything on entry now.
So remove the requirement. This makes locking of writeback_single_inode()
temporarily awkward (entering with i_lock, returning with i_lock and
wb->list_lock) but it will be sanitized in the next patch.
Also inode_wait_for_writeback() doesn't need wb->list_lock for anything. It was
just taking it to make usage convenient for callers but with
writeback_single_inode() changing it's not very convenient anymore. So remove
the lock from that function.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Move inode requeueing after inode has been written out into a separate
function.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Instead of clearing I_DIRTY_PAGES and resetting it when we didn't succeed in
writing them all, just clear the bit only when we succeeded writing all the
pages. We also move the clearing of the bit close to other i_state handling to
separate it from writeback list handling. This is desirable because list
handling will differ for flusher thread and other writeback_single_inode()
callers in future. No filesystem plays any tricks with I_DIRTY_PAGES (like
checking it in ->writepages or ->write_inode implementation) so this movement
is safe.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
When writeback_single_inode() is called on inode which has I_SYNC already
set while doing WB_SYNC_NONE, inode is moved to b_more_io list. However
this makes sense only if the caller is flusher thread. For other callers of
writeback_single_inode() it doesn't really make sense and may be even wrong
- flusher thread may be doing WB_SYNC_ALL writeback in parallel.
So we move requeueing from writeback_single_inode() to writeback_sb_inodes().
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Move clearing of I_SYNC into inode_sync_complete(). It is more logical to have
clearing of I_SYNC bit and waking of waiters in one place. Also later we will
have two places needing to clear I_SYNC and wake up waiters so this allows them
to use the common helper. Moving of I_SYNC clearing to a later stage of
writeback_single_inode() is safe since we hold i_lock all the time.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
This prevents global_dirty_limit from remaining 0 (the initial value)
for long time, since it's only updated in update_dirty_limit() when
above the dirty freerun area.
It will avoid unexpected consequences when some random code use it as a
convenient approximation of the global dirty threshold.
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Reorder structure writeback_control to remove 8 bytes of padding on 64
bit builds, this shrinks its size from 48 to 40 bytes.
This structure is always on the stack and uses C99 named initialisation,
so should be safe and have a small impact on stack usage.
Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
The function global_dirtyable_memory is only referenced in this file and
should be marked static to prevent it from being exposed globally.
This quiets the sparse warning:
warning: symbol 'global_dirtyable_memory' was not declared. Should it be static?
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Pull system.h fixups for less common arch's from Paul Gortmaker:
"Here is what is hopefully the last of the system.h related fixups.
The fixes for Alpha and ia64 are code relocations consistent with what
was done for the more mainstream architectures. Note that the
diffstat lines removed vs lines added are not the same since I've
fixed some of the whitespace issues in the relocated code blocks.
However they are functionally the same. Compile tested locally, plus
these two have been in linux-next for a while.
There is also a trivial one line system.h related fix for the Tilera
arch from Chris Metcalf to fix an implict include.."
* 'systemh-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux:
irq_work: fix compile failure on tile from missing include
ia64: populate the cmpxchg header with appropriate code
alpha: fix build failures from system.h dismemberment
It includes:
- a compile fix for au1*fb
- a fix to make kyrofb usable on x86_64
- a fix for uvesafb to prevent an oops due to NX-protection
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.12 (GNU/Linux)
iQIcBAABAgAGBQJPiFZ6AAoJECSVL5KnPj1P23cP/32/ZDNMdxmS5IJxm3dLyGX2
cXr03aetzB5nj+E4eEopKe2M5XWV1iwfKh7cR/B8drEMDsunpjy3d1WrBOQ/Dubz
QtCmqV8+XpFGhJWxWEWSK+BhU0sThzlqbQszZ4r/KOYC9TLLi5GpBt3UZ2gbs+Vc
EwwAMyw/yDisTxU7sy84dUKy13TYsh0fR/BlfTJE1PiI1VxvBmCMGHofJMJvpx/E
RgIsUFyTfFqeGzlagoQEcIlSDrqxIuDl61YXny6e1MX4NsMIF3x2YYZ+T/6UlXCg
JaU2zx5G5a2pqlry8mzqbQ9CVP7xtmjbpxdxrJk1N1eRQ1wlsTpZCqVyFKKCzUlI
lw6Og5Tq7IgKpx7PL15R2cLJd+x1wJAxEcBWeov004Gx3IFkRsvBABapAvyHnGWb
5JEL/N2ipqlUNXEol6mVt9kacY+jKt/KGAf7FvE6qPnTy5EY+RNLTYziTyp1oCq9
d21Yud2dHbOl6Q2WRDgw7i/DKjsBtZrHbuqWNgIsdhbVUT0qhRIaG6U6MjVzYLb2
cxyqUWL1rnsdj5PbooaFuxqKoCpwb20B+t1LJTwS2/KBZY02I7lyrKe3fg27asGz
KbBpeoG5WaKDhpQUrWz1RVg6Wadbn2GJKhsW85PFV97H9Tw2xZP9bujJbtyKKSN+
DPdT2vLcXMGmLJigtJW/
=PWxJ
-----END PGP SIGNATURE-----
Merge tag 'fbdev-fixes-for-3.4-1' of git://github.com/schandinat/linux-2.6
Pull fbdev fixes from Florian Tobias Schandinat:
- a compile fix for au1*fb
- a fix to make kyrofb usable on x86_64
- a fix for uvesafb to prevent an oops due to NX-protection
"The fix for kyrofb is a bit large but it's just replacing "unsigned
long" by "u32" for 64 bit compatibility."
* tag 'fbdev-fixes-for-3.4-1' of git://github.com/schandinat/linux-2.6:
video:uvesafb: Fix oops that uvesafb try to execute NX-protected page
fbdev: fix au1*fb builds
kyrofb: fix on x86_64
Pull the minimal btrfs branch from Chris Mason:
"We have a use-after-free in there, along with errors when mount -o
discard is enabled, and a BUG_ON(we should compile with UP more
often)."
* 'for-linus-min' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
Btrfs: use commit root when loading free space cache
Btrfs: fix use-after-free in __btrfs_end_transaction
Btrfs: check return value of bio_alloc() properly
Btrfs: remove lock assert from get_restripe_target()
Btrfs: fix eof while discarding extents
Btrfs: fix uninit variable in repair_eb_io_failure
Revert "Btrfs: increase the global block reserve estimates"
Pull block driver bits from Jens Axboe:
- A series of fixes for mtip32xx. Most from Asai at Micron, but also
one from Greg, getting rid of the dependency on PCIE_HOTPLUG.
- A few bug fixes for xen-blkfront, and blkback.
- A virtio-blk fix for Vivek, making resize actually work.
- Two fixes from Stephen, making larger transfers possible on cciss.
This is needed for tape drive support.
* 'for-3.4/drivers' of git://git.kernel.dk/linux-block:
block: mtip32xx: remove HOTPLUG_PCI_PCIE dependancy
mtip32xx: dump tagmap on failure
mtip32xx: fix handling of commands in various scenarios
mtip32xx: Shorten macro names
mtip32xx: misc changes
mtip32xx: Add new sysfs entry 'status'
mtip32xx: make setting comp_time as common
mtip32xx: Add new bitwise flag 'dd_flag'
mtip32xx: fix error handling in mtip_init()
virtio-blk: Call revalidate_disk() upon online disk resize
xen/blkback: Make optional features be really optional.
xen/blkback: Squash the discard support for 'file' and 'phy' type.
mtip32xx: fix incorrect value set for drv_cleanup_done, and re-initialize and start port in mtip_restart_port()
cciss: Fix scsi tape io with more than 255 scatter gather elements
cciss: Initialize scsi host max_sectors for tape drive support
xen-blkfront: make blkif_io_lock spinlock per-device
xen/blkfront: don't put bdev right after getting it
xen-blkfront: use bitmap_set() and bitmap_clear()
xen/blkback: Enable blkback on HVM guests
xen/blkback: use grant-table.c hypercall wrappers
Pull block core bits from Jens Axboe:
"It's a nice and quiet round this time, since most of the tricky stuff
has been pushed to 3.5 to give it more time to mature. After a few
hectic block IO core changes for 3.3 and 3.2, I'm quite happy with a
slow round.
Really minor stuff in here, the only real functional change is making
the auto-unplug threshold a per-queue entity. The threshold is set so
that it's low enough that we don't hold off IO for too long, but still
big enough to get a nice benefit from the batched insert (and hence
queue lock cost reduction). For raid configurations, this currently
breaks down."
* 'for-3.4/core' of git://git.kernel.dk/linux-block:
block: make auto block plug flush threshold per-disk based
Documentation: Add sysfs ABI change for cfq's target latency.
block: Make cfq_target_latency tunable through sysfs.
block: use lockdep_assert_held for queue locking
block: blk_alloc_queue_node(): use caller's GFP flags instead of GFP_KERNEL
The OMAP driver needs a 'depends on ARCH_OMAP2PLUS' since it only
builds for OMAP2+ platforms.
This 'depends on' was in the original patch from Russell King, but was
erroneously removed by me when making this option user-selectable in
commit b09db45c (cpufreq: OMAP driver depends CPUfreq tables.) This
patch remedies that.
Apologies to Russell King for breaking his originally working patch.
Also, thanks to Grazvydas Ignotas for reporting the same problem.
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Grazvydas Ignotas <notasas@gmail.com>
Signed-off-by: Kevin Hilman <khilman@ti.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull sparc fixes from David Miller.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
sparc64: Eliminate obsolete __handle_softirq() function
sparc64: Fix bootup crash on sun4v.
Miscellaneous bug fixes to GPIO drivers and for a corner case in the
gpio device tree parsing code.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJPiG4HAAoJEEFnBt12D9kBekwQALuM73zU18cvoLgd+wg34RFy
4xJ23sm3QtSixXFBcQuLPwcPT5hMUMjQBP+LQ3mf8oshrr3zV5EdjuvVXA8xP78H
nBWqBoJ+HNEV01b1a1rqvXGgzHofCVYHS7dAZ6Ob0RPfTfIzzM7x9et9pWsq1Q/5
ribPBqaonhs0hnJkEoXB8SVl4R6brhubwzk38C9gzAG2/faVr6Wv5o43C1PlLiLg
WT+0bvSJmQwiV2U9fkq9HUkjoZLw2VS8BUYL8OjzeVZ0mdGJrP1KQiQuhuYOg6Rx
iBPJK7SyCRsIvGsqb9IkdLpZ3hdvM1g5OEmXIHXowNlwjiyX8hZgs0XF4cIZ+VCn
GOIrYG8GE3uy5WRnkROzCPyMvFYYidTp4ZcZoh1bRJWzt/6m7craAgSkcwk4/xTw
edbLTqW1szmI/pkwb3EjbUkA+ee6lMSlwKoKmvVz/LL+/UOdRWvm48vu1rdkQ1+1
+Ylgm9KpG1ywFyPsWG7owLWH2j4Us/4lBdWWVwtJaCORc2KGdxD7BVxhAcrZTIIt
RB8ODBvpYgs/jH/+tvfnvdNuXHQiPyd/UQwbGXs0RM/rHBTtDaUfxt8zXQpXkgRa
JBl+pvpnmJHsNBJ7i4by/ii62kF3rXjTMA8ZGo8zNGOBcto5jMBkIGn+zlx17c8V
xx880JZh7+8sZu3kTqRs
=fdGO
-----END PGP SIGNATURE-----
Merge tag 'gpio-for-linus' of git://git.secretlab.ca/git/linux-2.6
Pull GPIO bug fixes from Grant Likely:
"Miscellaneous bug fixes to GPIO drivers and for a corner case in the
gpio device tree parsing code."
* tag 'gpio-for-linus' of git://git.secretlab.ca/git/linux-2.6:
gpio/exynos: Fix compiler warning in gpio-samsung.c file
gpio: Fix range check in of_gpio_simple_xlate()
gpio: Fix uninitialized variable bit in adp5588_irq_handler
gpio/sodaville: Convert sodaville driver to new irqdomain API
Miscellaneous driver bug fixes. No major changes in this branch.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJPiG0NAAoJEEFnBt12D9kBUekP/R1qIWR2kvwNKWqv+7f1HMda
cF0/evfraqNoEM2qGeq06fyF/7ows3NyWLvLqdut3+7rmbhl62vekYOt44efrYLA
J6KQGcIvo5C02wSiwK2McyjzkijvQ4hxiPGGeUwI7Pc1nj4JmrT3bXUJvlvwOh/4
lYeHMwEhhmg/yN0qMMqA9D+aRItMqXuxZXRAXfAQTm/o5Yg8EjjDMfMyo+SPUen+
BpF9qSU+vg33e/5l/sT0eXC5C0xDhEZx/p9gO1De0oqpWIRhusbDI74ZUVHK+8+G
OIJaMReuULdJcbJJl3Eyt9Mm/776w8+yoHNDTFlchtyzMplpybsWraJTKjeYjuGR
vzORvpY+mwX08XmOWTt3QJQdUg9cSjrNmtboDhggEeokGFDYjpk17WscsmQjxWUi
owgoSXMnft6c6CzF2hmvwKFKeE0ljogZhMLdp71hf9zEzr4JiGyTV1hzabWK1af7
DvqZ6/GNxgWnlso+DTT2EogC5n9f92k65gCaYuU29iQNKeNNPDychLK2KiETqowN
d0RHsH+sIFo6JdotacTNoH5QMqW6AM9emT6OSGmSWu2zlxg8tY9JO5U7HFIbEHsH
iak5K/DfWGMfgREXmX4srcfMZ7fQzdwpbaXLKkE6VwQDa5VSlgt6ezVDMOC4lh4P
3dSv07cV/Eo1My0YBGyJ
=tw/j
-----END PGP SIGNATURE-----
Merge tag 'spi-for-linus' of git://git.secretlab.ca/git/linux-2.6
Pull SPI bug fixes from Grant Likely:
"Miscellaneous driver bug fixes. No major changes in this branch."
* tag 'spi-for-linus' of git://git.secretlab.ca/git/linux-2.6:
spi/imx: prevent NULL pointer dereference in spi_imx_probe()
spi/imx: mark base member in spi_imx_data as __iomem
spi/mpc83xx: fix NULL pdata dereference bug
spi/davinci: Fix DMA API usage in davinci
spi/pL022: include types.h to remove compilation warnings
The invocation of softirq is now handled by irq_exit(), so there is no
need for sparc64 to invoke it on the trap-return path. In fact, doing so
is a bug because if the trap occurred in the idle loop, this invocation
can result in lockdep-RCU failures. The problem is that RCU ignores idle
CPUs, and the sparc64 trap-return path to the softirq handlers fails to
tell RCU that the CPU must be considered non-idle while those handlers
are executing. This means that RCU is ignoring any RCU read-side critical
sections in those handlers, which in turn means that RCU-protected data
can be yanked out from under those read-side critical sections.
The shiny new lockdep-RCU ability to detect RCU read-side critical sections
that RCU is ignoring located this problem.
The fix is straightforward: Make sparc64 stop manually invoking the
softirq handlers.
Reported-by: Meelis Roos <mroos@linux.ee>
Suggested-by: David Miller <davem@davemloft.net>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Meelis Roos <mroos@linux.ee>
Cc: stable@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
The DS driver registers as a subsys_initcall() but this can be too
early, in particular this risks registering before we've had a chance
to allocate and setup module_kset in kernel/params.c which is
performed also as a subsyts_initcall().
Register DS using device_initcall() insteal.
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: stable@vger.kernel.org
Building with IRQ_WORK configured results in
kernel/irq_work.c: In function ‘irq_work_run’:
kernel/irq_work.c:110: error: implicit declaration of function ‘irqs_disabled’
The appropriate header just needs to be included.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
commit 93f378883c
"Fix ia64 build errors (fallout from system.h disintegration)"
introduced arch/ia64/include/asm/cmpxchg.h as a temporary
build fix and stated:
"... leave the migration of xchg() and cmpxchg() to this new
header file for a future patch."
Migrate the appropriate chunks from asm/intrinsics.h and fix
the whitespace issues in the migrated chunk.
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: David Howells <dhowells@redhat.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
commit ec2212088c
"Disintegrate asm/system.h for Alpha"
combined with commit b4816afa39
"Move the asm-generic/system.h xchg() implementation to asm-generic/cmpxchg.h"
introduced the concept of asm/cmpxchg.h but the alpha arch
never got one. Fork the cmpxchg content out of the asm/atomic.h
file to create one.
Some minor whitespace fixups were done on the block of code that
created the new file.
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Acked-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
non-mlx4 hardware.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABCAAGBQJPh2i0AAoJEENa44ZhAt0hf3UQAJkb0Ol8earKatr3qWuDpWtf
aQ5AvdRIWAnxXrIc0pDagxgQ7mF/9OFuuRsJCnoyxbhlLGuDcBEs0b9fm5arhKCz
0fNTxcZSftpO7mNDjzcS71rsRga5GLZcKnJ/I1AWLpG8xJ5nPSfOzwTnC+U/kczm
GnAhcNN1pR7jmUXxfiafY4VFWcNQqhLQTa+TZdlzdPXoSfxk6M8RyGzVOn99V2rR
XSYo7xO0Zww5WKSgBmULwGJ3h1wqWXqA9GZrQnCVJ53BzNRt/oxrumEoKBohdnVR
7Ce8Ts8oVRr99RmAqTLKTzy7A7SJJpsrHUmYWDl7aDM5LJyFvk75chr0b0u7IaIC
SRBmdFTWbT+JX8lu4gGhwk9GBl6aMePnaUkOpbt2RbYVgVChUFHZD6mvmvA589JD
PspnQWb+FarM6wT8rXT2RXAxGNCZFj1W8jz120HFAi8hcWxMVPMWWCU/dWMfJsMS
42YfleWqmstPD9y5ir3Mj7kcrCTi1NXbIAoR4a255DNgD3ih79QzYT/W0Ln47D+P
XAEjPR0L5D3f4pbAAB3ogB3Z3sejRGi8DLK0bO3PsuvjkXq+XoSK+gunnVQwl8u4
q4vffLr6EMBUs6AZy+VNPAwzOwSojGfzWD0k7S/iFsJmY4zHTWtKjcVVW8X3Ivw4
HMZDpHkXI+rRICOF24A1
=fWhE
-----END PGP SIGNATURE-----
Merge tag 'srpt-srq-type' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
Pull infiniband fix from Roland Dreier:
"Add a fix for a bug hit by Alexey Shvetsov in ib_srtp that hits on
non-mlx4 hardware."
* tag 'srpt-srq-type' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
IB/srpt: Set srq_type to IB_SRQT_BASIC
We've now fixed IS_ENABLED() and friends to not require any special
"__enabled_" prefixed versions of the normal Kconfig options, so delete
the last traces of them being generated.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This reverts commit 953742c8fe.
Dumping two lines into autoconf.h for all existing Kconfig options
results in a giant file (~16k lines) we have to process each time we
compile something. We've weaned IS_ENABLED() and similar off of
requiring the __enabled_ definitions so now we can revert the change
which caused all the extra lines.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Using IS_ENABLED() within C (vs. within CPP #if statements) in its
current form requires us to actually define every possible bool/tristate
Kconfig option twice (__enabled_* and __enabled_*_MODULE variants).
This results in a huge autoconf.h file, on the order of 16k lines for a
x86_64 defconfig.
Fixing IS_ENABLED to be able to work on the smaller subset of just
things that we really have defined is step one to fixing this. Which
means it has to not choke when fed non-enabled options, such as:
include/linux/netdevice.h:964:1: warning: "__enabled_CONFIG_FCOE_MODULE" is not defined [-Wundef]
The original prototype of how to implement a C and preprocessor
compatible way of doing this came from the Google+ user "comex ." in
response to Linus' crowdsourcing challenge for a possible improvement on
his earlier C specific solution:
#define config_enabled(x) (__stringify(x)[0] == '1')
In this implementation, I've chosen variable names that hopefully make
how it works more understandable.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
A user reported that booting his box up with btrfs root on 3.4 was way
slower than on 3.3 because I removed the ideal caching code. It turns out
that we don't load the free space cache if we're in a commit for deadlock
reasons, but since we're reading the cache and it hasn't changed yet we are
safe reading the inode and free space item from the commit root, so do that
and remove all of the deadlock checks so we don't unnecessarily skip loading
the free space cache. The user reported this fixed the slowness. Thanks,
Tested-by: Calvin Walton <calvin.walton@kepstin.ca>
Signed-off-by: Josef Bacik <josef@redhat.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Here are a number of fixes for the USB core and drivers for 3.4-rc2
Lots of tiny xhci fixes here, a few usb-serial driver fixes and new device ids,
and a smattering of other minor fixes in different USB drivers.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)
iEYEABECAAYFAk+HVPUACgkQMUfUDdst+ykskwCfV0ypvXzg0JMTBqNDHb8BDmcZ
Ca4An0pZyzO2JsUHTfGUAvm6SFANqCnk
=KB61
-----END PGP SIGNATURE-----
Merge tag 'usb-3.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are a number of fixes for the USB core and drivers for 3.4-rc2
Lots of tiny xhci fixes here, a few usb-serial driver fixes and new
device ids, and a smattering of other minor fixes in different USB
drivers.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"
* tag 'usb-3.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (30 commits)
USB: update usbtmc api documentation
xHCI: Correct the #define XHCI_LEGACY_DISABLE_SMI
xHCI: use gfp flags from caller instead of GFP_ATOMIC
xHCI: add XHCI_RESET_ON_RESUME quirk for VIA xHCI host
USB: fix bug of device descriptor got from superspeed device
xhci: Fix register save/restore order.
xhci: Restore event ring dequeue pointer on resume.
xhci: Don't write zeroed pointers to xHC registers.
xhci: Warn when hosts don't halt.
xhci: don't re-enable IE constantly
usb: xhci: fix section mismatch in linux-next
xHCI: correct to print the true HSEE of USBCMD
USB: serial: fix race between probe and open
UHCI: hub_status_data should indicate if ports are resuming
EHCI: keep track of ports being resumed and indicate in hub_status_data
USB: fix race between root-hub suspend and remote wakeup
USB: sierra: add support for Sierra Wireless MC7710
USB: ftdi_sio: fix race condition in TIOCMIWAIT, and abort of TIOCMIWAIT when the device is removed
USB: ftdi_sio: fix status line change handling for TIOCMIWAIT and TIOCGICOUNT
USB: don't ignore suspend errors for root hubs
...
Here are some tty and serial fixes for 3.4-rc2.
Most important here is the pl011 fix, which has been reported by about
100 different people, which means more people use it than I expected :)
There are also some 8250 driver reverts due to some problems reported by
them. And other minor fixes as well.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)
iEYEABECAAYFAk+HVDgACgkQMUfUDdst+ymKngCbBIqGrmq5PiDEV7JnkpLRcCEy
DQoAn1XH+Y4NR0Wu121AkJ7UwE8CiXSg
=h7iL
-----END PGP SIGNATURE-----
Merge tag 'tty-3.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty and serial fixes from Greg KH:
"Here are some tty and serial fixes for 3.4-rc2.
Most important here is the pl011 fix, which has been reported by about
100 different people, which means more people use it than I expected
:)
There are also some 8250 driver reverts due to some problems reported
by them. And other minor fixes as well.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"
* tag 'tty-3.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
pch_uart: Add Kontron COMe-mTT10 uart clock quirk
pch_uart: Fix MSI setting issue
serial/8250_pci: add a "force background timer" flag and use it for the "kt" serial port
Revert "serial/8250_pci: setup-quirk workaround for the kt serial controller"
Revert "serial/8250_pci: init-quirk msi support for kt serial controller"
tty/serial/omap: console can only be built-in
serial: samsung: fix omission initialize ulcon in reset port fn()
printk(): add KERN_CONT where needed in hpet and vt code
tty/serial: atmel_serial: fix RS485 half-duplex problem
tty: serial: altera_uart: Check for NULL platform_data in probe.
isdn/gigaset: use gig_dbg() for debugging output
omap-serial: Fix the error handling in the omap_serial probe
serial: PL011: move interrupt clearing
Here are a number of bugfixes for the drivers/staging/ portion of the kernel
that have been reported recently.
Nothing major here, with maybe the exception of the ramster code can now be
built so it is enabled in the build again, and lots of memory leaks that people
like to have fixed on their systems.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)
iEYEABECAAYFAk+HTnkACgkQMUfUDdst+ylk8wCfXi65bqb6VEKr9QmfLlmBU5sf
E54AnRpxLNw7ckt7eWZMhtCUAvg4ts1I
=8siZ
-----END PGP SIGNATURE-----
Merge tag 'staging-3.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging tree fixes from Greg KH:
"Here are a number of bugfixes for the drivers/staging/ portion of the
kernel that have been reported recently.
Nothing major here, with maybe the exception of the ramster code can
now be built so it is enabled in the build again, and lots of memory
leaks that people like to have fixed on their systems.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"
* tag 'staging-3.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
staging: android: fix mem leaks in __persistent_ram_init()
staging: vt6656: Don't leak memory in drivers/staging/vt6656/ioctl.c::private_ioctl()
staging: iio: hmc5843: Fix crash in probe function.
staging/xgifb: fix display on XGI Volari Z11m cards
Staging: android: timed_gpio: Fix resource leak in timed_gpio_probe error paths
android: make persistent_ram based drivers depend on HAVE_MEMBLOCK
staging: iio: ak8975: Remove i2c client data corruption
staging: drm/omap: move where DMM driver is registered
staging: zsmalloc: fix memory leak
Staging: rts_pstor: off by one in for loop
staging: ozwpan: Added new maintainer for ozwpan
staging:rts_pstor:Avoid "Bad target number" message when probing driver
staging:rts_pstor:Fix possible panic by NULL pointer dereference
Staging: vt6655-6: check keysize before memcpy()
staging/media/as102: Don't call release_firmware() on uninitialized variable
staging:iio:core add missing increment of loop index in iio_map_array_unregister()
staging: ramster: unbreak my heart
staging/vme: Fix module parameters
staging: sep: Fix sign of error
Here are some minor fixes for the driver core and kobjects that people have
reported recently.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)
iEYEABECAAYFAk+HTLAACgkQMUfUDdst+ylwlACcCcVRLdGIyW7Qjd1VB8hAVCSy
27EAniaKfzOiPAkrb9718D6gfdBTVU0B
=c1Bm
-----END PGP SIGNATURE-----
Merge tag 'driver-core-3.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core and kobject fixes from Greg KH:
"Here are some minor fixes for the driver core and kobjects that people
have reported recently.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"
* tag 'driver-core-3.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
kobject: provide more diagnostic info for kobject_add_internal() failures
sysfs: handle 'parent deleted before child added'
sysfs: Prevent crash on unset sysfs group attributes
sysfs: Update the name hash for an entry after changing the namespace
drivers/base: fix compiler warning in SoC export driver - idr should be ida
drivers/base: Remove unneeded spin_lock_init call for soc_lock
Format string bug fix for irqdomain debug output on 64 bit platforms
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJPh1bxAAoJEEFnBt12D9kBbJYP/1QVBjTuObbdoI4UQ8TMTueO
6Wh0hZ0zxRF+lznPJKJQdurIKJBtgM2m+M+HZl1fIrhIQbwzASc3whztqc30n1rj
qnqjzeGAdQv8NWvABgjZJM0s8SuCFlwvnm0BfdXGwe4Uh/E761rs3oz0YtZcXUC8
XXiWjY6FNExZ8dKFv94SDmFS8FGjz72gQW5rGB8wtyD/sl7rs59K6h2eOBm5HhUT
DDjsIlyUGev7QYMJNFRfYDEFKBXFH63v1q69kroOxEgd2CMwD2WfAguwBdFKhOrF
aWfOUJZaOkglGOfeGulEs0lohgWeehSZYwKNTDZh/FPqQmSXhixN9PIc4iYBXlqa
ZgyUF3Tt3BQ+s8rHNTk1psWxvzYvHzfA6+KGRdPmZ9fOmqdfCoAfj2wh5oWmSKsJ
ZwQygeU1ziI/deBRVL08qW1NeeYFf3iGY68wIV338XGBmMYxpWwzWjoWO3nKkSxm
nMUiiOyEVulLdzeXy+JCL39IbZ8atiDj/012CIsiHhssZRtoQNt2wyBxZ4yditze
6gZWtak1dn9ZAIZiGfzPh5SbbPOGjykqt0VSoyhKU1XEVAsGByWwZvqLFWwMRSeD
KIKp7zINy5p/ftoBhe3dKgDNw83FJF+IqubK5k6m/AtDY14WOoUbVIkfbxXhhXLK
Di5uxHBxhRIL5jhj1v27
=7g0b
-----END PGP SIGNATURE-----
Merge tag 'irqdomain-for-linus' of git://git.secretlab.ca/git/linux-2.6
Pull a fix for the recent irqdomain bug fixes from Grant Likely:
"I flubbed one patch in the last pull request which broke a format
string on 64 bit platforms. Here's the fix."
* tag 'irqdomain-for-linus' of git://git.secretlab.ca/git/linux-2.6:
irq_domain: fix type mismatch in debugfs output format
sizeof(void*) returns an unsigned long, but it was being used as a width parameter to a "%-*s" format string which requires an int. On 64 bit platforms this causes a type mismatch:
linux/kernel/irq/irqdomain.c:575: warning: field width should have type
'int', but argument 6 has type 'long unsigned int'
This change casts the size to an int so printf gets the right data type.
Reported-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Cc: David Daney <david.daney@cavium.com>
Pull trivial perf build failure fix from Thomas Gleixner.
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf tools: Fix getrusage() related build failure on glibc trunk
Pull timer fixes from Thomas Gleixner:
"The itimer removal one is not strictly a fix, but I really wanted to
avoid a rebase of the urgent ones."
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
Revert "clocksource: Load the ACPI PM clocksource asynchronously"
clockevents: tTack broadcast device mode change in tick_broadcast_switch_to_oneshot()
itimer: Use printk_once instead of WARN_ONCE
nohz: Fix stale jiffies update in tick_nohz_restart()
tick: Document TICK_ONESHOT config option
proc: stats: Use arch_idle_time for idle and iowait times if available
itimer: Schedule silent NULL pointer fixup in setitimer() for removal
Pull x86 fixes from Thomas Gleixner.
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86: Use correct byte-sized register constraint in __add()
x86: Use correct byte-sized register constraint in __xchg_op()
x86: vsyscall: Use NULL instead 0 for a pointer argument
If, in __persistent_ram_init(), the call to
persistent_ram_buffer_init() fails or the call to
persistent_ram_init_ecc() fails then we fail to free the memory we
allocated to 'prz' with kzalloc() - thus leaking it.
To prevent the leaks I consolidated all error exits from the function
at a 'err:' label at the end and made all error cases jump to that
label where we can then make sure we always free 'prz'. This is safe
since all the situations where the code bails out happen before 'prz'
has been stored anywhere and although we'll do a redundant kfree(NULL)
call in the case of kzalloc() itself failing that's OK since kfree()
deals gracefully with NULL pointers and I felt it was more important
to keep all error exits at a single location than to avoid that one
harmless/redundant kfree() on a error path.
Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Acked-by: Colin Cross <ccross@android.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If copy_to_user() fails in the WLAN_CMD_GET_NODE_LIST case of the
switch in drivers/staging/vt6656/ioctl.c::private_ioctl() we'll leak
the memory allocated to 'pNodeList'. Fix that by kfree'ing the memory
in the failure case.
Also remove a pointless cast (to type 'PSNodeList') of a kmalloc()
return value - kmalloc() returns a void pointer that is implicitly
converted, so there is no need for an explicit cast.
Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Merge fixes from Andrew Morton.
* emailed from Andrew Morton <akpm@linux-foundation.org>: (14 patches)
panic: fix stack dump print on direct call to panic()
drivers/rtc/rtc-pl031.c: enable clock on all ST variants
Revert "mm: vmscan: fix misused nr_reclaimed in shrink_mem_cgroup_zone()"
hugetlb: fix race condition in hugetlb_fault()
drivers/rtc/rtc-twl.c: use static register while reading time
drivers/rtc/rtc-s3c.c: add placeholder for driver private data
drivers/rtc/rtc-s3c.c: fix compilation error
MAINTAINERS: add PCDP console maintainer
memcg: do not open code accesses to res_counter members
drivers/rtc/rtc-efi.c: fix section mismatch warning
drivers/rtc/rtc-r9701.c: reset registers if invalid values are detected
drivers/char/random.c: fix boot id uniqueness race
memcg: fix broken boolen expression
memcg: fix up documentation on global LRU
Pull networking fixes from David Miller:
1) Fix bluetooth userland regression reported by Keith Packard, from
Gustavo Padovan.
2) Revert ath9k PS idle change, from Sujith Manoharan.
3) Correct default TCP memory limits (again), from Eric Dumazet.
4) Fix tcp_rcv_rtt_update() accidental use of unscaled RTT, from Neal
Cardwell.
5) We made a facility for layers like wireless to say how much tailroom
they need in the SKB for link layer stuff such as wireless
encryption etc., but TCP works hard to fill every SKB out to the end
defeating this specification.
This leads to every TCP packet getting reallocated by the wireless
code in order to have the right amount of tailroom available.
Fix TCP to only fill SKBs out to the real amount of data area it
asked for during the allocation, this way it won't eat into the
slack added for the device's tailroom needs.
Reported by Marc Merlin and fixed by Eric Dumazet.
6) Leaks, endian bugs, and new device IDs in bluetooth from Santosh
Nayak, João Paulo Rechi Vita, Cho, Yu-Chen, Andrei Emeltchenko,
AceLan Kao, and Andrei Emeltchenko.
7) OOPS on tty_close fix in bluetooth's hci_ldisc from Johan Hovold.
8) netfilter erroneously scales TCP window twice, fix from Changli Gao.
9) Memleak fix in wext-core from Julia Lawall.
10) Consistently handle invalid TCP packets in ipv4 vs. ipv6 conntrack,
from Jozsef Kadlecsik.
11) Validate IP header length properly in netfilter conntrack's
ipv4_get_l4proto().
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (39 commits)
NFC: Fix the LLCP Tx fragmentation loop
rtlwifi: Add missing DMA buffer unmapping for PCI drivers
rtlwifi: Preallocate USB read buffers and eliminate kalloc in read routine
tcp: avoid order-1 allocations on wifi and tx path
net: allow pskb_expand_head() to get maximum tailroom
bridge: Do not send queries on multicast group leaves
MAINTAINERS: Mark NATSEMI driver as orphan'd.
tcp: fix tcp_rcv_rtt_update() use of an unscaled RTT sample
tcp: restore correct limit
Revert "ath9k: fix going to full-sleep on PS idle"
rt2x00: Fix rfkill_polling register function.
bcma: fix build error on MIPS; implicit pcibios_enable_device
netfilter: nf_conntrack: fix incorrect logic in nf_conntrack_init_net
netfilter: nf_ct_ipv4: packets with wrong ihl are invalid
netfilter: nf_ct_ipv4: handle invalid IPv4 and IPv6 packets consistently
net/wireless/wext-core.c: add missing kfree
rtlwifi: Fix oops on rate-control failure
mac80211: Convert WARN_ON to WARN_ON_ONCE
rtlwifi: rtl8192de: Fix firmware initialization
nl80211: ensure interface is up in various APIs
...
Pull drm fixes from Dave Airlie:
"Mostly exynos and intel.
Intel has 3 regression fixers (more info in intel merge commit), along
with some other make hw work fixes, exynos has some cleanups and an
ioctl fix.
A couple of radeon fixes, couple of build fixes, and a savage
userspace interface possible overflow fix."
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (23 commits)
drm/exynos: fixed exynos broken ioctl
drm/i915: clear fencing tracking state when retiring requests
drm/exynos: fix to pointer manager member of struct exynos_drm_subdrv
drm/exynos: fix struct for operation callback functions to driver name
drm/exynos: use define instead of default_win member in struct mixer_context
drm/exynos: rename s/HDMI_OVERLAY_NUMBER/MIXER_WIN_NR
drm/exynos: remove unused codes in hdmi and mixer
drm/exynos: remove unnecessary type conversion of hdmi and mixer
drm/i915: make rc6 module parameter read-only
drm/i915: implement ColorBlt w/a
drm/i915/ringbuffer: Exclude last 2 cachlines of ring on 845g
Revert "drm/i915: reenable gmbus on gen3+ again"
drm/radeon: only add the mm i2c bus if the hw_i2c module param is set
vgaarb.h: fix build warnings
drm/i915: properly compute dp dithering for user-created modes
drm/radeon/kms: fix DVO setup on some r4xx chips
drm/savage: fix integer overflows in savage_bci_cmdbuf()
drm/radeon: replace udelay with mdelay for long timeouts
drm/i915: Finish any pending operations on the framebuffer before disabling
drm/i915: Removed IVB forced enable of sprite dest key.
...
Two are tagged for -stable. They can cause an oops, but very rarely.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)
iQIVAwUAT4ZxRTnsnt1WYoG5AQKaaQ/9HCaffA6TpN2hiA/utmfzurQGqXkhWmxj
3nggDy221mHwVPQqouk0OniOIMObGPxKXHeiMMNZM3prLDPBemDx2UJ54NgQonyL
gB8siqhkwOMJ2G2dYW3gQiS6+euuxoEB0QpJJZ7Jye162FYmo3LGcFu2LQViMaNe
9eaneUddH3AsDsJ5YSID1ylV1zYBceQK7trtd7mStWILV+lqNU0HBWpGplT+BOsb
Jbe60nxnpWAgy0YUKDv4DVglM9YS7i3sJ1HfS8MZ30fCGAED+1Rkxk6DxOwVnKWL
ogS2xVNxBQ6hXBIwXPNBU4tEtT6VdI4GQRZkBKczRFhJ4B2LHMronoDXA1A5hrKP
+ELcFn4CWN9DfO9X0CpjU//JHh6vNl/p0RoOKRsABDP+VJfU7zSPn6lxrI/KCEGU
Q66WUEYWrd0RDihqTQ2gkdJL8m1JmNDx3J4fKqWR528v0bgWoJOK4oIUdvLJ3zlW
vAhejlKGOEbWKtwxi1mAVQ6XdFISkL4UR/tdjJkCj+H+SlL0HdTITCKcQju2qiQ/
CPwapez6sZVwUQ7852OhycCUyz7AztkURbio+clcGVAttRqxkfvt4uN/RPpQMtKO
QBoFz4pN6hqAVM8p+G8fgmDUJ9nC5PHzf3iPaLCEFR69CubGDF/O9rD6T7SwxUtS
FexjkBeW2OU=
=yJBa
-----END PGP SIGNATURE-----
Merge tag 'md-3.4-fixes' of git://neil.brown.name/md
Pull a few more fixes for md from NeilBrown:
"Two are tagged for -stable. They can cause an oops, but very rarely."
* tag 'md-3.4-fixes' of git://neil.brown.name/md:
md/bitmap: prevent bitmap_daemon_work running while initialising bitmap
md/raid1,raid10: Fix calculation of 'vcnt' when processing error recovery.
MD: Bitmap version cleanup.
Commit 6e6f0a1f0f ("panic: don't print redundant backtraces on oops")
causes a regression where no stack trace will be printed at all for the
case where kernel code calls panic() directly while not processing an
oops, and of course there are 100's of instances of this type of call.
The original commit executed the check (!oops_in_progress), but this will
always be false because just before the dump_stack() there is a call to
bust_spinlocks(1), which does the following:
void __attribute__((weak)) bust_spinlocks(int yes)
{
if (yes) {
++oops_in_progress;
The proper way to resolve the problem that original commit tried to
solve is to avoid printing a stack dump from panic() when the either of
the following conditions is true:
1) TAINT_DIE has been set (this is done by oops_end())
This indicates and oops has already been printed.
2) oops_in_progress > 1
This guards against the rare case where panic() is invoked
a second time, or in between oops_begin() and oops_end()
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: <stable@vger.kernel.org> [3.3+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The ST variants of the PL031 all require bit 26 in the control register
to be set before they work properly. Discovered this when testing on
the Nomadik board where it would suprisingly just stand still.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Cc: Mian Yousaf Kaukab <mian.yousaf.kaukab@stericsson.com>
Cc: Alessandro Rubini <rubini@unipv.it>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This reverts commit c38446cc65.
Before the commit, the code makes senses to me but not after the commit.
The "nr_reclaimed" is the number of pages reclaimed by scanning through
the memcg's lru lists. The "nr_to_reclaim" is the target value for the
whole function. For example, we like to early break the reclaim if
reclaimed 32 pages under direct reclaim (not DEF_PRIORITY).
After the reverted commit, the target "nr_to_reclaim" is decremented each
time by "nr_reclaimed" but we still use it to compare the "nr_reclaimed".
It just doesn't make sense to me...
Signed-off-by: Ying Han <yinghan@google.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Hillf Danton <dhillf@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The race is as follows:
Suppose a multi-threaded task forks a new process (on cpu A), thus
bumping up the ref count on all the pages. While the fork is occurring
(and thus we have marked all the PTEs as read-only), another thread in
the original process (on cpu B) tries to write to a huge page, taking an
access violation from the write-protect and calling hugetlb_cow(). Now,
suppose the fork() fails. It will undo the COW and decrement the ref
count on the pages, so the ref count on the huge page drops back to 1.
Meanwhile hugetlb_cow() also decrements the ref count by one on the
original page, since the original address space doesn't need it any
more, having copied a new page to replace the original page. This
leaves the ref count at zero, and when we call unlock_page(), we panic.
fork on CPU A fault on CPU B
============= ==============
...
down_write(&parent->mmap_sem);
down_write_nested(&child->mmap_sem);
...
while duplicating vmas
if error
break;
...
up_write(&child->mmap_sem);
up_write(&parent->mmap_sem); ...
down_read(&parent->mmap_sem);
...
lock_page(page);
handle COW
page_mapcount(old_page) == 2
alloc and prepare new_page
...
handle error
page_remove_rmap(page);
put_page(page);
...
fold new_page into pte
page_remove_rmap(page);
put_page(page);
...
oops ==> unlock_page(page);
up_read(&parent->mmap_sem);
The solution is to take an extra reference to the page while we are
holding the lock on it.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
Cc: Hillf Danton <dhillf@gmail.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>