On a system with 1 local mount and 1 NFS mount, if the NFS server
becomes not responding when dd to the NFS mount, the NFS dirty pages may
exceed the global dirty limit and _every_ task involving writing will be
blocked. The whole system appears unresponsive.
The workaround is to permit through the bdi's that only has a small
number of dirty pages. The number chosen (bdi_stat_error pages) is not
enough to enable the local disk to run in optimal throughput, however is
enough to make the system responsive on a broken NFS mount. The user can
then kill the dirtiers on the NFS mount and increase the global dirty
limit to bring up the local disk's throughput.
It risks allowing dirty pages to grow much larger than the global dirty
limit when there are 1000+ mounts, however that's very unlikely to happen,
especially in low memory profiles.
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
We do "floating proportions" to let active devices to grow its target
share of dirty pages and stalled/inactive devices to decrease its target
share over time.
It works well except in the case of "an inactive disk suddenly goes
busy", where the initial target share may be too small. To mitigate
this, bdi_position_ratio() has the below line to raise a small
bdi_thresh when it's safe to do so, so that the disk be feed with enough
dirty pages for efficient IO and in turn fast rampup of bdi_thresh:
bdi_thresh = max(bdi_thresh, (limit - dirty) / 8);
balance_dirty_pages() normally does negative feedback control which
adjusts ratelimit to balance the bdi dirty pages around the target.
In some extreme cases when that is not enough, it will have to block
the tasks completely until the bdi dirty pages drop below bdi_thresh.
Acked-by: Jan Kara <jack@suse.cz>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Currently write(2) to a file is not interruptible by any signal.
Sometimes this is desirable, e.g. when you want to quickly kill a
process hogging your disk. Also, with commit 499d05ecf9 ("mm: Make
task in balance_dirty_pages() killable"), it's necessary to abort the
current write accordingly to avoid it quickly dirtying lots more pages
at unthrottled rate.
This patch makes write interruptible by SIGKILL. We do not allow write
to be interruptible by any other signal because that has larger
potential of screwing some badly written applications.
Reported-by: Kazuya Mio <k-mio@sx.jp.nec.com>
Tested-by: Kazuya Mio <k-mio@sx.jp.nec.com>
Acked-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Document the @reason parameter to make "make htmldocs" happy.
Acked-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Marcos Paulo de Souza <marcos.mage@gmail.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
In the case where CONFIG_PSTORE=n, the function efi_pstore_read() doesn't
have the correct list of parameters. This patch provides a definition
of efi_pstore_read() with 'char **buf' added to fix this warning:
"drivers/firmware/efivars.c:609: warning: initialization from".
problem introduced in commit f6f8285132
Signed-off-by: Christoph Fritz <chf.fritz@googlemail.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
* 'for-3.2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
percpu: explain why per_cpu_ptr_to_phys() is more complicated than necessary
percpu: fix chunk range calculation
percpu: rename pcpu_mem_alloc to pcpu_mem_zalloc
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
hrtimer: Fix extra wakeups from __remove_hrtimer()
timekeeping: add arch_offset hook to ktime_get functions
clocksource: Avoid selecting mult values that might overflow when adjusted
time: Improve documentation of timekeeeping_adjust()
Commit fa27271bc8d2("genirq: Fixup poll handling") introduced a
regression that broke irqfixup/irqpoll for some hardware configurations.
Amidst reorganizing 'try_one_irq', that patch removed a test that
checked for 'action->handler' returning IRQ_HANDLED, before acting on
the interrupt. Restoring this test back returns the functionality lost
since 2.6.39. In the current set of tests, after 'action' is set, it
must precede '!action->next' to take effect.
With this and my previous patch to irq/spurious.c, c75d720fca, all
IRQ regressions that I have encountered are fixed.
Signed-off-by: Edward Donovan <edward.donovan@numble.net>
Reported-and-tested-by: Rogério Brito <rbrito@ime.usp.br>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@kernel.org (2.6.39+)
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'fbdev-for-linus' of git://github.com/schandinat/linux-2.6:
viafb: correct sync polarity for OLPC DCON
video:da8xx-fb: Disable and reset sequence on version2 of LCDC
OMAPDSS: DISPC: skip scaling calculations when not scaling
OMAPFB: fix compilation warnings due to missing include
OMAPDSS: HDMI: fix returned HDMI pixel clock
Revert a hunk in drivers/net/wireless/ath/ath9k/hw.c introduced by
commit 2577c6e8f2 ("ath9k_hw: Add support for AR946/8x chipsets") that
caused a nasty regression to appear on my Acer Ferrari One (the box
locks up entirely at random times after the wireless has been started
without any way to get debug information out of it).
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
VT1708 has no support for unsolicited events per jack-plug, the driver
implements the workq for polling the jack-detection. The mixer element
"Jack Detect" was supposed to control this behavior on/off, but this
doesn't work properly as is now. The workq is always started and the
HP automute is always enabled.
This patch fixes the jack-detect control behavior by triggering / stopping
the work appropriately at the state change. Also the work checks the
internal state to continue scheduling or not.
Cc: <stable@kernel.org> [v3.1]
Signed-off-by: Takashi Iwai <tiwai@suse.de>
The CS420X_IMAC27 was copied from the line before but CS420X_APPLE
was clearly intented.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Fix build failure in staging iio driver:
.../drivers/staging/iio/industrialio-core.c: In function 'iio_event_getfd':
.../drivers/staging/iio/industrialio-core.c:262:32: error:
'ev_int' undeclared (first use in this function)
Also convert the rest of the function to use the new variable.
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
arch/powerpc/mm/hugetlbpage.c: In function 'reserve_hugetlb_gpages':
arch/powerpc/mm/hugetlbpage.c:312:2: error: implicit declaration of function 'parse_args'
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2d3cbf8b (cgroup_freezer: update_freezer_state() does incorrect state
transitions) removed is_task_frozen_enough and replaced it with a simple
frozen call. This, however, breaks freezing for a group with stopped tasks
because those cannot be frozen and so the group remains in CGROUP_FREEZING
state (update_if_frozen doesn't count stopped tasks) and never reaches
CGROUP_FROZEN.
Let's add is_task_frozen_enough back and use it at the original locations
(update_if_frozen and try_to_freeze_cgroup). Semantically we consider
stopped tasks as frozen enough so we should consider both cases when
testing frozen tasks.
Testcase:
mkdir /dev/freezer
mount -t cgroup -o freezer none /dev/freezer
mkdir /dev/freezer/foo
sleep 1h &
pid=$!
kill -STOP $pid
echo $pid > /dev/freezer/foo/tasks
echo FROZEN > /dev/freezer/foo/freezer.state
while true
do
cat /dev/freezer/foo/freezer.state
[ "`cat /dev/freezer/foo/freezer.state`" = "FROZEN" ] && break
sleep 1
done
echo OK
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Li Zefan <lizf@cn.fujitsu.com>
Cc: Tomasz Buchert <tomasz.buchert@inria.fr>
Cc: Paul Menage <paul@paulmenage.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: stable@kernel.org
Signed-off-by: Tejun Heo <htejun@gmail.com>
At this point, ehv_pic has been allocated but not stored anywhere, so it
should be freed before leaving the function.
A simplified version of the semantic match that finds this problem is as
follows: (http://coccinelle.lip6.fr/)
// <smpl>
@exists@
local idexpression x;
statement S,S1;
expression E;
identifier fl;
expression *ptr != NULL;
@@
x = \(kmalloc\|kzalloc\|kcalloc\)(...);
...
if (x == NULL) S
<... when != x
when != if (...) { <+...kfree(x)...+> }
when any
when != true x == NULL
x->fl
...>
(
if (x == NULL) S1
|
if (...) { ... when != x
when forall
(
return \(0\|<+...x...+>\|ptr\);
|
* return ...;
)
}
)
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Acked-by: Timur Tabi <timur@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
If Freescale LBC driver fails to initialise itself from device tree, then
internal structure is freed only but not NULL-fied. As result functions
fsl_lbc_find() after checking the structure is not NULL are trying to
access device registers.
Signed-off-by: Alexandre Rusev <arusev@dev.rtsoft.ru>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
compatible in dts has been changed, so the driver needs to be updated
accordingly.
Signed-off-by: Shaohui Xie <Shaohui.Xie@freescale.com>
Cc: Grant Likely <grant.likely@secretlab.ca>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
QE_General4 should only round up the divisor iff divisor is > 3.
Rounding up lower divisors makes the error too big, causing USB
on MPC832x to fail.
Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Acked-by: Timur Tabi <timur@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
arch/powerpc/platforms/85xx/p3060_qds.c: In function '__machine_initcall_p3060_qds_declare_of_platform_devices':
arch/powerpc/platforms/85xx/p3060_qds.c:73:1: error: implicit declaration of function 'declare_of_platform_devices'
declare_of_platform_devices should have been corenet_ds_publish_devices.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
The driver for the Freescale P3060 QDS got added by commit 96cc017c5b
("[...] Add support for P3060QDS board"). Its Kconfig entry selects
MPC8xxx_GPIO. But at the time that driver got added MPC8xxx_GPIO was
already renamed to GPIO_MPC8XXX, by commit c68308dd50 ("gpio: move
mpc8xxx/512x gpio driver to drivers/gpio").
So make this driver select GPIO_MPC8XXX.
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Acked-by: Wolfram Sang <w.sang@pengutronix.de>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
P1023 external IRQ[4:6, 11] are not pin out, but the interrupts are
utilized by the PCIe controllers. As they are not exposed as pins we
need to set them as active-high (internal to the SoC these interrupts
are pulled down).
IRQs[0:3,7:10] are pulled up on the board so we have them set as
active-low.
Signed-off-by: Roy Zang <tie-fei.zang@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
* git://github.com/rustyrussell/linux:
virtio-pci: make reset operation safer
virtio-mmio: Correct the name of the guest features selector
virtio: add HAS_IOMEM dependency to MMIO platform bus driver
virtio pci device reset actually just does an I/O
write, which in PCI is really posted, that is it
can complete on CPU before the device has received it.
Further, interrupts might have been pending on
another CPU, so device callback might get invoked after reset.
This conflicts with how drivers use reset, which is typically:
reset
unregister
a callback running after reset completed can race with
unregister, potentially leading to use after free bugs.
Fix by flushing out the write, and flushing pending interrupts.
This assumes that device is never reset from
its vq/config callbacks, or in parallel with being
added/removed, document this assumption.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Fix this compile error on s390:
CC [M] drivers/virtio/virtio_mmio.o
drivers/virtio/virtio_mmio.c: In function 'vm_get_features':
drivers/virtio/virtio_mmio.c:107:2: error: implicit declaration of function 'writel'
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Pawel Moll <pawel.moll@arm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci:
PCI hotplug: shpchp: don't blindly claim non-AMD 0x7450 device IDs
PCI: pciehp: wait 100 ms after Link Training check
PCI: pciehp: wait 1000 ms before Link Training check
PCI: pciehp: Retrieve link speed after link is trained
PCI: Let PCI_PRI depend on PCI
PCI: Fix compile errors with PCI_ATS and !PCI_IOV
PCI / ACPI: Make acpiphp ignore root bridges using PCIe native hotplug
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tyhicks/ecryptfs:
eCryptfs: Extend array bounds for all filename chars
eCryptfs: Flush file in vma close
eCryptfs: Prevent file create race condition
From mhalcrow's original commit message:
Characters with ASCII values greater than the size of
filename_rev_map[] are valid filename characters.
ecryptfs_decode_from_filename() will access kernel memory beyond
that array, and ecryptfs_parse_tag_70_packet() will then decrypt
those characters. The attacker, using the FNEK of the crafted file,
can then re-encrypt the characters to reveal the kernel memory past
the end of the filename_rev_map[] array. I expect low security
impact since this array is statically allocated in the text area,
and the amount of memory past the array that is accessible is
limited by the largest possible ASCII filename character.
This patch solves the issue reported by mhalcrow but with an
implementation suggested by Linus to simply extend the length of
filename_rev_map[] to 256. Characters greater than 0x7A are mapped to
0x00, which is how invalid characters less than 0x7A were previously
being handled.
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Reported-by: Michael Halcrow <mhalcrow@google.com>
Cc: stable@kernel.org
Dirty pages weren't being written back when an mmap'ed eCryptfs file was
closed before the mapping was unmapped. Since f_ops->flush() is not
called by the munmap() path, the lower file was simply being released.
This patch flushes the eCryptfs file in the vm_ops->close() path.
https://launchpad.net/bugs/870326
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Cc: stable@kernel.org [2.6.39+]
The file creation path prematurely called d_instantiate() and
unlock_new_inode() before the eCryptfs inode info was fully
allocated and initialized and before the eCryptfs metadata was written
to the lower file.
This could result in race conditions in subsequent file and inode
operations leading to unexpected error conditions or a null pointer
dereference while attempting to use the unallocated memory.
https://launchpad.net/bugs/813146
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Cc: stable@kernel.org
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (31 commits)
drm: integer overflow in drm_mode_dirtyfb_ioctl()
drivers/gpu/vga/vgaarb.c: add missing kfree
drm/radeon/kms/atom: unify i2c gpio table handling
drm/radeon/kms: fix up gpio i2c mask bits for r4xx for real
ttm: Don't return the bo reserved on error path
drm/radeon/kms: add a CS ioctl flag not to rewrite tiling flags in the CS
drm/i915: Fix inconsistent backlight level during disabled
drm, i915: Fix memory leak in i915_gem_busy_ioctl().
drm/i915: Use DPCD value for max DP lanes.
drm/i915: Initiate DP link training only on the lanes we'll be using
drm/i915: Remove trailing white space
drm/i915: Try harder during dp pattern 1 link training
drm/i915: Make DP prepare/commit consistent with DP dpms
drm/i915: Let panel power sequencing hardware do its job
drm/i915: Treat PCH eDP like DP in most places
drm/i915: Remove link_status field from intel_dp structure
drm/i915: Move common PCH_PP_CONTROL setup to ironlake_get_pp_control
drm/i915: Module parameters using '-1' as default must be signed type
drm/i915: Turn on another required clock gating bit on gen6.
drm/i915: Turn on a required 3D clock gating bit on Sandybridge.
...