Matt added BPF_JIT support in commit 0ca87f05, but currently none of our
defconfigs build it. Turn that sucker on.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Add the ability to inject IOMMU faults. We enable this per device
via a fail_iommu sysfs property, similar to fault injection on other
subsystems.
An example:
...
0003:01:00.1 Ethernet controller: Emulex Corporation OneConnect 10Gb NIC (be3) (rev 02)
To inject one error to this device:
echo 1 > /sys/bus/pci/devices/0003:01:00.1/fail_iommu
echo 1 > /sys/kernel/debug/fail_iommu/probability
echo 1 > /sys/kernel/debug/fail_iommu/times
As feared, the first failure injected on the be3 results in an
unrecoverable error, taking down both functions of the card
permanently:
be2net 0003:01:00.1: Unrecoverable error in the card
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The DMA API debug code has hooks to verify all DMA entries have been
freed at time of hot unplug. We need to call dma_debug_add_bus for
this to work.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Similar to PCI, separate the bus probe from device probe. This allows
us to attach bus notifiers for DMA debug and IOMMU fault injection
before devices have been probed.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
During boot we see a number of these warnings:
vio 30000000: Warning: IOMMU dma not supported: mask 0xffffffffffffffff, table unavailable
The reason for this is that we set IOMMU properties for all VIO
devices even if they are not DMA capable.
Only set DMA ops, table and mask for devices with a DMA window.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We use SIAR or regs->nip for the instruction pointer depending on
the PMU configuration, but we always use regs->nip in the callchain.
Use perf_instruction_pointer so the backtrace is consistent.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
At the moment we always use the SIAR if the PMU supports continuous
sampling. Unfortunately the SIAR and the PMU exception are not
synchronised for non marked events so we can end up with callchains
that dont make sense.
The following patch checks the HV and PR bits for samples coming from
userspace and always uses pt_regs for them. Userspace will never have
interrupts off so there is no real advantage to using the SIAR for
non marked events in userspace.
I had experimented with a patch that did a similar thing for kernel
samples but we lost a significant amount of information. I was
unable to profile any of our early exception code for example.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The logic to choose whether to use the SIAR or get the information
out of pt_regs is going to get more complicated, so do it once in
perf_read_regs.
We overload regs->result which is gross but we are already doing it
with regs->dsisr.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We want to access the MMCRA_SIHV and MMCRA_SIPR bits elsewhere so
create mmcra_sihv and mmcra_sipr which hide the differences between
the old and new layout of the bits.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Some macros use RA where when RA=R0 the values is 0, so make this
the enforced mnemonic in the macro.
Idea suggested by Andreas Schwab.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Enforce the use of R0-R31 in macros where possible now we have all the
fixes in.
R0-R31 macros are removed here so that can't be used anymore. They
should not be defined anywhere.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Now have ___PPC_RA/B/S/T we can use it in some places. These are
places where we can't use the existing defines which will soon enforce
R0-R31 usage.
The macros being changed here are being used in inline asm, which
can't convert to enforce the R0-R31 usage.
bpf_jit uses a mix of both generated and non-generated with the same
code, so just convert all these to use the ___PPC_R versions which
won't enforce R usage later.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
These are currently the same as __PPC_RA/B/S/T but we'll wrap them
soon.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We need to do this so we can enforce the name of a and b in called
macros PPC_RA/B later.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
These macros are using integers where they could be using logical
names since they take registers.
We are going to enforce this soon, so fix these up now.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
LOAD_REG_ADDR define is just a wrapper around real instructions so we
can just use real register names here (ie. lower case).
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
mtocrf define is just a wrapper around the real instructions so we can
just use real register names here (ie. lower case).
Also remove braces in macro so this is possible.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Move this duplicated definition to ppc_asm.h and remove the
braces which prevent the use of %rN register names
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Merge the defines of VCPU_GPR from different places.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Merge the defines of STACKFRAMESIZE, STK_REG, STK_PARAM from different
places.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Now all the fixes are in place, let's rock-n-roll!
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anything that uses a constructed instruction (ie. from ppc-opcode.h),
need to use the new R0 macro, as %r0 is not going to work.
Also convert usages of macros where we are just determining an offset
(usually for a load/store), like:
std r14,STK_REG(r14)(r1)
Can't use STK_REG(r14) as %r14 doesn't work in the STK_REG macro since
it's just calculating an offset.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The assembler doesn't take %r0 register arguments in braces, so remove them.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We are going to use these later and convert r0 to %r0 etc.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
udbg_init_debug_opal() should be udbg_init_debug_opal_raw() as
the caller in arch/powerpc/kernel/udbg.c expects
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Newer gcc are being a bit blind here (it's pretty obvious we don't
reach the code path using the array if we haven't initialized the
pointer) but none of that is performance critical so let's just
silence it.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
There was a typo, checking for CONFIG_TRACE_IRQFLAG instead of
CONFIG_TRACE_IRQFLAGS causing some useful debug code to not be
built
This in turns causes a build error on BookE 64-bit due to incorrect
semicolons at the end of a couple of macros, so let's fix that too
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: stable@vger.kernel.org [v3.4]
Looks like we still have issues with pSeries and Cell idle code
vs. the lazy irq state. In fact, the reset fixes that went upstream
are exposing the problem more by causing BUG_ON() to trigger (which
this patch turns into a WARN_ON instead).
We need to be careful when using a variant of low power state that
has the side effect of turning interrupts back on, to properly set
all the SW & lazy state to look as if everything is enabled before
we enter the low power state with MSR:EE off as we will return with
MSR:EE on. If not, we have a discrepancy of state which can cause
things to go very wrong later on.
This patch moves the logic into a helper and uses it from the
pseries and cell idle code. The power4/970 idle code already got
things right (in assembly even !) so I'm not touching it. The power7
"bare metal" idle code is subtly different and correct. Remains PA6T
and some hypervisor based Cell platforms which have questionable
code in there, but they are mostly dead platforms so I'll fix them
when I manage to get final answers from the respective maintainers
about how the low power state actually works on them.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: stable@vger.kernel.org [v3.4]
A smallish fix for a lock dependency issue which affects a bunch of
Qualcomm boards that do unusually complicated things with their
regulators, the API is unlikely to be called by any other system.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABAgAGBQJP9yyGAAoJEBus8iNuMP3dI24P/RXFIHH1dYck49uQ17moYDY+
vQBM7amuu23Q9MJxU5rGWaqwxXR8qT3iT76BlL9bt8u91GM4tDrccLRW09+GJXNZ
1tpFK96iOaH9ju/+FMc0nR9P385YqxRh91ZWDfyAqectBU8nFAtcl+7BakuR/9j/
2XbftG3T555EylCfwAaw1ikOALxE47ALgKa/Hglu8RDZTQSaWWZb2o20PTXfrkUk
Mt1Pzs9PMfmgE7VUC6Q04vEydX47L8/iwywanzcbMTKw00d3o5WXTwNqDzcPWGZv
S9pSGQaZ10JVMO7eu4337fsfIuhUEU0Ilbaow1eLi1RymgDFIamr0SUKcpC1sPDF
O95l1GZmEwiyDJITL4K1Ssf5TZCKmR9eMh9e1iPmOrtpK4tGHxwqO5+OHmfncISp
g4x4G7o+aK5M9c3+9r3G7S1ZrV4LAcFLVxN9LHo0PLIG/lT7tl8e6dAlWZjKSRIC
w18lJ7mdOCXgJqOt4UzIRbWpI0pZaXt0tk1mfU+/vfzhl9C5AVMVNGqgaiafHWGG
Kh01vjjSIGhfE3k83eOGlkYpTGvFsZBMVUMHif8hRfrBjiHj7RnvAfXqV0ULpa50
v0huqrPwB+384c2V6K1f6IbpyLXZQ10h4HwctxprKbO+cZZ5oYlxLlu8tkJFvNMD
QxLC8aDIumuprilMgryd
=x509
-----END PGP SIGNATURE-----
Merge tag 'regulator-3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Pull regulator fix from Mark Brown:
"A smallish fix for a lock dependency issue which affects a bunch of
Qualcomm boards that do unusually complicated things with their
regulators, the API is unlikely to be called by any other system."
* tag 'regulator-3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
regulator: Fix recursive mutex lockdep warning
Don't call v4l2_ctrl_g_ctrl on ctrls which the model cam in question
does not have.
Reported-by: Frank Schäfer <fschaefer.oss@googlemail.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
[ Taken directly, since Mauro is on vacation ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJP+hnqAAoJENkgDmzRrbjxkpcP/jl53FUc6M8L9djVSu/5EIxV
2uLCoGL7kPq2f168VJ4002f1Fa5Irld9h7A4qlqxn8sE/5OGTEvSGnw3/rkoeMeB
NLjZAFCre+qN0kWFFeZN76TlTGIJpudjsw4EE5fV8CSEbjSRt4SzQ1r8euRKGc0m
1mqp5rCeUAPrDhRuEyqcqKU36CjDDnNukz1Q1tADgTJ6Q0T8yBZkDqPOIpNy6LCv
YfazFiZDox7Wx7abPGJga4mIz1OyUR6l4vD8trJHrUaWlirLZE51+Lf69tAXP+Az
lxvCyiXVTuNuj/TI8FOejWdixmMW3GczeSn7KCzbke1GbUfne1DBfxDImdJg6VbG
AhQHUqvJAWHuohS5mnJpFCQZIrfkScyXC0gVm+g/dIJTf+4jqirk4DZyHZecznt1
T5FdleRqO2ws0BMxeD3r/d/q5aUR0ryS08CLzFLP0GDcFgU6wANM87G1Q4IahZ7A
gUjQ82piy1zzYML3/4r4qPTWAk1BGCYB6yfSVlCSwwhp+U5qER7cvrTugm4/eA6g
4QkDUlGcwGryN6/VSNIoT6ymIB5SpFgOuAcF011MKl9OMUnOyshsiLz3GuTEGYim
GZmcJsFnV9rbGTQ2DjYw8fny+DB4wwOTkHjdwVGfKBII3nY4Puc8bzaShqbDj+nH
G1VIKLiZPmRVgm4cc5jC
=vMWl
-----END PGP SIGNATURE-----
Merge tag 'virtio-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus
Pull minor virtio-balloon fix from Rusty Russell:
"Theoretical fix, which greatly simplifies upcoming balloon patches
which will go in via some vm tree."
* tag 'virtio-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
virtio-balloon: fix add/get API use
were reported by Fernando Guzman Lugo.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJP+dsKAAoJELLolMlTRIoMOTEQAK2WOgW4ygXDOfhwZCj4ooXG
Mk9PUQlEBfEufvKmK/Oh3xDVeP0i4+3BeWZ4Kgmas689SJc+LIbLSS0HtARcYF1+
ZTEj7Gnax3P3tD6GxwU5bf+jrirCbr5sSeDa7p5jh017zVsZ5MVYOYvWXKQtYAZg
hi6m2AqOUfkI3yOnWRbNMds+x3TTwpLC/3mT9mcFiJ/3L9SGx22SVfkRIa8tstjg
i0eJDTu9cbHHZ7wtuGQoVZDO5kHLi7dPKnFTiFat8dklaSNMzNBrs5Tl4DT0XA3s
tzb+BrnkddL2L36cC0XIyV1L/tQCdtAWLRFx5SDfpXKNQcYpqlMAteaAXdhf4Tjl
kJapLhGeTVRSED6TvYX9XMFbi4U8PBSBvHd4VMryN1jGcmslE3UYCPIoqOL/aIZP
dc9POY6JM9r/pMlzhQ6VUmTiecCVeCJyCh8x8r1wj7V1mC1nCxDskxObibHFjRPk
4xdmke0+UR3fn4/aA/PRZbiy3kpT84lK9SoPlzUKLLBMzJ3KMLl5Dug2+Y8Aeutn
gBxt9x+FA4UgmS62St97Cu7qbzWvAbpl95rFD4aIANMNxU9OkcjgWaB7aJzr8WWt
j8Z2OkT0WW4ka60sGj0YxAEDlKhxC8f1lzD5tuJgmVFEbXhNTZaw+GG/ZwIqb5bq
g7KQY8FnGDETO9Tew1QQ
=nbt8
-----END PGP SIGNATURE-----
Merge tag 'rpmsg-3.5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ohad/rpmsg
Pull rpmsg fixes from Ohad Ben-Cohen:
"Fixing two (somewhat rare) endpoint-related race issues, both of which
were reported by Fernando Guzman Lugo."
* tag 'rpmsg-3.5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ohad/rpmsg:
rpmsg: make sure inflight messages don't invoke just-removed callbacks
rpmsg: avoid premature deallocation of endpoints
from Shinya Kuribayashi.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJP+ZcQAAoJELLolMlTRIoMUP8P/iRgrwLmfsUN/w+/PAv8rRCP
aprvEamj+Jq6DiA63C3Fg6j03mN4ADIzjanqupttMEJekKXyW2mJh2WzDdi5qQ39
pWBhp8wBl0vIdZXOEpOc3baE82XQFTgv4iTxFmcjXCDRoaOlLJqNGpehs5h4yvRK
+QoiBfucmoa7atKkOoQxlZQyUZvkuxcSnpZSd4m39ylrk4LTQHoXJ1+6OrBKFeGV
wzHTO9kGdPdAkpW73TatBMsu2unzLAAO1ImP2ETKL/JlqG8W5NhLZwrEbtq5JKtr
nUyskUp3xdZYcevXOzZNwTp6uRcdoUUEh3OzT+xEGpoh8V5lU5NSSFOCaR4UQrm+
lyz1HePAc9nNG7MErpcLu4WSY5xPqukAk3ML07t2zRgzMp/8BUHKDT0n43HJchSO
JXKz9k0x21hE+uGfyi0euC1dnJpiZxFVIsn5m7YP/WeQdyKpO4EKOi1N7LGWALUh
s6rE7czWfpXxHfBwXpWpqIVZQsFG8KN19XAe3uEXDZVBMJwFUiJcQxBPJn7nE9Tl
wQNeD7Lj1It4AyL8zSTRVpLjwXWbzG2LrLx5VhPDgLNGGnoFAmMb+2O6oFutCHwM
T+zJH7il9IroSArMGDZaGWRc/+EDWTqEBBwm/JFAyAzWw/SIxkJhfdl6IrJ9cMtw
AvW/7lYyE3EbUfZOR4sX
=dKmT
-----END PGP SIGNATURE-----
Merge tag 'hwspinlock-3.5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ohad/hwspinlock
Pull hwspinlock fix from Ohad Ben-Cohen:
"A single hwspinlock core fix for multiple hwspinlock devices
scenarios, from Shinya Kuribayashi."
* tag 'hwspinlock-3.5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ohad/hwspinlock:
hwspinlock/core: use global ID to register hwspinlocks on multiple devices
The patches fix several issues in the AMD IOMMU driver, the NVidia SMMU
driver, and the DMA debug code. The most important fix for the AMD IOMMU
solves a problem with SR-IOV devices where virtual functions did not
work with IOMMU enabled. The NVidia SMMU patch fixes a possible sleep
while spin-lock situation (queued the small fix for v3.5, a better but
more intrusive fix is coming for v3.6). The DMA debug patches fix a
possible data corruption issue due to bool vs. u32 usage.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJP8YJLAAoJECvwRC2XARrjv74QAJKyaUoXbhqst84RU1owiFcz
A2oUC/DwqEGJD3VrixcY6Ih/2SXYuHd+Cpjd2Q8Eyc2gq66aXy7DTm/Rp+TeMxFD
M65ZZzbbrVeU2Ym7apNeKe3AzaspVRLKThirbCK5/cMuM6l0Dmh6+X/O1iqnagju
OLsd+rk8eMRPF7N2dFdG28OwI/rdWkkFj1apdo4BwavPjDP06C7pqd9/sbxtDc55
u7Yxsw456U7N/aimBKn24EeXo8NmT58W+NsNMxMqrF3sJ/gzJCotnYTR/ijZCz5w
jv1JQlhNSL+tD2F7yQMXLmhvK72UvFPrMiRK/mSgDnQUoSxxiuKDRQ+TnDardhdC
YMTj6488cnMqjmPIlkz41Wfu7knMDZVT8yUdfBy9nWEQYe6ALfLwVWwDSZzkA4SA
Dl0VyryIZMqeJbuCbPO5mR0Lh2WzBEapHSpoo30bz7E0n+F7Uw47XVSWfkiGEbfv
z41oKmqEK6xD1Kl/71gzwlADMRwkIrX+qDWuOTTRuHJVhb58Kwg3/P05hTAe6E3b
BDXROOiCO9GiNuecH0QbdBQNOEFsCdIqpqzUMaGzUOQHHP1w6Y8dMn3TZnabX5wO
0mZWH1tzBmIPkI4G5V9h3UaurqJT0ZdsAehyjBtvACBOn8TUyricbvHF/imVS4T1
zzHcTVEFLV4ocSpG3ZsY
=1jsG
-----END PGP SIGNATURE-----
Merge tag 'iommu-fixes-v3.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull IOMMU fixes from Joerg Roedel:
"The patches fix several issues in the AMD IOMMU driver, the NVidia
SMMU driver, and the DMA debug code.
The most important fix for the AMD IOMMU solves a problem with SR-IOV
devices where virtual functions did not work with IOMMU enabled. The
NVidia SMMU patch fixes a possible sleep while spin-lock situation
(queued the small fix for v3.5, a better but more intrusive fix is
coming for v3.6). The DMA debug patches fix a possible data
corruption issue due to bool vs u32 usage."
* tag 'iommu-fixes-v3.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu/amd: fix type bug in flush code
dma-debug: debugfs_create_bool() takes a u32 pointer
iommu/tegra: smmu: Fix unsleepable memory allocation
iommu/amd: Initialize dma_ops for hotplug and sriov devices
iommu/amd: Fix missing iommu_shutdown initialization in passthrough mode
Since ee7cd8981e 'virtio: expose added
descriptors immediately.', in virtio balloon virtqueue_get_buf might
now run concurrently with virtqueue_kick. I audited both and this
seems safe in practice but this is not guaranteed by the API.
Additionally, a spurious interrupt might in theory make
virtqueue_get_buf run in parallel with virtqueue_add_buf, which is
racy.
While we might try to protect against spurious callbacks it's
easier to fix the driver: balloon seems to be the only one
(mis)using the API like this, so let's just fix balloon.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (removed unused var)
Pull cgroup fixes from Tejun Heo:
"The previous cgroup pull request contained a patch to fix a race
condition during cgroup hierarchy umount. Unfortunately, while the
patch reduced the race window such that the test case I and Sasha were
using didn't trigger it anymore, it wasn't complete - Shyju and Li
could reliably trigger the race condition using a different test case.
The problem wasn't the gap between dentry deletion and release which
the previous patch tried to fix. The window was between the last
dput() of a root's child and the resulting dput() of the root. For
cgroup dentries, the deletion and release always happen synchronously.
As this releases the s_active ref, the refcnt of the root dentry,
which doesn't hold s_active, stays above zero without the
corresponding s_active. If umount was in progress, the last
deactivate_super() proceeds to destory the superblock and triggers
BUG() on the non-zero root dentry refcnt after shrinking.
This issue surfaced because cgroup dentries are now allowed to linger
after rmdir(2) since 3.5-rc1. Before, rmdir synchronously drained the
dentry refcnt and the s_active acquired by rmdir from vfs layer
protected the whole thing. After 3.5-rc1, cgroup may internally hold
and put dentry refs after rmdir finishes and the delayed dput()
doesn't have surrounding s_active ref exposing this issue.
This pull request contains two patches - one reverting the previous
incorrect fix and the other adding the surrounding s_active ref around
the delayed dput().
This is quite late in the release cycle but the change is on the safer
side and fixes the test cases reliably, so I don't think it's too
crazy."
* 'for-3.5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup: fix cgroup hierarchy umount race
Revert "cgroup: superblock can't be released with active dentries"
Pull security docs update from James Morris.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
security: Minor improvements to no_new_privs documentation
We already use them for openat() and friends, but fchdir() also wants to
be able to use O_PATH file descriptors. This should make it comparable
to the O_SEARCH of Solaris. In particular, O_PATH allows you to access
(not-quite-open) a directory you don't have read persmission to, only
execute permission.
Noticed during development of multithread support for ksh93.
Reported-by: ольга крыжановская <olga.kryzhanovska@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: stable@kernel.org # O_PATH introduced in 3.0+
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
48ddbe1946 "cgroup: make css->refcnt clearing on cgroup removal
optional" allowed a css to linger after the associated cgroup is
removed. As a css holds a reference on the cgroup's dentry, it means
that cgroup dentries may linger for a while.
Destroying a superblock which has dentries with positive refcnts is a
critical bug and triggers BUG() in vfs code. As each cgroup dentry
holds an s_active reference, any lingering cgroup has both its dentry
and the superblock pinned and thus preventing premature release of
superblock.
Unfortunately, after 48ddbe1946, there's a small window while
releasing a cgroup which is directly under the root of the hierarchy.
When a cgroup directory is released, vfs layer first deletes the
corresponding dentry and then invokes dput() on the parent, which may
recurse further, so when a cgroup directly below root cgroup is
released, the cgroup is first destroyed - which releases the s_active
it was holding - and then the dentry for the root cgroup is dput().
This creates a window where the root dentry's refcnt isn't zero but
superblock's s_active is. If umount happens before or during this
window, vfs will see the root dentry with non-zero refcnt and trigger
BUG().
Before 48ddbe1946, this problem didn't exist because the last dentry
reference was guaranteed to be put synchronously from rmdir(2)
invocation which holds s_active around the whole process.
Fix it by holding an extra superblock->s_active reference across
dput() from css release, which is the dput() path added by 48ddbe1946
and the only one which doesn't hold an extra s_active ref across the
final cgroup dput().
Signed-off-by: Tejun Heo <tj@kernel.org>
LKML-Reference: <4FEEA5CB.8070809@huawei.com>
Reported-by: shyju pv <shyju.pv@huawei.com>
Tested-by: shyju pv <shyju.pv@huawei.com>
Cc: Sasha Levin <levinsasha928@gmail.com>
Acked-by: Li Zefan <lizefan@huawei.com>
This reverts commit fa980ca87d. The
commit was an attempt to fix a race condition where a cgroup hierarchy
may be unmounted with positive dentry reference on root cgroup. While
the commit made the race condition slightly more difficult to trigger,
the race was still there and could be reliably triggered using a
different test case.
Revert the incorrect fix. The next commit will describe the race and
fix it correctly.
Signed-off-by: Tejun Heo <tj@kernel.org>
LKML-Reference: <4FEEA5CB.8070809@huawei.com>
Reported-by: shyju pv <shyju.pv@huawei.com>
Cc: Sasha Levin <levinsasha928@gmail.com>
Acked-by: Li Zefan <lizefan@huawei.com>
Commit 300bab9770 (hwspinlock/core: register a bank of hwspinlocks in a
single API call, 2011-09-06) introduced 'hwspin_lock_register_single()'
to register numerous (a bank of) hwspinlock instances in a single API,
'hwspin_lock_register()'.
At which time, 'hwspin_lock_register()' accidentally passes 'local IDs'
to 'hwspin_lock_register_single()', despite that ..._single() requires
'global IDs' to register hwspinlocks.
We have to convert into global IDs by supplying the missing 'base_id'.
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Shinya Kuribayashi <shinya.kuribayashi.px@renesas.com>
[ohad: fix error path of hwspin_lock_register, too]
Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com>
Pull ARM fixes from Russell King:
"Last merge window, we had some updates from Al cleaning up the signal
restart handling. These have caused some problems on ARM, and while
Al has some fixes, we have some concerns with Al's patches but we've
been unsuccesful with discussing this.
We have got to the point where we need to do something, and we've
decided that the best solution is to revert the appropriate commits
until Al is able to reply to us.
Also included here are four patches to fix warnings that I've noticed
in my build system, and one fix for kprobes test code."
* 'fixes' of git://git.linaro.org/people/rmk/linux-arm:
ARM: fix warning caused by wrongly typed arm_dma_limit
ARM: fix warnings about atomic64_read
ARM: 7440/1: kprobes: only test 'sub pc, pc, #1b-2b+8-2' on ARMv6
ARM: 7441/1: perf: return -EOPNOTSUPP if requested mode exclusion is unavailable
ARM: 7443/1: Revert "new way of handling ERESTART_RESTARTBLOCK"
ARM: 7442/1: Revert "remove unused restart trampoline"
ARM: fix set_domain() macro
ARM: fix mach-versatile/pci.c warning
The documentation didn't actually mention how to enable no_new_privs.
This also adds a note about possible interactions between
no_new_privs and LSMs (i.e. why teaching systemd to set no_new_privs
is not necessarily a good idea), and it references the new docs
from include/linux/prctl.h.
Suggested-by: Rob Landley <rob@landley.net>
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: James Morris <james.l.morris@oracle.com>