Currently, there is assumption in early MPU setup code that kernel
image is located in RAM, which is obviously not true for XIP. To run
code from ROM we need to make sure that it is covered by MPU. However,
due to we allocate regions (semi-)dynamically we can run into issue of
trimming region we are running from in case ROM spawns several MPU
regions. To help deal with that we enforce minimum alignments for start
end end of XIP address space as 1MB and 128Kb correspondingly.
Tested-by: Alexandre TORGUE <alexandre.torgue@st.com>
Tested-by: Benjamin Gaignard <benjamin.gaignard@linaro.org>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
PMSAv7 defines curious alignment requirements to the regions:
- size must be power of 2, and
- region start must be aligned to the region size
Because of that we currently adjust lowmem bounds plus we assign
only one MPU region to cover memory all these lead to significant amount of
memory could be wasted. As an example, consider 64Mb of memory at
0x70000000 - it fits alignment requirements nicely; now, imagine that
2Mb of memory is reserved for coherent DMA allocation, so now Linux is
expected to see 62Mb of memory... and here annoying thing happens -
memory gets truncated to 32Mb (we've lost 30Mb!), i.e. MPU layout
looks like:
0: base 0x70000000, size 0x2000000
This patch tries to allocate as much as possible MPU slots to minimise
amount of truncated memory. Moreover, with this patch MPU subregions
starting to get used. MPU subregions allow us reduce the number of MPU
slots used. For example given above, MPU layout looks like:
0: base 0x70000000, size 0x2000000
1: base 0x72000000, size 0x1000000
2: base 0x73000000, size 0x1000000, disable subreg 7 (0x73e00000 - 0x73ffffff)
Where without subregions we'd get:
0: base 0x70000000, size 0x2000000
1: base 0x72000000, size 0x1000000
2: base 0x73000000, size 0x800000
3: base 0x73800000, size 0x400000
4: base 0x73c00000, size 0x200000
To achieve better layout we fist try to cover specified memory as is
(maybe with help of subregions) and if we failed, we truncate memory
to fit alignment requirements (so it occupies one MPU slot) and
perform one more attempt with the reminder, and so on till we either
cover all memory or run out of MPU slots.
Tested-by: Szemző András <sza@esh.hu>
Tested-by: Alexandre TORGUE <alexandre.torgue@st.com>
Tested-by: Benjamin Gaignard <benjamin.gaignard@linaro.org>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
This patch makes it possible to use MPU with v7M cores.
Tested-by: Szemző András <sza@esh.hu>
Tested-by: Alexandre TORGUE <alexandre.torgue@st.com>
Tested-by: Benjamin Gaignard <benjamin.gaignard@linaro.org>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
The last user of CONFIG_VECTORS_BASE has gone, so kill it.
Tested-by: Benjamin Gaignard <benjamin.gaignard@linaro.org>
Reported-by: Afzal Mohammed <afzal.mohd.ma@gmail.com>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
It seems that MPU never worked with XIP, so we just disallow such
combination.
Tested-by: Szemző András <sza@esh.hu>
Tested-by: Alexandre TORGUE <alexandre.torgue@st.com>
Tested-by: Benjamin Gaignard <benjamin.gaignard@linaro.org>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Currently, there are several issues with how MPU is setup:
1. We won't boot if MPU is missing
2. We won't boot if use XIP
3. Further extension of MPU setup requires asm skills
The 1st point can be relaxed, so we can continue with boot CPU even if
MPU is missed and fail boot for secondaries only. To address the 2nd
point we could create region covering CONFIG_XIP_PHYS_ADDR - _end and
that might work for the first stage of MPU enable, but due to MPU's
alignment requirement we could cover too much, IOW we need more
flexibility in how we're partitioning memory regions... and it'd be
hardly possible to archive because of the 3rd point.
This patch is trying to address 1st and 3rd issues and paves the path
for 2nd and further improvements.
The most visible change introduced with this patch is that we start
using mpu_rgn_info array (as it was supposed?), so change in MPU setup
done by boot CPU is recorded there and feed to secondaries. It
allows us to keep minimal region setup for boot CPU and do the rest in
C. Since we start programming MPU regions in C evaluation of MPU
constrains (number of regions supported and minimal region order) can
be done once, which in turn open possibility to free-up "probe"
region early.
Tested-by: Szemző András <sza@esh.hu>
Tested-by: Alexandre TORGUE <alexandre.torgue@st.com>
Tested-by: Benjamin Gaignard <benjamin.gaignard@linaro.org>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Currently, inline assembly for accessing to MPU's cp15 lacks volatile
keyword which opens possibility to compiler to optimise such accesses
as soon as we start using them more intensively. Rather than fixing
inline asm, lets move MPU accessors to use cp15 helpers which do the
right thing.
Tested-by: Szemző András <sza@esh.hu>
Tested-by: Alexandre TORGUE <alexandre.torgue@st.com>
Tested-by: Benjamin Gaignard <benjamin.gaignard@linaro.org>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Having MPU handling code in dedicated module makes it easier to
enhance/maintain it.
Tested-by: Szemző András <sza@esh.hu>
Tested-by: Alexandre TORGUE <alexandre.torgue@st.com>
Tested-by: Benjamin Gaignard <benjamin.gaignard@linaro.org>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
If CONFIG_DEBUG_LOCK_ALLOC=y, the kernel log is spammed with a few
hundred identical messages:
unwind: Unknown symbol address c0800300
unwind: Index not found c0800300
c0800300 is the return address from the last subroutine call (to
__memzero()) in __mmap_switched(). Apparently having this address in
the link register confuses the unwinder.
To fix this, reset the link register to zero before jumping to
start_kernel().
Fixes: 9520b1a1b5 ("ARM: head-common.S: speed up startup code")
Suggested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
This series provides the needed changes to suport the ELF_FDPIC binary
format on ARM. Both MMU and non-MMU systems are supported. This format
has many advantages over the BFLT format used on MMU-less systems, such
as being real ELF that can be parsed by standard tools, can support
shared dynamic libs, etc.
This contains important fixes to the XIP linker script, some more linker
script cleanups, .bss clearing and .data copying speedups related to the
above, and an opt-in config option for XIP kernels that allows for
compressing .data in ROM that depend on those other patches to work
properly.
- Minor improvements
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABAgAGBQJZvXBZAAoJEGb5WYXrGLvB7agQAIAUlQCUuVIChJVs2YYEyyR5
PCkgFC/sjQdT3nAUpvZohP5sTwdSzBaxWWFv6lbPmh/1AFzukF95sfehsBn2qPI7
enuaSEmc5CUiuj76aTspS10l0G8oLS721pnTbTxicaq1ZHl+NIMkAFwRtaTdileq
4WBS+cfdqdj4aR/qWOTmJ8OXWEL7oHLFT1Xnc+uJpD/SltvWBj9gG8fJiwQsWIi7
qiB5KeUoU3XxYI4hQkvuhr81xhvM9iz9M/4fDJocdKS4aU1Aw/btnhXgU1yG1nOm
1Jwqxr3idrbdkLBajn9MK/DxfmdkYbcb93O7kw5QuHMvXzzhVjB824PAR/k2iK3B
3JSpGFfRgtCEQk5NeiQb0NIVTbc+H133ppxoDr9CTxgikvza3Utc27oDIxUkQKuZ
Cp/2TcoD0e1/+ul22QO8TQLjfpSVo34wLOO7aBEfuWtwYwPoSkfJv0BjBzHGdQUj
XQXjdItPhlNp/tEHdto62LIZzae2C+FEHYWWIRRSh505tM8V/69skOaOp3NRmFUQ
SmX6Wzuc1i6b3UjegvIJsZfrY+0eZNE9n3UufSyqhT6GKV0oOZhyvTsrIkGuJ3L5
LJm+3BApQGEVi2rHFa8dBpcibVn7kIFVVTmKTPkOpen9AnWXlsrjg41N02FxNo+J
ewhIkECzGMxLxtxHy+yp
=gwDT
-----END PGP SIGNATURE-----
Merge tag 'upstream-4.14-rc1' of git://git.infradead.org/linux-ubifs
Pull UBI updates from Richard Weinberger:
"Minor improvements"
* tag 'upstream-4.14-rc1' of git://git.infradead.org/linux-ubifs:
UBI: Fix two typos in comments
ubi: fastmap: fix spelling mistake: "invalidiate" -> "invalidate"
ubi: pr_err() strings should end with newlines
ubi: pr_err() strings should end with newlines
ubi: pr_err() strings should end with newlines
Pull UML updates from Richard Weinberger:
- minor improvements
- fixes for Debian's new gcc defaults (pie enabled by default)
- fixes for XSTATE/XSAVE to make UML work again on modern systems
* 'for-linus-4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
um: return negative in tuntap_open_tramp()
um: remove a stray tab
um: Use relative modversions with LD_SCRIPT_DYN
um: link vmlinux with -no-pie
um: Fix CONFIG_GCOV for modules.
Fix minor typos and grammar in UML start_up help
um: defconfig: Cleanup from old Kconfig options
um: Fix FP register size for XSTATE/XSAVE
Pull networking fixes from David Miller:
1) Fix hotplug deadlock in hv_netvsc, from Stephen Hemminger.
2) Fix double-free in rmnet driver, from Dan Carpenter.
3) INET connection socket layer can double put request sockets, fix
from Eric Dumazet.
4) Don't match collect metadata-mode tunnels if the device is down,
from Haishuang Yan.
5) Do not perform TSO6/GSO on ipv6 packets with extensions headers in
be2net driver, from Suresh Reddy.
6) Fix scaling error in gen_estimator, from Eric Dumazet.
7) Fix 64-bit statistics deadlock in systemport driver, from Florian
Fainelli.
8) Fix use-after-free in sctp_sock_dump, from Xin Long.
9) Reject invalid BPF_END instructions in verifier, from Edward Cree.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (43 commits)
mlxsw: spectrum_router: Only handle IPv4 and IPv6 events
Documentation: link in networking docs
tcp: fix data delivery rate
bpf/verifier: reject BPF_ALU64|BPF_END
sctp: do not mark sk dumped when inet_sctp_diag_fill returns err
sctp: fix an use-after-free issue in sctp_sock_dump
netvsc: increase default receive buffer size
tcp: update skb->skb_mstamp more carefully
net: ipv4: fix l3slave check for index returned in IP_PKTINFO
net: smsc911x: Quieten netif during suspend
net: systemport: Fix 64-bit stats deadlock
net: vrf: avoid gcc-4.6 warning
qed: remove unnecessary call to memset
tg3: clean up redundant initialization of tnapi
tls: make tls_sw_free_resources static
sctp: potential read out of bounds in sctp_ulpevent_type_enabled()
MAINTAINERS: review Renesas DT bindings as well
net_sched: gen_estimator: fix scaling error in bytes/packets samples
nfp: wait for the NSP resource to appear on boot
nfp: wait for board state before talking to the NSP
...
Pull more input updates from Dmitry Torokhov:
"A second round of updates for the input subsystem:
- a new driver for PWM-controlled vibrators
- ucb1400 touchscreen driver had completely busted suspend/resume
handling
- we now handle "home" button found on some devices with Goodix
touchscreens
- assorted other fixups"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: i8042 - add Gigabyte P57 to the keyboard reset table
Input: xpad - validate USB endpoint type during probe
Input: ucb1400_ts - fix suspend and resume handling
Input: edt-ft5x06 - fix access to non-existing register
Input: elantech - make arrays debounce_packet static, reduces object code size
Input: surface3_spi - make const array header static, reduces object code size
Input: goodix - add support for capacitive home button
Input: add a driver for PWM controllable vibrators
Input: adi - make array seq static, reduces object code size
Commit 5620a0d1aa ("firmware: delete in-kernel firmware") removed the
entire firmware directory. Unfortunately it thereby also removed the
support for built-in firmware.
This restores the ability to build firmware directly into the kernel by
pruning the original Makefile to the necessary minimum. The default for
EXTRA_FIRMWARE_DIR is now the standard directory /lib/firmware/.
Fixes: 5620a0d1aa ("firmware: delete in-kernel firmware")
Signed-off-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Acked-by: Greg K-H <gregkh@linuxfoundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The driver doesn't support events from address families other than IPv4
and IPv6, so ignore them. Otherwise, we risk queueing a work item before
it's initialized.
This can happen in case a VRF is configured when MROUTE_MULTIPLE_TABLES
is enabled, as the VRF driver will try to add an l3mdev rule for the
IPMR family.
Fixes: 65e65ec137 ("mlxsw: spectrum_router: Don't ignore IPv6 notifications")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reported-by: Andreas Rammhold <andreas@rammhold.de>
Reported-by: Florian Klink <flokli@flokli.de>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now skb->mstamp_skb is updated later, we also need to call
tcp_rate_skb_sent() after the update is done.
Fixes: 8c72c65b42 ("tcp: update skb->skb_mstamp more carefully")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull MIPS updates from Ralf Baechle:
"This is the main pull request for 4.14 for MIPS; below a summary of
the non-merge commits:
CM:
- Rename mips_cm_base to mips_gcr_base
- Specify register size when generating accessors
- Use BIT/GENMASK for register fields, order & drop shifts
- Add cluster & block args to mips_cm_lock_other()
CPC:
- Use common CPS accessor generation macros
- Use BIT/GENMASK for register fields, order & drop shifts
- Introduce register modify (set/clear/change) accessors
- Use change_*, set_* & clear_* where appropriate
- Add CM/CPC 3.5 register definitions
- Use GlobalNumber macros rather than magic numbers
- Have asm/mips-cps.h include CM & CPC headers
- Cluster support for topology functions
- Detect CPUs in secondary clusters
CPS:
- Read GIC_VL_IDENT directly, not via irqchip driver
DMA:
- Consolidate coherent and non-coherent dma_alloc code
- Don't use dma_cache_sync to implement fd_cacheflush
FPU emulation / FP assist code:
- Another series of 14 commits fixing corner cases such as NaN
propgagation and other special input values.
- Zero bits 32-63 of the result for a CLASS.D instruction.
- Enhanced statics via debugfs
- Do not use bools for arithmetic. GCC 7.1 moans about this.
- Correct user fault_addr type
Generic MIPS:
- Enhancement of stack backtraces
- Cleanup from non-existing options
- Handle non word sized instructions when examining frame
- Fix detection and decoding of ADDIUSP instruction
- Fix decoding of SWSP16 instruction
- Refactor handling of stack pointer in get_frame_info
- Remove unreachable code from force_fcr31_sig()
- Convert to using %pOF instead of full_name
- Remove the R6000 support.
- Move FP code from *_switch.S to *_fpu.S
- Remove unused ST_OFF from r2300_switch.S
- Allow platform to specify multiple its.S files
- Add #includes to various files to ensure code builds reliable and
without warning..
- Remove __invalidate_kernel_vmap_range
- Remove plat_timer_setup
- Declare various variables & functions static
- Abstract CPU core & VP(E) ID access through accessor functions
- Store core & VP IDs in GlobalNumber-style variable
- Unify checks for sibling CPUs
- Add CPU cluster number accessors
- Prevent direct use of generic_defconfig
- Make CONFIG_MIPS_MT_SMP default y
- Add __ioread64_copy
- Remove unnecessary inclusions of linux/irqchip/mips-gic.h
GIC:
- Introduce asm/mips-gic.h with accessor functions
- Use new GIC accessor functions in mips-gic-timer
- Remove counter access functions from irq-mips-gic.c
- Remove gic_read_local_vp_id() from irq-mips-gic.c
- Simplify shared interrupt pending/mask reads in irq-mips-gic.c
- Simplify gic_local_irq_domain_map() in irq-mips-gic.c
- Drop gic_(re)set_mask() functions in irq-mips-gic.c
- Remove gic_set_polarity(), gic_set_trigger(), gic_set_dual_edge(),
gic_map_to_pin() and gic_map_to_vpe() from irq-mips-gic.c.
- Convert remaining shared reg access, local int mask access and
remaining local reg access to new accessors
- Move GIC_LOCAL_INT_* to asm/mips-gic.h
- Remove GIC_CPU_INT* macros from irq-mips-gic.c
- Move various definitions to the driver
- Remove gic_get_usm_range()
- Remove __gic_irq_dispatch() forward declaration
- Remove gic_init()
- Use mips_gic_present() in place of gic_present and remove
gic_present
- Move gic_get_c0_*_int() to asm/mips-gic.h
- Remove linux/irqchip/mips-gic.h
- Inline __gic_init()
- Inline gic_basic_init()
- Make pcpu_masks a per-cpu variable
- Use pcpu_masks to avoid reading GIC_SH_MASK*
- Clean up mti, reserved-cpu-vectors handling
- Use cpumask_first_and() in gic_set_affinity()
- Let the core set struct irq_common_data affinity
microMIPS:
- Fix microMIPS stack unwinding on big endian systems
MIPS-GIC:
- SYNC after enabling GIC region
NUMA:
- Remove the unused parent_node() macro
R6:
- Constify r2_decoder_tables
- Add accessor & bit definitions for GlobalNumber
SMP:
- Constify smp ops
- Allow boot_secondary SMP op to return errors
VDSO:
- Drop gic_get_usm_range() usage
- Avoid use of linux/irqchip/mips-gic.h
Platform changes:
Alchemy:
- Add devboard machine type to cpuinfo
- update cpu feature overrides
- Threaded carddetect irqs for devboards
AR7:
- allow NULL clock for clk_get_rate
BCM63xx:
- Fix ENETDMA_6345_MAXBURST_REG offset
- Allow NULL clock for clk_get_rate
CI20:
- Enable GPIO and RTC drivers in defconfig
- Add ethernet and fixed-regulator nodes to DTS
Generic platform:
- Move Boston and NI 169445 FIT image source to their own files
- Include asm/bootinfo.h for plat_fdt_relocated()
- Include asm/time.h for get_c0_*_int()
- Include asm/bootinfo.h for plat_fdt_relocated()
- Include asm/time.h for get_c0_*_int()
- Allow filtering enabled boards by requirements
- Don't explicitly disable CONFIG_USB_SUPPORT
- Bump default NR_CPUS to 16
JZ4700:
- Probe the jz4740-rtc driver from devicetree
Lantiq:
- Drop check of boot select from the spi-falcon driver.
- Drop check of boot select from the lantiq-flash MTD driver.
- Access boot cause register in the watchdog driver through regmap
- Add device tree binding documentation for the watchdog driver
- Add docs for the RCU DT bindings.
- Convert the fpi bus driver to a platform_driver
- Remove ltq_reset_cause() and ltq_boot_select(
- Switch to a proper reset driver
- Switch to a new drivers/soc GPHY driver
- Add an USB PHY driver for the Lantiq SoCs using the RCU module
- Use of_platform_default_populate instead of __dt_register_buses
- Enable MFD_SYSCON to be able to use it for the RCU MFD
- Replace ltq_boot_select() with dummy implementation.
Loongson 2F:
- Allow NULL clock for clk_get_rate
Malta:
- Use new GIC accessor functions
NI 169445:
- Add support for NI 169445 board.
- Only include in 32r2el kernels
Octeon:
- Add support for watchdog of 78XX SOCs.
- Add support for watchdog of CN68XX SOCs.
- Expose support for mips32r1, mips32r2 and mips64r1
- Enable more drivers in config file
- Add support for accessing the boot vector.
- Remove old boot vector code from watchdog driver
- Define watchdog registers for 70xx, 73xx, 78xx, F75xx.
- Make CSR functions node aware.
- Allow access to CIU3 IRQ domains.
- Misc cleanups in the watchdog driver
Omega2+:
- New board, add support and defconfig
Pistachio:
- Enable Root FS on NFS in defconfig
Ralink:
- Add Mediatek MT7628A SoC
- Allow NULL clock for clk_get_rate
- Explicitly request exclusive reset control in the pci-mt7620 PCI driver.
SEAD3:
- Only include in 32 bit kernels by default
VoCore:
- Add VoCore as a vendor t0 dt-bindings
- Add defconfig file"
* '4.14-features' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (167 commits)
MIPS: Refactor handling of stack pointer in get_frame_info
MIPS: Stacktrace: Fix microMIPS stack unwinding on big endian systems
MIPS: microMIPS: Fix decoding of swsp16 instruction
MIPS: microMIPS: Fix decoding of addiusp instruction
MIPS: microMIPS: Fix detection of addiusp instruction
MIPS: Handle non word sized instructions when examining frame
MIPS: ralink: allow NULL clock for clk_get_rate
MIPS: Loongson 2F: allow NULL clock for clk_get_rate
MIPS: BCM63XX: allow NULL clock for clk_get_rate
MIPS: AR7: allow NULL clock for clk_get_rate
MIPS: BCM63XX: fix ENETDMA_6345_MAXBURST_REG offset
mips: Save all registers when saving the frame
MIPS: Add DWARF unwinding to assembly
MIPS: Make SAVE_SOME more standard
MIPS: Fix issues in backtraces
MIPS: jz4780: DTS: Probe the jz4740-rtc driver from devicetree
MIPS: Ci20: Enable RTC driver
watchdog: octeon-wdt: Add support for 78XX SOCs.
watchdog: octeon-wdt: Add support for cn68XX SOCs.
watchdog: octeon-wdt: File cleaning.
...
Pull more i2c updates from Wolfram Sang:
"I2C has two more new drivers: Altera FPGA and STM32F7"
* 'i2c/for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: i2c-stm32f7: add driver
i2c: i2c-stm32f4: use generic definition of speed enum
dt-bindings: i2c-stm32: Document the STM32F7 I2C bindings
i2c: altera: Add Altera I2C Controller driver
dt-bindings: i2c: Add Altera I2C Controller
Neither ___bpf_prog_run nor the JITs accept it.
Also adds a new test case.
Fixes: 17a5267067 ("bpf: verifier (add verifier core)")
Signed-off-by: Edward Cree <ecree@solarflare.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
sctp_diag would not actually dump out sk/asoc if inet_sctp_diag_fill
returns err, in which case it shouldn't mark sk dumped by setting
cb->args[3] as 1 in sctp_sock_dump().
Otherwise, it could cause some asocs to have no parent's sk dumped
in 'ss --sctp'.
So this patch is to not set cb->args[3] when inet_sctp_diag_fill()
returns err in sctp_sock_dump().
Fixes: 8f840e47f1 ("sctp: add the sctp_diag.c file")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 86fdb3448c ("sctp: ensure ep is not destroyed before doing the
dump") tried to fix an use-after-free issue by checking !sctp_sk(sk)->ep
with holding sock and sock lock.
But Paolo noticed that endpoint could be destroyed in sctp_rcv without
sock lock protection. It means the use-after-free issue still could be
triggered when sctp_rcv put and destroy ep after sctp_sock_dump checks
!ep, although it's pretty hard to reproduce.
I could reproduce it by mdelay in sctp_rcv while msleep in sctp_close
and sctp_sock_dump long time.
This patch is to add another param cb_done to sctp_for_each_transport
and dump ep->assocs with holding tsp after jumping out of transport's
traversal in it to avoid this issue.
It can also improve sctp diag dump to make it run faster, as no need
to save sk into cb->args[5] and keep calling sctp_for_each_transport
any more.
This patch is also to use int * instead of int for the pos argument
in sctp_for_each_transport, which could make postion increment only
in sctp_for_each_transport and no need to keep changing cb->args[2]
in sctp_sock_filter and sctp_sock_dump any more.
Fixes: 86fdb3448c ("sctp: ensure ep is not destroyed before doing the dump")
Reported-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The default receive buffer size was reduced by recent change
to a value which was appropriate for 10G and Windows Server 2016.
But the value is too small for full performance with 40G on Azure.
Increase the default back to maximum supported by host.
Fixes: 8b5327975a ("netvsc: allow controlling send/recv buffer size")
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
liujian reported a problem in TCP_USER_TIMEOUT processing with a patch
in tcp_probe_timer() :
https://www.spinics.net/lists/netdev/msg454496.html
After investigations, the root cause of the problem is that we update
skb->skb_mstamp of skbs in write queue, even if the attempt to send a
clone or copy of it failed. One reason being a routing problem.
This patch prevents this, solving liujian issue.
It also removes a potential RTT miscalculation, since
__tcp_retransmit_skb() is not OR-ing TCP_SKB_CB(skb)->sacked with
TCPCB_EVER_RETRANS if a failure happens, but skb->skb_mstamp has
been changed.
A future ACK would then lead to a very small RTT sample and min_rtt
would then be lowered to this too small value.
Tested:
# cat user_timeout.pkt
--local_ip=192.168.102.64
0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
+0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
+0 bind(3, ..., ...) = 0
+0 listen(3, 1) = 0
+0 `ifconfig tun0 192.168.102.64/16; ip ro add 192.0.2.1 dev tun0`
+0 < S 0:0(0) win 0 <mss 1460>
+0 > S. 0:0(0) ack 1 <mss 1460>
+.1 < . 1:1(0) ack 1 win 65530
+0 accept(3, ..., ...) = 4
+0 setsockopt(4, SOL_TCP, TCP_USER_TIMEOUT, [3000], 4) = 0
+0 write(4, ..., 24) = 24
+0 > P. 1:25(24) ack 1 win 29200
+.1 < . 1:1(0) ack 25 win 65530
//change the ipaddress
+1 `ifconfig tun0 192.168.0.10/16`
+1 write(4, ..., 24) = 24
+1 write(4, ..., 24) = 24
+1 write(4, ..., 24) = 24
+1 write(4, ..., 24) = 24
+0 `ifconfig tun0 192.168.102.64/16`
+0 < . 1:2(1) ack 25 win 65530
+0 `ifconfig tun0 192.168.0.10/16`
+3 write(4, ..., 24) = -1
# ./packetdrill user_timeout.pkt
Signed-off-by: Eric Dumazet <edumazet@googl.com>
Reported-by: liujian <liujian56@huawei.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
rt_iif is only set to the actual egress device for the output path. The
recent change to consider the l3slave flag when returning IP_PKTINFO
works for local traffic (the correct device index is returned), but it
broke the more typical use case of packets received from a remote host
always returning the VRF index rather than the original ingress device.
Update the fixup to consider l3slave and rt_iif actually getting set.
Fixes: 1dfa76390b ("net: ipv4: add check for l3slave for index returned in IP_PKTINFO")
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the network interface is kept running during suspend, the net core
may call net_device_ops.ndo_start_xmit() while the Ethernet device is
still suspended, which may lead to a system crash.
E.g. on sh73a0/kzm9g and r8a73a4/ape6evm, the external Ethernet chip is
driven by a PM controlled clock. If the Ethernet registers are accessed
while the clock is not running, the system will crash with an imprecise
external abort.
As this is a race condition with a small time window, it is not so easy
to trigger at will. Using pm_test may increase your chances:
# echo 0 > /sys/module/printk/parameters/console_suspend
# echo platform > /sys/power/pm_test
# echo mem > /sys/power/state
To fix this, make sure the network interface is quietened during
suspend.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We can enter a deadlock situation because there is no sufficient protection
when ndo_get_stats64() runs in process context to guard against RX or TX NAPI
contexts running in softirq, this can lead to the following lockdep splat and
actual deadlock was experienced as well with an iperf session in the background
and a while loop doing ifconfig + ethtool.
[ 5.780350] ================================
[ 5.784679] WARNING: inconsistent lock state
[ 5.789011] 4.13.0-rc7-02179-g32fae27c725d #70 Not tainted
[ 5.794561] --------------------------------
[ 5.798890] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
[ 5.804971] swapper/0/0 [HC0[0]:SC1[1]:HE0:SE0] takes:
[ 5.810175] (&syncp->seq#2){+.?...}, at: [<c0768a28>] bcm_sysport_tx_reclaim+0x30/0x54
[ 5.818327] {SOFTIRQ-ON-W} state was registered at:
[ 5.823278] bcm_sysport_get_stats64+0x17c/0x258
[ 5.828053] dev_get_stats+0x38/0xac
[ 5.831776] rtnl_fill_stats+0x30/0x118
[ 5.835761] rtnl_fill_ifinfo+0x538/0xe24
[ 5.839921] rtmsg_ifinfo_build_skb+0x6c/0xd8
[ 5.844430] rtmsg_ifinfo_event.part.5+0x14/0x44
[ 5.849201] rtmsg_ifinfo+0x20/0x28
[ 5.852837] register_netdevice+0x628/0x6b8
[ 5.857171] register_netdev+0x14/0x24
[ 5.861051] bcm_sysport_probe+0x30c/0x438
[ 5.865280] platform_drv_probe+0x50/0xb0
[ 5.869418] driver_probe_device+0x2e8/0x450
[ 5.873817] __driver_attach+0x104/0x120
[ 5.877871] bus_for_each_dev+0x7c/0xc0
[ 5.881834] bus_add_driver+0x1b0/0x270
[ 5.885797] driver_register+0x78/0xf4
[ 5.889675] do_one_initcall+0x54/0x190
[ 5.893646] kernel_init_freeable+0x144/0x1d0
[ 5.898135] kernel_init+0x8/0x110
[ 5.901665] ret_from_fork+0x14/0x2c
[ 5.905363] irq event stamp: 24263
[ 5.908804] hardirqs last enabled at (24262): [<c08eecf0>] net_rx_action+0xc4/0x4e4
[ 5.916624] hardirqs last disabled at (24263): [<c0a7da00>] _raw_spin_lock_irqsave+0x1c/0x98
[ 5.925143] softirqs last enabled at (24258): [<c022a7fc>] irq_enter+0x84/0x98
[ 5.932524] softirqs last disabled at (24259): [<c022a918>] irq_exit+0x108/0x16c
[ 5.939985]
[ 5.939985] other info that might help us debug this:
[ 5.946576] Possible unsafe locking scenario:
[ 5.946576]
[ 5.952556] CPU0
[ 5.955031] ----
[ 5.957506] lock(&syncp->seq#2);
[ 5.960955] <Interrupt>
[ 5.963604] lock(&syncp->seq#2);
[ 5.967227]
[ 5.967227] *** DEADLOCK ***
[ 5.967227]
[ 5.973222] 1 lock held by swapper/0/0:
[ 5.977092] #0: (&(&ring->lock)->rlock){..-...}, at: [<c0768a18>] bcm_sysport_tx_reclaim+0x20/0x54
So just remove the u64_stats_update_begin()/end() pair in ndo_get_stats64()
since it does not appear to be useful for anything. No inconsistency was
observed with either ifconfig or ethtool, global TX counts equal the sum of
per-queue TX counts on a 32-bit architecture.
Fixes: 10377ba767 ("net: systemport: Support 64bit statistics")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When building an allmodconfig kernel with gcc-4.6, we get a rather
odd warning:
drivers/net/vrf.c: In function ‘vrf_ip6_input_dst’:
drivers/net/vrf.c:964:3: error: initialized field with side-effects overwritten [-Werror]
drivers/net/vrf.c:964:3: error: (near initialization for ‘fl6’) [-Werror]
I have no idea what this warning is even trying to say, but it does
seem like a false positive. Reordering the initialization in to match
the structure definition gets rid of the warning, and might also avoid
whatever gcc thinks is wrong here.
Fixes: 9ff7438460 ("net: vrf: Handle ipv6 multicast and link-local addresses")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
call to memset to assign 0 value immediately after allocating
memory with kzalloc is unnecesaary as kzalloc allocates the memory
filled with 0 value.
Semantic patch used to resolve this issue:
@@
expression e,e2; constant c;
statement S;
@@
e = kzalloc(e2, c);
if(e == NULL) S
- memset(e, 0, e2);
Signed-off-by: Himanshu Jha <himanshujha199640@gmail.com>
Signed-off-by: Himanshu Jha <himanshujha199640@gmail.com>
Acked-by: Sudarsana Kalluru <sudarsana.kalluru@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Many many years ago (at the kernel summit in Boston), we all came to the
agreement that the firmware/ tree should be dropped from the kernel, and
everyone use the linux-firmware package instead. For some minor reason,
David Woodhouse didn't send the pull request at that point in time, and
everyone forgot about this.
The topic came up in the hallway track at the Plumbers conference this
week, so here's a single patch that drops the whole firmware tree. The
last firmware update was back in 2013, and all distros have been using
linux-firmware instead since at least that year, if not before. The
only commits to that directory since 2013 was some kbuild fixups for
various build tool issues.
So lets finally drop this, we don't need to lug them around in the
kernel source tree anymore, especially as no one wants or uses them.
This has passed build testing with 0-day, I don't think it made it into
linux-next this week, but I figured it was good to get in before
4.14-rc1 was out.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCWbwh7Q8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ylo2ACgoVQKQzUZ+xUPR2ushiqRzumHxF8AoNauS1r+
w8HQCNYUV75voi5RmnjY
=pSt4
-----END PGP SIGNATURE-----
Merge tag 'firmware_removal-4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull firmware removal from Greg KH:
"Many many years ago (at the kernel summit in Boston), we all came to
the agreement that the firmware/ tree should be dropped from the
kernel, and everyone use the linux-firmware package instead. For some
minor reason, David Woodhouse didn't send the pull request at that
point in time, and everyone forgot about this.
The topic came up in the hallway track at the Plumbers conference this
week, so here's a single patch that drops the whole firmware tree. The
last firmware update was back in 2013, and all distros have been using
linux-firmware instead since at least that year, if not before. The
only commits to that directory since 2013 was some kbuild fixups for
various build tool issues.
So lets finally drop this, we don't need to lug them around in the
kernel source tree anymore, especially as no one wants or uses them.
This has passed build testing with 0-day, I don't think it made it
into linux-next this week, but I figured it was good to get in before
4.14-rc1 was out"
* tag 'firmware_removal-4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
firmware: delete in-kernel firmware
nios2: time: Read timer in get_cycles only if initialized
nios2: add earlycon support to 3c120 devboard DTS
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJZu+wvAAoJEFWoEK+e3syCojkP/R1BwzrM3hi5elL0Z/uH7X5Z
zcRcfDsAMssdYEERnLk+8Ecw2ehN/8qi7qMD2AW5Lm+N1Tob7L4fG87ZTQQLXi9Y
uIlVvbeIwlkanN7iQXm9E9s7RMyu/7NO3JrdnKAo0SJaL6zbiYIAGdgIItvgrfUI
VVbYRBUmwkzR9qqFL1BSEZLA0CRQCVAhTXKDcmiSFKn6g8dhmlSQ0xxs0J9LVAgI
ClhHQQejX0eMFUlZ5/I3lpwZHLBA+AVBcCsRO6HqK5erLsZy697mZVvrzGI+2mPk
Rumo0Fwgwz0MjtMkzVAFBn7J6tdoIQJxlwOyg8iJ65oWzEhplvc2KgkHzxaONKkG
ISMm+JDdH7R+Nz+O3PmCio/HBrvZLxC3wSsHtqh19rfA9TiX+Y+J260GIqvksmLo
06MgdMOzdvIAuRmwGIDy12Q38gIwNIPr4NPh2m89QXtuTbUoWSM70dA4jKmCmVg0
ux4hAe4qxBRcse9U/pVx+H7z2piwZyXtbk453xa0zucNun1mh0Y+v79fY5IF07T9
v84ph7aUSBoaXLyTuDfgfCjB/cSw2xlK0W1k0rra2XTWm7YKIp38QsYvCSBdc54t
p8EtURPf5W5514B0QIfhyHNmVlSz9MIKYhigYC42AyvRESCNkzwxtxy/bEPXuUOm
t/iNPv2TdNQCQ9aMWhqx
=pf1f
-----END PGP SIGNATURE-----
Merge tag 'nios2-v4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/lftan/nios2
Pull arch/nios2 update from Ley Foon Tan.
* tag 'nios2-v4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/lftan/nios2:
nios2: time: Read timer in get_cycles only if initialized
nios2: add earlycon support to 3c120 devboard DTS
Just one fix, for the handling of alignment interrupts on dcbz instructions.
Thanks to:
Paul Mackerras, Christian Zigotzky, Michal Sojka.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJZu5HyAAoJEFHr6jzI4aWA+yoP+gOKF4obMWnvSJ5YWZZWQee/
T++Uhps7AF0G2y99rBVUKzbXzweDPpBrNQW5cFlF51WinNCA4KJ4E08xoNSrnlKl
ebc6Tn2cHbpWgydSRxWDig/Dp90U95toKp14irO2YvOVWoTgYBX7gOHhkMcpxdLH
MeVe6IAatOSNNNpJK9gLrNDLZaDhTGqXf64VIaw44llf0R9avNFhoS+8F94bllm2
VSsXizP1HZxWFcxJ1W/F54zkOvOkSXxjmd9Lf0IOxERPAshZBO7MfDaXF9MaYXwL
4AU5GZc2UmWIJpwmSQqRQPpLOPFvsJkhNnXeUb/8kQAG2W+HfZy6BvHpVh1lajm7
j+tyIYe+E7Jt63tQxyxSKpGnu20po78iimciYhl77A2RksGl9Tw5sHWj8gok30TI
pQFIRLPAO38MnnD3OrsboAnBrruS0Og+lkSWan7nV5n8ybHflvimjG+MfkNx9vzL
as7yJThQ8D8JBRAyaIHG3aAUPgBDxpZwqhDU3rac0qZChOc9DH4X7YFHR1UzGH8n
J8h9e7IsKGfvm4O9j4t3/nq0Qwh0LvpAlfu1/tPdJ0KXc9BnQTVoCpkFzhA+IFV6
JY2y4F1MfW8ZWS20iPHLlYEXYcTtcTq5b47JYQx97TgSp82ShXSLwbUxXNwSraDv
umUvJle++yyRm927njki
=F9yZ
-----END PGP SIGNATURE-----
Merge tag 'powerpc-4.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fix from Michael Ellerman:
"Just one fix, for the handling of alignment interrupts on dcbz
instructions.
Thanks to Paul Mackerras, Christian Zigotzky, Michal Sojka"
* tag 'powerpc-4.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc: Fix handling of alignment interrupt on dcbz instruction
ACL patch, I realized that Orangefs ACL code was busted, not just in the
kernel module, but in the server as well. I've been working on the
code in the server mostly, but here's one kernel patch, there
will be more.
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJZu8QoAAoJEM9EDqnrzg2+cJIP/2A8cCLk5HZf/DU0R9AUTyZx
+vZF0xdbesUuOkOAw5phYor31bToD8XsPKY7z/OlUjXJR09V2DukCX4YRDyrOibJ
nTUxWk1JnCArUeefIkD8N/4EtuQ7n8mdhAeln4//vjzGKXB/BgmTNhXe3+8RTj/t
IIVV+T99aNth4JD9K7Uux/vtoZA7kSvIbPQHRzFRr38GORJOkjB8b3mluxwjDkS/
S2DJND/mneQSPeh7VylKGSPHTqQcv9eg83/muyEoaWcd94QT2pZx9ZEYQrZ1hKus
1Nk1xmaJXqbJ1V+9qKJT9cxiDwE3uz5TQ6JSeB91ca50DcO/Up9EIdPMF0P13J21
0mh5/OuiffVdBYYPIWOe2KdpXOw9aBQqQyNA2MZdk1hotW0o/FlxNx4qtuVeQpqo
f3U0hQBQQD86nLylw6QDu5sD8Bxxx4ihM5szuHn9YlStYUgPbBdPZHsFTk2LAmp8
71UobSnSQGaOop6pfAJXW2y7m790BwYQhK7vSozQtTLMNU7EzPItIT6oQM7pjS3M
1Kh6a5XXwH+imbiaeLRBsfNA293eDuRhz9wsZeLnCV0Tt34bp+UG0FMfP+gceDQn
4hFEPnzWVAQpCyJBq4AMCH/fitawQiLcqgPBjiMVOl6pnznd5MK5C4T9XfaNf5R6
t2JWF/PS+plOa02uq4Fu
=wR7F
-----END PGP SIGNATURE-----
Merge tag 'for-linus-4.14-ofs2' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux
Pull orangefs updates from Mike Marshall:
"Some cleanups and a big bug fix for ACLs.
When I was reviewing Jan Kara's ACL patch, I realized that Orangefs
ACL code was busted, not just in the kernel module, but in the server
as well. I've been working on the code in the server mostly, but
here's one kernel patch, there will be more"
* tag 'for-linus-4.14-ofs2' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux:
orangefs: Adjust three checks for null pointers
orangefs: Use kcalloc() in orangefs_prepare_cdm_array()
orangefs: Delete error messages for a failed memory allocation in five functions
orangefs: constify xattr_handler structure
orangefs: don't call filemap_write_and_wait from fsync
orangefs: off by ones in xattr size checks
orangefs: documentation clean up
orangefs: react properly to posix_acl_update_mode's aftermath.
orangefs: Don't clear SGID when inheriting ACLs
Similar to other Gigabyte laptops, the touchpad on P57 requires a
keyboard reset to detect Elantech touchpad correctly.
BugLink: https://bugs.launchpad.net/bugs/1594214
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
When emulating a nested VM-entry from L1 to L2, several control field
validation checks are deferred to the hardware. Should one of these
validation checks fail, vcpu_vmx_run will set the vmx->fail flag. When
this happens, the L2 guest state is not loaded (even in part), and
execution should continue in L1 with the next instruction after the
VMLAUNCH/VMRESUME.
The VMCS12 is not modified (except for the VM-instruction error
field), the VMCS12 MSR save/load lists are not processed, and the CPU
state is not loaded from the VMCS12 host area. Moreover, the vmcs02
exit reason is stale, so it should not be consulted for any reason.
Signed-off-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
On an early VMLAUNCH/VMRESUME failure (i.e. one which sets the
VM-instruction error field of the current VMCS), the launch state of
the current VMCS is not set to "launched," and the VM-exit information
fields of the current VMCS (including IDT-vectoring information and
exit reason) are stale.
On a late VMLAUNCH/VMRESUME failure (i.e. one which sets the high bit
of the exit reason field), the launch state of the current VMCS is not
set to "launched," and only two of the VM-exit information fields of
the current VMCS are modified (exit reason and exit
qualification). The remaining VM-exit information fields of the
current VMCS (including IDT-vectoring information, in particular) are
stale.
Signed-off-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
After a successful VM-entry, RFLAGS is cleared, with the exception of
bit 1, which is always set. This is handled by load_vmcs12_host_state.
Signed-off-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
For example, the following could occur, making us miss a wakeup:
CPU0 CPU1
kvm_vcpu_block kvm_mips_comparecount_func
[L] swait_active(&vcpu->wq)
[S] prepare_to_swait(&vcpu->wq)
[L] if (!kvm_vcpu_has_pending_timer(vcpu))
schedule() [S] queue_timer_int(vcpu)
Ensure that the swait_active() check is not hoisted over the interrupt.
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Particularly because kvmppc_fast_vcpu_kick_hv() is a callback,
ensure that we properly serialize wq active checks in order to
avoid potentially missing a wakeup due to racing with the waiter
side.
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This is a generic call and can be suceptible to races
in reading the wq task_list while another task is adding
itself to the list. Add a full barrier by using the
swq_has_sleeper() helper.
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
During code inspection, the following potential race was seen:
CPU0 CPU1
kvm_async_pf_task_wait apf_task_wake_one
[L] swait_active(&n->wq)
[S] prepare_to_swait(&n.wq)
[L] if (!hlist_unhahed(&n.link))
schedule() [S] hlist_del_init(&n->link);
Properly serialize swait_active() checks such that a wakeup is
not missed.
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
... as we've got the new helper now. This caller already
does the right thing, hence no changes in semantics.
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>