Commit Graph

723177 Commits

Author SHA1 Message Date
Scott Mayhew c156618e15 nfs: fix a deadlock in nfs client initialization
The following deadlock can occur between a process waiting for a client
to initialize in while walking the client list during nfsv4 server trunking
detection and another process waiting for the nfs_clid_init_mutex so it
can initialize that client:

Process 1                               Process 2
---------                               ---------
spin_lock(&nn->nfs_client_lock);
list_add_tail(&CLIENTA->cl_share_link,
        &nn->nfs_client_list);
spin_unlock(&nn->nfs_client_lock);
                                        spin_lock(&nn->nfs_client_lock);
                                        list_add_tail(&CLIENTB->cl_share_link,
                                                &nn->nfs_client_list);
                                        spin_unlock(&nn->nfs_client_lock);
                                        mutex_lock(&nfs_clid_init_mutex);
                                        nfs41_walk_client_list(clp, result, cred);
                                        nfs_wait_client_init_complete(CLIENTA);
(waiting for nfs_clid_init_mutex)

Make sure nfs_match_client() only evaluates clients that have completed
initialization in order to prevent that deadlock.

This patch also fixes v4.0 trunking behavior by not marking the client
NFS_CS_READY until the clientid has been confirmed.

Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-12-15 14:31:49 -05:00
Haishuang Yan c05fad5713 ip_gre: fix wrong return value of erspan_rcv
If pskb_may_pull return failed, return PACKET_REJECT instead of -ENOMEM.

Fixes: 84e54fe0a5 ("gre: introduce native tunnel support for ERSPAN")
Cc: William Tu <u9012063@gmail.com>
Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15 14:10:39 -05:00
Daniele Palmas c647c0d62c net: usb: qmi_wwan: add Telit ME910 PID 0x1101 support
This patch adds support for Telit ME910 PID 0x1101.

Signed-off-by: Daniele Palmas <dnlplm@gmail.com>
Acked-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15 13:46:10 -05:00
David S. Miller d1fca67fee Merge branch 'net-sched-Make-qdisc-offload-uapi-uniform'
Yuval Mintz says:

====================
net: sched: Make qdisc offload uapi uniform

Several qdiscs can already be offloaded to hardware, but there's an
inconsistecy in regard to the uapi through which they indicate such
an offload is taking place - indication is passed to the user via
TCA_OPTIONS where each qdisc retains private logic for setting it.

The recent addition of offloading to RED in
602f3baf22 ("net_sch: red: Add offload ability to RED qdisc") caused
the addition of yet another uapi field for this purpose -
TC_RED_OFFLOADED.

For clarity and prevention of bloat in the uapi we want to eliminate
said added uapi, replacing it with a common mechanism that can be used
to reflect offload status of the various qdiscs.

The first patch introduces TCA_HW_OFFLOAD as the generic message meant
for this purpose. The second changes the current RED implementation into
setting the internal bits necessary for passing it, and the third removes
TC_RED_OFFLOADED as its no longer needed.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15 13:35:37 -05:00
Yuval Mintz 4a98795bc8 pkt_sched: Remove TC_RED_OFFLOADED from uapi
Following the previous patch, RED is now using the new uniform uapi
for indicating it's offloaded. As a result, TC_RED_OFFLOADED is no
longer utilized by kernel and can be removed [as it's still not
part of any stable release].

Fixes: 602f3baf22 ("net_sch: red: Add offload ability to RED qdisc")
Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15 13:35:37 -05:00
Yuval Mintz 428a68af3a net: sched: Move to new offload indication in RED
Let RED utilize the new internal flag, TCQ_F_OFFLOADED,
to mark a given qdisc as offloaded instead of using a dedicated
indication.

Also, change internal logic into looking at said flag when possible.

Fixes: 602f3baf22 ("net_sch: red: Add offload ability to RED qdisc")
Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15 13:35:36 -05:00
Yuval Mintz 7a4fa29106 net: sched: Add TCA_HW_OFFLOAD
Qdiscs can be offloaded to HW, but current implementation isn't uniform.
Instead, qdiscs either pass information about offload status via their
TCA_OPTIONS or omit it altogether.

Introduce a new attribute - TCA_HW_OFFLOAD that would form a uniform
uAPI for the offloading status of qdiscs.

Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15 13:35:36 -05:00
David S. Miller 0a0606970f Merge branch 'aquantia-fixes'
Igor Russkikh says:

====================
net: aquantia: Atlantic driver 12/2017 updates

The patchset contains important hardware fix for machines with large MRRS
and couple of improvement in stats and capabilities reporting

patch v3:
 - Fixed patch #7 after Andrew's finding. NIC level stats actually
   have to be cleaned only on hw struct creation (and this is done
   in kzalloc). On each hwinit we only have to reset link state
   to make sure hw stats update will not increment nic stats during init.

patch v2:
 - split into more detailed commits

Comment from David on wrong defines case will be submitted separately later
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15 12:46:43 -05:00
Igor Russkikh d4c242d4ba net: aquantia: Increment driver version
Add a suffix to distinguish kernel mainline version and aquantia releases

Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15 12:46:42 -05:00
Igor Russkikh 98bc036de4 net: aquantia: Fix typo in ethtool statistics names
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15 12:46:42 -05:00
Igor Russkikh f3e2778429 net: aquantia: Update hw counters on hw init
On very first start we should read out current HW counter values
to make diff based calculations later.
This also should be done each time NIC gets down/up or wakes up
after sleep state. We reset link state explicitly to prevent diffs
from being summed this first time.

Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15 12:46:42 -05:00
Igor Russkikh fdb4a0830e net: aquantia: Improve link state and statistics check interval callback
Reduce timeout from 2 secs to 1 sec. If link is down,
reduce it to 500msec. This speeds up link detection.

Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15 12:46:42 -05:00
Igor Russkikh 45cc1c7ad4 net: aquantia: Fill in multicast counter in ndev stats from hardware
This metric comes from HW and is also diff-calculated, like other counters

Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15 12:46:42 -05:00
Igor Russkikh 9f8a2203a5 net: aquantia: Fill ndev stat couters from hardware
Originally they were filled from ring sw counters.
These sometimes incorrectly calculate byte and packet amounts
when using LRO/LSO and jumboframes. Filling ndev counters from
hardware makes them precise.

Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15 12:46:42 -05:00
Igor Russkikh be08d839d9 net: aquantia: Extend stat counters to 64bit values
Device hardware provides only 32bit counters. Using these directly
causes byte counters to overflow soon. A separate nic level structure
with 64 bit counters is now used to collect incrementally all the stats
and report these counters to ethtool stats and ndev stats.

Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15 12:46:41 -05:00
Igor Russkikh 1e36616151 net: aquantia: Fix hardware DMA stream overload on large MRRS
Systems with large MRRS on device (2K, 4K) with high data rates and/or
large MTU, atlantic observes DMA packet buffer overflow. On some systems
that causes PCIe transaction errors, hardware NMIs or datapath freeze.
This patch
1) Limits MRRS from device side to 2K (thats maximum our hardware supports)
2) Limit maximum size of outstanding TX DMA data read requests. This makes
hardware buffers running fine.

Signed-off-by: Pavel Belous <pavel.belous@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15 12:46:41 -05:00
Igor Russkikh e4d02ca04c net: aquantia: Fix actual speed capabilities reporting
Different hardware device Ids correspond to different maximum speed
available. Extra checks were added for devices D108 and D109 to
remove unsupported speeds from these device capabilities list.

Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15 12:46:41 -05:00
Tony Lindgren 20a2742e57 dt-bindings: ti-sysc: Update binding for timers and capabilities
The ti-sysc binding does not yet describe the capabilities of the
interconnect target module. So to make the ti-sysc binding usable
for configuring the interconnect target module, we need to add few
more properties:

1. To detect between omap2 and omap4 timers, let's add compatibles
   for them for "ti,sysc-omap2-timer" and,sysc-omap4-timer". This
   makes it easier to pick up the already initialized system timers
   later on

2. Let's add "ti,sysc-mask" for a mask of features supported by the
   interconnect target module. This describes what we have available
   in the various SYSCONFIG registers

3. Let's add "ti,sysc-midle" and "ti,sysc-sidle" lists for the master
   and slave idle modes supported by the interconnect target module.
   These describe the values available for MIDLE and SIDLE bits in
   the SYSCONFIG registers

4. Some interconnect target modules need a short delay after reset
   before they can be accessed, let's use "ti,sysc-delay-us" for
   that

5. Let's add "ti,syss-mask" bit to describe the optional SYSSTATUS
   register bits for reset done bits

6. Let's support the two existing custom quirk properties already
   listed in Documentation/devicetree/bindings/arm/omap/omap.txt for
   "ti,no-reset-on-init" and "ti,no-idle-on-init"

7. And finally, let's add a header for the binding for the dts
   files and the driver to use

Cc: Benoît Cousson <bcousson@baylibre.com>
Cc: Dave Gerlach <d-gerlach@ti.com>
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Liam Girdwood <lgirdwood@gmail.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Nishanth Menon <nm@ti.com>
Cc: Matthijs van Duin <matthijsvanduin@gmail.com>
Cc: Paul Walmsley <paul@pwsan.com>
Cc: Peter Ujfalusi <peter.ujfalusi@ti.com>
Cc: Sakari Ailus <sakari.ailus@iki.fi>
Cc: Suman Anna <s-anna@ti.com>
Cc: Tero Kristo <t-kristo@ti.com>
Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Tony Lindgren <tony@atomide.com>
2017-12-15 09:39:11 -08:00
Willem de Bruijn 35b99dffc3 sock: free skb in skb_complete_tx_timestamp on error
skb_complete_tx_timestamp must ingest the skb it is passed. Call
kfree_skb if the skb cannot be enqueued.

Fixes: b245be1f4d ("net-timestamp: no-payload only sysctl")
Fixes: 9ac25fc063 ("net: fix socket refcounting in skb_complete_tx_timestamp()")
Reported-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15 11:30:36 -05:00
David S. Miller d9356edc44 Merge branch 's390-fixes'
Julian Wiedmann says:

====================
s390/qeth: fixes 2017-12-13

some more patches for 4.15, that fix multiple issues with IP Takeover
configuration in qeth.
Please queue them up for stable kernels as well (4.9 and newer).
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15 11:29:44 -05:00
Julian Wiedmann 02f510f326 s390/qeth: update takeover IPs after configuration change
Any modification to the takeover IP-ranges requires that we re-evaluate
which IP addresses are takeover-eligible. Otherwise we might do takeover
for some addresses when we no longer should, or vice-versa.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15 11:29:43 -05:00
Julian Wiedmann 8a03a3692b s390/qeth: lock IP table while applying takeover changes
Modifying the flags of an IP addr object needs to be protected against
eg. concurrent removal of the same object from the IP table.

Fixes: 5f78e29cee ("qeth: optimize IP handling in rx_mode callback")
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15 11:29:43 -05:00
Julian Wiedmann b22d73d668 s390/qeth: don't apply takeover changes to RXIP
When takeover is switched off, current code clears the 'TAKEOVER' flag on
all IPs. But the flag is also used for RXIP addresses, and those should
not be affected by the takeover mode.
Fix the behaviour by consistenly applying takover logic to NORMAL
addresses only.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15 11:29:43 -05:00
Julian Wiedmann 7fbd9493f0 s390/qeth: apply takeover changes when mode is toggled
Just as for an explicit enable/disable, toggling the takeover mode also
requires that the IP addresses get updated. Otherwise all IPs that were
added to the table before the mode-toggle, get registered with the old
settings.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15 11:29:42 -05:00
Will Deacon a454483137 arm64: fpsimd: Fix copying of FP state from signal frame into task struct
Commit 9de52a755c ("arm64: fpsimd: Fix failure to restore FPSIMD
state after signals") fixed an issue reported in our FPSIMD signal
restore code but inadvertently introduced another issue which tends to
manifest as random SEGVs in userspace.

The problem is that when we copy the struct fpsimd_state from the kernel
stack (populated from the signal frame) into the struct held in the
current thread_struct, we blindly copy uninitialised stack into the
"cpu" field, which means that context-switching of the FP registers is
no longer reliable.

This patch fixes the problem by copying only the user_fpsimd member of
struct fpsimd_state. We should really rework the function prototypes
to take struct user_fpsimd_state * instead, but let's just get this
fixed for now.

Cc: Dave Martin <Dave.Martin@arm.com>
Fixes: 9de52a755c ("arm64: fpsimd: Fix failure to restore FPSIMD state after signals")
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-12-15 16:12:35 +00:00
David S. Miller 0f546ffcd0 Here are some batman-adv bugfixes:
- Initialize the fragment headers, by Sven Eckelmann
 
  - Fix a NULL check in BATMAN V, by Sven Eckelmann
 
  - Fix kernel doc for the time_setup() change, by Sven Eckelmann
 
  - Use the right lock in BATMAN IV OGM Update, by Sven Eckelmann
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEE1ilQI7G+y+fdhnrfoSvjmEKSnqEFAlozsZQWHHN3QHNpbW9u
 d3VuZGVybGljaC5kZQAKCRChK+OYQpKeoZ1uD/94DURJf4b8kgUDjgMfuKZZyafH
 c4N8SI+3+gZExtgXbb48f91a9S5T7qIAm46c7nHNJpTLEhc/ucup3c06i8VjBc2r
 Tz37jnnjrHzfbw0p10zpLkdDlU0mLtEQBpNGQxru8h2x3gs2RBfDy/yX7MzzAp4m
 maZ3ub0wPAtlRKQ4OywTStP9ZP7BXj4/puCRrn3kPIKDc1Ahwh4dS4r4euSvvR6z
 2Qkfnpb3sbzh1Sa526GNAwgEo6hpssOugQyCIZUK4YJh1OfjnS+xJHWmfBkLw7q9
 OfkuwAjkZMivrcxK/lvdMJ6WltQUyWVkKxDuA6iAtwRV4tX5W5b+9OEToZlHuPid
 11bDXDfmWTfnpYM7uTT7tRbjFpvJ9X4ftUNHHtlL+h+VRz8Uf0xcteBIi7FxHHm5
 fRqMLj5J0WU5rVf2vKTz2XIi5dXWgy/bkoPzalMc8UxWVsKthr34UZmcPRDVvCja
 bwad1H+EEelx4tvPeWTEEhdhLFvalmfgRG3i/paqr547ETllmO462kdUTOmxDnvT
 76GZVsLeHBPrnZi/tqc230EyoZijiM/l41twOZPjb0cklvOEFYR0vs/BUfmvdM/3
 xlUF+CBaVnHcY3HDn7ywVICWP6jnG+LOQ4tOL6jjsKiqbEpPHRHuQ++MnZ2uRUgs
 5e/yBVxdEjo9RoLVsg==
 =US7g
 -----END PGP SIGNATURE-----

Merge tag 'batadv-net-for-davem-20171215' of git://git.open-mesh.org/linux-merge

Simon Wunderlich says:

====================
Here are some batman-adv bugfixes:

 - Initialize the fragment headers, by Sven Eckelmann

 - Fix a NULL check in BATMAN V, by Sven Eckelmann

 - Fix kernel doc for the time_setup() change, by Sven Eckelmann

 - Use the right lock in BATMAN IV OGM Update, by Sven Eckelmann
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15 11:02:11 -05:00
Yuval Mintz fccff08628 mlxsw: spectrum: Disable MAC learning for ovs port
Learning is currently enabled for ports which are OVS slaves -
even though OVS doesn't need this indication.
Since we're not associating a fid with the port, HW would continuously
notify driver of learned [& aged] MACs which would be logged as errors.

Fixes: 2b94e58df5 ("mlxsw: spectrum: Allow ports to work under OVS master")
Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15 10:47:36 -05:00
Steven Rostedt f73c52a5bc sched/rt: Do not pull from current CPU if only one CPU to pull
Daniel Wagner reported a crash on the BeagleBone Black SoC.

This is a single CPU architecture, and does not have a functional
arch_send_call_function_single_ipi() implementation which can crash
the kernel if that is called.

As it only has one CPU, it shouldn't be called, but if the kernel is
compiled for SMP, the push/pull RT scheduling logic now calls it for
irq_work if the one CPU is overloaded, it can use that function to call
itself and crash the kernel.

Ideally, we should disable the SCHED_FEAT(RT_PUSH_IPI) if the system
only has a single CPU. But SCHED_FEAT is a constant if sched debugging
is turned off. Another fix can also be used, and this should also help
with normal SMP machines. That is, do not initiate the pull code if
there's only one RT overloaded CPU, and that CPU happens to be the
current CPU that is scheduling in a lower priority task.

Even on a system with many CPUs, if there's many RT tasks waiting to
run on a single CPU, and that CPU schedules in another RT task of lower
priority, it will initiate the PULL logic in case there's a higher
priority RT task on another CPU that is waiting to run. But if there is
no other CPU with waiting RT tasks, it will initiate the RT pull logic
on itself (as it still has RT tasks waiting to run). This is a wasted
effort.

Not only does this help with SMP code where the current CPU is the only
one with RT overloaded tasks, it should also solve the issue that
Daniel encountered, because it will prevent the PULL logic from
executing, as there's only one CPU on the system, and the check added
here will cause it to exit the RT pull code.

Reported-by: Daniel Wagner <wagi@monom.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-rt-users <linux-rt-users@vger.kernel.org>
Cc: stable@vger.kernel.org
Fixes: 4bdced5c9 ("sched/rt: Simplify the IPI based RT balancing logic")
Link: http://lkml.kernel.org/r/20171202130454.4cbbfe8d@vmware.local.home
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-15 16:28:02 +01:00
Ingo Molnar 643e345c95 tools/headers: Synchronize kernel <-> tooling headers
Two kernel headers got modified recently, which are used by tooling as well:

 tools/include/uapi/linux/kvm.h
 arch/x86/include/asm/cpufeatures.h

None of those changes have an effect on tooling, so do a plain copy.

Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-15 13:49:28 +01:00
Ingo Molnar 215eada73e objtool: Resync objtool's instruction decoder source code copy with the kernel's latest version
This fixes the following warning:

  warning: objtool: x86 instruction decoder differs from kernel

Note that there are cleanups queued up for v4.16 that will make this
warning more informative and will make the syncing easier as well.

Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-15 13:45:37 +01:00
Randy Dunlap f5b5fab178 x86/decoder: Fix and update the opcodes map
Update x86-opcode-map.txt based on the October 2017 Intel SDM publication.
Fix INVPID to INVVPID.
Add UD0 and UD1 instruction opcodes.

Also sync the objtool and perf tooling copies of this file.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <masami.hiramatsu@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/aac062d7-c0f6-96e3-5c92-ed299e2bd3da@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-15 13:45:20 +01:00
Volodymyr Babchuk ef8e08d24c tee: shm: inline tee_shm_get_id()
Now, when struct tee_shm is defined in public header,
we can inline small getter functions like this one.

Signed-off-by: Volodymyr Babchuk <vlad.babchuk@gmail.com>
Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
2017-12-15 13:36:21 +01:00
Volodymyr Babchuk 217e0250cc tee: use reference counting for tee_context
We need to ensure that tee_context is present until last
shared buffer will be freed.

Signed-off-by: Volodymyr Babchuk <vlad.babchuk@gmail.com>
Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
2017-12-15 13:36:18 +01:00
Volodymyr Babchuk f58e236c9d tee: optee: enable dynamic SHM support
Previous patches added various features that are needed for dynamic SHM.
Dynamic SHM allows Normal World to share any buffers with OP-TEE.
While original design suggested to use pre-allocated region (usually of
1M to 2M of size), this new approach allows to use all non-secure RAM for
command buffers, RPC allocations and TA parameters.

This patch checks capability OPTEE_SMC_SEC_CAP_DYNAMIC_SHM. If it was set
by OP-TEE, then kernel part of OP-TEE will use kernel page allocator
to allocate command buffers. Also it will set TEE_GEN_CAP_REG_MEM
capability to tell userspace that it supports shared memory registration.

Signed-off-by: Volodymyr Babchuk <vlad.babchuk@gmail.com>
Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
2017-12-15 13:36:17 +01:00
Volodymyr Babchuk abd135ba21 tee: optee: add optee-specific shared pool implementation
This is simple pool that uses kernel page allocator. This pool can be
used in case OP-TEE supports dynamic shared memory.

Signed-off-by: Volodymyr Babchuk <vlad.babchuk@gmail.com>
Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
2017-12-15 13:36:17 +01:00
Volodymyr Babchuk d885cc5e07 tee: optee: store OP-TEE capabilities in private data
Those capabilities will be used in subsequent patches.

Signed-off-by: Volodymyr Babchuk <vlad.babchuk@gmail.com>
Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
2017-12-15 13:36:16 +01:00
Volodymyr Babchuk 53a107c812 tee: optee: add registered buffers handling into RPC calls
With latest changes to OP-TEE we can use any buffers as a shared memory.
Thus, it is possible for supplicant to provide part of own memory
when OP-TEE asks to allocate a shared buffer.

This patch adds support for such feature into RPC handling code.
Now when OP-TEE asks supplicant to allocate shared buffer, supplicant
can use TEE_IOC_SHM_REGISTER to provide such buffer. RPC handler is
aware of this, so it will pass list of allocated pages to OP-TEE.

Signed-off-by: Volodymyr Babchuk <vlad.babchuk@gmail.com>
[jw: fix parenthesis alignment in free_pages_list()]
Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
2017-12-15 13:35:37 +01:00
Volodymyr Babchuk 64cf9d8a67 tee: optee: add registered shared parameters handling
Now, when client applications can register own shared buffers in OP-TEE,
we need to extend ABI for parameter passing to/from OP-TEE.

So, if OP-TEE core detects that parameter belongs to registered shared
memory, it will use corresponding parameter attribute.

Signed-off-by: Volodymyr Babchuk <vlad.babchuk@gmail.com>
Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
2017-12-15 13:32:32 +01:00
Volodymyr Babchuk 06ca79179c tee: optee: add shared buffer registration functions
This change adds ops for shm_(un)register functions in tee interface.
Client application can use these functions to (un)register an own shared
buffer in OP-TEE address space. This allows zero copy data sharing between
Normal and Secure Worlds.

Please note that while those functions were added to optee code,
it does not report to userspace that those functions are available.
OP-TEE code does not set TEE_GEN_CAP_REG_MEM flag. This flag will be
enabled only after all other features of dynamic shared memory will be
implemented in subsequent patches. Of course user can ignore presence of
TEE_GEN_CAP_REG_MEM flag and try do call those functions. This is okay,
driver will register shared buffer in OP-TEE, but any attempts to use
this shared buffer will fail.

Signed-off-by: Volodymyr Babchuk <vlad.babchuk@gmail.com>
Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
2017-12-15 13:32:32 +01:00
Volodymyr Babchuk 3bb48ba5cd tee: optee: add page list manipulation functions
These functions will be used to pass information about shared
buffers to OP-TEE. ABI between Linux and OP-TEE is defined
in optee_msg.h and optee_smc.h.

optee_msg.h defines OPTEE_MSG_ATTR_NONCONTIG attribute
for shared memory references and describes how such references
should be passed. Note that it uses 64-bit page addresses even
on 32 bit systems. This is done to support LPAE and to unify
interface.

Signed-off-by: Volodymyr Babchuk <vlad.babchuk@gmail.com>
[jw: replacing uint64_t with u64 in optee_fill_pages_list()]
Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
2017-12-15 13:32:31 +01:00
Volodymyr Babchuk de5c6dfc43 tee: optee: Update protocol definitions
There were changes in REE<->OP-TEE ABI recently.
Now ABI allows us to pass non-contiguous memory buffers as list of
pages to OP-TEE. This can be achieved by using new parameter attribute
OPTEE_MSG_ATTR_NONCONTIG.

OP-TEE also is able to use all non-secure RAM for shared buffers. This
new capability is enabled with OPTEE_SMC_SEC_CAP_DYNAMIC_SHM flag.

This patch adds necessary definitions to the protocol definition files at
Linux side.

Signed-off-by: Volodymyr Babchuk <vlad.babchuk@gmail.com>
Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
2017-12-15 13:32:31 +01:00
Volodymyr Babchuk e0c69ae8bf tee: shm: add page accessor functions
In order to register a shared buffer in TEE, we need accessor
function that return list of pages for that buffer.

Signed-off-by: Volodymyr Babchuk <vlad.babchuk@gmail.com>
Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
2017-12-15 13:32:30 +01:00
Volodymyr Babchuk b25946ad95 tee: shm: add accessors for buffer size and page offset
These two function will be needed for shared memory registration in OP-TEE

Signed-off-by: Volodymyr Babchuk <vlad.babchuk@gmail.com>
Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
2017-12-15 13:32:30 +01:00
Jens Wiklander 033ddf12bc tee: add register user memory
Added new ioctl to allow users register own buffers as a shared memory.

Signed-off-by: Volodymyr Babchuk <vlad.babchuk@gmail.com>
[jw: moved tee_shm_is_registered() declaration]
[jw: added space after __tee_shm_alloc() implementation]
Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
2017-12-15 13:32:20 +01:00
Jens Wiklander e2aca5d892 tee: flexible shared memory pool creation
Makes creation of shm pools more flexible by adding new more primitive
functions to allocate a shm pool. This makes it easier to add driver
specific shm pool management.

Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
Signed-off-by: Volodymyr Babchuk <vlad.babchuk@gmail.com>
2017-12-15 12:37:29 +01:00
Andy Lutomirski 7ee18d6779 x86/power: Make restore_processor_context() sane
My previous attempt to fix a couple of bugs in __restore_processor_context():

  5b06bbcfc2 ("x86/power: Fix some ordering bugs in __restore_processor_context()")

... introduced yet another bug, breaking suspend-resume.

Rather than trying to come up with a minimal fix, let's try to clean it up
for real.  This patch fixes quite a few things:

 - The old code saved a nonsensical subset of segment registers.
   The only registers that need to be saved are those that contain
   userspace state or those that can't be trivially restored without
   percpu access working.  (On x86_32, we can restore percpu access
   by writing __KERNEL_PERCPU to %fs.  On x86_64, it's easier to
   save and restore the kernel's GSBASE.)  With this patch, we
   restore hardcoded values to the kernel state where applicable and
   explicitly restore the user state after fixing all the descriptor
   tables.

 - We used to use an unholy mix of inline asm and C helpers for
   segment register access.  Let's get rid of the inline asm.

This fixes the reported s2ram hangs and make the code all around
more logical.

Analyzed-by: Linus Torvalds <torvalds@linux-foundation.org>
Reported-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Reported-by: Pavel Machek <pavel@ucw.cz>
Tested-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Tested-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: Zhang Rui <rui.zhang@intel.com>
Fixes: 5b06bbcfc2 ("x86/power: Fix some ordering bugs in __restore_processor_context()")
Link: http://lkml.kernel.org/r/398ee68e5c0f766425a7b746becfc810840770ff.1513286253.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-15 12:21:38 +01:00
Andy Lutomirski 896c80bef4 x86/power/32: Move SYSENTER MSR restoration to fix_processor_context()
x86_64 restores system call MSRs in fix_processor_context(), and
x86_32 restored them along with segment registers.  The 64-bit
variant makes more sense, so move the 32-bit code to match the
64-bit code.

No side effects are expected to runtime behavior.

Tested-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: Zhang Rui <rui.zhang@intel.com>
Link: http://lkml.kernel.org/r/65158f8d7ee64dd6bbc6c1c83b3b34aaa854e3ae.1513286253.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-15 12:18:29 +01:00
Andy Lutomirski 090edbe23f x86/power/64: Use struct desc_ptr for the IDT in struct saved_context
x86_64's saved_context nonsensically used separate idt_limit and
idt_base fields and then cast &idt_limit to struct desc_ptr *.

This was correct (with -fno-strict-aliasing), but it's confusing,
served no purpose, and required #ifdeffery. Simplify this by
using struct desc_ptr directly.

No change in functionality.

Tested-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: Zhang Rui <rui.zhang@intel.com>
Link: http://lkml.kernel.org/r/967909ce38d341b01d45eff53e278e2728a3a93a.1513286253.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-15 12:18:29 +01:00
Thomas Gleixner cef31d9af9 posix-timer: Properly check sigevent->sigev_notify
timer_create() specifies via sigevent->sigev_notify the signal delivery for
the new timer. The valid modes are SIGEV_NONE, SIGEV_SIGNAL, SIGEV_THREAD
and (SIGEV_SIGNAL | SIGEV_THREAD_ID).

The sanity check in good_sigevent() is only checking the valid combination
for the SIGEV_THREAD_ID bit, i.e. SIGEV_SIGNAL, but if SIGEV_THREAD_ID is
not set it accepts any random value.

This has no real effects on the posix timer and signal delivery code, but
it affects show_timer() which handles the output of /proc/$PID/timers. That
function uses a string array to pretty print sigev_notify. The access to
that array has no bound checks, so random sigev_notify cause access beyond
the array bounds.

Add proper checks for the valid notify modes and remove the SIGEV_THREAD_ID
masking from various code pathes as SIGEV_NONE can never be set in
combination with SIGEV_THREAD_ID.

Reported-by: Eric Biggers <ebiggers3@gmail.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Reported-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: stable@vger.kernel.org
2017-12-15 11:08:40 +01:00
Thierry Reding 7f4c9176f7 iommu/tegra: Allow devices to be grouped
Implement the ->device_group() and ->of_xlate() callbacks which are used
in order to group devices. Each group can then share a single domain.

This is implemented primarily in order to achieve the same semantics on
Tegra210 and earlier as on Tegra186 where the Tegra SMMU was replaced by
an ARM SMMU. Users of the IOMMU API can now use the same code to share
domains between devices, whereas previously they used to attach each
device individually.

Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2017-12-15 10:12:32 +01:00