Commit Graph

75242 Commits

Author SHA1 Message Date
Paolo Bonzini
ca6155c0f2 Merge tag 'patchew/20200219160953.13771-1-imammedo@redhat.com' of https://github.com/patchew-project/qemu into HEAD
This series removes ad hoc RAM allocation API (memory_region_allocate_system_memory)
and consolidates it around hostmem backend. It allows to

* resolve conflicts between global -mem-prealloc and hostmem's "policy" option,
  fixing premature allocation before binding policy is applied

* simplify complicated memory allocation routines which had to deal with 2 ways
  to allocate RAM.

* reuse hostmem backends of a choice for main RAM without adding extra CLI
  options to duplicate hostmem features.  A recent case was -mem-shared, to
  enable vhost-user on targets that don't support hostmem backends [1] (ex: s390)

* move RAM allocation from individual boards into generic machine code and
  provide them with prepared MemoryRegion.

* clean up deprecated NUMA features which were tied to the old API (see patches)
  - "numa: remove deprecated -mem-path fallback to anonymous RAM"
  - (POSTPONED, waiting on libvirt side) "forbid '-numa node,mem' for 5.0 and newer machine types"
  - (POSTPONED) "numa: remove deprecated implicit RAM distribution between nodes"

Introduce a new machine.memory-backend property and wrapper code that aliases
global -mem-path and -mem-alloc into automatically created hostmem backend
properties (provided memory-backend was not set explicitly given by user).
A bulk of trivial patches then follow to incrementally convert individual
boards to using machine.memory-backend provided MemoryRegion.

Board conversion typically involves:

* providing MachineClass::default_ram_size and MachineClass::default_ram_id
  so generic code could create default backend if user didn't explicitly provide
  memory-backend or -m options

* dropping memory_region_allocate_system_memory() call

* using convenience MachineState::ram MemoryRegion, which points to MemoryRegion
   allocated by ram-memdev

On top of that for some boards:

* missing ram_size checks are added (typically it were boards with fixed ram size)

* ram_size fixups are replaced by checks and hard errors, forcing user to
  provide correct "-m" values instead of ignoring it and continuing running.

After all boards are converted, the old API is removed and memory allocation
routines are cleaned up.
2020-02-25 09:19:00 +01:00
Sunil Muthuswamy
c220cdec48 WHPX: Assigning maintainer for Windows Hypervisor Platform
Signed-off-by: Sunil Muthuswamy <sunilmut@microsoft.com>
Message-Id: <SN4PR2101MB0880E245954826FD91C9D67DC0110@SN4PR2101MB0880.namprd21.prod.outlook.com>
Reviewed-by: Justin Terry (VM) <juterry@microsoft.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-02-25 09:18:01 +01:00
Philippe Mathieu-Daudé
88cd34ee9e accel/kvm: Check ioctl(KVM_SET_USER_MEMORY_REGION) return value
kvm_vm_ioctl() can fail, check its return value, and log an error
when it failed. This fixes Coverity CID 1412229:

  Unchecked return value (CHECKED_RETURN)

  check_return: Calling kvm_vm_ioctl without checking return value

Reported-by: Coverity (CID 1412229)
Fixes: 235e8982ad ("support using KVM_MEM_READONLY flag for regions")
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <20200221163336.2362-1-philmd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-02-25 09:18:01 +01:00
Paolo Bonzini
93c3593ad0 target/i386: check for empty register in FXAM
The fxam instruction returns the wrong result after fdecstp or after
an underflow.  Check fptags to handle this.

Reported-by: <chengang@emindsoft.com.cn>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-02-25 09:18:01 +01:00
Julia Suvorova
cce8944cc9 qdev-monitor: Forbid repeated device_del
Device unplug can be done asynchronously. Thus, sending the second
device_del before the previous unplug is complete may lead to
unexpected results. On PCIe devices, this cancels the hot-unplug
process.

Signed-off-by: Julia Suvorova <jusual@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20200220165556.39388-1-jusual@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-02-25 09:18:01 +01:00
bauerchen
037fb5eb39 mem-prealloc: optimize large guest startup
[desc]:
    Large memory VM starts slowly when using -mem-prealloc, and
    there are some areas to optimize in current method;

    1、mmap will be used to alloc threads stack during create page
    clearing threads, and it will attempt mm->mmap_sem for write
    lock, but clearing threads have hold read lock, this competition
    will cause threads createion very slow;

    2、methods of calcuating pages for per threads is not well;if we use
    64 threads to split 160 hugepage,63 threads clear 2page,1 thread
    clear 34 page,so the entire speed is very slow;

    to solve the first problem,we add a mutex in thread function,and
    start all threads when all threads finished createion;
    and the second problem, we spread remainder to other threads,in
    situation that 160 hugepage and 64 threads, there are 32 threads
    clear 3 pages,and 32 threads clear 2 pages.

[test]:
    320G 84c VM start time can be reduced to 10s
    680G 84c VM start time can be reduced to 18s

Signed-off-by: bauerchen <bauerchen@tencent.com>
Reviewed-by: Pan Rui <ruippan@tencent.com>
Reviewed-by: Ivan Ren <ivanren@tencent.com>
[Simplify computation of the number of pages per thread. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-02-25 09:18:01 +01:00
Stefan Hajnoczi
920d557e5a memory: batch allocate ioeventfds[] in address_space_update_ioeventfds()
Reallocing the ioeventfds[] array each time an element is added is very
expensive as the number of ioeventfds increases.  Batch allocate instead
to amortize the cost of realloc.

This patch reduces Linux guest boot times from 362s to 140s when there
are 2 virtio-blk devices with 1 virtqueue and 99 virtio-blk devices with
32 virtqueues.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20200218182226.913977-1-stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-02-25 09:18:01 +01:00
Peter Maydell
c1e667d259 Pull request
This pull request contains a virtio-blk/scsi performance optimization, event
 loop scalability improvements, and a qtest-based device fuzzing framework.  I
 am including the fuzzing patches because I have reviewed them and Thomas Huth
 is currently away on leave.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEhpWov9P5fNqsNXdanKSrs4Grc8gFAl5Q6z0ACgkQnKSrs4Gr
 c8hWHwf6A8f4+CY+wvW1k0GkXlo6Z8QT3Lms0pMlbCcM+eTgRTz4HE2FK7lDFM8b
 xUfjy78pImXJyquGeLyzcFDMrJNJkHfz1BncYxDHs/rweWNM8R9Wy08my3YLRAnK
 MIM0unoWzXY8QKx+f61IEDGz1ZMVHTMpV99GhacMdx3xkyOTQk5TW1HYUUGYyiL6
 gE3eVlo6a9SR06fznmBIuV3Thbqu/jI4SVk3n+gb6+HU5L9T+X6y9UsungVAd68M
 oByV/RNu6+HL+YPt39XmIaUA2iOz4aq+7pw3r7w9BMrTr7AK522yoGkYXFpipCJA
 spLoz+YS1Lpmezl+CuJgIak+wTJqmA==
 =OnmP
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging

Pull request

This pull request contains a virtio-blk/scsi performance optimization, event
loop scalability improvements, and a qtest-based device fuzzing framework.  I
am including the fuzzing patches because I have reviewed them and Thomas Huth
is currently away on leave.

# gpg: Signature made Sat 22 Feb 2020 08:50:05 GMT
# gpg:                using RSA key 8695A8BFD3F97CDAAC35775A9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" [full]
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>" [full]
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35  775A 9CA4 ABB3 81AB 73C8

* remotes/stefanha/tags/block-pull-request: (31 commits)
  fuzz: add documentation to docs/devel/
  fuzz: add virtio-scsi fuzz target
  fuzz: add virtio-net fuzz target
  fuzz: add i440fx fuzz targets
  fuzz: add configure flag --enable-fuzzing
  fuzz: add target/fuzz makefile rules
  fuzz: add support for qos-assisted fuzz targets
  fuzz: support for fork-based fuzzing.
  main: keep rcu_atfork callback enabled for qtest
  exec: keep ram block across fork when using qtest
  fuzz: add fuzzer skeleton
  libqos: move useful qos-test funcs to qos_external
  libqos: split qos-test and libqos makefile vars
  libqos: rename i2c_send and i2c_recv
  qtest: add in-process incoming command handler
  libqtest: make bufwrite rely on the TransportOps
  libqtest: add a layer of abstraction to send/recv
  qtest: add qtest_server_send abstraction
  fuzz: add FUZZ_TARGET module type
  module: check module wasn't already initialized
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-02-24 11:38:54 +00:00
Alexander Bulekov
e5c59355ae fuzz: add documentation to docs/devel/
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-id: 20200220041118.23264-23-alxndr@bu.edu
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-02-22 08:26:48 +00:00
Alexander Bulekov
472a07a6e2 fuzz: add virtio-scsi fuzz target
The virtio-scsi fuzz target sets up and fuzzes the available virtio-scsi
queues. After an element is placed on a queue, the fuzzer can select
whether to perform a kick, or continue adding elements.

Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-id: 20200220041118.23264-22-alxndr@bu.edu
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-02-22 08:26:48 +00:00
Alexander Bulekov
b1db8c6316 fuzz: add virtio-net fuzz target
The virtio-net fuzz target feeds inputs to all three virtio-net
virtqueues, and uses forking to avoid leaking state between fuzz runs.

Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-id: 20200220041118.23264-21-alxndr@bu.edu
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-02-22 08:26:48 +00:00
Alexander Bulekov
04f713242d fuzz: add i440fx fuzz targets
These three targets should simply fuzz reads/writes to a couple ioports,
but they mostly serve as examples of different ways to write targets.
They demonstrate using qtest and qos for fuzzing, as well as using
rebooting and forking to reset state, or not resetting it at all.

Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-id: 20200220041118.23264-20-alxndr@bu.edu
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-02-22 08:26:48 +00:00
Alexander Bulekov
adc28027ff fuzz: add configure flag --enable-fuzzing
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-id: 20200220041118.23264-19-alxndr@bu.edu
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-02-22 08:26:48 +00:00
Alexander Bulekov
c621dc3e01 fuzz: add target/fuzz makefile rules
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20200220041118.23264-18-alxndr@bu.edu
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-02-22 08:26:48 +00:00
Alexander Bulekov
275ab39d86 fuzz: add support for qos-assisted fuzz targets
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-id: 20200220041118.23264-17-alxndr@bu.edu
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-02-22 08:26:48 +00:00
Alexander Bulekov
cb06fdad05 fuzz: support for fork-based fuzzing.
fork() is a simple way to ensure that state does not leak in between
fuzzing runs. Unfortunately, the fuzzer mutation engine relies on
bitmaps which contain coverage information for each fuzzing run, and
these bitmaps should be copied from the child to the parent(where the
mutation occurs). These bitmaps are created through compile-time
instrumentation and they are not shared with fork()-ed processes, by
default. To address this, we create a shared memory region, adjust its
size and map it _over_ the counter region. Furthermore, libfuzzer
doesn't generally expose the globals that specify the location of the
counters/coverage bitmap. As a workaround, we rely on a custom linker
script which forces all of the bitmaps we care about to be placed in a
contiguous region, which is easy to locate and mmap over.

Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-id: 20200220041118.23264-16-alxndr@bu.edu
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-02-22 08:26:48 +00:00
Alexander Bulekov
d6919e4cb6 main: keep rcu_atfork callback enabled for qtest
The qtest-based fuzzer makes use of forking to reset-state between
tests. Keep the callback enabled, so the call_rcu thread gets created
within the child process.

Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20200220041118.23264-15-alxndr@bu.edu
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-02-22 08:26:48 +00:00
Alexander Bulekov
a028edeaa6 exec: keep ram block across fork when using qtest
Ram blocks were marked MADV_DONTFORK breaking fuzzing-tests which
execute each test-input in a forked process.

Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-id: 20200220041118.23264-14-alxndr@bu.edu
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-02-22 08:26:48 +00:00
Alexander Bulekov
5f6fd09a97 fuzz: add fuzzer skeleton
tests/fuzz/fuzz.c serves as the entry point for the virtual-device
fuzzer. Namely, libfuzzer invokes the LLVMFuzzerInitialize and
LLVMFuzzerTestOneInput functions, both of which are defined in this
file. This change adds a "FuzzTarget" struct, along with the
fuzz_add_target function, which should be used to define new fuzz
targets.

Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-id: 20200220041118.23264-13-alxndr@bu.edu
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-02-22 08:26:48 +00:00
Alexander Bulekov
f62a0bff6a libqos: move useful qos-test funcs to qos_external
The moved functions are not specific to qos-test and might be useful
elsewhere. For example the virtual-device fuzzer makes use of them for
qos-assisted fuzz-targets.

Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-id: 20200220041118.23264-12-alxndr@bu.edu
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-02-22 08:26:48 +00:00
Alexander Bulekov
92ecf9be90 libqos: split qos-test and libqos makefile vars
Most qos-related objects were specified in the qos-test-obj-y variable.
qos-test-obj-y also included qos-test.o which defines a main().
This made it difficult to repurpose qos-test-obj-y to link anything
beside tests/qos-test against libqos. This change separates objects that
are libqos-specific and ones that are qos-test specific into different
variables.

Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20200220041118.23264-11-alxndr@bu.edu
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-02-22 08:26:48 +00:00
Alexander Bulekov
39397a9a76 libqos: rename i2c_send and i2c_recv
The names i2c_send and i2c_recv collide with functions defined in
hw/i2c/core.c. This causes an error when linking against libqos and
softmmu simultaneously (for example when using qtest inproc). Rename the
libqos functions to avoid this.

Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Acked-by: Thomas Huth <thuth@redhat.com>
Message-id: 20200220041118.23264-10-alxndr@bu.edu
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-02-22 08:26:48 +00:00
Alexander Bulekov
0bd9aef89b qtest: add in-process incoming command handler
The handler allows a qtest client to send commands to the server by
directly calling a function, rather than using a file/CharBackend

Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-id: 20200220041118.23264-9-alxndr@bu.edu
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-02-22 08:26:48 +00:00
Alexander Bulekov
ca5d464151 libqtest: make bufwrite rely on the TransportOps
When using qtest "in-process" communication, qtest_sendf directly calls
a function in the server (qtest.c). Previously, bufwrite used
socket_send, which bypasses the TransportOps enabling the call into
qtest.c. This change replaces the socket_send calls with ops->send,
maintaining the benefits of the direct socket_send call, while adding
support for in-process qtest calls.

Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-id: 20200220041118.23264-8-alxndr@bu.edu
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-02-22 08:26:47 +00:00
Alexander Bulekov
075334810b libqtest: add a layer of abstraction to send/recv
This makes it simple to swap the transport functions for qtest commands
to and from the qtest client. For example, now it is possible to
directly pass qtest commands to a server handler that exists within the
same process, without the standard way of writing to a file descriptor.

Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-id: 20200220041118.23264-7-alxndr@bu.edu
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-02-22 08:26:47 +00:00
Alexander Bulekov
e731d083e3 qtest: add qtest_server_send abstraction
qtest_server_send is a function pointer specifying the handler used to
transmit data to the qtest client. In the standard configuration, this
calls the CharBackend handler, but now it is possible for other types of
handlers, e.g direct-function calls if the qtest client and server
exist within the same process (inproc)

Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Acked-by: Thomas Huth <thuth@redhat.com>
Message-id: 20200220041118.23264-6-alxndr@bu.edu
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-02-22 08:26:47 +00:00
Alexander Bulekov
e785e50a5e fuzz: add FUZZ_TARGET module type
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-id: 20200220041118.23264-5-alxndr@bu.edu
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-02-22 08:26:47 +00:00
Alexander Bulekov
46a07579eb module: check module wasn't already initialized
The virtual-device fuzzer must initialize QOM, prior to running
vl:qemu_init, so that it can use the qos_graph to identify the arguments
required to initialize a guest for libqos-assisted fuzzing. This change
prevents errors when vl:qemu_init tries to (re)initialize the previously
initialized QOM module.

Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20200220041118.23264-4-alxndr@bu.edu
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-02-22 08:26:47 +00:00
Alexander Bulekov
7b73386222 softmmu: split off vl.c:main() into main.c
A program might rely on functions implemented in vl.c, but implement its
own main(). By placing main into a separate source file, there are no
complaints about duplicate main()s when linking against vl.o. For
example, the virtual-device fuzzer uses a main() provided by libfuzzer,
and needs to perform some initialization before running the softmmu
initialization. Now, main simply calls three vl.c functions which
handle the guest initialization, main loop and cleanup.

Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-id: 20200220041118.23264-3-alxndr@bu.edu
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-02-22 08:26:47 +00:00
Alexander Bulekov
bac068e064 softmmu: move vl.c to softmmu/
Move vl.c to a separate directory, similar to linux-user/
Update the chechpatch and get_maintainer scripts, since they relied on
/vl.c for top_of_tree checks.

Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-id: 20200220041118.23264-2-alxndr@bu.edu
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-02-22 08:26:47 +00:00
Stefan Hajnoczi
7391d34c3c aio-posix: make AioHandler dispatch O(1) with epoll
File descriptor monitoring is O(1) with epoll(7), but
aio_dispatch_handlers() still scans all AioHandlers instead of
dispatching just those that are ready.  This makes aio_poll() O(n) with
respect to the total number of registered handlers.

Add a local ready_list to aio_poll() so that each nested aio_poll()
builds a list of handlers ready to be dispatched.  Since file descriptor
polling is level-triggered, nested aio_poll() calls also see fds that
were ready in the parent but not yet dispatched.  This guarantees that
nested aio_poll() invocations will dispatch all fds, even those that
became ready before the nested invocation.

Since only handlers ready to be dispatched are placed onto the
ready_list, the new aio_dispatch_ready_handlers() function provides O(1)
dispatch.

Note that AioContext polling is still O(n) and currently cannot be fully
disabled.  This still needs to be fixed before aio_poll() is fully O(1).

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Sergio Lopez <slp@redhat.com>
Message-id: 20200214171712.541358-6-stefanha@redhat.com
[Fix compilation error on macOS where there is no epoll(87).  The
aio_epoll() prototype was out of date and aio_add_ready_list() needed to
be moved outside the ifdef.
--Stefan]
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-02-22 08:26:47 +00:00
Stefan Hajnoczi
4749079ce0 aio-posix: make AioHandler deletion O(1)
It is not necessary to scan all AioHandlers for deletion.  Keep a list
of deleted handlers instead of scanning the full list of all handlers.

The AioHandler->deleted field can be dropped.  Let's check if the
handler has been inserted into the deleted list instead.  Add a new
QLIST_IS_INSERTED() API for this check.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Sergio Lopez <slp@redhat.com>
Message-id: 20200214171712.541358-5-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-02-22 08:26:47 +00:00
Stefan Hajnoczi
195ed8cb36 qemu/queue.h: add QLIST_SAFE_REMOVE()
QLIST_REMOVE() assumes the element is in a list.  It also leaves the
element's linked list pointers dangling.

Introduce a safe version of QLIST_REMOVE() and convert open-coded
instances of this pattern.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Sergio Lopez <slp@redhat.com>
Message-id: 20200214171712.541358-4-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-02-22 08:26:47 +00:00
Stefan Hajnoczi
ca8c6b2275 aio-posix: don't pass ns timeout to epoll_wait()
Don't pass the nanosecond timeout into epoll_wait(), which expects
milliseconds.

The epoll_wait() timeout value does not matter if qemu_poll_ns()
determined that the poll fd is ready, but passing a value in the wrong
units is still ugly.  Pass a 0 timeout to epoll_wait() instead.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Sergio Lopez <slp@redhat.com>
Message-id: 20200214171712.541358-3-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-02-22 08:26:47 +00:00
Stefan Hajnoczi
ff29ed3a33 aio-posix: fix use after leaving scope in aio_poll()
epoll_handler is a stack variable and must not be accessed after it goes
out of scope:

      if (aio_epoll_check_poll(ctx, pollfds, npfd, timeout)) {
          AioHandler epoll_handler;
          ...
          add_pollfd(&epoll_handler);
          ret = aio_epoll(ctx, pollfds, npfd, timeout);
      } ...

  ...

  /* if we have any readable fds, dispatch event */
  if (ret > 0) {
      for (i = 0; i < npfd; i++) {
          nodes[i]->pfd.revents = pollfds[i].revents;
      }
  }

nodes[0] is &epoll_handler, which has already gone out of scope.

There is no need to use pollfds[] for epoll.  We don't need an
AioHandler for the epoll fd.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Sergio Lopez <slp@redhat.com>
Message-id: 20200214171712.541358-2-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-02-22 08:26:47 +00:00
Stefan Hajnoczi
8c6b0356b5 util/async: make bh_aio_poll() O(1)
The ctx->first_bh list contains all created BHs, including those that
are not scheduled.  The list is iterated by the event loop and therefore
has O(n) time complexity with respected to the number of created BHs.

Rewrite BHs so that only scheduled or deleted BHs are enqueued.
Only BHs that actually require action will be iterated.

One semantic change is required: qemu_bh_delete() enqueues the BH and
therefore invokes aio_notify().  The
tests/test-aio.c:test_source_bh_delete_from_cb() test case assumed that
g_main_context_iteration(NULL, false) returns false after
qemu_bh_delete() but it now returns true for one iteration.  Fix up the
test case.

This patch makes aio_compute_timeout() and aio_bh_poll() drop from a CPU
profile reported by perf-top(1).  Previously they combined to 9% CPU
utilization when AioContext polling is commented out and the guest has 2
virtio-blk,num-queues=1 and 99 virtio-blk,num-queues=32 devices.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20200221093951.1414693-1-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-02-22 08:26:47 +00:00
Paolo Bonzini
8c3570e339 rcu_queue: add QSLIST functions
QSLIST is the only family of lists for which we do not have RCU-friendly accessors,
add them.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20200220103828.24525-1-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-02-22 08:26:47 +00:00
Stefan Hajnoczi
f25c0b5479 aio-posix: avoid reacquiring rcu_read_lock() when polling
The first rcu_read_lock/unlock() is expensive.  Nested calls are cheap.

This optimization increases IOPS from 73k to 162k with a Linux guest
that has 2 virtio-blk,num-queues=1 and 99 virtio-blk,num-queues=32
devices.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20200218182708.914552-1-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-02-22 08:26:47 +00:00
Denis Plotnikov
c9b7d9ec21 virtio: increase virtqueue size for virtio-scsi and virtio-blk
The goal is to reduce the amount of requests issued by a guest on
1M reads/writes. This rises the performance up to 4% on that kind of
disk access pattern.

The maximum chunk size to be used for the guest disk accessing is
limited with seg_max parameter, which represents the max amount of
pices in the scatter-geather list in one guest disk request.

Since seg_max is virqueue_size dependent, increasing the virtqueue
size increases seg_max, which, in turn, increases the maximum size
of data to be read/write from a guest disk.

More details in the original problem statment:
https://lists.gnu.org/archive/html/qemu-devel/2017-12/msg03721.html

Suggested-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Denis Plotnikov <dplotnikov@virtuozzo.com>
Message-id: 20200214074648.958-1-dplotnikov@virtuozzo.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-02-22 08:26:47 +00:00
Peter Maydell
88e2b97aa3 virtiofs pull 20200221
Mostly minor cleanups.
 Miroslav's fixes a make install corner case.
 Philippe's set includes an error corner case fix.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEERfXHG0oMt/uXep+pBRYzHrxb/ecFAl5P2WMACgkQBRYzHrxb
 /ecIuw/9HmZTEL+kDHt84LINVwJmr0bZUK4wEa0K41yCKsTdtHhZOg2p5L1+HcS/
 t0jZa8RisqiFLPuJoBfFSDSX9V/RSjz7Ay2nIP3hJLhssPbpU+YM6pep2RmmNz3I
 mJi+l8AFuyd+CBQvecVvMLv8+RybOyc6+g4vEws//d61yZ6nzovSeeMiZdthfJhq
 Qs6fFmTKqRrziDTS0+CuoOhJ8ErC1R7tanTzN4Ke2nDUw2i53bco7dLtFYkZ0YL4
 mOPb5txOZt2uTTMkaw9Np3+ahZ4/xoCW9igwQy9bGNwY7FnqUlobLBcpK80M9aGA
 XTX+YruK/t19+4EwQe0CVAPwNI24kijq604SWCyGOA90KQz4IO8Dtj0gSWCfFBcn
 4ffj6/OgZM7IeXD/IomfPTzyyGxVyyP54aT+DiMzBhAm9Tk14kwK0HdsmSt4PgIG
 pBgFCMPQmaOp4LJTY/RHls0lYU/LlTzYlyqLtYMGosZZcO1Zze980+EPXf8JKBER
 svTcnAOFTK6Yht6EskyQx9bgsGNIMmL69pNpuArYKjv+uXKQhuRmlkg3MJcVsIUE
 bCVNLIqyZqk3POnCkFiKMK4QZH0qwMf8QBs6j3M6I0nyVzLgk67t/BN55z1Ga6Jn
 UrgPXqBYBd0NnmE4WLNmzTDlIrlITYi3/hNYOWU7E5Eh3nylyoc=
 =1ANS
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/dgilbert-gitlab/tags/pull-virtiofs-20200221' into staging

virtiofs pull 20200221

Mostly minor cleanups.
Miroslav's fixes a make install corner case.
Philippe's set includes an error corner case fix.

# gpg: Signature made Fri 21 Feb 2020 13:21:39 GMT
# gpg:                using RSA key 45F5C71B4A0CB7FB977A9FA90516331EBC5BFDE7
# gpg: Good signature from "Dr. David Alan Gilbert (RH2) <dgilbert@redhat.com>" [full]
# Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A  9FA9 0516 331E BC5B FDE7

* remotes/dgilbert-gitlab/tags/pull-virtiofs-20200221:
  docs: Fix virtiofsd.1 location
  virtiofsd: Remove fuse.h and struct fuse_module
  tools/virtiofsd/fuse_lowlevel: Fix fuse_out_header::error value
  tools/virtiofsd/passthrough_ll: Remove unneeded variable assignment
  tools/virtiofsd/passthrough_ll: Remove unneeded variable assignment
  virtiofsd: Help message fix for 'seconds'

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-02-21 17:00:23 +00:00
Peter Maydell
9ac5df20f5 target-arm queue:
* aspeed/scu: Implement chip ID register
  * hw/misc/iotkit-secctl: Fix writing to 'PPC Interrupt Clear' register
  * mainstone: Make providing flash images non-mandatory
  * z2: Make providing flash images non-mandatory
  * Fix failures to flush SVE high bits after AdvSIMD INS/ZIP/UZP/TRN/TBL/TBX/EXT
  * Minor performance improvement: spend less time recalculating hflags values
  * Code cleanup to isar_feature function tests
  * Implement ARMv8.1-PMU and ARMv8.4-PMU extensions
  * Bugfix: correct handling of PMCR_EL0.LC bit
  * Bugfix: correct definition of PMCRDP
  * Correctly implement ACTLR2, HACTLR2
  * allwinner: Wire up USB ports
  * Vectorize emulation of USHL, SSHL, PMUL*
  * xilinx_spips: Correct the number of dummy cycles for the FAST_READ_4 cmd
  * sh4: Fix PCI ISA IO memory subregion
 -----BEGIN PGP SIGNATURE-----
 
 iQJNBAABCAA3FiEE4aXFk81BneKOgxXPPCUl7RQ2DN4FAl5QAqEZHHBldGVyLm1h
 eWRlbGxAbGluYXJvLm9yZwAKCRA8JSXtFDYM3r05D/9TqS2xjqDT6e61W7ETnnCw
 hOXg5grQcI0ppStUWb7eW9SdfaftquIrxaB4kYXX00daMn6BOM7YzSLtSEuEeswZ
 hbSUGDuMFtUtOfyArR+m3HxoIJDAa6u9t4Di7sVR/DKPuKwCCerR7oAf8fCfWNEB
 3sP3+AnwozvXCkRBmHf5q7EflVNP7mYx+fWPnGxNapZn0W94KqDafQFEy0NejbiZ
 NfpcrdGSY/Fn83znuHNMLDrhYqnWbVudgvBtJ3j/uGvzMAnzrS26CHLhVdJDTvF3
 HzuZR8su9vawizbXeeqqC3D6M75xPyUjUg8XSI9cTs1Il31uTL/elkugkKWG8G99
 w/4Q77IPxe+wtC9eCtys2qQqhqC7z4jfa/WgAhghuvx/jPaOQK/p5SChUqh6CPdx
 DHxCHF52ADlhZ6zflLMf9nhh8e42PJ2+8xTh0WAMDVwIsuB9FVMIU3FY9mwGPCvN
 xBbmPJxS5QoqF8UXsdtA89YIV8r8yF2uZeKpgXQltXFmBzcgsETI6wtjWgl3uQuB
 dKnDAwVlMQubuPrZb5KDTY+dDDk9SQBMHbDffSDsootXaSNBnAtanJAdyXRox8KG
 S4OsjrGPMNn83SSwBODIPjtzLEbQyIcND5XkdN2oU96xstvJeGsyZJ56WHmbWhmC
 tFTjkDGbaYolJy+IBvLCPQ==
 =JP9N
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20200221-1' into staging

target-arm queue:
 * aspeed/scu: Implement chip ID register
 * hw/misc/iotkit-secctl: Fix writing to 'PPC Interrupt Clear' register
 * mainstone: Make providing flash images non-mandatory
 * z2: Make providing flash images non-mandatory
 * Fix failures to flush SVE high bits after AdvSIMD INS/ZIP/UZP/TRN/TBL/TBX/EXT
 * Minor performance improvement: spend less time recalculating hflags values
 * Code cleanup to isar_feature function tests
 * Implement ARMv8.1-PMU and ARMv8.4-PMU extensions
 * Bugfix: correct handling of PMCR_EL0.LC bit
 * Bugfix: correct definition of PMCRDP
 * Correctly implement ACTLR2, HACTLR2
 * allwinner: Wire up USB ports
 * Vectorize emulation of USHL, SSHL, PMUL*
 * xilinx_spips: Correct the number of dummy cycles for the FAST_READ_4 cmd
 * sh4: Fix PCI ISA IO memory subregion

# gpg: Signature made Fri 21 Feb 2020 16:17:37 GMT
# gpg:                using RSA key E1A5C593CD419DE28E8315CF3C2525ED14360CDE
# gpg:                issuer "peter.maydell@linaro.org"
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" [ultimate]
# gpg:                 aka "Peter Maydell <pmaydell@gmail.com>" [ultimate]
# gpg:                 aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" [ultimate]
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83  15CF 3C25 25ED 1436 0CDE

* remotes/pmaydell/tags/pull-target-arm-20200221-1: (46 commits)
  target/arm: Set MVFR0.FPSP for ARMv5 cpus
  target/arm: Use isar_feature_aa32_simd_r32 more places
  target/arm: Rename isar_feature_aa32_simd_r32
  sh4: Fix PCI ISA IO memory subregion
  xilinx_spips: Correct the number of dummy cycles for the FAST_READ_4 cmd
  target/arm: Convert PMULL.8 to gvec
  target/arm: Convert PMULL.64 to gvec
  target/arm: Convert PMUL.8 to gvec
  target/arm: Vectorize USHL and SSHL
  arm: allwinner: Wire up USB ports
  hcd-ehci: Introduce "companion-enable" sysbus property
  hw: usb: hcd-ohci: Move OHCISysBusState and TYPE_SYSBUS_OHCI to include file
  target/arm: Correctly implement ACTLR2, HACTLR2
  target/arm: Use FIELD_EX32 for testing 32-bit fields
  target/arm: Use isar_feature function for testing AA32HPD feature
  target/arm: Test correct register in aa32_pan and aa32_ats1e1 checks
  target/arm: Correct handling of PMCR_EL0.LC bit
  target/arm: Correct definition of PMCRDP
  target/arm: Provide ARMv8.4-PMU in '-cpu max'
  target/arm: Implement ARMv8.4-PMU extension
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-02-21 16:18:38 +00:00
Richard Henderson
9eb4f58918 target/arm: Set MVFR0.FPSP for ARMv5 cpus
We are going to convert FEATURE tests to ISAR tests,
so FPSP needs to be set for these cpus, like we have
already for FPDP.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200214181547.21408-5-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-02-21 16:07:03 +00:00
Richard Henderson
a6627f5fc6 target/arm: Use isar_feature_aa32_simd_r32 more places
Many uses of ARM_FEATURE_VFP3 are testing for the number of simd
registers implemented.  Use the proper test vs MVFR0.SIMDReg.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200214181547.21408-4-richard.henderson@linaro.org
[PMM: fix typo in commit message]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-02-21 16:07:03 +00:00
Richard Henderson
0e13ba7889 target/arm: Rename isar_feature_aa32_simd_r32
The old name, isar_feature_aa32_fp_d32, does not reflect
the MVFR0 field name, SIMDReg.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20200214181547.21408-3-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: wrapped one long line]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-02-21 16:07:03 +00:00
Guenter Roeck
47d2d36cd8 sh4: Fix PCI ISA IO memory subregion
Booting the r2d machine from flash fails because flash is not discovered.
Looking at the flattened memory tree, we see the following.

FlatView #1
 AS "memory", root: system
 AS "cpu-memory-0", root: system
 AS "sh_pci_host", root: bus master container
 Root memory region: system
  0000000000000000-000000000000ffff (prio 0, i/o): io
  0000000000010000-0000000000ffffff (prio 0, i/o): r2d.flash @0000000000010000

The overlapping memory region is sh_pci.isa, ie the ISA I/O region bridge.
This region is initially assigned to address 0xfe240000, but overwritten
with a write into the PCIIOBR register. This write is expected to adjust
the PCI memory window, but not to change the region's base adddress.

Peter Maydell provided the following detailed explanation.

"Section 22.3.7 and in particular figure 22.3 (of "SSH7751R user's manual:
hardware") are clear about how this is supposed to work: there is a window
at 0xfe240000 in the system register space for PCI I/O space. When the CPU
makes an access into that area, the PCI controller calculates the PCI
address to use by combining bits 0..17 of the system address with the
bits 31..18 value that the guest has put into the PCIIOBR. That is, writing
to the PCIIOBR changes which section of the IO address space is visible in
the 0xfe240000 window. Instead what QEMU's implementation does is move the
window to whatever value the guest writes to the PCIIOBR register -- so if
the guest writes 0 we put the window at 0 in system address space."

Fix the problem by calling memory_region_set_alias_offset() instead of
removing and re-adding the PCI ISA subregion on writes into PCIIOBR.
At the same time, in sh_pci_device_realize(), don't set iobr since
it is overwritten later anyway. Instead, pass the base address to
memory_region_add_subregion() directly.

Many thanks to Peter Maydell for the detailed problem analysis, and for
providing suggestions on how to fix the problem.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Message-id: 20200218201050.15273-1-linux@roeck-us.net
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-02-21 16:07:02 +00:00
Francisco Iglesias
33e2c4d8d3 xilinx_spips: Correct the number of dummy cycles for the FAST_READ_4 cmd
Correct the number of dummy cycles required by the FAST_READ_4 command (to
be eight, one dummy byte).

Fixes: ef06ca3946 ("xilinx_spips: Add support for RX discard and RX drain")
Suggested-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Francisco Iglesias <frasse.iglesias@gmail.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 20200218113350.6090-1-frasse.iglesias@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-02-21 16:07:02 +00:00
Richard Henderson
e7e96fc5ec target/arm: Convert PMULL.8 to gvec
We still need two different helpers, since NEON and SVE2 get the
inputs from different locations within the source vector.  However,
we can convert both to the same internal form for computation.

The sve2 helper is not used yet, but adding it with this patch
helps illustrate why the neon changes are helpful.

Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200216214232.4230-5-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-02-21 16:07:02 +00:00
Richard Henderson
b9ed510e46 target/arm: Convert PMULL.64 to gvec
The gvec form will be needed for implementing SVE2.

Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200216214232.4230-4-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-02-21 16:07:02 +00:00
Richard Henderson
a21bb78e58 target/arm: Convert PMUL.8 to gvec
The gvec form will be needed for implementing SVE2.

Extend the implementation to operate on uint64_t instead of uint32_t.
Use a counted inner loop instead of terminating when op1 goes to zero,
looking toward the required implementation for ARMv8.4-DIT.

Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200216214232.4230-3-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-02-21 16:07:02 +00:00
Richard Henderson
87b74e8b6e target/arm: Vectorize USHL and SSHL
These instructions shift left or right depending on the sign
of the input, and 7 bits are significant to the shift.  This
requires several masks and selects in addition to the actual
shifts to form the complete answer.

That said, the operation is still a small improvement even for
two 64-bit elements -- 13 vector operations instead of 2 * 7
integer operations.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200216214232.4230-2-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-02-21 16:07:02 +00:00