Commit Graph

5205 Commits

Author SHA1 Message Date
Richard Henderson 1ee73216f4 log: Add locking to large logging blocks
Reuse the existing locking provided by stdio to keep in_asm, cpu,
op, op_opt, op_ind, and out_asm as contiguous blocks.

While it isn't possible to interleave e.g. in_asm or op_opt logs
because of the TB lock protecting all code generation, it is
possible to interleave cpu logs, or to interleave a cpu dump with
an out_asm dump.

For mingw32, we appear to have no viable solution for this.  The locking
functions are not properly exported from the system runtime library.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-11-01 10:29:03 -06:00
Paolo Bonzini 28017e010d tests: send error_report to test log
Implement error_vprintf to send the output of error_report to
the test log.  This silences test-vmstate.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1477326663-67817-3-git-send-email-pbonzini@redhat.com>
2016-11-01 16:06:57 +01:00
Paolo Bonzini 397d30e940 qemu-error: remove dependency of stubs on monitor
Leave the implementation of error_vprintf and error_vprintf_unless_qmp
(the latter now trivially wrapped by error_printf_unless_qmp) to
libqemustub.a and monitor.c.  This has two advantages: it lets us
remove the monitor_printf and monitor_vprintf stubs, and it lets
tests provide a different implementation of the functions that uses
g_test_message.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1477326663-67817-2-git-send-email-pbonzini@redhat.com>
2016-11-01 16:06:57 +01:00
John Snow d899636810 blockjobs: fix documentation
(Trivial)

Fix wrong function names in documentation.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1477584421-1399-8-git-send-email-jsnow@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2016-11-01 08:04:56 -04:00
John Snow c87621ea68 blockjobs: split interface into public/private, Part 1
To make it a little more obvious which functions are intended to be
public interface and which are intended to be for use only by jobs
themselves, split the interface into "public" and "private" files.

Convert blockjobs (e.g. block/backup) to using the private interface.
Leave blockdev and others on the public interface.

There are remaining uses of private state by qemu-img, and several
cases in blockdev.c and block/io.c where we grab job->blk for the
purposes of acquiring an AIOContext.

These will be corrected in future patches.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1477584421-1399-7-git-send-email-jsnow@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2016-11-01 08:04:56 -04:00
John Snow 0df4ba5863 Blockjobs: Internalize user_pause logic
BlockJobs will begin hiding their state in preparation for some
refactorings anyway, so let's internalize the user_pause mechanism
instead of leaving it to callers to correctly manage.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1477584421-1399-6-git-send-email-jsnow@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2016-11-01 07:55:57 -04:00
John Snow 8254b6d953 blockjob: centralize QMP event emissions
There's no reason to leave this to blockdev; we can do it in blockjobs
directly and get rid of an extra callback for most users.

All non-internal events, even those created outside of QMP, will
consistently emit events.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1477584421-1399-5-git-send-email-jsnow@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2016-11-01 07:55:57 -04:00
John Snow 47970dfb0a Replication/Blockjobs: Create replication jobs as internal
Bubble up the internal interface to commit and backup jobs, then switch
replication tasks over to using this methodology.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1477584421-1399-4-git-send-email-jsnow@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2016-11-01 07:55:57 -04:00
John Snow f81e0b4532 blockjobs: Allow creating internal jobs
Add the ability to create jobs without an ID.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1477584421-1399-3-git-send-email-jsnow@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2016-11-01 07:55:57 -04:00
John Snow 559b935f8c blockjobs: hide internal jobs from management API
If jobs are not created directly by the user, do not allow them to be
seen by the user/management utility. At the moment, 'internal' jobs are
those that do not have an ID. As of this patch it is impossible to
create such jobs.

Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1477584421-1399-2-git-send-email-jsnow@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2016-11-01 07:55:57 -04:00
Peter Maydell e80b4b8fb6 VFIO updates 2016-10-31
- Replace skip_dump with ram_device to denote device memory and mark
    as non-direct to avoid memcpy to MMIO - fixes RTL (Alex Williamson)
  - Skip zero-length sparse mmaps - avoids unnecessary warning
    (Alex Williamson)
  - Clear BARs on reset so guest doesn't assume programming on return
    from S3 (Ido Yariv)
  - Enable sub-page MMIO mmaps - performance improvement for devices
    with smaller BARs, iff both host and guest map them to full,
    aligned pages (Yongji Xie)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.14 (GNU/Linux)
 
 iQIcBAABAgAGBQJYF37XAAoJECObm247sIsi9okP/jT/UBqR1G7RVuxQ8AZPPAsU
 mBClGw5lC2lQ70M/t9HNxMMpceHSmAIC4doauOhVNGn7yl3MgHywhEmuxvdQQBAV
 WQYkrZsAIyNhg4I0/92PybsppccEgXgGjz7tW+56udgPhU4ChSsbUwrt8uxZ6/M5
 R/rIGBe/46QVKCAPes3PvOLq19LErUnN0uSasP0QxacD0aFnO9vRSlT3Ake6mnqv
 u+Z1p8d9DM5LYkZPV0wcDWBlosda+cWFH+RhEp1UH4d+2hpW4+WB6bMG6SneguAV
 9P6Dl7z8dJUZauFXw+/ctYDHLOKmul6wb7fLR8n09kqLsgxveH3xEw3tILEDBMvn
 W9xBc1Rp5luH7vZio8ZUYvRO0+/MGEyzQwUPcOiw/VOWl0w8IYyA2UVpHQZk5Esi
 r+DsrkxdonrhqXuB4vrJg7TdlbBEh2cAciy2zrSsYADB2ine/op7O+68+kqwsrlP
 tQOz+wIEi+72G7S6jdnVUQAYu+01Fae55K8gR2OPwGQO5SWgliYY7AZbE3l6eMZ7
 UtgG8YfJpJbZ5wQnshkF5NlNO9HwUS3bp+YgaSdF+NiZC+lz1nKpsqEx/JXRST7V
 A9hvK5so5mZ69EmEz7ruijBIblF3nte+Pfrm+FTjwqMUklvbwsElJGKf/fI6f+kl
 xYyUWkiYOoZXmSkjCanm
 =ZMwj
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/awilliam/tags/vfio-updates-20161031.0' into staging

VFIO updates 2016-10-31

 - Replace skip_dump with ram_device to denote device memory and mark
   as non-direct to avoid memcpy to MMIO - fixes RTL (Alex Williamson)
 - Skip zero-length sparse mmaps - avoids unnecessary warning
   (Alex Williamson)
 - Clear BARs on reset so guest doesn't assume programming on return
   from S3 (Ido Yariv)
 - Enable sub-page MMIO mmaps - performance improvement for devices
   with smaller BARs, iff both host and guest map them to full,
   aligned pages (Yongji Xie)

# gpg: Signature made Mon 31 Oct 2016 17:26:47 GMT
# gpg:                using RSA key 0x239B9B6E3BB08B22
# gpg: Good signature from "Alex Williamson <alex.williamson@redhat.com>"
# gpg:                 aka "Alex Williamson <alex@shazbot.org>"
# gpg:                 aka "Alex Williamson <alwillia@redhat.com>"
# gpg:                 aka "Alex Williamson <alex.l.williamson@gmail.com>"
# Primary key fingerprint: 42F6 C04E 540B D1A9 9E7B  8A90 239B 9B6E 3BB0 8B22

* remotes/awilliam/tags/vfio-updates-20161031.0:
  vfio: Add support for mmapping sub-page MMIO BARs
  vfio/pci: fix out-of-sync BAR information on reset
  vfio: Handle zero-length sparse mmap ranges
  memory: Don't use memcpy for ram_device regions
  memory: Replace skip_dump flag with "ram_device"

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-10-31 18:19:06 +00:00
Peter Maydell 8ff7fd8a29 Block layer patches
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJYF2zfAAoJEH8JsnLIjy/WOsAP/1TLKU8ZZ/TLpONfUD6SYcdt
 rPxhuxlxRl7k1xP/Y4oYa+Nl6Q9emgIcTA3V3hqKNEmaqxQWEM56Q+l2cFhaqqVm
 CO/bNJ0vGDogAG0ahgPCs9XKx7IPByb/iQiE+FNL68ZPZAWIaEHYgoiCjukr6e8B
 Q/OXN+WQLM4SIxnfLCYKSCycdGsaSTSx1T/8IjQb0DbpMFtDHOPkXcf/7AsiNJdP
 wdcx0uDkFk3GK2hN4A/ODFQQpEoi6ehm/RaaiDqGEX93IWnFlXRpq4h+dih79jaI
 Xuf9vt+uBKqTW5EiwY/lUmalf+zxj6wHNkvNze9paiacAvuT4N4Vq4+niDvRAbeI
 gkeC2GNRdO0iD+iBMHKQQ0JelBn0y/B+txzurcbZd71d0GF3r6jC17BEwnM39veR
 iIARK/dZEvygYRQknTG/EkbYHBdpWNmjKeVBr/08L7v7r9pN15huNx6Cmr+klz08
 ibstYxyCdmhvAf+UFELDZh3MF+65dpv8+Sa2OYS1SeE7jO9IhkF1SSyaGtSglvtO
 AN9xrJnaOBlLdVLycLpuH+1bcpXIYwMaoHeVsDmeBI5N9QPYlD73Y0ns7vENLR59
 2kUtyhOORdaE+y69hjuRmNYIN8svszzDk6or95aGZB3r4yv7uu2w+sGyy2dR+pdZ
 m+80c3xshudX9rUyHlJb
 =Hglo
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging

Block layer patches

# gpg: Signature made Mon 31 Oct 2016 16:10:07 GMT
# gpg:                using RSA key 0x7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74  56FE 7F09 B272 C88F 2FD6

* remotes/kevin/tags/for-upstream: (29 commits)
  qapi: allow blockdev-add for NFS
  block/nfs: Introduce runtime_opts in NFS
  block: Mention replication in BlockdevDriver enum docs
  qemu-iotests: test 'offset' and 'size' options in raw driver
  raw_bsd: add offset and size options
  qemu-iotests: Test the 'base-node' parameter of 'block-stream'
  block: Add 'base-node' parameter to the 'block-stream' command
  qemu-iotests: Test streaming to a Quorum child
  qemu-iotests: Add iotests.supports_quorum()
  qemu-iotests: Test block-stream and block-commit in parallel
  qemu-iotests: Test overlapping stream and commit operations
  qemu-iotests: Test block-stream operations in parallel
  qemu-iotests: Test streaming to an intermediate layer
  docs: Document how to stream to an intermediate layer
  block: Add QMP support for streaming to an intermediate layer
  block: Support streaming to an intermediate layer
  block: Block all intermediate nodes in commit_active_start()
  block: Block all nodes involved in the block-commit operation
  block: Check blockers in all nodes involved in a block-commit job
  block: Use block_job_add_bdrv() in backup_start()
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-10-31 17:29:04 +00:00
Alex Williamson 4a2e242bbb memory: Don't use memcpy for ram_device regions
With a vfio assigned device we lay down a base MemoryRegion registered
as an IO region, giving us read & write accessors.  If the region
supports mmap, we lay down a higher priority sub-region MemoryRegion
on top of the base layer initialized as a RAM device pointer to the
mmap.  Finally, if we have any quirks for the device (ie. address
ranges that need additional virtualization support), we put another IO
sub-region on top of the mmap MemoryRegion.  When this is flattened,
we now potentially have sub-page mmap MemoryRegions exposed which
cannot be directly mapped through KVM.

This is as expected, but a subtle detail of this is that we end up
with two different access mechanisms through QEMU.  If we disable the
mmap MemoryRegion, we make use of the IO MemoryRegion and service
accesses using pread and pwrite to the vfio device file descriptor.
If the mmap MemoryRegion is enabled and results in one of these
sub-page gaps, QEMU handles the access as RAM, using memcpy to the
mmap.  Using either pread/pwrite or the mmap directly should be
correct, but using memcpy causes us problems.  I expect that not only
does memcpy not necessarily honor the original width and alignment in
performing a copy, but it potentially also uses processor instructions
not intended for MMIO spaces.  It turns out that this has been a
problem for Realtek NIC assignment, which has such a quirk that
creates a sub-page mmap MemoryRegion access.

To resolve this, we disable memory_access_is_direct() for ram_device
regions since QEMU assumes that it can use memcpy for those regions.
Instead we access through MemoryRegionOps, which replaces the memcpy
with simple de-references of standard sizes to the host memory.

With this patch we attempt to provide unrestricted access to the RAM
device, allowing byte through qword access as well as unaligned
access.  The assumption here is that accesses initiated by the VM are
driven by a device specific driver, which knows the device
capabilities.  If unaligned accesses are not supported by the device,
we don't want them to work in a VM by performing multiple aligned
accesses to compose the unaligned access.  A down-side of this
philosophy is that the xp command from the monitor attempts to use
the largest available access weidth, unaware of the underlying
device.  Using memcpy had this same restriction, but at least now an
operator can dump individual registers, even if blocks of device
memory may result in access widths beyond the capabilities of a
given device (RTL NICs only support up to dword).

Reported-by: Thorsten Kohfeldt <thorsten.kohfeldt@gmx.de>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
2016-10-31 09:53:03 -06:00
Alex Williamson 21e00fa55f memory: Replace skip_dump flag with "ram_device"
Setting skip_dump on a MemoryRegion allows us to modify one specific
code path, but the restriction we're trying to address encompasses
more than that.  If we have a RAM MemoryRegion backed by a physical
device, it not only restricts our ability to dump that region, but
also affects how we should manipulate it.  Here we recognize that
MemoryRegions do not change to sometimes allow dumps and other times
not, so we replace setting the skip_dump flag with a new initializer
so that we know exactly the type of region to which we're applying
this behavior.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
2016-10-31 09:53:03 -06:00
Alberto Garcia 23d402d42b block: Add block_job_add_bdrv()
When a block job is created on a certain BlockDriverState, operations
are blocked there while the job exists. However, some block jobs may
involve additional BDSs, which must be blocked separately when the job
is created and unblocked manually afterwards.

This patch adds block_job_add_bdrv(), that simplifies this process by
keeping a list of BDSs that are involved in the specified block job.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-10-31 16:52:38 +01:00
Alberto Garcia c0778f6693 block: Add bdrv_drain_all_{begin,end}()
bdrv_drain_all() doesn't allow the caller to do anything after all
pending requests have been completed but before block jobs are
resumed.

This patch splits bdrv_drain_all() into _begin() and _end() for that
purpose. It also adds aio_{disable,enable}_external() calls to disable
external clients in the meantime.

An important restriction of this split is that no new block jobs or
BlockDriverStates can be created between the bdrv_drain_all_begin()
and bdrv_drain_all_end() calls. This is not a concern now because
we'll only be using this in bdrv_reopen_multiple(), but it must be
dealt with if we ever have other uses cases in the future.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-10-31 16:51:14 +01:00
Ashijeet Acharya 89cadc9dc0 util/qemu-sockets: Make inet_connect_saddr() public
Make inet_connect_saddr() in util/qemu-sockets.c public in order to be
able to use it with InetSocketAddress sockets outside of
util/qemu-sockets.c independently.

Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-10-31 16:49:13 +01:00
Peter Maydell 6bc56d317f Base patches for MTTCG enablement.
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQExBAABCAAbBQJYF07FFBxwYm9uemluaUByZWRoYXQuY29tAAoJEL/70l94x66D
 ppoIAI4AxWocso5WIUH6uEHjOAxw9ZNhZ92nF8VtcbvGtN/eh8Qk4jfRX+W/Jl0q
 D13Rm3m8ynNHqh8YFs+O6i/WSgxHGxKwb75mNr36HDnYnMFluTvRQkvYJUXRyRuL
 CVtNgy8+q8FbbWo+NiJ5I7gfk2Si4BQfZN0uCLqGuCwqvvA/spN13xUcpeBXEKhL
 TeDGZBT/atDnT2bRcve8E8g5/0RKjTL9EB0jwfJjHocT5bs+toPe6js9VnZDRNWN
 ZldcONgEHj3zAj9j7hTkVWFTGPSCx/tt6y6JeORq1oxk0mCCswEk0U9A3hLzLjc/
 94XHsLaEoZ7HNAKtkLc07NYhkQM=
 =+6Sj
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream-mttcg' into staging

Base patches for MTTCG enablement.

# gpg: Signature made Mon 31 Oct 2016 14:01:41 GMT
# gpg:                using RSA key 0xBFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream-mttcg:
  tcg: move locking for tb_invalidate_phys_page_range up
  *_run_on_cpu: introduce run_on_cpu_data type
  cpus: re-factor out handle_icount_deadline
  tcg: cpus rm tcg_exec_all()
  tcg: move tcg_exec_all and helpers above thread fn
  target-arm/arm-powerctl: wake up sleeping CPUs
  tcg: protect translation related stuff with tb_lock.
  translate-all: Add assert_(memory|tb)_lock annotations
  linux-user/elfload: ensure mmap_lock() held while setting up
  tcg: comment on which functions have to be called with tb_lock held
  cpu-exec: include cpu_index in CPU_LOG_EXEC messages
  translate-all: add DEBUG_LOCKING asserts
  translate_all: DEBUG_FLUSH -> DEBUG_TB_FLUSH
  cpus: make all_vcpus_paused() return bool

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-10-31 15:29:12 +00:00
Paolo Bonzini 14e6fe12a7 *_run_on_cpu: introduce run_on_cpu_data type
This changes the *_run_on_cpu APIs (and helpers) to pass data in a
run_on_cpu_data type instead of a plain void *. This is because we
sometimes want to pass a target address (target_ulong) and this fails on
32 bit hosts emulating 64 bit guests.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20161027151030.20863-24-alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-10-31 15:00:25 +01:00
Peter Maydell eab9e9629c Migration bits from the COLO project
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQItBAABCAAXBQJYFc37EBxhbWl0QGtlcm5lbC5vcmcACgkQ6wtN/GV+9nDdqQ/9
 H+di+q/zF6aOEuXRCN0ud9R81fZ7nLve3e9EbKCNZXnxN3M3+zQj0At+e/SBEoTc
 rRwqJmNlRX3TWViWsAYmddgtopZ9R9DWwW/VsKjS2Ng230YShrA/o20hu2dJkwl3
 CN4vAObzc/gxM59NWUlMnTXOG+Z9fI1NlEf0vZ2484a59KPVIE7W1zZccT1F8MNq
 sfa3RlOGBchNO2Rfzrr6cFGGH7UTfWiftPs7EDOfN/YBcpld6V4DfPWdPP8r2DSX
 gG7D3DJuuqPxZCjl0Nm1OjaIunYfVrRpMPMeNyo1+kTVbvvhDPFjah/MSWu8XJ2c
 N5lSqikIAWfMbQVW9gDpQ0495eRQTWA7VIlCbwN6mqNxyBvQMA+licFq6UFFrCMp
 quC3gO+daz3fKvUhi23TpebbqKLHA5OZA5ZvkjGDgkKPMCJyQRBZHLb81t5xsulZ
 cXuzAOeRcXK9aEJvcrDgWwxzi3PN8zg74RF1ZV8gxM4DkHKohlsnbgshWqGkFh3M
 +S5tEPqVlOlDO9juf6rlwNnVbWhFDFEGMKjI9XMTWwVWTREbqyP86fP66h2C4qc6
 34yAHi5G2i43dxzHgpHC0MpU0XenO0EYdu+8Tcx35LSBSfkOeD7pU1DeZQgbI51m
 ZQSnLDJqv2HVdoZT7vaIjwuhtEl84xqelFdVg+cY7Gw=
 =sur7
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/amit-migration/tags/migration-for-2.8' into staging

Migration bits from the COLO project

# gpg: Signature made Sun 30 Oct 2016 10:39:55 GMT
# gpg:                using RSA key 0xEB0B4DFC657EF670
# gpg: Good signature from "Amit Shah <amit@amitshah.net>"
# gpg:                 aka "Amit Shah <amit@kernel.org>"
# gpg:                 aka "Amit Shah <amitshah@gmx.net>"
# Primary key fingerprint: 48CA 3722 5FE7 F4A8 B337  2735 1E9A 3B5F 8540 83B6
#      Subkey fingerprint: CC63 D332 AB8F 4617 4529  6534 EB0B 4DFC 657E F670

* remotes/amit-migration/tags/migration-for-2.8:
  MAINTAINERS: Add maintainer for COLO framework related files
  configure: Support enable/disable COLO feature
  docs: Add documentation for COLO feature
  COLO: Implement failover work for secondary VM
  COLO: Implement the process of failover for primary VM
  COLO: Introduce state to record failover process
  COLO: Add 'x-colo-lost-heartbeat' command to trigger failover
  COLO: Synchronize PVM's state to SVM periodically
  COLO: Add checkpoint-delay parameter for migrate-set-parameters
  COLO: Load VMState into QIOChannelBuffer before restore it
  COLO: Send PVM state to secondary side when do checkpoint
  COLO: Add a new RunState RUN_STATE_COLO
  COLO: Introduce checkpointing protocol
  COLO: Establish a new communicating path for COLO
  migration: Switch to COLO process after finishing loadvm
  migration: Enter into COLO mode after migration if COLO is enabled
  COLO: migrate COLO related info to secondary node
  migration: Introduce capability 'x-colo' to migration

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-10-31 13:06:38 +00:00
Peter Maydell 5ff06787d4 Xen 2016/10/28
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJYE/VuAAoJEIlPj0hw4a6Q/qAP+gMhunO/OJRSwIlYSOt+fVKW
 LE17QUfdVe204IuWY3h9svTvJXj/pOsE1HtIrGAUwsJxMqMBHeZnKhVZvEbvL2Iy
 sKSxzQkkPa6qVn/+Nxr9ZxULJZPXWnl61FLbElelff4w46lTaBM3gBaWDekFEO64
 RbMvsMAUmav1x88KPvqY71Crbx5wbPhNmFqhbJNaQmm3zIQDK1TzGESv882mQKy2
 rKNapBUXq8XnUNN+lIHhnzU9kUjhZxu7uet3GHMVICeAYu3b9jkgomv2OcV/sfRg
 3o/NoXp4I7ZY3F0fkbtJOIFx0m+YlWnQhkBGsQoXJW+4lUdQR9ypMY4OdzjRa80e
 w9GrDt1//LOYrTpB0ZBkW0MIfnUK4TCqtL/aEQtRY9fdRFvcVpCjnqrYw+u9boZ1
 hVypTYmAbk/ece6aJ/dngDQVtGC9qMGlHtBqSBRajFxenvFdY+DK6/FhITpNmobU
 YPWTSwS6WPw/venfvrTMfCQudGW3Jg8iBzRbGPS+GYfYlTHFoO0lKFGWWeTuRFIw
 /4owDhMJr5hRMRWZxCAu+Z8Ymj1MFuK7zDKjvT9LgWOZORh+rz6Tfn9+oxSt/D9I
 1VKb/T9N9wBS3kreZ+Uz5+aQBoQSN5AQjay2ECZ16u9i63EL89CqXyyrJLlKlEGB
 cP3GzM9DlAtCQR943bys
 =Ao+X
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/sstabellini/tags/xen-20161028-tag' into staging

Xen 2016/10/28

# gpg: Signature made Sat 29 Oct 2016 02:03:42 BST
# gpg:                using RSA key 0x894F8F4870E1AE90
# gpg: Good signature from "Stefano Stabellini <sstabellini@kernel.org>"
# gpg:                 aka "Stefano Stabellini <stefano.stabellini@eu.citrix.com>"
# Primary key fingerprint: D04E 33AB A51F 67BA 07D3  0AEA 894F 8F48 70E1 AE90

* remotes/sstabellini/tags/xen-20161028-tag:
  xen: Rename xen_be_del_xendev
  xen: Rename xen_be_find_xendev
  xen: Rename xen_be_evtchn_event
  xen: Rename xen_be_send_notify
  xen: Rename xen_be_unbind_evtchn
  xen: Rename xen_be_printf to xen_pv_printf
  xen: Move xenstore cleanup and mkdir functions
  xen: Prepare xendev qtail to be shared with frontends
  xen: Move evtchn functions to xen_pvdev.c
  xen: Move xenstore_update to xen_pvdev.c
  xen: Create a new file xen_pvdev.c
  xen: Fix coding style warnings
  xen: Fix coding style errors

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-10-31 12:35:39 +00:00
Peter Maydell 277d44f5a6 trivial patches for 2016-10-28
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABCAAGBQJYE2wfAAoJEHAbT2saaT5ZGYUH/3QWJ4OFWbqGo1YYN5AIAheF
 v1bQGTh1HGbLk46ajhUvzB0bMHb1FC1KoOruU2wFYuKK/J5zQ+4X9EmaC/fD7hyx
 nGTcPWAyxKOlqOq3In9ro+xWQNzEhfoypKCQQVC4Y3quzub48wAro8fuFSNXLyBq
 ERvAsjgj0TrLEHoWtJl2bPYiqSd6KAHZAKPFW3Jw8MmsBcTLmnF2PVW3LBfdcHe7
 6vlhqX7lPzVlHRaUsaxRkFxYd2YGisbe3bPRDw2fTxrtOYyEkopQq7xi2Q6Yq5N0
 z0yM2oJ7o1QtUOXYa7KBf03WZ7e119HimaUkGLg+0LVhQNbeG3hd3gNwApXa5og=
 =tYml
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/mjt/tags/trivial-patches-fetch' into staging

trivial patches for 2016-10-28

# gpg: Signature made Fri 28 Oct 2016 16:17:51 BST
# gpg:                using RSA key 0x701B4F6B1A693E59
# gpg: Good signature from "Michael Tokarev <mjt@tls.msk.ru>"
# gpg:                 aka "Michael Tokarev <mjt@corpit.ru>"
# gpg:                 aka "Michael Tokarev <mjt@debian.org>"
# Primary key fingerprint: 6EE1 95D1 886E 8FFB 810D  4324 457C E0A0 8044 65C5
#      Subkey fingerprint: 7B73 BAD6 8BE7 A2C2 8931  4B22 701B 4F6B 1A69 3E59

* remotes/mjt/tags/trivial-patches-fetch: (23 commits)
  Fix build for less common build directories names
  clean-up: removed duplicate #includes
  scripts/clean-includes: added duplicate #include check
  monitor: deprecate 'default' option
  qemu-ga: Remove stray 'q' in documentation
  Makefile: Fix help text for target 'installer'
  s390: avoid always-true comparison in s390_pci_generate_fid()
  migration: Remove unneeded NULL check from migrate_fd_error()
  scripts/hxtool: fix undefined behavour of echo
  qemu-options.hx: set: fix copy-paste error
  usb: Change *_exitfn return type from int to void
  MAINTAINERS: qemu-trivial information
  colo-compare: remove unused struct CompareChardevProps and 'props' variable
  milkymist-pfpu: fix potential integer overflow
  hw/block/nvme: Simplify if-statements a little bit
  target-lm32: rewrite gen_compare()
  lm32: milkymist-tmu2: fix integer overflow
  target-lm32: disable asm logging via LOG_DIS()
  target-lm32: swap operand of wcsr in LOG_DIS()
  target-lm32: fix LOG_DIS operand order
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-10-31 11:58:30 +00:00
Peter Maydell 5273a45e75 -----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
 
 iQEcBAABCAAGBQJYE2ULAAoJEMo1YkxqkXHGvvAH/iDPIAiwBXbndL3KhQTneSHn
 ctd4I3VK1/VVTIBRJIetqETiWiAm/WoRhI9kBc/NrQxBFx3ko+fpSYFS2t6lJYnV
 EX0vjTKjFhr05tOTQDH/SQtHdU5x/x2M8SsxqrCcTyLm5VDfdPeBlMBfSNMj/L2K
 bwinANVEwr6LOM0h8weQ0SvOCa5MLII2p5ufGwKQmhUY5tgZvFlyPa+quDVisKoE
 7CpLwWHmUQSNxUXSaru90osUJyk90wCcYxPpJN3YO1MHvpH4kG8DpZ8bnFqLAoNw
 zkRdqIrlfntD+mKDqRU1y0GXxu9I4VK1UDcQyRFoSdMi2oHR+L018sQEjCYTAXo=
 =n+CF
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/famz/tags/for-upstream' into staging

# gpg: Signature made Fri 28 Oct 2016 15:47:39 BST
# gpg:                using RSA key 0xCA35624C6A9171C6
# gpg: Good signature from "Fam Zheng <famz@redhat.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 5003 7CB7 9706 0F76 F021  AD56 CA35 624C 6A91 71C6

* remotes/famz/tags/for-upstream:
  aio: convert from RFifoLock to QemuRecMutex
  qemu-thread: introduce QemuRecMutex
  iothread: release AioContext around aio_poll
  block: only call aio_poll on the current thread's AioContext
  qemu-img: call aio_context_acquire/release around block job
  qemu-io: acquire AioContext
  block: prepare bdrv_reopen_multiple to release AioContext
  replication: pass BlockDriverState to reopen_backing_file
  iothread: detach all block devices before stopping them
  aio: introduce qemu_get_current_aio_context
  sheepdog: use BDRV_POLL_WHILE
  nfs: use BDRV_POLL_WHILE
  nfs: move nfs_set_events out of the while loops
  block: introduce BDRV_POLL_WHILE
  qed: Implement .bdrv_drain
  block: change drain to look only at one child at a time
  block: add BDS field to count in-flight requests
  mirror: use bdrv_drained_begin/bdrv_drained_end
  blockjob: introduce .drain callback for jobs
  replication: interrupt failover if the main device is closed

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-10-31 10:10:16 +00:00
Paolo Bonzini 7d7500d998 tcg: comment on which functions have to be called with tb_lock held
softmmu requires more functions to be thread-safe, because translation
blocks can be invalidated from e.g. notdirty callbacks.  Probably the
same holds for user-mode emulation, it's just that no one has ever
tried to produce a coherent locking there.

This patch will guide the introduction of more tb_lock and tb_unlock
calls for system emulation.

Note that after this patch some (most) of the mentioned functions are
still called outside tb_lock/tb_unlock.  The next one will rectify this.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-Id: <20161027151030.20863-7-alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-10-31 10:51:16 +01:00
Alex Bennée 301e40ed80 translate-all: add DEBUG_LOCKING asserts
This adds asserts to check the locking on the various translation
engines structures. There are two sets of structures that are protected
by locks.

The first the l1map and PageDesc structures used to track which
translation blocks are associated with which physical addresses. In
user-mode this is covered by the mmap_lock.

The second case are TB context related structures which are protected by
tb_lock which is also user-mode only.

Currently the asserts do nothing in SoftMMU mode but this will change
for MTTCG.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-Id: <20161027151030.20863-4-alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-10-31 10:24:45 +01:00
Gonglei 5551e3a88e virtio-crypto: introduce virtio_crypto.h
Introduce the virtio_crypto.h which follows
virtio-crypto specification.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-10-30 20:06:22 +02:00
Gonglei 9e4f86a84e cryptodev: add symmetric algorithm operation stuff
This patch adds session operation and crypto operation
stuff in the cryptodev backend, including function
pointers and corresponding structures.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-10-30 20:06:22 +02:00
Gonglei d0ee7a135f cryptodev: introduce cryptodev backend interface
cryptodev backend interface is used to realize the active work for
virtual crypto device.

This patch only add the framework, doesn't include specific operations.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-10-30 20:06:22 +02:00
Paolo Bonzini fa283a4a8b virtio: inline virtio_queue_set_host_notifier_fd_handler
Of the three possible parameter combinations for
virtio_queue_set_host_notifier_fd_handler:

- assign=true/set_handler=true is only called from
  virtio_device_start_ioeventfd

- assign=false/set_handler=false is called from
  set_host_notifier_internal but it only does something when
  reached from virtio_device_stop_ioeventfd_impl; otherwise
  there is no EventNotifier set on qemu_get_aio_context().

- assign=true/set_handler=false is called from
  set_host_notifier_internal, but it is not doing anything:
  with the new start_ioeventfd and stop_ioeventfd methods,
  there is never an EventNotifier set on qemu_get_aio_context()
  at this point.  This is enforced by the assertion in
  virtio_bus_set_host_notifier.

Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-10-30 20:06:21 +02:00
Paolo Bonzini ed08a2a0ba virtio: use virtio_bus_set_host_notifier to start/stop ioeventfd
ioeventfd_disabled was the only reason for the default
implementation of virtio_device_start_ioeventfd not to use
virtio_bus_set_host_notifier.  This is now fixed, and the sole entry
point to set up ioeventfd can be virtio_bus_set_host_notifier.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-10-30 20:06:21 +02:00
Paolo Bonzini e616c2f390 virtio: remove ioeventfd_disabled altogether
Now that there is not anymore a switch from the generic ioeventfd handler
to the dataplane handler, virtio_bus_set_host_notifier(assign=true) is
always called with !bus->ioeventfd_started, hence virtio_bus_stop_ioeventfd
does nothing in this case.  Move the invocation to vhost.c, which is the
only place that needs it.

Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-10-30 20:06:20 +02:00
Paolo Bonzini 6019f3b966 virtio: remove set_handler argument from set_host_notifier_internal
Make virtio_device_start_ioeventfd_impl use the same logic as
dataplane to set up the host notifier.  This removes the need
for the set_handler argument in set_host_notifier_internal.

This is a first step towards using virtio_bus_set_host_notifier
as the sole entry point to set up ioeventfds.  At least now
the functions have the same interface, but they still differ
in that virtio_bus_set_host_notifier sets ioeventfd_disabled.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-10-30 20:06:20 +02:00
Paolo Bonzini f1ac6a5522 Revert "virtio: Introduce virtio_add_queue_aio"
This reverts commit 872dd82c83.
virtio_add_queue_aio is unused.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-10-30 20:06:20 +02:00
Paolo Bonzini ad07cd69ec virtio-scsi: always use dataplane path if ioeventfd is active
Override start_ioeventfd and stop_ioeventfd to start/stop the
whole dataplane logic.  This has some positive side effects:

- no need anymore for virtio_add_queue_aio (i.e. a revert of
  commit 1c627137c1)

- no need anymore to switch from generic ioeventfd handlers to
  dataplane

It detects some errors better:

    $ qemu-system-x86_64 -object iothread,id=io \
          -device virtio-scsi-pci,ioeventfd=off,iothread=io
    qemu-system-x86_64: -device virtio-scsi-pci,ioeventfd=off,iothread=io:
    ioeventfd is required for iothread

while previously it would have started just fine.

Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-10-30 19:51:32 +02:00
Paolo Bonzini 8e93cef14e virtio: introduce virtio_device_ioeventfd_enabled
This will be used to forbid iothread configuration when the
proxy does not allow using ioeventfd.  To simplify the implementation,
change the direction of the ioeventfd_disabled callback too.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-10-30 19:51:32 +02:00
Paolo Bonzini ff4c07df67 virtio: add start_ioeventfd and stop_ioeventfd to VirtioDeviceClass
Allow customization of the start and stop of ioeventfd.  This will
allow direct start of dataplane without passing through the default
ioeventfd handlers, which in turn allows using the dataplane logic
instead of virtio_add_queue_aio.  It will also enable some code
simplification, because the sole entry point to ioeventfd setup
will be virtio_bus_set_host_notifier.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-10-30 19:51:32 +02:00
Paolo Bonzini b13d396227 virtio: move ioeventfd_started flag to VirtioBusState
This simplifies the code and removes the ioeventfd_started
and ioeventfd_set_started callback.  The only difference is
in how virtio-ccw handles an error---it doesn't disable
ioeventfd forever anymore.  It was the only backend to do
so, and if desired this behavior should be implemented in

virtio-bus.c.

Instead of ioeventfd_started, the ioeventfd_assign callback now
determines whether the virtio bus supports host notifiers.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-10-30 19:51:32 +02:00
Paolo Bonzini 4ddcc2d5cb virtio: move ioeventfd_disabled flag to VirtioBusState
This simplifies the code and removes the ioeventfd_set_disabled
callback.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-10-30 19:51:32 +02:00
Dr. David Alan Gilbert ea43e25987 virtio/migration: Add VMStateDescription to VirtioDeviceClass
Provide a vmsd pointer for VirtIO devices to use instead of the
load/save methods.

We'll eventually kill off the load/save methods.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-10-30 19:51:31 +02:00
zhanghailiang b3f7f0c5e6 COLO: Implement the process of failover for primary VM
For primary side, if COLO gets failover request from users.
To be exact, gets 'x_colo_lost_heartbeat' command.
COLO thread will exit the loop while the failover BH does the
cleanup work and resumes VM.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
2016-10-30 15:17:39 +05:30
zhanghailiang aef060850b COLO: Introduce state to record failover process
When handling failover, COLO processes differently according to
the different stage of failover process, here we introduce a global
atomic variable to record the status of failover.

We add four failover status to indicate the different stage of failover process.
You should use the helpers to get and set the value.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
2016-10-30 15:17:39 +05:30
zhanghailiang d89e666e06 COLO: Add 'x-colo-lost-heartbeat' command to trigger failover
We leave users to choose whatever heartbeat solution they want,
if the heartbeat is lost, or other errors they detect, they can use
experimental command 'x_colo_lost_heartbeat' to tell COLO to do failover,
COLO will do operations accordingly.

For example, if the command is sent to the Primary side,
the Primary side will exit COLO mode, does cleanup work,
and then, PVM will take over the service work. If sent to the Secondary side,
the Secondary side will run failover work, then takes over PVM's service work.

Cc: Luiz Capitulino <lcapitulino@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
2016-10-30 15:17:39 +05:30
zhanghailiang 25d0c16f62 migration: Switch to COLO process after finishing loadvm
Switch from normal migration loadvm process into COLO checkpoint process if
COLO mode is enabled.

We add three new members to struct MigrationIncomingState,
'have_colo_incoming_thread' and 'colo_incoming_thread' record the COLO
related thread for secondary VM, 'migration_incoming_co' records the
original migration incoming coroutine.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
2016-10-30 15:17:39 +05:30
zhanghailiang 0b827d5e72 migration: Enter into COLO mode after migration if COLO is enabled
Add a new migration state: MIGRATION_STATUS_COLO. Migration source side
enters this state after the first live migration successfully finished
if COLO is enabled by command 'migrate_set_capability x-colo on'.

We reuse migration thread, so the process of checkpointing will be handled
in migration thread.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
2016-10-30 15:17:39 +05:30
zhanghailiang 5821ebf93b COLO: migrate COLO related info to secondary node
We can determine whether or not VM in destination should go into COLO mode
by referring to the info that was migrated.

We skip this section if COLO is not enabled (i.e.
migrate_set_capability colo off), so that, It doesn't break compatibility
with migration no matter whether users configure the --enable-colo/disable-colo
on the source/destination side or not;

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
2016-10-30 15:17:39 +05:30
zhanghailiang 35a6ed4f71 migration: Introduce capability 'x-colo' to migration
We add helper function colo_supported() to indicate whether
colo is supported or not, with which we use to control whether or not
showing 'x-colo' string to users, they can use qmp command
'query-migrate-capabilities' or hmp command 'info migrate_capabilities'
to learn if colo is supported.

The default value for COLO (COarse-Grain LOck Stepping) is disabled.

Cc: Juan Quintela <quintela@redhat.com>
Cc: Amit Shah <amit.shah@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
2016-10-30 15:17:39 +05:30
Emil Condrea 71981364b6 xen: Rename xen_be_del_xendev
Prepare xen_be_del_xendev to be shared with frontends:
 * xen_be_del_xendev -> xen_pv_del_xendev

Signed-off-by: Emil Condrea <emilcondrea@gmail.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Quan Xu <xuquan8@huawei.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
2016-10-28 17:54:49 -07:00
Emil Condrea fa0253d066 xen: Rename xen_be_find_xendev
Prepare xen_be_find_xendev to be shared with frontends:
 * xen_be_find_xendev -> xen_pv_find_xendev

Signed-off-by: Emil Condrea <emilcondrea@gmail.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Quan Xu <xuquan8@huawei.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
2016-10-28 17:54:39 -07:00
Emil Condrea 49442d9621 xen: Rename xen_be_evtchn_event
Prepare xen_be_evtchn_event to be shared with frontends:
 * xen_be_evtchn_event -> xen_pv_evtchn_event

Signed-off-by: Emil Condrea <emilcondrea@gmail.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Quan Xu <xuquan8@huawei.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
2016-10-28 17:54:31 -07:00
Emil Condrea ba18fa2a8c xen: Rename xen_be_send_notify
Prepare xen_be_send_notify to be shared with frontends:
 * xen_be_send_notify -> xen_pv_send_notify

Signed-off-by: Emil Condrea <emilcondrea@gmail.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Quan Xu <xuquan8@huawei.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
2016-10-28 17:54:21 -07:00
Emil Condrea 65807f4b6c xen: Rename xen_be_unbind_evtchn
Prepare xen_be_unbind_evtchn to be shared with frontends:
 * xen_be_unbind_evtchn -> xen_pv_unbind_evtchn

Signed-off-by: Emil Condrea <emilcondrea@gmail.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Quan Xu <xuquan8@huawei.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
2016-10-28 17:54:11 -07:00
Emil Condrea 96c77dba6f xen: Rename xen_be_printf to xen_pv_printf
Prepare xen_be_printf to be used by both backend and frontends:
 * xen_be_printf -> xen_pv_printf

Signed-off-by: Emil Condrea <emilcondrea@gmail.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Quan Xu <xuquan8@huawei.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
2016-10-28 17:53:50 -07:00
Emil Condrea 148512e062 xen: Prepare xendev qtail to be shared with frontends
* move xendevs qtail to xen_pvdev.c
 * change xen_be_get_xendev to use a new function: xen_pv_insert_xendev

Signed-off-by: Emil Condrea <emilcondrea@gmail.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Quan Xu <xuquan8@huawei.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
2016-10-28 17:53:25 -07:00
Emil Condrea 31c17aa5c3 xen: Move evtchn functions to xen_pvdev.c
The name of the functions moved:
 * xen_be_evtchn_event
 * xen_be_unbind_evtchn
 * xen_be_send_notify

Signed-off-by: Emil Condrea <emilcondrea@gmail.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Quan Xu <xuquan8@huawei.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
2016-10-28 17:53:16 -07:00
Emil Condrea 046db9bec5 xen: Move xenstore_update to xen_pvdev.c
* xenstore_update -> xen_pvdev.c

Signed-off-by: Emil Condrea <emilcondrea@gmail.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Quan Xu <xuquan8@huawei.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
2016-10-28 17:53:08 -07:00
Emil Condrea f0021dba62 xen: Create a new file xen_pvdev.c
The purpose of the new file is to store generic functions shared by frontend
and backends such as xenstore operations, xendevs.

Signed-off-by: Quan Xu <quan.xu@intel.com>
Signed-off-by: Emil Condrea <emilcondrea@gmail.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Quan Xu <xuquan8@huawei.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
2016-10-28 17:52:48 -07:00
Emil Condrea b9730c5b4e xen: Fix coding style warnings
Fixes:
 * WARNING: line over 80 characters

Signed-off-by: Emil Condrea <emilcondrea@gmail.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Quan Xu <xuquan8@huawei.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
2016-10-28 17:52:39 -07:00
Peter Maydell 66a77ea676 ppc patch queue 2016-10-28
This pull request supersedes and extends the one from 2016-10-26
 (which had a build bug).
 
 Highlights:
   * SLOF (pseries guest firmware) update
   * Enable a number of extra testcases on ppc / pseries
   * Added the 'powernv' machine type
     - Almost enough to be minimally usable
     - But still missing necessary interrupt controller updates
   * Cleanup and consolidation of NVRAM handling on several platforms
     with related firmware
   * Substantial cleanup to device tree construction
   * Some more POWER9 instruction emulation
   * Cleanup to handling of pseries option vectors and CAS reboot
     handling (host/guest feature negotiation mechanism)
   * Significant cleanups to handling of PCI devices in test cases
   * New hotplug event infrastructure
   * Memory hot unplug support for pseries
   * Several bug fixes
 
 The NVRAM cleanup affects some Sun sparc platforms as well as ppc
 ones, but have been tested by the sparc maintainer (Mark Cave-Ayland).
 
 The test additions also include substantial general changes to the
 test framework that aren't strictly ppc related.  They don't seem to
 break tests on other platforms, they're for the benefit of enabling
 tests on ppc and there isn't a specific maintainer for them, so
 they're included in this tree.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJYEqvPAAoJEGw4ysog2bOSQiwP/jOr5bxwZmGDrOwAIzqhagib
 lo0+W1E6OCttbBhAp1inWZP6RnexAZ7orb+Q7DHQFbtYukUbPYwB/VzmWhaws6XV
 jxK/lVB3A+XlRIEKUUc8bWWGRN+QBMnQIcUhNlKuC4AVKMC1aZY9ZLT6LvilV6X7
 QtxAlBPmI2od2kyDHt/ibG9FkROFMi9ybbQG+D7Pu32NlTPgF06R6NPKtpkjEpUU
 dRYAUB+VTB4eofjzyVqsL+QB7uX5g0V9aPmYWBaXqjTG61ivHMJJ7zHta+GdckJM
 fk3S68ftPmf4EE+uL1Ff2fy+2Sxjh4QeNwxl1rppzrgvW8VkNOX6969idxSERUb8
 I2/RVM7F1mJ7+4fNIjenAru8qu3O981lU9+t7R5mmTcEsSk28FOkOv6Io+6JnGA2
 32qgOXwihsUDaH2pDagZ+ySaOqjWMD9WGQTfQgFMthGkcs6heG7ByvFrcpcacl5a
 kbMl7cj+zkgusLuQHx0dp669R7Ch7bxSigQC11iMCpAmFhXl8qJ37ACPJn8NlzOq
 NJdZwXOp9plYZ70a2CgKVXB6j+jhxOeJHg2v08jfF0rUILovBwH0WOrl1m0fb2gB
 1u+ua2FED+5rtwpGJ7pL/oE20H11QDfHNDvqEpagvHAHSSu5nqGxd/falYRYE59C
 wdMXPqJYQqkSYuA6XkgO
 =3Z5V
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.8-20161028' into staging

ppc patch queue 2016-10-28

This pull request supersedes and extends the one from 2016-10-26
(which had a build bug).

Highlights:
  * SLOF (pseries guest firmware) update
  * Enable a number of extra testcases on ppc / pseries
  * Added the 'powernv' machine type
    - Almost enough to be minimally usable
    - But still missing necessary interrupt controller updates
  * Cleanup and consolidation of NVRAM handling on several platforms
    with related firmware
  * Substantial cleanup to device tree construction
  * Some more POWER9 instruction emulation
  * Cleanup to handling of pseries option vectors and CAS reboot
    handling (host/guest feature negotiation mechanism)
  * Significant cleanups to handling of PCI devices in test cases
  * New hotplug event infrastructure
  * Memory hot unplug support for pseries
  * Several bug fixes

The NVRAM cleanup affects some Sun sparc platforms as well as ppc
ones, but have been tested by the sparc maintainer (Mark Cave-Ayland).

The test additions also include substantial general changes to the
test framework that aren't strictly ppc related.  They don't seem to
break tests on other platforms, they're for the benefit of enabling
tests on ppc and there isn't a specific maintainer for them, so
they're included in this tree.

# gpg: Signature made Fri 28 Oct 2016 02:37:19 BST
# gpg:                using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg:                 aka "David Gibson (kernel.org) <dwg@kernel.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-2.8-20161028: (73 commits)
  ppc: allow certain HV interrupts to be delivered to guests
  spapr: Memory hot-unplug support
  spapr: use count+index for memory hotplug
  spapr: Add DRC count indexed hotplug identifier type
  spapr: add hotplug interrupt machine options
  spapr_events: add support for dedicated hotplug event source
  spapr: update spapr hotplug documentation
  target-ppc: Add xvcmpnesp, xvcmpnedp instructions
  target-ppc: add xscmp[eq,gt,ge,ne]dp instructions
  tests: Add pseries machine to the prom-env-test, too
  spapr_nvram: Pre-initialize the NVRAM to support the -prom-env parameter
  libqos: Change PCI accessors to take opaque BAR handle
  tests: Don't assume structure of PCI IO base in ahci-test
  tests: Use qpci_mem{read,write} in ivshmem-test
  libqos: Add 64-bit PCI IO accessors
  tests: Clean up IO handling in ide-test
  libqos: Implement mmio accessors in terms of mem{read,write}
  libqos: Add streaming accessors for PCI MMIO
  tests: Adjust tco-test to use qpci_legacy_iomap()
  libqos: Better handling of PCI legacy IO
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-10-28 16:31:59 +01:00
Anand J 814bb12a56 clean-up: removed duplicate #includes
Some files contain multiple #includes of the same header file.
Removed most of those unnecessary duplicate entries using
scripts/clean-includes.

Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Anand J <anand.indukala@gmail.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-10-28 18:17:24 +03:00
Marc-André Lureau bdbcb547cf monitor: deprecate 'default' option
This option does nothing since commit 06ac27f.  Deprecate it.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-10-28 18:17:23 +03:00
Peter Maydell 01b601f061 Merge qio 2016/10/27 v1
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJYEfjrAAoJEL6G67QVEE/fdU4P/i7yBJo436OpkdgeWS8AWuFr
 ptZ+Fj/weGka5GU9E3KQu36kbSgrtfcgwTHphCMXnZ0YCeKQDuM57f7LNiN6qheB
 nqgJvJioLbUvLTQvCHOISM7bWOnYvASBmYtLJFtUcP/jhdOy61KaADnJ+7MbliNv
 yJSW2RN+s/y9nUb+dxEpIXXUVMRa6BX+wHW3O44c1oLn6/Pe20aJeHTyDx3qiBhD
 8RYXUgRZopH2bouBSzXgMQTbn/QMD/dC81WQlHKlt4swffyei2D/1pciOcuc0SXz
 +SZdkTre5JB5Kd6DU8zQ6PrrIt1nPmLSptSyhQvNxm+uWNWHnFcW1s2aYmf/ikjl
 4boW37ayJx09mns8yv7TerzEPbL5qJvVX8Dsnb6telkvrS9hy9S1xuIB5xHbt6/h
 vwFmCdwaZoGpDDaoXRL+9k9TOI9BbEMKX33nAPDqvEXLMIf+og4fmweTKcY4XTRL
 /Fdg1H71v8Ayv+r5TJOKwFg3PNNjnvqkbk1psS+aaW7dup43iaYGIKWy+VFaCufk
 hPXLOtR5lUsYC2qm+nkjPIgoP7D8oZx4AGkCHbYsqzi+l1lynZH3rBIs8ggLr72o
 FFk4g0sNYe1ccAa89jFEgWIQbS0N6ckUXCv12g3eyF/UIC1F35/mGGugSRnTXuc2
 a/WsvgU7pGBrtqXcg7lF
 =gsxL
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/berrange/tags/pull-qio-2016-10-27-1' into staging

Merge qio 2016/10/27 v1

# gpg: Signature made Thu 27 Oct 2016 13:54:03 BST
# gpg:                using RSA key 0xBE86EBB415104FDF
# gpg: Good signature from "Daniel P. Berrange <dan@berrange.com>"
# gpg:                 aka "Daniel P. Berrange <berrange@redhat.com>"
# Primary key fingerprint: DAF3 A6FD B26B 6291 2D0E  8E3F BE86 EBB4 1510 4FDF

* remotes/berrange/tags/pull-qio-2016-10-27-1:
  main: set names for main loop sources created
  vnc: set name for all I/O channels created
  migration: set name for all I/O channels created
  char: set name for all I/O channels created
  nbd: set name for all I/O channels created
  io: add ability to set a name for IO channels
  io: Add a QIOChannelSocket cleanup test
  io: set LISTEN flag explicitly for listen sockets
  io: Introduce a qio_channel_set_feature() helper
  io: Use qio_channel_has_feature() where applicable
  io: Fix double shift usages on QIOChannel features

Conflicts:
	qemu-char.c

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-10-28 15:30:55 +01:00
Paolo Bonzini 3fe7122337 aio: convert from RFifoLock to QemuRecMutex
It is simpler and a bit faster, and QEMU does not need the contention
callbacks (and thus the fairness) anymore.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1477565348-5458-21-git-send-email-pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2016-10-28 21:50:18 +08:00
Paolo Bonzini feadec6384 qemu-thread: introduce QemuRecMutex
GRecMutex is new in glib 2.32, so we cannot use it.  Introduce
a recursive mutex in qemu-thread instead, which will be used
instead of RFifoLock.

Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1477565348-5458-20-git-send-email-pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2016-10-28 21:50:18 +08:00
Paolo Bonzini 65c1b5b622 iothread: release AioContext around aio_poll
This is the first step towards having fine-grained critical sections in
dataplane threads, which will resolve lock ordering problems between
address_space_* functions (which need the BQL when doing MMIO, even
after we complete RCU-based dispatch) and the AioContext.

Because AioContext does not use contention callbacks anymore, the
unit test has to be changed.

Previously applied as a0710f7995 and
then reverted.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1477565348-5458-19-git-send-email-pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2016-10-28 21:50:18 +08:00
Paolo Bonzini c9d1a56174 block: only call aio_poll on the current thread's AioContext
aio_poll is not thread safe; for example bdrv_drain can hang if
the last in-flight I/O operation is completed in the I/O thread after
the main thread has checked bs->in_flight.

The bug remains latent as long as all of it is called within
aio_context_acquire/aio_context_release, but this will change soon.

To fix this, if bdrv_drain is called from outside the I/O thread,
signal the main AioContext through a dummy bottom half.  The event
loop then only runs in the I/O thread.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1477565348-5458-18-git-send-email-pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2016-10-28 21:50:18 +08:00
Paolo Bonzini 720150f318 block: prepare bdrv_reopen_multiple to release AioContext
After the next patch bdrv_drain_all will have to be called without holding any
AioContext.  Prepare to do this by adding an AioContext argument to
bdrv_reopen_multiple.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1477565348-5458-15-git-send-email-pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2016-10-28 21:50:18 +08:00
Paolo Bonzini e437016511 aio: introduce qemu_get_current_aio_context
This will be used by BDRV_POLL_WHILE (and thus by bdrv_drain)
to choose how to wait for I/O completion.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1477565348-5458-12-git-send-email-pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2016-10-28 21:50:18 +08:00
Paolo Bonzini 88b062c203 block: introduce BDRV_POLL_WHILE
We want the BDS event loop to run exclusively in the iothread that
owns the BDS's AioContext.  This macro will provide the synchronization
between the two event loops; for now it just wraps the common idiom
of a while loop around aio_poll.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-Id: <1477565348-5458-8-git-send-email-pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2016-10-28 21:50:18 +08:00
Paolo Bonzini 9972354856 block: add BDS field to count in-flight requests
Unlike tracked_requests, this field also counts throttled requests,
and remains non-zero if an AIO operation needs a BH to be "really"
completed.

With this change, it is no longer necessary to have a dummy
BdrvTrackedRequest for requests that are never serialising, and
it is no longer necessary to poll the AioContext once after
bdrv_requests_pending(bs) returns false.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-Id: <1477565348-5458-5-git-send-email-pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2016-10-28 21:50:18 +08:00
Paolo Bonzini bae8196d9f blockjob: introduce .drain callback for jobs
This is required to decouple block jobs from running in an
AioContext.  With multiqueue block devices, a BlockDriverState
does not really belong to a single AioContext.

The solution is to first wait until all I/O operations are
complete; then loop in the main thread for the block job to
complete entirely.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-Id: <1477565348-5458-3-git-send-email-pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2016-10-28 21:50:18 +08:00
Peter Maydell fd209e4a77 -----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJYEm6NAAoJEH3vgQaq/DkO+30P/0n5EYgIZQugvUXJvpu0xx7V
 EDqtgrirRTtKaoApqiY36U3WbWK+1pPdMcdJ2z4bI9+VyYkKxlEisgEeGy2E7S33
 qD1kV+L1NabBV8Ee577wxAZz3xl4MBh1pzCIXsySA2VRNg/W6L8hj4rTmjap1U9p
 ZtbLkmwMpSwTkJxWPG1W+k0klk1tYxmcwsWcCSCuSOTXm/0gBpWdy5gBRuQXVi0l
 DQFlKS6BDlRiCvR4Qix6n0v8VTQfbRMGS40M6tpr3/QH/HvoKhxfTS/g8P72Bk20
 DPNsKF9DBfTY3KCtjcSrPTREaMqFw8VXn5XSw1uE30ALZNHru9PpVS3hbLfGmltB
 HAVANMbqROFvkQghtGWD7f34Oks/bxzLKxEXPAs9stwvthV46KyJsMHuiSbuzJhv
 tOUq0MadEquuVvgDqoRYKrwyYrjsRRZ4z5kDDnOr2iGZK+Mrhq7jBuYuKcvHyQi0
 apd27X4wwQTx/9tavC+ujeuVxAWBlSSP1EVGSiIenlq21cHLowuZdqrt2swAYkCs
 VlUyOzdCO/62SJGcrnrRCj3sKWbPTySnmDZQKrHve4rBzcL28IHCRxIfzbXRBkQI
 kGigceOwIyNW/bnp6rSYoBFKpz1NF2VScr/t5JzknsC8gT/tA0wPDBoIeL/kPVHm
 T/qOTHLDY/fHUNwXOkTe
 =tz25
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/jnsnow/tags/ide-pull-request' into staging

# gpg: Signature made Thu 27 Oct 2016 22:15:57 BST
# gpg:                using RSA key 0x7DEF8106AAFC390E
# gpg: Good signature from "John Snow (John Huston) <jsnow@redhat.com>"
# Primary key fingerprint: FAEB 9711 A12C F475 812F  18F2 88A9 064D 1835 61EB
#      Subkey fingerprint: F9B7 ABDB BCAC DF95 BE76  CBD0 7DEF 8106 AAFC 390E

* remotes/jnsnow/tags/ide-pull-request:
  qemu-iotests: Test creating floppy drives
  fdc: Move qdev properties to FloppyDrive
  fdc: Add a floppy drive qdev
  fdc: Add a floppy qbus
  macio: switch over to new byte-aligned DMA helpers
  dma-helpers: explicitly pass alignment into DMA helpers

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-10-28 14:29:50 +01:00
Bharata B Rao afdbd40356 spapr: Add DRC count indexed hotplug identifier type
Add support for DRC count indexed hotplug ID type which is primarily
needed for memory hot unplug. This type allows for specifying the
number of DRs that should be plugged/unplugged starting from a given
DRC index.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
* updated rtas_event_log_v6_hp to reflect count/index field ordering
  used in PAPR hotplug ACR
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-28 11:17:35 +11:00
Michael Roth ffbb1705a3 spapr_events: add support for dedicated hotplug event source
Hotplug events were previously delivered using an EPOW interrupt
and were queued by linux guests into a circular buffer. For traditional
EPOW events like shutdown/resets, this isn't an issue, but for hotplug
events there are cases where this buffer can be exhausted, resulting
in the loss of hotplug events, resets, etc.

Newer-style hotplug event are delivered using a dedicated event source.
We enable this in supported guests by adding standard an additional
event source in the guest device-tree via /event-sources, and, if
the guest advertises support for the newer-style hotplug events,
using the corresponding interrupt to signal the available of
hotplug/unplug events.

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-28 11:17:35 +11:00
Michael Roth 417ece33fc spapr: improve ibm,architecture-vec-5 property handling
ibm,architecture-vec-5 is supposed to encode all option vector 5 bits
negotiated between platform/guest. Currently we hardcode this property
in the boot-time device tree to advertise a single negotiated
capability, "Form 1" NUMA Affinity, regardless of whether or not CAS
has been invoked or that capability has actually been negotiated.

Improve this by generating ibm,architecture-vec-5 based on the full
set of option vector 5 capabilities negotiated via CAS.

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-28 09:38:26 +11:00
Michael Roth 6787d27b04 spapr: add option vector handling in CAS-generated resets
In some cases, ibm,client-architecture-support calls can fail. This
could happen in the current code for situations where the modified
device tree segment exceeds the buffer size provided by the guest
via the call parameters. In these cases, QEMU will reset, allowing
an opportunity to regenerate the device tree from scratch via
boot-time handling. There are potentially other scenarios as well,
not currently reachable in the current code, but possible in theory,
such as cases where device-tree properties or nodes need to be removed.

We currently don't handle either of these properly for option vector
capabilities however. Instead of carrying the negotiated capability
beyond the reset and creating the boot-time device tree accordingly,
we start from scratch, generating the same boot-time device tree as we
did prior to the CAS-generated and the same device tree updates as we
did before. This could (in theory) cause us to get stuck in a reset
loop. This hasn't been observed, but depending on the extensiveness
of CAS-induced device tree updates in the future, could eventually
become an issue.

Address this by pulling capability-related device tree
updates resulting from CAS calls into a common routine,
spapr_dt_cas_updates(), and adding an sPAPROptionVector*
parameter that allows us to test for newly-negotiated capabilities.
We invoke it as follows:

1) When ibm,client-architecture-support gets called, we
   call spapr_dt_cas_updates() with the set of capabilities
   added since the previous call to ibm,client-architecture-support.
   For the initial boot, or a system reset generated by something
   other than the CAS call itself, this set will consist of *all*
   options supported both the platform and the guest. For calls
   to ibm,client-architecture-support immediately after a CAS-induced
   reset, we call spapr_dt_cas_updates() with only the set
   of capabilities added since the previous call, since the other
   capabilities will have already been addressed by the boot-time
   device-tree this time around. In the unlikely event that
   capabilities are *removed* since the previous CAS, we will
   generate a CAS-induced reset. In the unlikely event that we
   cannot fit the device-tree updates into the buffer provided
   by the guest, well generate a CAS-induced reset.

2) When a CAS update results in the need to reset the machine and
   include the updates in the boot-time device tree, we call the
   spapr_dt_cas_updates() using the full set of negotiated
   capabilities as part of the reset path. At initial boot, or after
   a reset generated by something other than the CAS call itself,
   this set will be empty, resulting in what should be the same
   boot-time device-tree as we generated prior to this patch. For
   CAS-induced reset, this routine will be called with the full set of
   capabilities negotiated by the platform/guest in the previous
   CAS call, which should result in CAS updates from previous call
   being accounted for in the initial boot-time device tree.

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
[dwg: Changed an int -> bool conversion to be more explicit]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-28 09:38:26 +11:00
Michael Roth facdb8b63b spapr_hcall: use spapr_ovec_* interfaces for CAS options
Currently we access individual bytes of an option vector via
ldub_phys() to test for the presence of a particular capability
within that byte. Currently this is only done for the "dynamic
reconfiguration memory" capability bit. If that bit is present,
we pass a boolean value to spapr_h_cas_compose_response()
to generate a modified device tree segment with the additional
properties required to enable this functionality.

As more capability bits are added, will would need to modify the
code to add additional option vector accesses and extend the
param list for spapr_h_cas_compose_response() to include similar
boolean values for these parameters.

Avoid this by switching to spapr_ovec_* helpers so we can do all
the parsing in one shot and then test for these additional bits
within spapr_h_cas_compose_response() directly.

Cc: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-28 09:38:26 +11:00
Michael Roth b20b7b7add spapr_ovec: initial implementation of option vector helpers
PAPR guests advertise their capabilities to the platform by passing
an ibm,architecture-vec structure via an
ibm,client-architecture-support hcall as described by LoPAPR v11,
B.6.2.3. during early boot.

Using this information, the platform enables the capabilities it
supports, then encodes a subset of those enabled capabilities (the
5th option vector of the ibm,architecture-vec structure passed to
ibm,client-architecture-support) into the guest device tree via
"/chosen/ibm,architecture-vec-5".

The logical format of these these option vectors is a bit-vector,
where individual bits are addressed/documented based on the byte-wise
offset from the beginning of the bit-vector, followed by the bit-wise
index starting from the byte-wise offset. Thus the bits of each of
these bytes are stored in reverse order. Additionally, the first
byte of each option vector is encodes the length of the option vector,
so byte offsets begin at 1, and bit offset at 0.

This is not very intuitive for the purposes of mapping these bits to
a particular documented capability, so this patch introduces a set
of abstractions that encapsulate the work of parsing/encoding these
options vectors and testing for individual capabilities.

Cc: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
[dwg: Tweaked double-include protection to not trigger a checkpatch
 false positive]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-28 09:38:26 +11:00
David Gibson 398a0bd5ae pseries: Remove spapr_create_fdt_skel()
For historical reasons construction of the guest device tree in spapr is
divided between spapr_create_fdt_skel() which is called at init time, and
spapr_build_fdt() which runs at reset time.  Over time, more and more
things have needed to be moved to reset time.

Previous cleanups mean the only things left in spapr_create_fdt_skel() are
the properties of the root node itself.  Finish consolidating these two
parts of device tree construction, by moving this to the start of
spapr_build_fdt(), and removing spapr_create_fdt_skel() entirely.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-10-28 09:38:26 +11:00
David Gibson bf5a6696ba pseries: Consolidate construction of /vdevice device tree node
Construction of the /vdevice node (and its children) is divided between
spapr_create_fdt_skel() (at init time), which creates the base node, and
spapr_populate_vdevice() (at reset time) which creates the nodes for each
individual virtual device.

This consolidates both into a single function called from
spapr_build_fdt().

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-10-28 09:38:26 +11:00
David Gibson ffb1e275a6 pseries: Move /event-sources construction to spapr_build_fdt()
The /event-sources device tree node is built from spapr_create_fdt_skel().
As part of consolidating device tree construction to reset time, this moves
it to spapr_build_fdt().

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-10-28 09:38:26 +11:00
David Gibson 3f5dabceba pseries: Consolidate construction of /rtas device tree node
For historical reasons construction of the /rtas node in the device
tree (amongst others) is split into several places.  In particular
it's split between spapr_create_fdt_skel(), spapr_build_fdt() and
spapr_rtas_device_tree_setup().

In fact, as well as adding the actual RTAS tokens to the device tree,
spapr_rtas_device_tree_setup() just adds the ibm,lrdr-capacity
property, which despite going in the /rtas node, doesn't have a lot to
do with RTAS.

This patch consolidates the code constructing /rtas together into a new
spapr_dt_rtas() function.  spapr_rtas_device_tree_setup() is renamed to
spapr_dt_rtas_tokens() and now only adds the token properties.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-10-28 09:38:26 +11:00
David Gibson 7c866c6a60 pseries: Consolidate construction of /chosen device tree node
For historical reasons, building the /chosen node in the guest device tree
is split across several places and includes both parts which write the DT
sequentially and others which use random access functions.

This patch consolidates construction of the node into one place, using
random access functions throughout.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-10-28 09:38:26 +11:00
David Gibson 9b9a19080a pseries: Move construction of /interrupt-controller fdt node
Currently the device tree node for the XICS interrupt controller is in
spapr_create_fdt_skel().  As part of consolidating device tree construction
to reset time, this moves it to a function called from spapr_build_fdt().

In addition we move the actual code into hw/intc/xics_spapr.c with the
rest of the PAPR specific interrupt controller code.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-10-28 09:38:26 +11:00
David Gibson 2cac78c12a pseries: Consolidate RTAS loading
At each system reset, the pseries machine needs to load RTAS (the runtime
portion of the guest firmware) into the VM.  This means copying
the actual RTAS code into guest memory, and also updating the device
tree so that the guest OS and boot firmware can locate it.

For historical reasons the copy and update to the device tree were in
different parts of the code.  This cleanup brings them both together in
an spapr_load_rtas() function.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-10-28 09:38:26 +11:00
David Gibson a19f7fb045 pseries: Make spapr_create_fdt_skel() get information from machine state
Currently spapr_create_fdt_skel() takes a bunch of individual parameters
for various things it will put in the device tree.  Some of these can
already be taken directly from sPAPRMachineState.  This patch alters it so
that all of them can be taken from there, which will allow this code to
be moved away from its current caller in future.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-10-28 09:38:25 +11:00
David Gibson cae172ab6d pseries: Remove rtas_addr and fdt_addr fields from machinestate
These values are used only within ppc_spapr_reset(), so just change them
to local variables.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-10-28 09:38:25 +11:00
Cédric Le Goater 3495b6b610 ppc/pnv: add a ISA bus
As Qemu only supports a single instance of the ISA bus, we use the LPC
controller of chip 0 to create one and plug in a couple of useful
devices, like an UART and RTC. An IPMI BT device, which is also an ISA
device, can be defined on the command line to connect an external BMC.
That is for later.

The PowerNV machine now has a console. Skiboot should load a kernel
and jump into it but execution will stop quite early because we lack a
model for the native XICS controller for the moment :

    [    0.000000] NR_IRQS:512 nr_irqs:512 16
    [    0.000000] XICS: Cannot find a Presentation Controller !
    [    0.000000] ------------[ cut here ]------------
    [    0.000000] WARNING: at arch/powerpc/platforms/powernv/setup.c:81
    ...
    [    0.000000] NIP [c00000000079d65c] pnv_init_IRQ+0x30/0x44

You can still do a few things under xmon.

Based on previous work from :
      Benjamin Herrenschmidt <benh@kernel.crashing.org>

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
[dwg: Trivial fix for a change in the serial_hds_isa_init() interface]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-28 09:38:25 +11:00
Benjamin Herrenschmidt a3980bf517 ppc/pnv: add a LPC controller
The LPC (Low Pin Count) interface on a POWER8 is made accessible to
the system through the ADU (XSCOM interface). This interface is part
of set of units connected together via a local OPB (On-Chip Peripheral
Bus) which act as a bridge between the ADU and the off chip LPC
endpoints, like external flash modules.

The most important units of this OPB are :
 - OPB Master: contains the ADU slave logic, a set of internal
   registers and the logic to control the OPB.
 - LPCHC (LPC HOST Controller): which implements a OPB Slave, a set of
   internal registers and the LPC HOST Controller to control the LPC
   interface.

Four address spaces are provided to the ADU :
 - LPC Bus Firmware Memory
 - LPC Bus Memory
 - LPC Bus I/O (ISA bus)
 - and the registers for the OPB Master and the LPC Host Controller

On POWER8, an intermediate hop is necessary to reach the OPB, through
a unit called the ECCB. OPB commands are simply mangled in ECCB write
commands.

On POWER9, the OPB master address space can be accessed via MMIO. The
logic is same but the code will be simpler as the XSCOM and ECCB hops
are not necessary anymore.

This version of the LPC controller model doesn't yet implement support
for the SerIRQ deserializer present in the Naples version of the chip
though some preliminary work is there.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[clg: - updated for qemu-2.7
      - ported on latest PowerNV patchset
      - changed the XSCOM interface to fit new model
      - QOMified the model
      - moved the ISA hunks in another patch
      - removed printf logging
      - added a couple of UNIMP logging
      - rewrote commit log ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-28 09:38:25 +11:00
Cédric Le Goater 24ece07250 ppc/pnv: add XSCOM handlers to PnvCore
Now that we are using real HW ids for the cores in PowerNV chips, we
can route the XSCOM accesses to them. We just need to attach a
specific XSCOM memory region to each core in the appropriate window
for the core number.

To start with, let's install the DTS (Digital Thermal Sensor) handlers
which should return 38°C for each core.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-28 09:38:25 +11:00
Cédric Le Goater 967b75230b ppc/pnv: add XSCOM infrastructure
On a real POWER8 system, the Pervasive Interconnect Bus (PIB) serves
as a backbone to connect different units of the system. The host
firmware connects to the PIB through a bridge unit, the
Alter-Display-Unit (ADU), which gives him access to all the chiplets
on the PCB network (Pervasive Connect Bus), the PIB acting as the root
of this network.

XSCOM (serial communication) is the interface to the sideband bus
provided by the POWER8 pervasive unit to read and write to chiplets
resources. This is needed by the host firmware, OPAL and to a lesser
extent, Linux. This is among others how the PCI Host bridges get
configured at boot or how the LPC bus is accessed.

To represent the ADU of a real system, we introduce a specific
AddressSpace to dispatch XSCOM accesses to the targeted chiplets. The
translation of an XSCOM address into a PCB register address is
slightly different between the P9 and the P8. This is handled before
the dispatch using a 8byte alignment for all.

To customize the device tree, a QOM InterfaceClass, PnvXScomInterface,
is provided with a populate() handler. The chip populates the device
tree by simply looping on its children. Therefore, each model needing
custom nodes should not forget to declare itself as a child at
instantiation time.

Based on previous work done by :
      Benjamin Herrenschmidt <benh@kernel.crashing.org>

Signed-off-by: Cédric Le Goater <clg@kaod.org>
[dwg: Added cpu parameter to xscom_complete()]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-28 09:38:25 +11:00
Cédric Le Goater d2fd9612ee ppc/pnv: add a PnvCore object
This is largy inspired by sPAPRCPUCore with some simplification, no
hotplug for instance. A set of PnvCore objects is added to the PnvChip
and the device tree is populated looping on these cores.

Real HW cpu ids are now generated depending on the chip cpu model, the
chip id and a core mask. The id is propagated to the CPU object, using
properties, to set the SPR_PIR (Processor Identification Register)

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-28 09:38:25 +11:00
Cédric Le Goater 631adaff31 ppc/pnv: add a PIR handler to PnvChip
The Processor Identification Register (PIR) is a register that holds a
processor identifier which is used for bus transactions (XSCOM) and
for processor differentiation in multiprocessor systems. It also used
in the interrupt vector entries (IVE) to identify the thread serving
the interrupts.

P9 and P8 have some differences in the CPU PIR encoding.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-28 09:38:25 +11:00
Cédric Le Goater 397a79e757 ppc/pnv: add a core mask to PnvChip
This will be used to build real HW ids for the cores and enforce some
limits on the available cores per chip.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-28 09:38:25 +11:00
Cédric Le Goater e997040e3f ppc/pnv: add a PnvChip object
This is is an abstraction of a POWER8 chip which is a set of cores
plus other 'units', like the pervasive unit, the interrupt controller,
the memory controller, the on-chip microcontroller, etc. The whole can
be seen as a socket. It depends on a cpu model and its characteristics:
max cores and specific inits are defined in a PnvChipClass.

We start with an near empty PnvChip with only a few cpu constants
which we will grow in the subsequent patches with the controllers
required to run the system.

The Chip CFAM (Common FRU Access Module) ID gives the model of the
chip and its version number. It is generally the first thing firmwares
fetch, available at XSCOM PCB address 0xf000f, to start initialization.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-28 09:38:25 +11:00
Benjamin Herrenschmidt 9e933f4a62 ppc/pnv: add skeleton PowerNV platform
The goal is to emulate a PowerNV system at the level of the skiboot
firmware, which loads the OS and provides some runtime services. Power
Systems have a lower firmware (HostBoot) that does low level system
initialization, like DRAM training. This is beyond the scope of what
qemu will address in a PowerNV guest.

No devices yet, not even an interrupt controller. Just to get started,
some RAM to load the skiboot firmware, the kernel and initrd. The
device tree is fully created in the machine reset op.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[clg: - updated for qemu-2.7
      - replaced fprintf by error_report
      - used a common definition of _FDT macro
      - removed VMStateDescription as migration is not yet supported
      - added IBM Copyright statements
      - reworked kernel_filename handling
      - merged PnvSystem and sPowerNVMachineState
      - removed PHANDLE_XICP
      - added ppc_create_page_sizes_prop helper
      - removed nmi support
      - removed kvm support
      - updated powernv machine to version 2.8
      - removed chips and cpus, They will be provided in another patches
      - added a machine reset routine to initialize the device tree (also)
      - french has a squelette and english a skeleton.
      - improved commit log.
      - reworked prototypes parameters
      - added a check on the ram size (thanks to Michael Ellerman)
      - fixed chip-id cell
      - changed MAX_CPUS to 2048
      - simplified memory node creation to one node only
      - removed machine version
      - rewrote the device tree creation with the fdt "rw" routines
      - s/sPowerNVMachineState/PnvMachineState/
      - etc.]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-28 09:38:24 +11:00
David Gibson e763da2344 pseries: Remove unused callbacks from sPAPR VIO bus state
The original QOMification of the spapr VIO devices in 3954d33 "spapr:
convert to QEMU Object Model (v2)" moved some callbacks from the
VIOsPAPRBus structure to the VIOsPAPRDeviceClass.  Except, that it
forgot to actually remove them from the VIOsPAPRBus structure (which
still exists, though it doesn't fulfill quite the same function as it
did pre-QOM).

This patch removes those now unused callback fields.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Thomas Huth <thuth@redhat.com>
2016-10-28 09:36:58 +11:00
Cédric Le Goater e3403258a2 ppc/xics: change the icp_ routines API to use an 'ICPState *' argument
The routines :

	void icp_set_cppr(ICPState *icp, uint8_t cppr);
	void icp_set_mfrr(ICPState *icp, uint8_t mfrr);
	void icp_eoi(ICPState *icp, uint32_t xirr);

now use one 'ICPState *icp' argument instead of a 'XICSState *' and a
server arguments. The backlink on XICSState* is used whenever needed.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-28 09:36:58 +11:00
Cédric Le Goater d49c603b37 ppc/xics: add a XICSState backlink in ICPState
The link will be used to change the API of the icp_* routines which
are still using an XICSState as an argument.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-28 09:36:58 +11:00
Cédric Le Goater 2bb0d10aeb ppc/xics: add a xics_set_nr_servers common routine
xics_spapr and xics_kvm nearly define the same 'set_nr_servers'
handler. Only the type of the ICP differs. So let's make a common one
to remove some duplicated code.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-28 09:36:58 +11:00
Thomas Huth c6363bae17 nvram: Rename openbios_firmware_abi.h into sun_nvram.h
The header now only contains inline functions related to the
Sun NVRAM, so the a name like sun_nvram.h seems to be more
appropriate now.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-28 09:36:58 +11:00
Thomas Huth ad723fe5a0 nvram: Move the remaining CHRP NVRAM related code to chrp_nvram.[ch]
Everything that is related to CHRP NVRAM should rather reside in
chrp_nvram.c / chrp_nvram.h instead of openbios_firmware_abi.h.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-28 09:36:58 +11:00
Thomas Huth 55d9950aaa nvram: Introduce helper functions for CHRP "system" and "free space" partitions
The "system partition" and "free space" partition layouts are
defined by the CHRP and LoPAPR specification, and used by
OpenBIOS and SLOF. We can re-use this code for other machines
that use OpenBIOS and SLOF, too. So let's make this code independent
from the MAC NVRAM environment and put it into two proper helper
functions.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-28 09:36:58 +11:00
Mark Cave-Ayland 99868af3d0 dma-helpers: explicitly pass alignment into DMA helpers
The hard-coded default alignment is BDRV_SECTOR_SIZE, however this is not
necessarily the case for all platforms. Use this as the default alignment for
all current callers.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Eric Blake <eblake@redhat.com>
Acked-by: John Snow <jsnow@redhat.com>
Message-id: 1476445266-27503-2-git-send-email-mark.cave-ayland@ilande.co.uk
Signed-off-by: John Snow <jsnow@redhat.com>
2016-10-27 16:29:13 -04:00
Kevin Wolf cbc14ac9c3 block: Remove bdrv_aio_ioctl()
It is unused now.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-10-27 19:05:23 +02:00
Kevin Wolf 16a389dc9e block: Introduce .bdrv_co_ioctl() driver callback
This allows drivers to implement ioctls in a coroutine-based way.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-10-27 19:05:23 +02:00
Kevin Wolf 61b2450414 block: Remove bdrv_ioctl()
It is unused now.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-10-27 19:05:23 +02:00
Kevin Wolf 48af776a5b block: Use blk_co_ioctl() for all BB level ioctls
All read/write functions already have a single coroutine-based function
on the BlockBackend level through which all requests go (no matter what
API style the external caller used) and which passes the requests down
to the block node level.

This patch exports a bdrv_co_ioctl() function and uses it to extend this
mode of operation to ioctls.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-10-27 19:05:22 +02:00
Kevin Wolf 7381e95cc2 block: Remove bdrv_aio_pdiscard()
It is unused now.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-10-27 19:05:22 +02:00
Daniel P. Berrange e93a68e102 char: set name for all I/O channels created
Ensure that all I/O channels created for character devices
are given names to distinguish their respective roles.

Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-10-27 09:13:10 +02:00
Daniel P. Berrange 20f4aa265e io: add ability to set a name for IO channels
The GSource object has ability to have a name, which is useful
when debugging performance problems with the mainloop event
callbacks that take too long. By associating a name with a
QIOChannel object, we can then set the name on any GSource
associated with the channel.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-10-27 09:13:10 +02:00
Felipe Franciosi d8d3c7cc67 io: Introduce a qio_channel_set_feature() helper
Testing QIOChannel feature support can be done with a helper called
qio_channel_has_feature(). Setting feature support, however, was
done manually with a logical OR. This patch introduces a new helper
called qio_channel_set_feature() and makes use of it where applicable.

Signed-off-by: Felipe Franciosi <felipe@nutanix.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-10-26 18:19:53 +02:00
Felipe Franciosi 8fbf661212 io: Fix double shift usages on QIOChannel features
When QIOChannels were introduced in 666a3af9, the feature bits were
already defined shifted. However, when using them, the code was shifting
them again. The incorrect use was consistent until 74b6ce43, where
QIO_CHANNEL_FEATURE_LISTEN was defined shifted but tested unshifted.

This patch changes the definition to be unshifted and fixes the
incorrect usage introduced on 74b6ce43.

Signed-off-by: Felipe Franciosi <felipe@nutanix.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-10-26 18:19:53 +02:00
Richard Henderson 7ebee43ee3 tcg: Add atomic128 helpers
Force the use of cmpxchg16b on x86_64.

Wikipedia suggests that only very old AMD64 (circa 2004) did not have
this instruction.  Further, it's required by Windows 8 so no new cpus
will ever omit it.

If we truely care about these, then we could check this at startup time
and then avoid executing paths that use it.

Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26 08:29:01 -07:00
Richard Henderson fdbc2b5722 tcg: Add EXCP_ATOMIC
When we cannot emulate an atomic operation within a parallel
context, this exception allows us to stop the world and try
again in a serial context.

Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26 08:29:00 -07:00
Richard Henderson 1edaeee095 int128: Add int128_make128
Allows Int128 to be used more generally, rather than having to
begin with 64-bit inputs and accumulate.

Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26 08:29:00 -07:00
Richard Henderson 0846beb366 int128: Use __int128 if available
Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26 08:29:00 -07:00
Richard Henderson 258dfaaad0 exec: Avoid direct references to Int128 parts
Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26 08:29:00 -07:00
Richard Henderson 84bca3927b atomics: Add __nocheck atomic operations
While the check against sizeof(void *) is appropriate for
normal usage within qemu, there are places in which we want
wider operaions and have checked for their existance.

Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26 08:28:57 -07:00
Emilio G. Cota 83d0c719f8 atomics: add atomic_op_fetch variants
This paves the way for upcoming work.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Message-Id: <1467054136-10430-9-git-send-email-cota@braap.org>
2016-10-26 08:28:57 -07:00
Emilio G. Cota 61696ddbdc atomics: add atomic_xor
This paves the way for upcoming work.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Message-Id: <1467054136-10430-8-git-send-email-cota@braap.org>
2016-10-26 08:28:56 -07:00
Richard Henderson d1a9f2d12f atomics: Add parameters to macros
Making these functional rather than object macros will
prevent later problems with complex macro expansion.

Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26 08:28:46 -07:00
Daniel P. Berrange 603476c25c qdict: implement a qdict_crumple method for un-flattening a dict
The qdict_flatten() method will take a dict whose elements are
further nested dicts/lists and flatten them by concatenating
keys.

The qdict_crumple() method aims to do the reverse, taking a flat
qdict, and turning it into a set of nested dicts/lists. It will
apply nesting based on the key name, with a '.' indicating a
new level in the hierarchy. If the keys in the nested structure
are all numeric, it will create a list, otherwise it will create
a dict.

If the keys are a mixture of numeric and non-numeric, or the
numeric keys are not in strictly ascending order, an error will
be reported.

As an example, a flat dict containing

 {
   'foo.0.bar': 'one',
   'foo.0.wizz': '1',
   'foo.1.bar': 'two',
   'foo.1.wizz': '2'
 }

will get turned into a dict with one element 'foo' whose
value is a list. The list elements will each in turn be
dicts.

 {
   'foo': [
     { 'bar': 'one', 'wizz': '1' },
     { 'bar': 'two', 'wizz': '2' }
   ],
 }

If the key is intended to contain a literal '.', then it must
be escaped as '..'. ie a flat dict

  {
     'foo..bar': 'wizz',
     'bar.foo..bar': 'eek',
     'bar.hello': 'world'
  }

Will end up as

  {
     'foo.bar': 'wizz',
     'bar': {
        'foo.bar': 'eek',
        'hello': 'world'
     }
  }

The intent of this function is that it allows a set of QemuOpts
to be turned into a nested data structure that mirrors the nesting
used when the same object is defined over QMP.

Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1475246744-29302-3-git-send-email-berrange@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[Parameter recursive dropped along with its tests; whitespace style
touched up]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-10-25 17:56:14 +02:00
Daniel P. Berrange 7d5e199ade qapi: rename QmpOutputVisitor to QObjectOutputVisitor
The QmpOutputVisitor has no direct dependency on QMP. It is
valid to use it anywhere that one wants a QObject. Rename it
to better reflect its functionality as a generic QAPI
to QObject converter.

The commit before previous renamed the files, this one renames C
identifiers.

Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1475246744-29302-6-git-send-email-berrange@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[Split into file rename and identifier rename]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-10-25 16:25:54 +02:00
Daniel P. Berrange 09e68369a8 qapi: rename QmpInputVisitor to QObjectInputVisitor
The QmpInputVisitor has no direct dependency on QMP. It is
valid to use it anywhere that one has a QObject. Rename it
to better reflect its functionality as a generic QObject
to QAPI converter.

The previous commit renamed the files, this one renames C identifiers.

Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1475246744-29302-5-git-send-email-berrange@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[Straightforwardly rebased, split into file and identifier rename]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-10-25 16:25:54 +02:00
Daniel P. Berrange b3db211f3c qapi: rename *qmp-*-visitor* to *qobject-*-visitor*
The QMP visitors have no direct dependency on QMP. It is
valid to use them anywhere that one has a QObject. Rename them
to better reflect their functionality as a generic QObject
to QAPI converter.

This is the first of three parts: rename the files.  The next two
parts will rename C identifiers.  The split is necessary to make git
rename detection work.

Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[Split into file and identifier rename, two comments touched up]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-10-25 16:25:48 +02:00
Peter Maydell c43e853afe x86 and CPU queue, 2016-10-24
x2APIC support to APIC code, cpu_exec_init() refactor on all
 architectures, and other x86 changes.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJYDmYyAAoJECgHk2+YTcWmoSUP/2ga+b9YmPuyL7XC+12pff0I
 Z8gdjUzbMUNcCI0JMZCTGUJbs3BapLcnsA7ypmt88s9kG02WeDMhNx1BfYiAFgLU
 kPLQlXAM7awEdGagd3sTCiFojSUZ7GxYHjd5fuhPoOAXvXM8im6zJl18ZcsnStjO
 /J8JGoGDHq1XJlz+RIjnGamojJWCiO/+iiD+rFmVSic8zjHPDYq14sIk/QJX+DaF
 azLiOI6DAlX3kyrN5ZshhIRQ3COzzUMUSDF/ZaYHjudUco5MBnwj/oLQniTq+ZUd
 hCu7dr5TpLxI7q1yltyd0UIl/+aZGbE8tEvoXAtc735iK4m2CTckT7ql6x3xI+Ir
 PmpPgIswHqfCiCXm8imLj6ZI47kRA1x4x4AudLaNVKP7jO82485sS9HWpOadYsaU
 jvek2SqfqvH+vce4FzwlLEcXGDb73MT/XkIUvd7SfPIbs9umgdZc03U4SHfAWr0i
 lAIRs4Ym0AAS2WSE4E09wvdUUr9oxaQBMhw3JAiNmg7hLfyINTP+D/IhtlAVXXEA
 F9D7fky5lDwfKvIwPxPJbDD5bCBV9AmxhiahIhv3epu4Kg4orf1inkrx0IZWSbB0
 7+JZ7j8asuizfibkeZAN9rxVwmz32makJNsnjzZHlnaPxTvIDzvRkNceBnhC5vKq
 3yfxgl4agXmMjveraAtt
 =T2kg
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/ehabkost/tags/x86-pull-request' into staging

x86 and CPU queue, 2016-10-24

x2APIC support to APIC code, cpu_exec_init() refactor on all
architectures, and other x86 changes.

# gpg: Signature made Mon 24 Oct 2016 20:51:14 BST
# gpg:                using RSA key 0x2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF  D1AA 2807 936F 984D C5A6

* remotes/ehabkost/tags/x86-pull-request:
  exec: call cpu_exec_exit() from a CPU unrealize common function
  exec: move cpu_exec_init() calls to realize functions
  exec: split cpu_exec_init()
  pc: q35: Bump max_cpus to 288
  pc: Require IRQ remapping and EIM if there could be x2APIC CPUs
  pc: Add 'etc/boot-cpus' fw_cfg file for machine with more than 255 CPUs
  Increase MAX_CPUMASK_BITS from 255 to 288
  pc: Clarify FW_CFG_MAX_CPUS usage comment
  pc: kvm_apic: Pass APIC ID depending on xAPIC/x2APIC mode
  pc: apic_common: Reset APIC ID to initial ID when switching into x2APIC mode
  pc: apic_common: Restore APIC ID to initial ID on reset
  pc: apic_common: Extend APIC ID property to 32bit
  pc: Leave max apic_id_limit only in legacy cpu hotplug code
  acpi: cphp: Force switch to modern cpu hotplug if APIC ID > 254
  pc: acpi: x2APIC support for SRAT table
  pc: acpi: x2APIC support for MADT table and _MAT method

Conflicts:
	target-arm/cpu.c

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-10-25 10:25:27 +01:00
Laurent Vivier 7bbc124e7e exec: call cpu_exec_exit() from a CPU unrealize common function
As cpu_exec_exit() mirrors the cpu_exec_realizefn(),
rename it as cpu_exec_unrealizefn().

Create and register a cpu_common_unrealizefn() function for
the CPU device class and call cpu_exec_unrealizefn() from
this function.

Remove cpu_exec_exit() from cpu_common_finalize()
(which mirrors init, not realize), and as x86_cpu_unrealizefn()
and ppc_cpu_unrealizefn() overwrite the device class unrealize function,
add a call to a parent_unrealize pointer.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-10-24 17:29:16 -02:00
Laurent Vivier ce5b1bbf62 exec: move cpu_exec_init() calls to realize functions
Modify all CPUs to call it from XXX_cpu_realizefn() function.

Remove all the cannot_destroy_with_object_finalize_yet as
unsafe references have been moved to cpu_exec_realizefn().
(tested with QOM command provided by commit 4c315c27)

for arm:

Setting of cpu->mp_affinity is moved from arm_cpu_initfn()
to arm_cpu_realizefn() as setting of cpu_index is now done
in cpu_exec_realizefn(). To avoid to overwrite an user defined
value, we set it to an invalid value by default, and update
it in realize function only if the value is still invalid.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-10-24 17:29:16 -02:00
Laurent Vivier 39e329e341 exec: split cpu_exec_init()
Put in cpu_exec_initfn() what initializes the CPU,
and leave in cpu_exec_init() what adds it to the environment.

As cpu_exec_initfn() is called by all XX_cpu_initfn(), call it
directly in cpu_common_initfn().
cpu_exec_init() is now a realize function, it will be renamed
to cpu_exec_realizefn() and moved to the XX_cpu_realizefn()
function in a following patch.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-10-24 17:29:16 -02:00
Igor Mammedov 080ac219cc pc: Add 'etc/boot-cpus' fw_cfg file for machine with more than 255 CPUs
Currently firmware uses 1 byte at 0x5F offset in RTC CMOS
to get number of CPUs present at boot. However 1 byte is
not enough to handle more than 255 CPUs.  So add a new
fw_cfg file that would allow QEMU to tell it.
For compat reasons add file only for machine types that
support more than 255 CPUs.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-10-24 17:29:15 -02:00
Igor Mammedov 079019f2e3 Increase MAX_CPUMASK_BITS from 255 to 288
so that it would be possible to increase maxcpus limit
for x86 target. Keep spapr/virt_arm at limit they used
to have 255.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-10-24 17:29:15 -02:00
Igor Mammedov 33d7a28829 pc: apic_common: Extend APIC ID property to 32bit
ACPI ID is 32 bit wide on CPUs with x2APIC support.
Extend 'id' property to support it.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-10-24 17:29:15 -02:00
Igor Mammedov 5eff33a2a1 pc: acpi: x2APIC support for SRAT table
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-10-24 17:29:14 -02:00
Igor Mammedov e2c9593945 pc: acpi: x2APIC support for MADT table and _MAT method
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-10-24 17:29:14 -02:00
Peter Maydell fe4c04071f target-arm queue:
* support variable (runtime-determined) page sizes, for a
    nearly-20% speedup of TCG for ARMv7 and v8 CPUs with 4K pages
  * ptimer: add tests, support more flexible behaviour around
    what happens on the "zero" tick, use ptimer for a9gtimer
  * virt: ACPI: Add IORT Structure definition
  * i2c: Fix SMBus read transactions to avoid double events
  * timer: stm32f2xx_timer: add check for prescaler value
  * QOMify musicpal, pxa2xx_gpio, strongarm, pl110
  * target-arm: Implement new HLT trap for semihosting
  * i2c: Add asserts for second smbus i2c_start_transfer()
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABCAAGBQJYDkPBAAoJEDwlJe0UNgzeUgYP/1coSoKGgxsKkLp8KRZSinoP
 k8J/GQCvPi7Ha1ntonPQ2vSrs11EdGlNaQWQo8Iq1KVnWx0K+fJJfhtqNXnzk7gN
 6xOhR+c3vZhCCoPkTJgY5yMF7LsMwOgYLmuk5hzxGHb7IRteHCaUCG5mCQBOOK39
 kIuXu0UlYo/cPYcXLZkHpRIQAHWWwXvfthAWHoYvdVFMmBUnneFZgPlFOYCAG7X6
 L9sEbpH5ZG2ttUXTYGtrBw1WfkVrKY9rwG2xZqu3yqQfGMDQFwHP1Q4pB7Z+SpoE
 H6mI7DBUnvHjcscg/H6/LUmMfJ3pL8qS7NtGz1AP9ArU2/Zk+MO7YJEcRBiYwtXY
 Z/LbYTyU7Ellrd5t2hAe1zxDiGh3TcRRCVuFZdyRhisyloKbq+/WmAcsTkuO5NRd
 p1hHBoPvwYEhEHXcoeq5uOGtwxrr4dl736wQ4vEhNyzbCfpcytEePjbxNdE44iVM
 VTcmsI0S6aAZxDv8a07dPn4BlFVXt/nzuhXFFDnealjVqOD5Pe4aG9oFTgHsJkdR
 bt1Z3gmjH6jCmUTTaMVqY696u8NoRxJdk7OMBLMHYmKuLzGJ70pqy2kinYOZXqxW
 yLziS++qTlZwzHqVxDNsnLiRoRn2Jo1kb/hoBIkvJ+TgoxqfDQjGBvVQLHHjbu/Z
 FKOCN6lnBNL41R4JigR7
 =nDCf
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20161024' into staging

target-arm queue:
 * support variable (runtime-determined) page sizes, for a
   nearly-20% speedup of TCG for ARMv7 and v8 CPUs with 4K pages
 * ptimer: add tests, support more flexible behaviour around
   what happens on the "zero" tick, use ptimer for a9gtimer
 * virt: ACPI: Add IORT Structure definition
 * i2c: Fix SMBus read transactions to avoid double events
 * timer: stm32f2xx_timer: add check for prescaler value
 * QOMify musicpal, pxa2xx_gpio, strongarm, pl110
 * target-arm: Implement new HLT trap for semihosting
 * i2c: Add asserts for second smbus i2c_start_transfer()

# gpg: Signature made Mon 24 Oct 2016 18:24:17 BST
# gpg:                using RSA key 0x3C2525ED14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
# gpg:                 aka "Peter Maydell <pmaydell@gmail.com>"
# gpg:                 aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>"
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83  15CF 3C25 25ED 1436 0CDE

* remotes/pmaydell/tags/pull-target-arm-20161024: (32 commits)
  i2c: Add asserts for second smbus i2c_start_transfer()
  target-arm: Implement new HLT trap for semihosting
  hw/display: QOM'ify pl110.c
  hw/arm: QOM'ify strongarm.c
  hw/arm: QOM'ify pxa2xx_gpio.c
  hw/arm: QOM'ify musicpal.c
  timer: stm32f2xx_timer: add check for prescaler value
  i2c: Fix SMBus read transactions to avoid double events
  timer: a9gtimer: remove loop to auto-increment comparator
  ARM: Virt: ACPI: Build an IORT table with RC and ITS nodes
  ACPI: Add IORT Structure definition
  tests: Add tests for the ARM MPTimer
  arm_mptimer: Convert to use ptimer
  tests: ptimer: Replace 10000 with 1
  tests: ptimer: Change the copyright comment
  tests: ptimer: Add tests for "no counter round down" policy
  hw/ptimer: Add "no counter round down" policy
  tests: ptimer: Add tests for "no immediate reload" policy
  hw/ptimer: Add "no immediate reload" policy
  tests: ptimer: Add tests for "no immediate trigger" policy
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-10-24 19:37:34 +01:00
Fam Zheng 6d3f4049ba block: More operations for meta dirty bitmap
Callers can create an iterator of meta bitmap with
bdrv_dirty_meta_iter_new(), then use the bdrv_dirty_iter_* operations on
it. Meta iterators are also counted by bitmap->active_iterators.

Also add a couple of functions to retrieve granularity and count.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1476395910-8697-11-git-send-email-jsnow@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-10-24 17:56:07 +02:00
Vladimir Sementsov-Ogievskiy 882c36f590 block: BdrvDirtyBitmap serialization interface
Several functions to provide necessary access to BdrvDirtyBitmap for
block-migration.c

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
[Add the "finish" parameters. - Fam]
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>

Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1476395910-8697-9-git-send-email-jsnow@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-10-24 17:56:07 +02:00
Vladimir Sementsov-Ogievskiy 8258888e22 hbitmap: serialization
Functions to serialize / deserialize(restore) HBitmap. HBitmap should be
saved to linear sequence of bits independently of endianness and bitmap
array element (unsigned long) size. Therefore Little Endian is chosen.

These functions are appropriate for dirty bitmap migration, restoring
the bitmap in several steps is available. To save performance, every
step writes only the last level of the bitmap. All other levels are
restored by hbitmap_deserialize_finish() as a last step of restoring.
So, HBitmap is inconsistent while restoring.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
[Fix left shift operand to 1UL; add "finish" parameter. - Fam]
Signed-off-by: Fam Zheng <famz@redhat.com>

Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1476395910-8697-8-git-send-email-jsnow@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-10-24 17:56:07 +02:00
Fam Zheng 15891fac7d block: Add two dirty bitmap getters
For dirty bitmap users to get the size and the name of a
BdrvDirtyBitmap.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1476395910-8697-6-git-send-email-jsnow@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-10-24 17:56:07 +02:00
Fam Zheng fb933437de block: Support meta dirty bitmap
The added group of operations enables tracking of the changed bits in
the dirty bitmap.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1476395910-8697-5-git-send-email-jsnow@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-10-24 17:56:07 +02:00
Fam Zheng 07ac4cdb57 HBitmap: Introduce "meta" bitmap to track bit changes
Upon each bit toggle, the corresponding bit in the meta bitmap will be
set.

Signed-off-by: Fam Zheng <famz@redhat.com>
[Amended text inline. --js]
Reviewed-by: Max Reitz <mreitz@redhat.com>

Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1476395910-8697-3-git-send-email-jsnow@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-10-24 17:56:07 +02:00
Fam Zheng dc162c8e4f block: Hide HBitmap in block dirty bitmap interface
HBitmap is an implementation detail of block dirty bitmap that should be hidden
from users. Introduce a BdrvDirtyBitmapIter to encapsulate the underlying
HBitmapIter.

A small difference in the interface is, before, an HBitmapIter is initialized
in place, now the new BdrvDirtyBitmapIter must be dynamically allocated because
the structure definition is in block/dirty-bitmap.c.

Two current users are converted too.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1476395910-8697-2-git-send-email-jsnow@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-10-24 17:56:07 +02:00
Prem Mallappa 16fc326a55 ACPI: Add IORT Structure definition
ACPI Spec 6.0 introduces IO Remapping Table Structure. This patch
introduces the definitions required to describe the IO relationship
between the PCIe root complex and the ITS.

This conforms to:
"IO Remapping Table System Software on ARM Platforms",
Document number: ARM DEN 0049B, October 2015.

Signed-off-by: Prem Mallappa <prem.mallappa@broadcom.com>
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Message-id: 1476707466-14300-2-git-send-email-eric.auger@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-10-24 16:26:54 +01:00
Dmitry Osipenko 226fb5aaff arm_mptimer: Convert to use ptimer
Current ARM MPTimer implementation uses QEMUTimer for the actual timer,
this implementation isn't complete and mostly tries to duplicate of what
generic ptimer is already doing fine.

Conversion to ptimer brings the following benefits and fixes:
	- Simple timer pausing implementation
	- Fixes counter value preservation after stopping the timer
	- Properly handles prescaler != 0 / counter = 0 / load = 0 cases
	- Code simplification and reduction

Bump VMSD to version 3, since VMState is changed and is not compatible
with the previous implementation.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-id: 37f378c33bb5a28d5cd71167a6bd5bff5e59cbc3.1475421224.git.digetx@gmail.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-10-24 16:26:53 +01:00
Dmitry Osipenko 5580ea4576 hw/ptimer: Add "no counter round down" policy
For most of the timers counter starts to decrement after first period
expires. Due to rounding down performed by the ptimer_get_count, it returns
counter - 1 for the running timer, so that for the ptimer user it looks
like counter gets decremented immediately after running the timer. Add "no
counter round down" policy that provides correct behaviour for those timers.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Message-id: ef39622d0ebfdc32a0877e59ffdf6910dc3db688.1475421224.git.digetx@gmail.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-10-24 16:26:52 +01:00
Dmitry Osipenko 3f6e6a13c1 hw/ptimer: Add "no immediate reload" policy
Immediate counter re-load on setting (or on starting to run with)
counter = 0 is a wrong behaviour for some of the timers. Add "no
immediate reload" policy that provides correct behaviour for such timers.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Message-id: bf9385cd2550ca451d564fa46007688cee3f3d9d.1475421224.git.digetx@gmail.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-10-24 16:26:52 +01:00
Dmitry Osipenko 22471b8a0f hw/ptimer: Add "no immediate trigger" policy
Performing trigger on setting (or starting to run with) counter = 0 could
be a wrong behaviour for some of the timers, provide "no immediate trigger"
policy to maintain correct behaviour for such timers.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Message-id: 72c0319cf2ec599f22397b7da280c06c34dc40dd.1475421224.git.digetx@gmail.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-10-24 16:26:51 +01:00
Dmitry Osipenko ef0a9984aa hw/ptimer: Add "continuous trigger" policy
Currently, periodic timer that has load = delta = 0 performs trigger
on timer reload and stops, printing a "period zero" error message.
Introduce new policy that makes periodic timer to continuously trigger
with a period interval in case of load = 0.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Message-id: 632b23dd11055d9bd5e338d66b38fac0bd51462e.1475421224.git.digetx@gmail.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-10-24 16:26:51 +01:00
Dmitry Osipenko 2b5c0322b7 hw/ptimer: Add "wraparound after one period" policy
Currently, periodic counter wraps around immediately once counter reaches
"0", this is wrong behaviour for some of the timers, resulting in one period
being lost. Add new ptimer policy that provides correct behaviour for such
timers, so that counter stays with "0" for a one period before wrapping
around.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Message-id: f22a670cf1f4be298b31640cb5f4be1df0f20ab6.1475421224.git.digetx@gmail.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-10-24 16:26:50 +01:00
Peter Maydell 20bccb82ff cpu: Support a target CPU having a variable page size
Support target CPUs having a page size which isn't knownn
at compile time. To use this, the CPU implementation should:
 * define TARGET_PAGE_BITS_VARY
 * not define TARGET_PAGE_BITS
 * define TARGET_PAGE_BITS_MIN to the smallest value it
   might possibly want for TARGET_PAGE_BITS
 * call set_preferred_target_page_bits() in its realize
   function to indicate the actual preferred target page
   size for the CPU (and report any error from it)

In CONFIG_USER_ONLY, the CPU implementation should continue
to define TARGET_PAGE_BITS appropriately for the guest
OS page size.

Machines which want to take advantage of having the page
size something larger than TARGET_PAGE_BITS_MIN must
set the MachineClass minimum_page_bits field to a value
which they guarantee will be no greater than the preferred
page size for any CPU they create.

Note that changing the target page size by setting
minimum_page_bits is a migration compatibility break
for that machine.

For debugging purposes, attempts to use TARGET_PAGE_SIZE
before it has been finally confirmed will assert.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2016-10-24 16:26:49 +01:00
Marc-André Lureau 82878dac6f char: remove explicit_be_open from CharDriverState
It's only used in qmp_chardev_add(), so use a create() argument instead.

Also switched to typedef functions for CharDriverParse/CharDriverCreate.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20161022100951.19562-7-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-10-24 15:46:11 +02:00
Marc-André Lureau 3aef23d7d8 char: replace avail_connections
No need to count the users of a CharDriverState, it can rely on the fact
of whether there is a CharBackend associated or if there is enough space
in the muxer.

Simplify and fold chr_mux_new_fe() in qemu_chr_fe_init() since there is
a single user now. Also switch from fprintf to raising error instead.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20161022100951.19562-5-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-10-24 15:46:10 +02:00
Marc-André Lureau 58fa54947e char: remove unused qemu_chr_fe_event
I introduced this function in d61b0c9a2f, but it isn't
used. Furthermore, it was incomplete, as it would need to translate QEMU
chr events to Spice port events.

(presumably it was used in the follow-up NBD-spice series that was not
completed: http://lists.gnu.org/archive/html/qemu-devel/2013-11/msg02024.html)

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20161022100951.19562-4-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-10-24 15:46:10 +02:00
Marc-André Lureau 8c260cb13c char: use an enum for CHR_EVENT
This may help to catch unhandled cases, and avoid having to maintain
numbering.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20161022100951.19562-3-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-10-24 15:46:10 +02:00
Marc-André Lureau 8cd35662af char: remove unused CHR_EVENT_FOCUS
Usage has long been removed, since commit f220174de8.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20161022100951.19562-2-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-10-24 15:46:10 +02:00
Marc-André Lureau 830896afe3 char: move fe_open in CharBackend
The fe_open state belongs to front end.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20161022100951.19562-1-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-10-24 15:46:10 +02:00
Marc-André Lureau 39ab61c6d0 char: remove explicit_fe_open, use a set_handlers argument
No need to keep explicit_fe_open around if it affects only a
qemu_chr_fe_set_handlers(). Use an additional argument instead.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20161022095318.17775-24-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-10-24 15:46:10 +02:00
Marc-André Lureau 72ac876248 char: rename chr_close/chr_free
The function is used to free the backend opaque pointer, let's name it
accordingly.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20161022095318.17775-23-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-10-24 15:46:10 +02:00
Marc-André Lureau a4afa548fc char: move front end handlers in CharBackend
Since the hanlders are associated with a CharBackend, rather than the
CharDriverState, it is more appropriate to store in CharBackend. This
avoids the handler copy dance in qemu_chr_fe_set_handlers() then
mux_chr_update_read_handler(), by storing the CharBackend pointer
directly.

Also a mux CharDriver should go through mux->backends[focused], since
chr->be will stay NULL. Before that, it was possible to call
chr->handler by mistake with surprising results, for ex through
qemu_chr_be_can_write(), which would result in calling the last set
handler front end, not the one with focus.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20161022095318.17775-22-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-10-24 15:46:10 +02:00
Marc-André Lureau fa394ed625 char: make some qemu_chr_fe skip if no driver
In most cases, front ends do not care about the side effect of
CharBackend, so we can simply skip the checks and call the qemu_chr_fe
functions even without associated CharDriver.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20161022095318.17775-20-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-10-24 15:27:21 +02:00
Marc-André Lureau c39860e6dc char: replace qemu_chr_claim/release with qemu_chr_fe_init/deinit
Now that all front end use qemu_chr_fe_init(), we can move chardev
claiming in init(), and add a function deinit() to release the chardev
and cleanup handlers.

The qemu_chr_fe_claim_no_fail() for property are gone, since the
property will raise an error instead. In other cases, where there is
already an error path, an error is raised instead. Finally, other cases
are handled by &error_abort in qemu_chr_fe_init().

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20161022095318.17775-19-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-10-24 15:27:21 +02:00
Marc-André Lureau 386f07d1fc char: fold qemu_chr_set_handlers in qemu_chr_fe_set_handlers
qemu_chr_add_handlers*() have been removed in previous change, so the
common qemu_chr_set_handlers() is no longer needed.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20161022095318.17775-17-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-10-24 15:27:21 +02:00
Marc-André Lureau 5345fdb446 char: use qemu_chr_fe* functions with CharBackend argument
This also switches from qemu_chr_add_handlers() to
qemu_chr_fe_set_handlers(). Note that qemu_chr_fe_set_handlers() now
takes the focus when fe_open (qemu_chr_add_handlers() did take the
focus)

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20161022095318.17775-16-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-10-24 15:27:21 +02:00
Marc-André Lureau 7fa47e2a80 char: rename some frontend functions
qemu_chr_accept_input() and qemu_chr_disconnect() are only used by
frontend, so use qemu_chr_fe prefix.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20161022095318.17775-14-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-10-24 15:27:20 +02:00
Marc-André Lureau becdfa00cf char: replace PROP_CHR with CharBackend
Store the property in a CharBackend instead of CharDriverState*.  This
also replace systematically chr by chr.chr to access the
CharDriverState*. The following patches will replace it with calls to
qemu_chr_fe CharBackend functions.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20161022095318.17775-12-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-10-24 15:27:20 +02:00
Marc-André Lureau 94a40fc560 char: introduce CharBackend
This new structure is meant to keep the details associated with a char
driver usage. On initialization, it gets a tag from the mux backend.
It can change its handlers thanks to qemu_chr_fe_set_handlers().

This structure is introduced so that all frontend will be moved to hold
and use a CharBackend. This will allow to better track char usage and
allocation, and help prevent some memory leaks or corruption.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20161022095318.17775-10-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-10-24 15:27:20 +02:00
Marc-André Lureau 6dfa8298fa mux: split mux_chr_update_read_handler()
Make qemu_chr_add_handlers_full() aware of mux handling. This allows
introduction of a tag associated with the fe handlers and a
qemu_chr_set_handlers() function to set the handler for a particular
tag. That will allow to get rid of qemu_chr_add_handlers*() in later
changes, in favor of qemu_chr_fe_set_handler().

To this end, chr_update_read_handler callback is enhanced with a tag
argument, and mux_chr_update_read_handler() is splitted in new
functions: mux_chr_new_handler_tag(), mux_chr_set_handlers(),
mux_set_focus().

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20161022095318.17775-9-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-10-24 15:27:20 +02:00
Marc-André Lureau b4948be93e char: remove init callback
The CharDriverState.init() callback is no longer set since commit
a61ae7f88c and thus unused. The only user, the malta FGPA display has
been converted to use an event "opened" callback instead.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20161022095318.17775-7-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-10-24 15:27:20 +02:00
Marc-André Lureau 4496dc49ec sun4uv: fix serial initialization regression
Since commit b6607a1a20, serial_hds_isa_init() was introduced to
factor out serial_isa_init() loops. However, sun4uv shouldn't start from
0 when there is a mm serial on 0 already. Add a "from" argument to
serial_hds_isa_init().

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20161022095318.17775-5-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-10-24 15:27:20 +02:00
Marc-André Lureau f0b454ebf8 char.h: misc doc fix
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20161011152012.3228-1-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-10-24 15:27:19 +02:00
Paolo Bonzini 9a54635dcb memory: add a per-AddressSpace list of listeners
This speeds up MEMORY_LISTENER_CALL noticeably.  Right now,
with many PCI devices you have N regions added to M AddressSpaces
(M = # PCI devices with bus-master enabled) and each call looks
up the whole listener list, with at least M listeners in it.
Because most of the regions in N are BARs, which are also roughly
proportional to M, the whole thing is O(M^3).  This changes it
to O(M^2), which is the best we can do without rewriting the
whole thing.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-10-24 15:27:19 +02:00
Paolo Bonzini d45fa784cd memory: eliminate global MemoryListeners
There is none, so just drop the code.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-10-24 15:27:19 +02:00
Paolo Bonzini 803cf26a9e atomic: base mb_read/mb_set on load-acquire and store-release
This introduces load-acquire and store-release operations in QEMU.
For now, just use them as an implementation detail of atomic_mb_read
and atomic_mb_set.

Since docs/atomics.txt documents that atomic_mb_read only synchronizes
with an atomic_mb_set of the same variable, we can use the new implementation
everywhere instead of seq-cst loads and stores.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-10-24 15:27:15 +02:00
Paolo Bonzini f1ee86963b atomic: introduce smp_mb_acquire and smp_mb_release
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-10-24 11:30:55 +02:00
Peter Maydell da158a86c4 Merge qcrypto 2016/10/20 v1
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJYCLFxAAoJEL6G67QVEE/fgxAQAJbVAzBWTjWylGpZ7SivFytb
 Inr9JI4izKs1TS4BjzgFm3qS9GLFyEQz2J4OQJ3GQi9dBauCkgoaqxziEaR69FHT
 8lPFml2RoCx4RjFHgkO+mYt+8THuGc0YGZ2zLMp3X1ed5ip56WhWv5EqBj775Mjp
 xvHCyiqBhU8H6QgX9uvn9pJ1m+o5suJPO6DudRJJMDqTaDem8Hu+/aBi710K22C6
 Vk5MylHbNUye6Rhsh3l1IDJOjcHiL7SIz04js8jZzoTnU76ar2U7o2Nx0huGWNBs
 ZXoK61nfgnxMq5cq/2N9mvzBhv8CywXV2wr23nGx5pcS0gbN4VpLko9xjGGWzlPH
 T/y7UoX1SGGp4CqEttbC8qiE3GKFjxf/lManam3CNgDdP0/YGuQ1G8Ocz5czPNw/
 CHr0lTuqHVDCagc6lr2laRamgsdQk3hQaVX4kl2C+ObS+4n1Nu8oFZhD9Hym1Tuf
 fFV5jMucqMYj6QlRlwaDRMhP/XJF/bCSoTu4vf5/MQU1Kd/OF0r3iV9GcHRr0eFs
 aYJs3DS/AHwejThGvq11/2V6WuVCiLkyQz79p9bPCTFDfmCEwVo9rF0NJOf9PMPf
 3hNM1GvTnZhm5z8hzxKd5Uke5yEF+c7i/ix1tRWKI1DASTiCcIDEGLIzga3IoYZU
 HPZxL7VEhFeXfmSc9sTP
 =cZy7
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/berrange/tags/pull-qcrypto-2016-10-20-1' into staging

Merge qcrypto 2016/10/20 v1

# gpg: Signature made Thu 20 Oct 2016 12:58:41 BST
# gpg:                using RSA key 0xBE86EBB415104FDF
# gpg: Good signature from "Daniel P. Berrange <dan@berrange.com>"
# gpg:                 aka "Daniel P. Berrange <berrange@redhat.com>"
# Primary key fingerprint: DAF3 A6FD B26B 6291 2D0E  8E3F BE86 EBB4 1510 4FDF

* remotes/berrange/tags/pull-qcrypto-2016-10-20-1:
  crypto: fix initialization of gcrypt threading
  crypto: fix initialization of crypto in tests
  qtest: fix make check complaint in crypto module
  crypto: add mode check in qcrypto_cipher_new() for cipher-builtin
  crypto: add CTR mode support
  crypto: extend mode as a parameter in qcrypto_cipher_supports()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-10-20 14:46:19 +01:00
Gonglei 3c28292f39 crypto: add CTR mode support
Introduce CTR mode support for the cipher APIs.
CTR mode uses a counter rather than a traditional IV.
The counter has additional properties, including a nonce
and initial counter block. We reuse the ctx->iv as
the counter for conveniences.

Both libgcrypt and nettle are support CTR mode, the
cipher-builtin doesn't support yet.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-10-19 10:09:24 +01:00
Gonglei f844836ddc crypto: extend mode as a parameter in qcrypto_cipher_supports()
It can't guarantee all cipher modes are supported
if one cipher algorithm is supported by a backend.
Let's extend qcrypto_cipher_supports() to take both
the algorithm and mode as parameters.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-10-19 10:09:24 +01:00
Peter Maydell 1b0d3845b4 VFIO updates 2016-10-17
- Convert to realize & improve error reporting (Eric Auger)
  - RTL quirk bug fix (Thorsten Kohfeldt)
  - Skip duplicate pre/post reset (Cao jin)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.14 (GNU/Linux)
 
 iQIcBAABAgAGBQJYBSm0AAoJECObm247sIsi6/UP/jXTErBSjQ+m+Mfgc/GIEBhH
 GFYpO6gI5mgkOlscgmzlga/WyV2TuhnL+zSFmuMiGix4PyTOhkFrW9gzc5dWf5qe
 pIxPjrRC4uHX77Kyxsi623RAmltxYe+SPfh4v/3ZpYZQFXjFX6aDMyTWVedrGeXX
 AXW8L4L7ngLqoW77Ak6QKHh/ehzuCdLbNe/e+FwEsly4dnajhEzUkxqqY6P2Dufe
 lfkmtOTnu/mTjEY3VTrSmcdhHsYBs11Rk4SZbSHBg2B12Ftf96bFpH5E0KXvPY49
 ywoM36nbOBkhibZXjI7lDp1lkGSNVO+nlPP7060QubiEMltCzwQK/XdZeL76NGy0
 HK7llJivX6f6kMI349HFsDXOu8gNa9RsuV88Z6EQQoRUdbiipP++QtbSS6kufQgL
 6Xxu/ElrxeAmoM0uhUMYnskNL00ZrIGNQco0tXpo7vQZ59AcPbJYQvQi36HBOW5B
 4e5I+X7+ULugUKP0vpQwJ/9pN6NFq8nJuo4YtLRbOj0DaE+tpBdxj7J06QsfdgcP
 i/itvfulzCDERYn2kAlC0yE6AJ3vqeufsWUaCzPfCan/xHeOxleLTvGEytRNRkhK
 g831TPLQbv2BdFeEE4DEpc6haizHhsKPMMEn4mQb8ztQM99hyn4WYmzyHBYyqIMl
 nWBI+FzGt45eTR/snau0
 =VVuQ
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/awilliam/tags/vfio-updates-20161017.0' into staging

VFIO updates 2016-10-17

 - Convert to realize & improve error reporting (Eric Auger)
 - RTL quirk bug fix (Thorsten Kohfeldt)
 - Skip duplicate pre/post reset (Cao jin)

# gpg: Signature made Mon 17 Oct 2016 20:42:44 BST
# gpg:                using RSA key 0x239B9B6E3BB08B22
# gpg: Good signature from "Alex Williamson <alex.williamson@redhat.com>"
# gpg:                 aka "Alex Williamson <alex@shazbot.org>"
# gpg:                 aka "Alex Williamson <alwillia@redhat.com>"
# gpg:                 aka "Alex Williamson <alex.l.williamson@gmail.com>"
# Primary key fingerprint: 42F6 C04E 540B D1A9 9E7B  8A90 239B 9B6E 3BB0 8B22

* remotes/awilliam/tags/vfio-updates-20161017.0:
  vfio: fix duplicate function call
  vfio/pci: Fix vfio_rtl8168_quirk_data_read address offset
  vfio/pci: Handle host oversight
  vfio/pci: Remove vfio_populate_device returned value
  vfio/pci: Remove vfio_msix_early_setup returned value
  vfio/pci: Conversion to realize
  vfio/platform: Pass an error object to vfio_base_device_init
  vfio/platform: fix a wrong returned value in vfio_populate_device
  vfio/platform: Pass an error object to vfio_populate_device
  vfio: Pass an error object to vfio_get_device
  vfio: Pass an error object to vfio_get_group
  vfio: Pass an Error object to vfio_connect_container
  vfio/pci: Pass an error object to vfio_pci_igd_opregion_init
  vfio/pci: Pass an error object to vfio_add_capabilities
  vfio/pci: Pass an error object to vfio_intx_enable
  vfio/pci: Pass an error object to vfio_msix_early_setup
  vfio/pci: Pass an error object to vfio_populate_device
  vfio/pci: Pass an error object to vfio_populate_vga
  vfio/pci: Use local error object in vfio_initfn

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-10-18 11:40:27 +01:00
Peter Maydell e8ddc2eae5 x86 queue, 2016-10-17
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJYBQ+LAAoJECgHk2+YTcWmaF0QAISVeb39vJyyXNhxXsy1Y5iM
 WSYA8Dym0TWCXTd7Fq7Ck4VS9ZC7hAREKNSBs2hgVPutecL56iB/IjWrB0AyFAMK
 u5y4H1pI6l9TiH+6GcDWwQjthM/0v2pEHzQ2udLWBSpKJGjDPTSQIafZUgrW2uu0
 J/Drxg17FJ6KixqCg3FemPBXucbuU1PSW2qEWIgVElwj843j3d/Wc5l1wNb24irN
 jnOcvJd9WQsuT2fUDXezrCRVQle92tHR1cHtu5bZvC1aMFbGuHfSA4pm7pXw3l5N
 8H0fhrCoj6JGKRY/pzHGmLgwMTWJL4qASxr6sEKkMAyu59DdjQ0+U8EhOwoAHYhp
 gSrNgpwPKRr2OKrSUJXil7w1cQ+hsokgEo44SDEgsV4k9Rgbz8VVVct+LwOxwfwW
 l9sC9L5ONheFODfB3rgVFiyAbspYxzwOvGZ29VoeMyb4CS1CUBrsvka8DledFi+m
 By26W6IMtXBa4NZoYqp49zHqUZ5Wu62I32LCaWDKscUQfaEJKrf1DtFQ9FlWhy5F
 4NeSzTo4eAp3WPRDscbvXIyEJfYqzf7gs8KQA9QD+aTceDOIPiZeMz6oMokFukE8
 Lt1fWzzppFJ6ZyPLO1YI/T91fOskl45r8b3T242fovOKGAlujkunpRcTAYaybN5C
 qUv5Qrq6ZujcRecupLfE
 =KM/h
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/ehabkost/tags/x86-pull-request' into staging

x86 queue, 2016-10-17

# gpg: Signature made Mon 17 Oct 2016 18:51:07 BST
# gpg:                using RSA key 0x2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF  D1AA 2807 936F 984D C5A6

* remotes/ehabkost/tags/x86-pull-request: (21 commits)
  target-i386: Don't use cpu->migratable when filtering features
  target-i386: Return runnability information on query-cpu-definitions
  target-i386: x86_cpu_load_features() function
  target-i386: Unset cannot_destroy_with_object_finalize_yet
  target-i386/kvm: cache the return value of kvm_enable_x2apic()
  intel_iommu: reject broken EIM
  intel_iommu: add OnOffAuto intr_eim as "eim" property
  intel_iommu: redo configuraton check in realize
  intel_iommu: pass whole remapped addresses to apic
  apic: add send_msi() to APICCommonClass
  apic: add global apic_get_class()
  target-i386: Move warning code outside x86_cpu_filter_features()
  qmp: Add runnability information to query-cpu-definitions
  target-i386: xsave: Add FP and SSE bits to x86_ext_save_areas
  target-i386: Register properties for feature aliases manually
  target-i386: Remove underscores from feat_names arrays
  target-i386: Make plus_features/minus_features QOM-based
  target-i386: Register aliases for feature names with underscores
  target-i386: Disable VME by default with TCG
  target-i386: List CPU models using subclass list
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-10-18 09:29:44 +01:00
Andrew Jones 2231f69b4e hw/arm/virt: no ITS on older machine types
We should avoid exposing new hardware (through DT and ACPI) on older
machine types. This patch keeps 2.7 and older from changing, despite
the introduction of ITS support for 2.8.

Signed-off-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-id: 1476117341-32690-3-git-send-email-drjones@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-10-17 19:22:17 +01:00
Cédric Le Goater 6dc52326cc aspeed: add support for the AST2500 SoC SMC controllers
The SMC controllers on the Aspeed AST2500 SoC are very similar to the
ones found on the AST2400. The differences are on the number of
supported flash modules and their default mappings in the SoC address
space.

The Aspeed AST2500 has one SPI controller for the BMC firmware and two
for the host firmware. All controllers have now the same set of
registers compatible with the AST2400 FMC controller and the legacy
'SMC' controller is fully gone.

We keep the FMC object to act as the BMC SPI controller and add a new
SPI controller for the host. We also have to introduce new type names
to handle the differences in the flash modules memory mappping.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Andrew Jeffery <andrew@aj.id.au>
Message-id: 1474977462-28032-5-git-send-email-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-10-17 19:22:16 +01:00
Cédric Le Goater dbcabeeb54 aspeed: extend the number of host SPI controllers
The AST2500 SoC has two. Let's prepare ground for the next changes
which will add the required definitions for the second host SPI
controller.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Andrew Jeffery <andrew@aj.id.au>
Message-id: 1474977462-28032-4-git-send-email-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-10-17 19:22:16 +01:00
Cédric Le Goater dcb834447f aspeed: move the flash module mapping address under the controller definition
This will ease the definition of the new controllers for the AST2500
SoC and also ease the support of the segment registers, which provide
a way to reconfigure the mapping window of each slave.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Andrew Jeffery <andrew@aj.id.au>
Message-id: 1474977462-28032-3-git-send-email-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-10-17 19:22:16 +01:00
Cédric Le Goater 0e5803dfbc aspeed: rename the smc object to fmc
The Aspeed SoC has three different types of SMC (Static Memory
Controller) controllers: the SMC (legacy), the FMC (the new one) and
the SPI for the host PNOR. The FMC and the SPI models are now
converging on the AST2500 SoC and the SMC, which was still available
on the AST2400 SoC, was removed.

The Aspeed SoC does not provide support for the legacy SMC
controller. So, let's rename the 'smc' object to 'fmc' to clarify its
nature.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Andrew Jeffery <andrew@aj.id.au>
Message-id: 1474977462-28032-2-git-send-email-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-10-17 19:22:16 +01:00
Radim Krčmář fb506e701e intel_iommu: reject broken EIM
Cluster x2APIC cannot work without KVM's x2apic API when the maximal
APIC ID is greater than 8 and only KVM's LAPIC can support x2APIC, so we
forbid other APICs and also the old KVM case with less than 9, to
simplify the code.

There is no point in enabling EIM in forbidden APICs, so we keep it
enabled only for the KVM APIC;  unconditionally, because making the
option depend on KVM version would be a maintanance burden.

Old QEMUs would enable eim whenever intremap was on, which would trick
guests into thinking that they can enable cluster x2APIC even if any
interrupt destination would get clamped to 8 bits.
Depending on your configuration, QEMU could notice that the destination
LAPIC is not present and report it with a very non-obvious:

  KVM: injection failed, MSI lost (Operation not permitted)

Or the guest could say something about unexpected interrupts, because
clamping leads to aliasing so interrupts were being delivered to
incorrect VCPUs.

KVM_X2APIC_API is the feature that allows us to enable EIM for KVM.

QEMU 2.7 allowed EIM whenever interrupt remapping was enabled.  In order
to keep backward compatibility, we again allow guests to misbehave in
non-obvious ways, and make it the default for old machine types.

A user can enable the buggy mode it with "x-buggy-eim=on".

Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-10-17 15:44:49 -02:00
Radim Krčmář e6b6af0560 intel_iommu: add OnOffAuto intr_eim as "eim" property
The default (auto) emulates the current behavior.
A user can now control EIM like
  -device intel-iommu,intremap=on,eim=off

Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-10-17 15:44:49 -02:00
Radim Krčmář 267ee35715 apic: add send_msi() to APICCommonClass
The MMIO based interface to APIC doesn't work well with MSIs that have
upper address bits set (remapped x2APIC MSIs).  A specialized interface
is a quick and dirty way to avoid the shortcoming.

Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-10-17 15:44:49 -02:00
Radim Krčmář 2f114315dc apic: add global apic_get_class()
Every configuration has only up to one APIC class and we'll be extending
the class with a function that can be called without an instanced
object, so a direct access to the class is convenient.

This patch will break compilation if some code uses apic_get_class()
with CONFIG_USER_ONLY.

Suggested-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-10-17 15:44:49 -02:00
Eric Auger 59f7d6743c vfio: Pass an error object to vfio_get_device
Pass an error object to prepare for migration to VFIO-PCI realize.

In vfio platform vfio_base_device_init we currently just report the
error. Subsequent patches will propagate the error up to the realize
function.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-10-17 10:58:00 -06:00
Eric Auger 1b808d5be0 vfio: Pass an error object to vfio_get_group
Pass an error object to prepare for migration to VFIO-PCI realize.

For the time being let's just simply report the error in
vfio platform's vfio_base_device_init(). A subsequent patch will
duly propagate the error up to vfio_platform_realize.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-10-17 10:57:59 -06:00
Eric Auger 426ec9049e vfio/pci: Use local error object in vfio_initfn
To prepare for migration to realize, let's use a local error
object in vfio_initfn. Also let's use the same error prefix for all
error messages.

On top of the 1-1 conversion, we start using a common error prefix for
all error messages. We also introduce a similar warning prefix which will
be used later on.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-10-17 10:57:56 -06:00
Peter Maydell 7bf59dfec4 ppc patch queue 2016-10-17
Highlights:
     * Significant rework of how PCI IO windows are placed for the
       pseries machine type
     * A number of extra tests added for ppc
     * Other tests clean up / fixed
     * Some cleanups to the XICS interrupt controller in preparation
       for the 'powernv' machine type
 
 A number of the test changes aren't strictly in ppc related code, but
 are included via my tree because they're primarily focused on
 improving test coverage for ppc.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJYBDqhAAoJEGw4ysog2bOSVPAP/RlWYnOCTDiKuSCXz7+joHl9
 lY3K+x2r8DbFFmqxk82h+uObBG/dVQ7kcF+o3SD49dMUqoi/iD8rS5UFrDtArFKP
 vPh4h7KpVbQHyMWoHTjo1Zw94Sr2xEqRelQOrZjRA+a794kg6MKs3EIekHo9QdZl
 33qU0aE02LNOmguDwsTqTavrs/19qjcUM+fAyfOMEMiaQHNxdwrQjvldlIeELPka
 Dz/iOS5Lxibq2txmZWMxiMAuBe6lmJQWlcUUy+xXylQ5OE7CQctCfm2hsWWoMpGo
 PnhY+UC+9Ctqvb4TyTzqjllEmEfwK219091+hW7epWIUXGuaWde1RCvwRUr7pFGD
 DN4/75M82aNrpG0ydjzLpqbOwj/h+YvQARdKTTS2/4iJCrDPd5O2rR4Yjkrt1lcq
 jrSSnqsCeYET+4JY4G4h3ZZsPRcLnnpcNrUcF9AwkRCe1ybr4npK8FGSd4AGWAOR
 J7/mZqs7Gne4DjbjIzwfntP8ak5AASPqJKEmwjAO7M8zD0/xm6Ovqbmo8Xte9Vxx
 ge4nBcAhoJJ8y1hiZNLOyy1d87WUiD84MiN/BuSK7UbeBVsfHvYWAcEsVGXFzaP0
 hXwLHxddmflU7gy6hTbQ/f9SQiaobphCaP3uM4fLIOzAn64EIELCDvRwlr6NakRW
 CbJWqMsNz/WblA/ZSlA3
 =tnCe
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.8-20161017' into staging

ppc patch queue 2016-10-17

Highlights:
    * Significant rework of how PCI IO windows are placed for the
      pseries machine type
    * A number of extra tests added for ppc
    * Other tests clean up / fixed
    * Some cleanups to the XICS interrupt controller in preparation
      for the 'powernv' machine type

A number of the test changes aren't strictly in ppc related code, but
are included via my tree because they're primarily focused on
improving test coverage for ppc.

# gpg: Signature made Mon 17 Oct 2016 03:42:41 BST
# gpg:                using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg:                 aka "David Gibson (kernel.org) <dwg@kernel.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-2.8-20161017:
  spapr: Improved placement of PCI host bridges in guest memory map
  spapr_pci: Add a 64-bit MMIO window
  spapr: Adjust placement of PCI host bridge to allow > 1TiB RAM
  spapr_pci: Delegate placement of PCI host bridges to machine type
  libqos: Limit spapr-pci to 32-bit MMIO for now
  libqos: Correct error in PCI hole sizing for spapr
  libqos: Isolate knowledge of spapr memory map to qpci_init_spapr()
  ppc/xics: Split ICS into ics-base and ics class
  ppc/xics: Make the ICSState a list
  spapr: fix inheritance chain for default machine options
  target-ppc: implement vexts[bh]2w and vexts[bhw]2d
  tests/boot-sector: Increase time-out to 90 seconds
  tests/boot-sector: Use mkstemp() to create a unique file name
  tests/boot-sector: Use minimum length for the Forth boot script
  qtest: ask endianness of the target in qtest_init()
  tests: minor cleanups in usb-hcd-uhci-test

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-10-17 12:59:54 +01:00
David Gibson 357d1e3bc7 spapr: Improved placement of PCI host bridges in guest memory map
Currently, the MMIO space for accessing PCI on pseries guests begins at
1 TiB in guest address space.  Each PCI host bridge (PHB) has a 64 GiB
chunk of address space in which it places its outbound PIO and 32-bit and
64-bit MMIO windows.

This scheme as several problems:
  - It limits guest RAM to 1 TiB (though we have a limited fix for this
    now)
  - It limits the total MMIO window to 64 GiB.  This is not always enough
    for some of the large nVidia GPGPU cards
  - Putting all the windows into a single 64 GiB area means that naturally
    aligning things within there will waste more address space.
In addition there was a miscalculation in some of the defaults, which meant
that the MMIO windows for each PHB actually slightly overran the 64 GiB
region for that PHB.  We got away without nasty consequences because
the overrun fit within an unused area at the beginning of the next PHB's
region, but it's not pretty.

This patch implements a new scheme which addresses those problems, and is
also closer to what bare metal hardware and pHyp guests generally use.

Because some guest versions (including most current distro kernels) can't
access PCI MMIO above 64 TiB, we put all the PCI windows between 32 TiB and
64 TiB.  This is broken into 1 TiB chunks.  The first 1 TiB contains the
PIO (64 kiB) and 32-bit MMIO (2 GiB) windows for all of the PHBs.  Each
subsequent TiB chunk contains a naturally aligned 64-bit MMIO window for
one PHB each.

This reduces the number of allowed PHBs (without full manual configuration
of all the windows) from 256 to 31, but this should still be plenty in
practice.

We also change some of the default window sizes for manually configured
PHBs to saner values.

Finally we adjust some tests and libqos so that it correctly uses the new
default locations.  Ideally it would parse the device tree given to the
guest, but that's a more complex problem for another time.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2016-10-16 12:04:15 +11:00
David Gibson daa2369903 spapr_pci: Add a 64-bit MMIO window
On real hardware, and under pHyp, the PCI host bridges on Power machines
typically advertise two outbound MMIO windows from the guest's physical
memory space to PCI memory space:
  - A 32-bit window which maps onto 2GiB..4GiB in the PCI address space
  - A 64-bit window which maps onto a large region somewhere high in PCI
    address space (traditionally this used an identity mapping from guest
    physical address to PCI address, but that's not always the case)

The qemu implementation in spapr-pci-host-bridge, however, only supports a
single outbound MMIO window, however.  At least some Linux versions expect
the two windows however, so we arranged this window to map onto the PCI
memory space from 2 GiB..~64 GiB, then advertised it as two contiguous
windows, the "32-bit" window from 2G..4G and the "64-bit" window from
4G..~64G.

This approach means, however, that the 64G window is not naturally aligned.
In turn this limits the size of the largest BAR we can map (which does have
to be naturally aligned) to roughly half of the total window.  With some
large nVidia GPGPU cards which have huge memory BARs, this is starting to
be a problem.

This patch adds true support for separate 32-bit and 64-bit outbound MMIO
windows to the spapr-pci-host-bridge implementation, each of which can
be independently configured.  The 32-bit window always maps to 2G.. in PCI
space, but the PCI address of the 64-bit window can be configured (it
defaults to the same as the guest physical address).

So as not to break possible existing configurations, as long as a 64-bit
window is not specified, a large single window can be specified.  This
will appear the same way to the guest as the old approach, although it's
now implemented by two contiguous memory regions rather than a single one.

For now, this only adds the possibility of 64-bit windows.  The default
configuration still uses the legacy mode.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2016-10-16 12:03:09 +11:00
David Gibson 6737d9ad79 spapr_pci: Delegate placement of PCI host bridges to machine type
The 'spapr-pci-host-bridge' represents the virtual PCI host bridge (PHB)
for a PAPR guest.  Unlike on x86, it's routine on Power (both bare metal
and PAPR guests) to have numerous independent PHBs, each controlling a
separate PCI domain.

There are two ways of configuring the spapr-pci-host-bridge device: first
it can be done fully manually, specifying the locations and sizes of all
the IO windows.  This gives the most control, but is very awkward with 6
mandatory parameters.  Alternatively just an "index" can be specified
which essentially selects from an array of predefined PHB locations.
The PHB at index 0 is automatically created as the default PHB.

The current set of default locations causes some problems for guests with
large RAM (> 1 TiB) or PCI devices with very large BARs (e.g. big nVidia
GPGPU cards via VFIO).  Obviously, for migration we can only change the
locations on a new machine type, however.

This is awkward, because the placement is currently decided within the
spapr-pci-host-bridge code, so it breaks abstraction to look inside the
machine type version.

So, this patch delegates the "default mode" PHB placement from the
spapr-pci-host-bridge device back to the machine type via a public method
in sPAPRMachineClass.  It's still a bit ugly, but it's about the best we
can do.

For now, this just changes where the calculation is done.  It doesn't
change the actual location of the host bridges, or any other behaviour.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2016-10-16 12:03:09 +11:00
Benjamin Herrenschmidt d4d7a59a7a ppc/xics: Split ICS into ics-base and ics class
The existing implementation remains same and ics-base is introduced. The
type name "ics" is retained, and all the related functions renamed as
ics_simple_*

This will allow different implementations for the source controllers
such as the MSI support of PHB3 on Power8 which uses in-memory state
tables for example.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
[ clg: added ICS_BASE_GET_CLASS and related fixes, based on :
       http://patchwork.ozlabs.org/patch/646010/ ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-14 16:31:02 +11:00
Benjamin Herrenschmidt cc706a5305 ppc/xics: Make the ICSState a list
Instead of an array of fixed sized blocks, use a list, as we will need
to have sources with variable number of interrupts. SPAPR only uses
a single entry. Native will create more. If performance becomes an
issue we can add some hashed lookup but for now this will do fine.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[ move the initialization of list to xics_common_initfn,
  restore xirr_owner after migration and move restoring to
  icp_post_load]
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
[ clg: removed the icp_post_load() changes from nikunj patchset v3:
       http://patchwork.ozlabs.org/patch/646008/ ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-14 16:31:02 +11:00
Ashijeet Acharya 2ff3025797 migrate: move max-bandwidth and downtime-limit to migrate_set_parameter
Mark the old commands 'migrate_set_speed' and 'migrate_set_downtime' as
deprecated.
Move max-bandwidth and downtime-limit into migrate-set-parameters for
setting maximum migration speed and expected downtime limit parameters
respectively.
Change downtime units to milliseconds (only for new-command) and set
its upper bound limit to 2000 seconds.
Update the query part in both hmp and qmp qemu control interfaces.

Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2016-10-13 17:23:53 +02:00
Dr. David Alan Gilbert 863e9621c5 RAMBlocks: Store page size
Store the page size in each RAMBlock, we need it later.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2016-10-13 17:23:53 +02:00
Marc-André Lureau 692d88b408 Revert "char: use a fixed idx for child muxed chr"
That commit mis-used mux char: the frontend are multiplexed, not the
backend. Fix the regression preventing "c-a c" to switch the focus. The
following patches will fix the crash (when leaving or removing frontend)
by tracking frontends with handler tags.

This reverts commit 949055a254.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-10-13 13:56:31 +01:00