Commit Graph

63 Commits

Author SHA1 Message Date
Richard Henderson
97e03465f7 accel/tcg: Move qemu_ram_addr_from_host_nofail to physmem.c
The base qemu_ram_addr_from_host function is already in
softmmu/physmem.c; move the nofail version to be adjacent.

Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Tested-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-09-06 08:04:26 +01:00
Thomas Huth
90de559a66 softmmu/physmem: Remove the ifdef __linux__ around the pagesize functions
Now that host_memory_backend_pagesize() is not depending on the hugetlb
memory path handling anymore, we can also remove the #ifdef and the
TOCTTOU comment from the calling functions - the code should now work
equally well on all host architectures.

Message-Id: <20220810125720.3849835-3-thuth@redhat.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Acked-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2022-08-26 13:34:10 +02:00
Richard Henderson
418ade7849 softmmu: Always initialize xlat in address_space_translate_for_iotlb
The bug is an uninitialized memory read, along the translate_fail
path, which results in garbage being read from iotlb_to_section,
which can lead to a crash in io_readx/io_writex.

The bug may be fixed by writing any value with zero
in ~TARGET_PAGE_MASK, so that the call to iotlb_to_section using
the xlat'ed address returns io_mem_unassigned, as desired by the
translate_fail path.

It is most useful to record the original physical page address,
which will eventually be logged by memory_region_access_valid
when the access is rejected by unassigned_mem_accepts.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1065
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20220621153829.366423-1-richard.henderson@linaro.org>
2022-06-21 09:26:16 -07:00
Jagannathan Raman
3123f93d6b vfio-user: handle PCI BAR accesses
Determine the BARs used by the PCI device and register handlers to
manage the access to the same.

Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 3373e10b5be5f42846f0632d4382466e1698c505.1655151679.git.jag.raman@oracle.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-06-15 16:43:42 +01:00
Marc-André Lureau
ec5f7ca857 include: move target page bits declaration to page-vary.h
Since the implementation unit is page-vary.c.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220323155743.1585078-24-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-04-06 14:31:43 +02:00
Marc-André Lureau
8e3b0cbb72 Replace qemu_real_host_page variables with inlined functions
Replace the global variables with inlined helper functions. getpagesize() is very
likely annotated with a "const" function attribute (at least with glibc), and thus
optimization should apply even better.

This avoids the need for a constructor initialization too.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20220323155743.1585078-12-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-04-06 10:50:38 +02:00
Philippe Mathieu-Daudé
3ab6fdc91b softmmu/physmem: Introduce MemTxAttrs::memory field and MEMTX_ACCESS_ERROR
Add the 'memory' bit to the memory attributes to restrict bus
controller accesses to memories.

Introduce flatview_access_allowed() to check bus permission
before running any bus transaction.

Have read/write accessors return MEMTX_ACCESS_ERROR if an access is
restricted.

There is no change for the default case where 'memory' is not set.

Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20211215182421.418374-4-philmd@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
[thuth: Replaced MEMTX_BUS_ERROR with MEMTX_ACCESS_ERROR, remove "inline"]
Signed-off-by: Thomas Huth <thuth@redhat.com>
2022-03-21 10:10:58 +01:00
Philippe Mathieu-Daudé
58e74682ba softmmu/physmem: Simplify flatview_write and address_space_access_valid
Remove unuseful local 'result' variables.

Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20211215182421.418374-3-philmd@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2022-03-21 10:03:30 +01:00
Peter Maydell
9740b907a5 target-arm queue:
* cleanups of qemu_oom_check() and qemu_memalign()
  * target/arm/translate-neon: UNDEF if VLD1/VST1 stride bits are non-zero
  * target/arm/translate-neon: Simplify align field check for VLD3
  * GICv3 ITS: add more trace events
  * GICv3 ITS: implement 8-byte accesses properly
  * GICv3: fix minor issues with some trace/log messages
  * ui/cocoa: Use the standard about panel
  * target/arm: Provide cpu property for controling FEAT_LPA2
  * hw/arm/virt: Disable LPA2 for -machine virt-6.2
 -----BEGIN PGP SIGNATURE-----
 
 iQJNBAABCAA3FiEE4aXFk81BneKOgxXPPCUl7RQ2DN4FAmImNs4ZHHBldGVyLm1h
 eWRlbGxAbGluYXJvLm9yZwAKCRA8JSXtFDYM3q87D/0cMQeF00uVRNqftrQg2SDI
 txJIG2QYUOPMCDfGWlGTfXv2TUc5y3XwA77C9vTcJcIWJlZ30DUa95DNYqA0BbOH
 TEOzRuZME64wA/JndHadz7oh+xb3HYn+6aSr63LeQCI3/h1eXVHknnEcyF1danOb
 YNB1T308THTEwJHQuKHYksIasgVwcjOf8FvMRYFozVkAKEx1SlabpFXST+aVNyx4
 ASsC2PTiJYAqwnYrTX8lWOYKMiKfkNrQcTd6x7rkoDw1pV7ZDMw2/69tpkhdJ5Fa
 lwxhwZ3+40x49eFGAhfuZWZmGLd4c+76u64pmWW429uk1JhaoXgErJM3xfHbI1er
 d7XSQYkMhDrY5SFuoE5XYwOuxanPtn3f7luM236Uzgf4ZR6qTrf6x+R1xLPZVYa9
 fWbjvR3g5sltTOzyc+9UsBq1OPCbRUbmhJtJDvojj5sWmNvgOwZnSkTu5kMAqvFP
 T2cQIi6phRBo3oMN/fhEZi3g828JjYEA9QlpWZ74JOyiXjYUq9VVNpoe/dtAv4Yy
 wZ+XhVNIK82/4Mxjr9SEeYeNzYrsEEvFAUqe9Bil2CpuIMV5ONEzs+UfQ/gyk4eq
 QnGPiojCrpf6PPAfci0Y6b4RzO+loMFpLjCpurngB4g4cBdmThKip0sVZdTZAI9Y
 lnusB8MR1sESoqYdPZsAfQ==
 =ix0J
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20220307' into staging

target-arm queue:
 * cleanups of qemu_oom_check() and qemu_memalign()
 * target/arm/translate-neon: UNDEF if VLD1/VST1 stride bits are non-zero
 * target/arm/translate-neon: Simplify align field check for VLD3
 * GICv3 ITS: add more trace events
 * GICv3 ITS: implement 8-byte accesses properly
 * GICv3: fix minor issues with some trace/log messages
 * ui/cocoa: Use the standard about panel
 * target/arm: Provide cpu property for controling FEAT_LPA2
 * hw/arm/virt: Disable LPA2 for -machine virt-6.2

# gpg: Signature made Mon 07 Mar 2022 16:46:06 GMT
# gpg:                using RSA key E1A5C593CD419DE28E8315CF3C2525ED14360CDE
# gpg:                issuer "peter.maydell@linaro.org"
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" [ultimate]
# gpg:                 aka "Peter Maydell <pmaydell@gmail.com>" [ultimate]
# gpg:                 aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" [ultimate]
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83  15CF 3C25 25ED 1436 0CDE

* remotes/pmaydell/tags/pull-target-arm-20220307:
  hw/arm/virt: Disable LPA2 for -machine virt-6.2
  target/arm: Provide cpu property for controling FEAT_LPA2
  ui/cocoa: Use the standard about panel
  hw/intc/arm_gicv3_cpuif: Fix register names in ICV_HPPIR read trace event
  hw/intc/arm_gicv3: Fix missing spaces in error log messages
  hw/intc/arm_gicv3: Specify valid and impl in MemoryRegionOps
  hw/intc/arm_gicv3_its: Add trace events for table reads and writes
  hw/intc/arm_gicv3_its: Add trace events for commands
  target/arm/translate-neon: Simplify align field check for VLD3
  target/arm/translate-neon: UNDEF if VLD1/VST1 stride bits are non-zero
  osdep: Move memalign-related functions to their own header
  util: Put qemu_vfree() in memalign.c
  util: Use meson checks for valloc() and memalign() presence
  util: Share qemu_try_memalign() implementation between POSIX and Windows
  meson.build: Don't misdetect posix_memalign() on Windows
  util: Return valid allocation for qemu_try_memalign() with zero size
  util: Unify implementations of qemu_memalign()
  util: Make qemu_oom_check() a static function

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2022-03-08 15:26:10 +00:00
Peter Maydell
5df022cf2e osdep: Move memalign-related functions to their own header
Move the various memalign-related functions out of osdep.h and into
their own header, which we include only where they are used.
While we're doing this, add some brief documentation comments.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20220226180723.1706285-10-peter.maydell@linaro.org
2022-03-07 13:16:49 +00:00
Philippe Mathieu-Daudé
464868a343 softmmu/physmem: Remove unnecessary include
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20220207075426.81934-14-f4bug@amsat.org>
2022-03-06 13:15:42 +01:00
Philippe Mathieu-Daudé
73842ef04a exec: Make cpu_memory_rw_debug() target agnostic
cpu_memory_rw_debug() is declared in "exec/cpu-all.h" which
contains target-specific declarations. To be able to use it
from target agnostic source, move the declaration to the
generic "exec/cpu-common.h" header.

Replace the target-specific 'target_ulong' type by 'vaddr'
which better reflects the argument type, and is target agnostic.

Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20220207075426.81934-5-f4bug@amsat.org>
2022-03-06 13:15:42 +01:00
Peter Maydell
b85ea5fa2f include: Move qemu_madvise() and related #defines to new qemu/madvise.h
The function qemu_madvise() and the QEMU_MADV_* constants associated
with it are used in only 10 files.  Move them out of osdep.h to a new
qemu/madvise.h header that is included where it is needed.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220208200856.3558249-2-peter.maydell@linaro.org
2022-02-21 13:30:20 +00:00
Philippe Mathieu-Daudé
75f01c68b5 exec/memory: Extract address_space_set() from dma_memory_set()
dma_memory_set() does a DMA barrier, set the address space with
a constant value. The constant value filling code is not specific
to DMA and can be used for AddressSpace. Extract it as a new
helper: address_space_set().

Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
[lv: rebase]
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <20220115203725.3834712-2-laurent@vivier.eu>
2022-01-20 09:09:37 +01:00
Alex Bennée
aff0e204cb accel/tcg: suppress IRQ check for special TBs
When we set cpu->cflags_next_tb it is because we want to carefully
control the execution of the next TB. Currently there is a race that
causes the second stage of watchpoint handling to get ignored if an
IRQ is processed before we finish executing the instruction that
triggers the watchpoint. Use the new CF_NOIRQ facility to avoid the
race.

We also suppress IRQs when handling precise self modifying code to
avoid unnecessary bouncing.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Cc: Pavel Dovgalyuk <pavel.dovgalyuk@ispras.ru>
Fixes: https://gitlab.com/qemu-project/qemu/-/issues/245
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211129140932.4115115-3-alex.bennee@linaro.org>
2021-11-29 15:12:37 +00:00
Daniel P. Berrangé
ca411b7c8a qapi: introduce x-query-ramblock QMP command
This is a counterpart to the HMP "info ramblock" command. It is being
added with an "x-" prefix because this QMP command is intended as an
adhoc debugging tool and will thus not be modelled in QAPI as fully
structured data, nor will it have long term guaranteed stability.
The existing HMP command is rewritten to call the QMP command.

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2021-11-02 15:55:14 +00:00
Pavel Dovgalyuk
efd629fb21 softmmu: fix for "after access" watchpoints
Watchpoints that should fire after the memory access
break an execution of the current block, try to
translate current instruction into the separate block,
which then causes debug interrupt.
But cpu_interrupt can't be called in such block when
icount is enabled, because interrupts muse be allowed
explicitly.
This patch sets CF_LAST_IO flag for retranslated block,
allowing interrupt request for the last instruction.

Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <163542169727.2127597.8141772572696627329.stgit@pasha-ThinkPad-X280>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-10-28 20:55:07 -07:00
Pavel Dovgalyuk
1ab0ba8ab5 softmmu: remove useless condition in watchpoint check
cpu_check_watchpoint function checks cpu->watchpoint_hit at the entry.
But then it also does the same in the middle of the function,
while this field can't change.
That is why this patch removes this useless condition.

Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <163542169094.2127597.8801843697434113110.stgit@pasha-ThinkPad-X280>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-10-28 20:55:07 -07:00
Pavel Dovgalyuk
9f660c077b softmmu: fix watchpoint processing in icount mode
Watchpoint processing code restores vCPU state twice:
in tb_check_watchpoint and in cpu_loop_exit_restore/cpu_restore_state.
Normally it does not affect anything, but in icount mode instruction
counter is incremented twice and becomes incorrect.
This patch eliminates unneeded CPU state restore.

Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <163542168516.2127597.8781375223437124644.stgit@pasha-ThinkPad-X280>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-10-28 20:55:07 -07:00
Greg Kurz
f18d403f15 softmmu/physmem.c: Fix typo in comment
Fix the comment to match what the code is doing, as explained in
the changelog of commit 86cf9e1546
that introduced the change:

    Commit 9458a9a1df added synchronization
    of vCPU and migration operations through calling run_on_cpu operation.
    However, in replay mode this synchronization is unneeded, because
    I/O and vCPU threads are already synchronized.
    This patch disables such synchronization for record/replay mode.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-Id: <163429018454.1146856.3429437540871060739.stgit@bahia.huguette>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2021-10-23 18:50:19 +02:00
Peter Xu
142518bda5 memory: Name all the memory listeners
Provide a name field for all the memory listeners.  It can be used to identify
which memory listener is which.

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210817013553.30584-2-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-30 15:30:24 +02:00
Sean Christopherson
56918a126a memory: Add RAM_PROTECTED flag to skip IOMMU mappings
Add a new RAMBlock flag to denote "protected" memory, i.e. memory that
looks and acts like RAM but is inaccessible via normal mechanisms,
including DMA.  Use the flag to skip protected memory regions when
mapping RAM for DMA in VFIO.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-30 14:50:19 +02:00
Peter Maydell
8efdb7ba1b softmmu/physmem.c: Check return value from realpath()
The realpath() function can return NULL on error, so we need to check
for it to avoid crashing when we try to strstr() into it.
This can happen if we run out of memory, or if /sys/ is not mounted,
among other situations.

Fixes: Coverity 1459913, 1460474
Fixes: ce317be98d ("exec: fetch the alignment of Linux devdax pmem character device nodes")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Jingqi Liu <jingqi.liu@intel.com>
Message-id: 20210812151525.31456-1-peter.maydell@linaro.org
2021-08-26 17:02:00 +01:00
Peter Maydell
8f1bdb0ea1 softmmu/physmem.c: Remove unneeded NULL check in qemu_ram_alloc_from_fd()
In the alignment check added to qemu_ram_alloc_from_fd() in commit
ce317be98d, the condition includes a check that 'mr' is not
NULL.  This check is unnecessary because we can assume that the
caller always passes us a valid MemoryRegion, and indeed later in the
function we assume mr is not NULL when we pass it to file_ram_alloc()
as new_block->mr.  Remove it.

Fixes: Coverity 1459867
Fixes: ce317be98d ("exec: fetch the alignment of Linux devdax pmem character device nodes")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Jingqi Liu <jingqi.liu@intel.com>
Message-id: 20210812150624.29139-1-peter.maydell@linaro.org
2021-08-26 17:02:00 +01:00
David Hildenbrand
1c4c685936 softmmu/physmem: fix wrong assertion in qemu_ram_alloc_internal()
When adding RAM_NORESERVE, we forgot to remove the old assertion when
adding the updated one, most probably when reworking the patches or
rebasing. We can easily crash QEMU by adding
  -object memory-backend-ram,id=mem0,size=500G,reserve=off
to the QEMU cmdline:
  qemu-system-x86_64: ../softmmu/physmem.c:2146: qemu_ram_alloc_internal:
  Assertion `(ram_flags & ~(RAM_SHARED | RAM_RESIZEABLE | RAM_PREALLOC))
  == 0' failed.

Fix it by removing the old assertion.

Fixes: 8dbe22c686 ("memory: Introduce RAM_NORESERVE and wire it up in qemu_ram_mmap()")
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-id: 20210805092350.31195-1-david@redhat.com
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-08-17 16:51:39 +01:00
David Hildenbrand
7e6d32ebf7 softmmu/physmem: Extend ram_block_discard_(require|disable) by two discard types
We want to separate the two cases whereby we discard ram
- uncoordinated: e.g., virito-balloon
- coordinated: e.g., virtio-mem coordinated via the RamDiscardManager

Reviewed-by: Pankaj Gupta <pankaj.gupta@cloud.ionos.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Auger Eric <eric.auger@redhat.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: teawater <teawaterz@linux.alibaba.com>
Cc: Marek Kedzierski <mkedzier@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210413095531.25603-12-david@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2021-07-08 15:54:45 -04:00
David Hildenbrand
98da491dff softmmu/physmem: Don't use atomic operations in ram_block_discard_(disable|require)
We have users in migration context that don't hold the BQL (when
finishing migration). To prepare for further changes, use a dedicated mutex
instead of atomic operations. Keep using qatomic_read ("READ_ONCE") for the
functions that only extract the current state (e.g., used by
virtio-balloon), locking isn't necessary.

While at it, split up the counter into two variables to make it easier
to understand.

Suggested-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta@cloud.ionos.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Auger Eric <eric.auger@redhat.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: teawater <teawaterz@linux.alibaba.com>
Cc: Marek Kedzierski <mkedzier@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210413095531.25603-11-david@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2021-07-08 15:54:45 -04:00
David Hildenbrand
d94e0bc9ef util/mmap-alloc: Support RAM_NORESERVE via MAP_NORESERVE under Linux
Let's support RAM_NORESERVE via MAP_NORESERVE on Linux. The flag has no
effect on most shared mappings - except for hugetlbfs and anonymous memory.

Linux man page:
  "MAP_NORESERVE: Do not reserve swap space for this mapping. When swap
  space is reserved, one has the guarantee that it is possible to modify
  the mapping. When swap space is not reserved one might get SIGSEGV
  upon a write if no physical memory is available. See also the discussion
  of the file /proc/sys/vm/overcommit_memory in proc(5). In kernels before
  2.6, this flag had effect only for private writable mappings."

Note that the "guarantee" part is wrong with memory overcommit in Linux.

Also, in Linux hugetlbfs is treated differently - we configure reservation
of huge pages from the pool, not reservation of swap space (huge pages
cannot be swapped).

The rough behavior is [1]:
a) !Hugetlbfs:

  1) Without MAP_NORESERVE *or* with memory overcommit under Linux
     disabled ("/proc/sys/vm/overcommit_memory == 2"), the following
     accounting/reservation happens:
      For a file backed map
       SHARED or READ-only - 0 cost (the file is the map not swap)
       PRIVATE WRITABLE - size of mapping per instance

      For an anonymous or /dev/zero map
       SHARED   - size of mapping
       PRIVATE READ-only - 0 cost (but of little use)
       PRIVATE WRITABLE - size of mapping per instance

  2) With MAP_NORESERVE, no accounting/reservation happens.

b) Hugetlbfs:

  1) Without MAP_NORESERVE, huge pages are reserved.

  2) With MAP_NORESERVE, no huge pages are reserved.

Note: With "/proc/sys/vm/overcommit_memory == 0", we were already able
to configure it for !hugetlbfs globally; this toggle now allows
configuring it more fine-grained, not for the whole system.

The target use case is virtio-mem, which dynamically exposes memory
inside a large, sparse memory area to the VM.

[1] https://www.kernel.org/doc/Documentation/vm/overcommit-accounting

Reviewed-by: Peter Xu <peterx@redhat.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com> for memory backend and machine core
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210510114328.21835-10-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-15 20:27:38 +02:00
David Hildenbrand
8dbe22c686 memory: Introduce RAM_NORESERVE and wire it up in qemu_ram_mmap()
Let's introduce RAM_NORESERVE, allowing mmap'ing with MAP_NORESERVE. The
new flag has the following semantics:

"
RAM is mmap-ed with MAP_NORESERVE. When set, reserving swap space (or huge
pages if applicable) is skipped: will bail out if not supported. When not
set, the OS will do the reservation, if supported for the memory type.
"

Allow passing it into:
- memory_region_init_ram_nomigrate()
- memory_region_init_resizeable_ram()
- memory_region_init_ram_from_file()

... and teach qemu_ram_mmap() and qemu_anon_ram_alloc() about the flag.
Bail out if the flag is not supported, which is the case right now for
both, POSIX and win32. We will add Linux support next and allow specifying
RAM_NORESERVE via memory backends.

The target use case is virtio-mem, which dynamically exposes memory
inside a large, sparse memory area to the VM.

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com> for memory backend and machine core
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210510114328.21835-9-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-15 20:27:38 +02:00
David Hildenbrand
b444f5c079 util/mmap-alloc: Pass flags instead of separate bools to qemu_ram_mmap()
Let's pass flags instead of bools to prepare for passing other flags and
update the documentation of qemu_ram_mmap(). Introduce new QEMU_MAP_
flags that abstract the mmap() PROT_ and MAP_ flag handling and simplify
it.

We expose only flags that are currently supported by qemu_ram_mmap().
Maybe, we'll see qemu_mmap() in the future as well that can implement these
flags.

Note: We don't use MAP_ flags as some flags (e.g., MAP_SYNC) are only
defined for some systems and we want to always be able to identify
these flags reliably inside qemu_ram_mmap() -- for example, to properly
warn when some future flags are not available or effective on a system.
Also, this way we can simplify PROT_ handling as well.

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com> for memory backend and machine core
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210510114328.21835-8-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-15 20:27:38 +02:00
David Hildenbrand
ebef62d0e5 softmmu/memory: Pass ram_flags to qemu_ram_alloc() and qemu_ram_alloc_internal()
Let's pass ram_flags to qemu_ram_alloc() and qemu_ram_alloc_internal(),
preparing for passing additional flags.

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com> for memory backend and machine core
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210510114328.21835-7-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-15 20:27:38 +02:00
David Hildenbrand
dbb92eea38 softmmu/physmem: Fix qemu_ram_remap() to handle shared anonymous memory
RAM_SHARED now also properly indicates shared anonymous memory. Let's check
that flag for anonymous memory as well, to restore the proper mapping.

Fixes: 06329ccecf ("mem: add share parameter to memory-backend-ram")
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210406080126.24010-4-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-15 20:27:37 +02:00
David Hildenbrand
cdfa56c551 softmmu/physmem: Fix ram_block_discard_range() to handle shared anonymous memory
We can create shared anonymous memory via
    "-object memory-backend-ram,share=on,..."
which is, for example, required by PVRDMA for mremap() to work.

Shared anonymous memory is weird, though. Instead of MADV_DONTNEED, we
have to use MADV_REMOVE: MADV_DONTNEED will only remove / zap all
relevant page table entries of the current process, the backend storage
will not get removed, resulting in no reduced memory consumption and
a repopulation of previous content on next access.

Shared anonymous memory is internally really just shmem, but without a
fd exposed. As we cannot use fallocate() without the fd to discard the
backing storage, MADV_REMOVE gets the same job done without a fd as
documented in "man 2 madvise". Removing backing storage implicitly
invalidates all page table entries with relevant mappings - an additional
MADV_DONTNEED is not required.

Fixes: 06329ccecf ("mem: add share parameter to memory-backend-ram")
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210406080126.24010-3-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-15 20:27:37 +02:00
David Hildenbrand
7ce18ca025 softmmu/physmem: Mark shared anonymous memory RAM_SHARED
Let's drop the "shared" parameter from ram_block_add() and properly
store it in the flags of the ram block instead, such that
qemu_ram_is_shared() properly succeeds on all ram blocks that were mapped
MAP_SHARED.

We'll use this information next to fix some cases with shared anonymous
memory.

Reviewed-by: Igor Kotrasinski <i.kotrasinsk@partner.samsung.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210406080126.24010-2-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-15 17:17:09 +02:00
Pavel Dovgalyuk
57dcb643d7 replay: fix watchpoint processing for reverse debugging
This patch enables reverse debugging with watchpoints.
Reverse continue scans the execution to find the breakpoints
and watchpoints that should fire. It uses helper function
replay_breakpoint() for that. But this function needs to access
icount, which can't be correct in the middle of TB.
Therefore, in case of watchpoint, we have to retranslate the block
to allow this access.

Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru>
Message-Id: <162072430303.827403.7379783546934958566.stgit@pasha-ThinkPad-X280>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-05-26 15:33:59 -07:00
David Hildenbrand
dcdc460767 exec: Relax range check in ram_block_discard_range()
We want to make use of ram_block_discard_range() in the RAM block resize
callback when growing a RAM block, *before* used_length is changed.
Let's relax the check. As RAM blocks always mmap the whole max_length area,
we cannot corrupt unrelated data.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210429112708.12291-6-david@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-05-13 18:21:13 +01:00
David Hildenbrand
c7c0e72408 migration/ram: Handle RAM block resizes during precopy
Resizing while migrating is dangerous and does not work as expected.
The whole migration code works on the usable_length of ram blocks and does
not expect this to change at random points in time.

In the case of precopy, the ram block size must not change on the source,
after syncing the RAM block list in ram_save_setup(), so as long as the
guest is still running on the source.

Resizing can be trigger *after* (but not during) a reset in
ACPI code by the guest
- hw/arm/virt-acpi-build.c:acpi_ram_update()
- hw/i386/acpi-build.c:acpi_ram_update()

Use the ram block notifier to get notified about resizes. Let's simply
cancel migration and indicate the reason. We'll continue running on the
source. No harm done.

Update the documentation. Postcopy will be handled separately.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210429112708.12291-5-david@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
  Manual merge
2021-05-13 18:21:13 +01:00
David Hildenbrand
8f44304c76 numa: Teach ram block notifiers about resizeable ram blocks
Ram block notifiers are currently not aware of resizes. To properly
handle resizes during migration, we want to teach ram block notifiers about
resizeable ram.

Introduce the basic infrastructure but keep using max_size in the
existing notifiers. Supply the max_size when adding and removing ram
blocks. Also, notify on resizes.

Acked-by: Paul Durrant <paul@xen.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: xen-devel@lists.xenproject.org
Cc: haxm-team@intel.com
Cc: Paul Durrant <paul@xen.org>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Anthony Perard <anthony.perard@citrix.com>
Cc: Wenchao Wang <wenchao.wang@intel.com>
Cc: Colin Xu <colin.xu@intel.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210429112708.12291-3-david@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-05-13 18:21:13 +01:00
David Hildenbrand
082851a3af util: vfio-helpers: Factor out and fix processing of existing ram blocks
Factor it out into common code when a new notifier is registered, just
as done with the memory region notifier. This keeps logic about how to
process existing ram blocks at a central place.

Just like when adding a new ram block, we have to register the max_length.
Ram blocks are only "fake resized". All memory (max_length) is mapped.

Print the warning from inside qemu_vfio_ram_block_added().

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210429112708.12291-2-david@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-05-13 18:21:13 +01:00
Thomas Huth
ee86213aa3 Do not include exec/address-spaces.h if it's not really necessary
Stop including exec/address-spaces.h in files that don't need it.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20210416171314.2074665-5-thuth@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2021-05-02 17:24:51 +02:00
Thomas Huth
2068cabd3f Do not include cpu.h if it's not really necessary
Stop including cpu.h in files that don't need it.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20210416171314.2074665-4-thuth@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2021-05-02 17:24:51 +02:00
Thomas Huth
4c386f8064 Do not include sysemu/sysemu.h if it's not really necessary
Stop including sysemu/sysemu.h in files that don't need it.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20210416171314.2074665-2-thuth@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2021-05-02 17:24:50 +02:00
Peter Maydell
56b89f4558 * add --enable/--disable-libgio to configure (Denis)
* small fixes (Pavel, myself)
 * fuzzing update (Alexander)
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmBQ+U4UHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroNAuAf8DO6soVd8Mtr+a/acTzkoquNfoZPZ
 Xyfi8kvkSfhcPnUObuTfqalzOiP2Gqlddqvtzkh86CGNriaGFc2Wutd708/84GDe
 fh4NmA9aYieo4sn/3PpZOjoqwO4FtV7yAHijRkgA9aYJnG6ijDByup6FCHqTX42z
 jKrHa0ldk41Klj9Z03/yJmIcXTACg1/2fRn2h4W6MVRpbWw4CCwdftA5Id+x0lmh
 JrKsRrdokt4kZG2nIXLJF/eI9QRQMVh1fB5kY9YiG8kHEjMC85IN+YFuDbD8nonp
 PN1DMsTz3Kl/BgnDMeio945TeaqhW3o8jRwd4Ys9K0hRGNrKdPGaiTS6lw==
 =RPSp
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini-gitlab/tags/for-upstream' into staging

* add --enable/--disable-libgio to configure (Denis)
* small fixes (Pavel, myself)
* fuzzing update (Alexander)

# gpg: Signature made Tue 16 Mar 2021 18:30:38 GMT
# gpg:                using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg:                issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini-gitlab/tags/for-upstream:
  qemu-timer: allow freeing a NULL timer
  hw/i8254: fix vmstate load
  scsi: fix sense code for EREMOTEIO
  Revert "accel: kvm: Add aligment assert for kvm_log_clear_one_slot"
  configure: add option to explicitly enable/disable libgio
  fuzz: move some DMA hooks
  fuzz: configure a sparse-mem device, by default
  memory: add a sparse memory device for fuzzing
  fuzz: add a am53c974 generic-fuzzer config
  fuzz: add instructions for building reproducers
  fuzz: add a script to build reproducers
  fuzz: don't leave orphan llvm-symbolizers around
  fuzz: fix the pro100 generic-fuzzer config
  MAINTAINERS: Cover fuzzer reproducer tests within 'Device Fuzzing'
  tests/qtest: Only run fuzz-virtio-scsi when virtio-scsi is available
  tests/qtest: Only run fuzz-megasas-test if megasas device is available

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-03-17 22:18:54 +00:00
Alexander Bulekov
7cac7fea70 fuzz: move some DMA hooks
For the sparse-mem device, we want the fuzzer to populate entire DMA
reads from sparse-mem, rather than hooking into the individual MMIO
memory_region_dispatch_read operations. Otherwise, the fuzzer will treat
each sequential read separately (and populate it with a separate
pattern). Work around this by rearranging some DMA hooks. Since the
fuzzer has it's own logic to skip accidentally writing to MMIO regions,
we can call the DMA cb, outside the flatview_translate loop.

Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-03-16 14:30:30 -04:00
David Hildenbrand
25459eb762 exec: Get rid of phys_mem_set_alloc()
As the last user is gone, we can get rid of phys_mem_set_alloc() and
simplify.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Halil Pasic <pasic@linux.ibm.com>
Cc: Cornelia Huck <cohuck@redhat.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Thomas Huth <thuth@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20210303130916.22553-3-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2021-03-15 11:01:23 +01:00
Alex Bennée
c0ae396a81 accel/tcg: move CF_CLUSTER calculation to curr_cflags
There is nothing special about this compile flag that doesn't mean we
can't just compute it with curr_cflags() which we should be using when
building a new set.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20210224165811.11567-3-alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-03-06 11:50:50 -08:00
Jagannathan Raman
44a4ff31c0 memory: alloc RAM from file at offset
Allow RAM MemoryRegion to be created from an offset in a file, instead
of allocating at offset of 0 by default. This is needed to synchronize
RAM between QEMU & remote process.

Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 609996697ad8617e3b01df38accc5c208c24d74e.1611938319.git.jag.raman@oracle.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2021-02-09 20:53:56 +00:00
Alexander Bulekov
fc1c8344e6 fuzz: ignore address_space_map is_write flag
We passed an is_write flag to the fuzz_dma_read_cb function to
differentiate between the mapped DMA regions that need to be populated
with fuzzed data, and those that don't. We simply passed through the
address_space_map is_write parameter. The goal was to cut down on
unnecessarily populating mapped DMA regions, when they are not read
from.

Unfortunately, nothing precludes code from reading from regions mapped
with is_write=true. For example, see:
https://lists.gnu.org/archive/html/qemu-devel/2021-01/msg04729.html

This patch removes the is_write parameter to fuzz_dma_read_cb. As a
result, we will fill all mapped DMA regions with fuzzed data, ignoring
the specified transfer direction.

Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-Id: <20210120060255.558535-1-alxndr@bu.edu>
2021-02-08 14:43:54 +01:00
Claudio Fontana
7827168471 cpu: tcg_ops: move to tcg-cpu-ops.h, keep a pointer in CPUClass
we cannot in principle make the TCG Operations field definitions
conditional on CONFIG_TCG in code that is included by both common_ss
and specific_ss modules.

Therefore, what we can do safely to restrict the TCG fields to TCG-only
builds, is to move all tcg cpu operations into a separate header file,
which is only included by TCG, target-specific code.

This leaves just a NULL pointer in the cpu.h for the non-TCG builds.

This also tidies up the code in all targets a bit, having all TCG cpu
operations neatly contained by a dedicated data struct.

Signed-off-by: Claudio Fontana <cfontana@suse.de>
Message-Id: <20210204163931.7358-16-cfontana@suse.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-02-05 10:24:15 -10:00
Claudio Fontana
c73bdb35a9 cpu: move debug_check_watchpoint to tcg_ops
commit 568496c0c0 ("cpu: Add callback to check architectural") and
commit 3826121d92 ("target-arm: Implement checking of fired")
introduced an ARM-specific hack for cpu_check_watchpoint.

Make debug_check_watchpoint optional, and move it to tcg_ops.

Signed-off-by: Claudio Fontana <cfontana@suse.de>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20210204163931.7358-15-cfontana@suse.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-02-05 10:24:14 -10:00