Commit Graph

103 Commits

Author SHA1 Message Date
Markus Armbruster 7a309cc95b qom: Change object_get_canonical_path_component() not to malloc
object_get_canonical_path_component() returns a malloced copy of a
property name on success, null on failure.

19 of its 25 callers immediately free the returned copy.

Change object_get_canonical_path_component() to return the property
name directly.  Since modifying the name would be wrong, adjust the
return type to const char *.

Drop the free from the 19 callers become simpler, add the g_strdup()
to the other six.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20200714160202.3121879-4-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
2020-07-21 16:23:43 +02:00
Roman Kagan c56ee92fcb block: consolidate blocksize properties consistency checks
Several block device properties related to blocksize configuration must
be in certain relationship WRT each other: physical block must be no
smaller than logical block; min_io_size, opt_io_size, and
discard_granularity must be a multiple of a logical block.

To ensure these requirements are met, add corresponding consistency
checks to blkconf_blocksizes, adjusting its signature to communicate
possible error to the caller.  Also remove the now redundant consistency
checks from the specific devices.

Signed-off-by: Roman Kagan <rvkagan@yandex-team.ru>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Message-Id: <20200528225516.1676602-3-rvkagan@yandex-team.ru>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-06-17 14:53:40 +02:00
Klaus Jensen 1c0c2163aa hw/block/nvme: verify msix_init_exclusive_bar() return value
Pass an Error to msix_init_exclusive_bar() and check it.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Message-Id: <20200609190333.59390-23-its@irrelevant.dk>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-06-17 14:53:40 +02:00
Klaus Jensen 6a25a4b42e hw/block/nvme: add msix_qsize parameter
Decouple the requested maximum number of ioqpairs (param max_ioqpairs)
from the number of MSI-X interrupt vectors by introducing a new
msix_qsize parameter and initialize MSI-X with that. This allows
emulating a device that has fewer vectors than I/O queue pairs and also
allows more than 2048 queue pairs. To keep the device behaving as
previously, use a msix_qsize default of 65 (default max_ioqpairs + 1).

This decoupling was actually suggested by Maxim some time ago in a
slightly different context, so adding a Suggested-by.

Suggested-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Message-Id: <20200609190333.59390-22-its@irrelevant.dk>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-06-17 14:53:40 +02:00
Philippe Mathieu-Daudé fbf2e5375e hw/block/nvme: Verify msix_vector_use() returned value
msix_vector_use() returns -EINVAL on error. Assert it won't.

Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20200609190333.59390-21-its@irrelevant.dk>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-06-17 14:53:40 +02:00
Klaus Jensen 945cb8f4c2 hw/block/nvme: factor out controller identify setup
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Message-Id: <20200609190333.59390-20-its@irrelevant.dk>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-06-17 14:53:40 +02:00
Klaus Jensen 0c35ad46b6 hw/block/nvme: do cmb/pmr init as part of pci init
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20200609190333.59390-19-its@irrelevant.dk>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-06-17 14:53:40 +02:00
Klaus Jensen 37712e00b1 hw/block/nvme: factor out pmr setup
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20200609190333.59390-18-its@irrelevant.dk>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-06-17 14:53:40 +02:00
Klaus Jensen 51ec094d40 hw/block/nvme: factor out cmb setup
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Message-Id: <20200609190333.59390-17-its@irrelevant.dk>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-06-17 14:53:40 +02:00
Klaus Jensen c3f5526d22 hw/block/nvme: factor out pci setup
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Message-Id: <20200609190333.59390-16-its@irrelevant.dk>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-06-17 14:53:40 +02:00
Klaus Jensen d634d74229 hw/block/nvme: factor out namespace setup
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Maxim Levitsky  <mlevitsk@redhat.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Message-Id: <20200609190333.59390-15-its@irrelevant.dk>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-06-17 14:53:40 +02:00
Klaus Jensen 3adee1c2d3 hw/block/nvme: add namespace helpers
Introduce some small helpers to make the next patches easier on the eye.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Message-Id: <20200609190333.59390-14-its@irrelevant.dk>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-06-17 14:53:40 +02:00
Klaus Jensen 90f4511543 hw/block/nvme: factor out block backend setup
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Message-Id: <20200609190333.59390-13-its@irrelevant.dk>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-06-17 14:53:40 +02:00
Klaus Jensen a17f50188b hw/block/nvme: factor out device state setup
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Message-Id: <20200609190333.59390-12-its@irrelevant.dk>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-06-17 14:53:40 +02:00
Klaus Jensen 54000c66f0 hw/block/nvme: factor out property/constraint checks
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Message-Id: <20200609190333.59390-11-its@irrelevant.dk>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-06-17 14:53:40 +02:00
Klaus Jensen e1731e816a hw/block/nvme: remove redundant cmbloc/cmbsz members
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Message-Id: <20200609190333.59390-10-its@irrelevant.dk>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-06-17 14:53:40 +02:00
Klaus Jensen dce22c8646 hw/block/nvme: add max_ioqpairs device parameter
The num_queues device paramater has a slightly confusing meaning because
it accounts for the admin queue pair which is not really optional.
Secondly, it is really a maximum value of queues allowed.

Add a new max_ioqpairs parameter that only accounts for I/O queue pairs,
but keep num_queues for compatibility.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Message-Id: <20200609190333.59390-9-its@irrelevant.dk>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-06-17 14:53:40 +02:00
Klaus Jensen ca247d3509 hw/block/nvme: fix pin-based interrupt behavior
First, since the device only supports MSI-X or pin-based interrupt, if
MSI-X is not enabled, it should not accept interrupt vectors different
from 0 when creating completion queues.

Secondly, the irq_status NvmeCtrl member is meant to be compared to the
INTMS register, so it should only be 32 bits wide. And it is really only
useful when used with multi-message MSI.

Third, since we do not force a 1-to-1 correspondence between cqid and
interrupt vector, the irq_status register should not have bits set
according to cqid, but according to the associated interrupt vector.

Fix these issues, but keep irq_status available so we can easily support
multi-message MSI down the line.

Fixes: 5e9aa92eb1 ("hw/block: Fix pin-based interrupt behaviour of NVMe")
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Message-Id: <20200609190333.59390-8-its@irrelevant.dk>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-06-17 14:53:40 +02:00
Klaus Jensen b4529c5c3a hw/block/nvme: refactor nvme_addr_read
Pull the controller memory buffer check to its own function. The check
will be used on its own in later patches.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Message-Id: <20200609190333.59390-7-its@irrelevant.dk>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-06-17 14:53:40 +02:00
Klaus Jensen 3e829fd438 hw/block/nvme: use constants in identify
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Message-Id: <20200609190333.59390-6-its@irrelevant.dk>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-06-17 14:53:40 +02:00
Klaus Jensen 1065abfbf1 hw/block/nvme: move device parameters to separate struct
Move device configuration parameters to separate struct to make it
explicit what is configurable and what is set internally.

Signed-off-by: Klaus Jensen <klaus.jensen@cnexlabs.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20200609190333.59390-5-its@irrelevant.dk>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-06-17 14:53:40 +02:00
Klaus Jensen 4920786ee6 hw/block/nvme: remove superfluous breaks
These break statements was left over when commit 3036a626e9 ("nvme:
add Get/Set Feature Timestamp support") was merged.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Message-Id: <20200609190333.59390-4-its@irrelevant.dk>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-06-17 14:53:40 +02:00
Klaus Jensen 6f4ee2e9aa hw/block/nvme: rename trace events to pci_nvme
Change the prefix of all nvme device related trace events to 'pci_nvme'
to not clash with trace events from the nvme block driver.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Message-Id: <20200609190333.59390-3-its@irrelevant.dk>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-06-17 14:53:39 +02:00
Klaus Jensen f7e8c23f39 hw/block/nvme: fix pci doorbell size calculation
The size of the BAR is 0x1000 (main registers) + 8 bytes for each
queue. Currently, the size of the BAR is calculated like so:

    n->reg_size = pow2ceil(0x1004 + 2 * (n->num_queues + 1) * 4);

Since the 'num_queues' parameter already accounts for the admin queue,
this should in any case not need to be incremented by one. Also, the
size should be initialized to (0x1000).

    n->reg_size = pow2ceil(0x1000 + 2 * n->num_queues * 4);

This, with the default value of num_queues (64), we will set aside room
for 1 admin queue and 63 I/O queues (4 bytes per doorbell, 2 doorbells
per queue).

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Message-Id: <20200609190333.59390-2-its@irrelevant.dk>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-06-17 14:53:39 +02:00
Philippe Mathieu-Daudé bc2a2364b8 hw/block: Let the NVMe emulated device be target-agnostic
Now than the non-target specific memory_region_msync() function
is available, use it to make this device target-agnostic.

Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20200508062456.23344-4-philmd@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-06-05 09:54:48 +01:00
Markus Armbruster 40c2281cc3 Drop more @errp parameters after previous commit
Several functions can't fail anymore: ich9_pm_add_properties(),
device_add_bootindex_property(), ppc_compat_add_property(),
spapr_caps_add_properties(), PropertyInfo.create().  Drop their @errp
parameter.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20200505152926.18877-16-armbru@redhat.com>
2020-05-15 07:08:14 +02:00
Andrzej Jakowski 6cf9413229 nvme: introduce PMR support from NVMe 1.4 spec
This patch introduces support for PMR that has been defined as part of NVMe 1.4
spec. User can now specify a pmrdev option that should point to HostMemoryBackend.
pmrdev memory region will subsequently be exposed as PCI BAR 2 in emulated NVMe
device. Guest OS can perform mmio read and writes to the PMR region that will stay
persistent across system reboot.

Signed-off-by: Andrzej Jakowski <andrzej.jakowski@linux.intel.com>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20200330164656.9348-1-andrzej.jakowski@linux.intel.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-04-30 17:51:07 +02:00
Marc-André Lureau 4f67d30b5e qdev: set properties with device_class_set_props()
The following patch will need to handle properties registration during
class_init time. Let's use a device_class_set_props() setter.

spatch --macro-file scripts/cocci-macro-file.h  --sp-file
./scripts/coccinelle/qdev-set-props.cocci --keep-comments --in-place
--dir .

@@
typedef DeviceClass;
DeviceClass *d;
expression val;
@@
- d->props = val
+ device_class_set_props(d, val)

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20200110153039.1379601-20-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-24 20:59:15 +01:00
Markus Armbruster a27bd6c779 Include hw/qdev-properties.h less
In my "build everything" tree, changing hw/qdev-properties.h triggers
a recompile of some 2700 out of 6600 objects (not counting tests and
objects that don't depend on qemu/osdep.h).

Many places including hw/qdev-properties.h (directly or via hw/qdev.h)
actually need only hw/qdev-core.h.  Include hw/qdev-core.h there
instead.

hw/qdev.h is actually pointless: all it does is include hw/qdev-core.h
and hw/qdev-properties.h, which in turn includes hw/qdev-core.h.
Replace the remaining uses of hw/qdev.h by hw/qdev-properties.h.

While there, delete a few superfluous inclusions of hw/qdev-core.h.

Touching hw/qdev-properties.h now recompiles some 1200 objects.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Daniel P. Berrangé" <berrange@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20190812052359.30071-22-armbru@redhat.com>
2019-08-16 13:31:53 +02:00
Markus Armbruster 650d103d3e Include hw/hw.h exactly where needed
In my "build everything" tree, changing hw/hw.h triggers a recompile
of some 2600 out of 6600 objects (not counting tests and objects that
don't depend on qemu/osdep.h).

The previous commits have left only the declaration of hw_error() in
hw/hw.h.  This permits dropping most of its inclusions.  Touching it
now recompiles less than 200 objects.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20190812052359.30071-19-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2019-08-16 13:31:52 +02:00
Markus Armbruster d645427057 Include migration/vmstate.h less
In my "build everything" tree, changing migration/vmstate.h triggers a
recompile of some 2700 out of 6600 objects (not counting tests and
objects that don't depend on qemu/osdep.h).

hw/hw.h supposedly includes it for convenience.  Several other headers
include it just to get VMStateDescription.  The previous commit made
that unnecessary.

Include migration/vmstate.h only where it's still needed.  Touching it
now recompiles only some 1600 objects.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20190812052359.30071-16-armbru@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2019-08-16 13:31:52 +02:00
Klaus Birkelund Jensen 1cc354ac98 nvme: do not advertise support for unsupported arbitration mechanism
The device mistakenly reports that the Weighted Round Robin with Urgent
Priority Class arbitration mechanism is supported.

It is not.

Signed-off-by: Klaus Birkelund Jensen <klaus.jensen@cnexlabs.com>
Message-id: 20190606092530.14206-1-klaus@birkelund.eu
Acked-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2019-06-24 15:53:01 +02:00
Markus Armbruster 0b8fa32f55 Include qemu/module.h where needed, drop it from qemu-common.h
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190523143508.25387-4-armbru@redhat.com>
[Rebased with conflicts resolved automatically, except for
hw/usb/dev-hub.c hw/misc/exynos4210_rng.c hw/misc/bcm2835_rng.c
hw/misc/aspeed_scu.c hw/display/virtio-vga.c hw/arm/stm32f205_soc.c;
ui/cocoa.m fixed up]
2019-06-12 13:18:33 +02:00
Kenneth Heitke 3036a626e9 nvme: add Get/Set Feature Timestamp support
Signed-off-by: Kenneth Heitke <kenneth.heitke@intel.com>
Reviewed-by: Klaus Birkelund Jensen <klaus.jensen@cnexlabs.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-06-04 15:22:09 +02:00
Klaus Birkelund Jensen 25349e8250 nvme: fix copy direction in DMA reads going to CMB
`nvme_dma_read_prp` erronously used `qemu_iovec_*to*_buf` instead of
`qemu_iovec_*from*_buf` when the request involved the controller memory
buffer.

Signed-off-by: Klaus Birkelund Jensen <klaus.jensen@cnexlabs.com>
Reviewed-by: Kenneth Heitke <kenneth.heitke@intel.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-05-20 17:08:56 +02:00
Keith Busch 9d6459d21a nvme: fix write zeroes offset and count
The implementation used blocks units rather than the expected bytes.

Fixes: c03e7ef12a ("nvme: Implement Write Zeroes")
Reported-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-03-12 20:30:14 +01:00
Li Qiang a3d25ddd6f nvme: use pci_dev directly in nvme_realize
There is no need to make another reference.

Signed-off-by: Li Qiang <liq3ea@163.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190120055558.32984-4-liq3ea@163.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2019-01-31 00:38:19 +01:00
Li Qiang 2410e133ec nvme: ensure the num_queues is not zero
When it is zero, it causes segv.
Using following command:

"-drive file=//home/test/test1.img,if=none,id=id0
-device nvme,drive=id0,serial=test,num_queues=0"
causes following Backtrack:

Thread 4 "qemu-system-x86" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffe9735700 (LWP 30952)]
0x0000555555a7a77c in nvme_start_ctrl (n=0x5555577473f0) at hw/block/nvme.c:825
825	    if (unlikely(n->cq[0])) {
(gdb) bt
0  0x0000555555a7a77c in nvme_start_ctrl (n=0x5555577473f0)
    at hw/block/nvme.c:825
1  0x0000555555a7af7f in nvme_write_bar (n=0x5555577473f0, offset=20,
    data=4587521, size=4) at hw/block/nvme.c:969
2  0x0000555555a7b81a in nvme_mmio_write (opaque=0x5555577473f0, addr=20,
    data=4587521, size=4) at hw/block/nvme.c:1163
3  0x0000555555869236 in memory_region_write_accessor (mr=0x555557747cd0,
    addr=20, value=0x7fffe97320f8, size=4, shift=0, mask=4294967295, attrs=...)
    at /home/test/qemu1/qemu/memory.c:502
4  0x0000555555869446 in access_with_adjusted_size (addr=20,
    value=0x7fffe97320f8, size=4, access_size_min=2, access_size_max=8,
    access_fn=0x55555586914d <memory_region_write_accessor>,
    mr=0x555557747cd0, attrs=...) at /home/test/qemu1/qemu/memory.c:568
5  0x000055555586c479 in memory_region_dispatch_write (mr=0x555557747cd0,
    addr=20, data=4587521, size=4, attrs=...)
    at /home/test/qemu1/qemu/memory.c:1499
6  0x00005555558030af in flatview_write_continue (fv=0x7fffe0061130,
    addr=4273930260, attrs=..., buf=0x7ffff7ff0028 "\001", len=4, addr1=20,
    l=4, mr=0x555557747cd0) at /home/test/qemu1/qemu/exec.c:3234
7  0x00005555558031f9 in flatview_write (fv=0x7fffe0061130, addr=4273930260,
    attrs=..., buf=0x7ffff7ff0028 "\001", len=4)
    at /home/test/qemu1/qemu/exec.c:3273
8  0x00005555558034ff in address_space_write (
---Type <return> to continue, or q <return> to quit---
    as=0x555556758480 <address_space_memory>, addr=4273930260, attrs=...,
    buf=0x7ffff7ff0028 "\001", len=4) at /home/test/qemu1/qemu/exec.c:3363
9  0x0000555555803550 in address_space_rw (
    as=0x555556758480 <address_space_memory>, addr=4273930260, attrs=...,
    buf=0x7ffff7ff0028 "\001", len=4, is_write=true)
    at /home/test/qemu1/qemu/exec.c:3374
10 0x00005555558884a1 in kvm_cpu_exec (cpu=0x555556920e40)
    at /home/test/qemu1/qemu/accel/kvm/kvm-all.c:2031
11 0x000055555584cd9d in qemu_kvm_cpu_thread_fn (arg=0x555556920e40)
    at /home/test/qemu1/qemu/cpus.c:1281
12 0x0000555555dbaf6d in qemu_thread_start (args=0x5555569438a0)
    at util/qemu-thread-posix.c:502
13 0x00007ffff5dc86db in start_thread (arg=0x7fffe9735700)
    at pthread_create.c:463
14 0x00007ffff5af188f in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Signed-off-by: Li Qiang <liq3ea@163.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190120055558.32984-3-liq3ea@163.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2019-01-31 00:38:19 +01:00
Li Qiang 08db59e188 nvme: use TYPE_NVME instead of constant string
Signed-off-by: Li Qiang <liq3ea@163.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190120055558.32984-2-liq3ea@163.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2019-01-31 00:38:19 +01:00
Keith Busch 6da021815e nvme: Fix spurious interrupts
The code had asserted an interrupt every time it was requested to check
for new completion queue entries.This can result in spurious interrupts
seen by the guest OS.

Fix this by asserting an interrupt only if there are un-acknowledged
completion queue entries available.

Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-11-27 12:59:00 +01:00
Logan Gunthorpe ad3a7e4555 nvme: fix bug with PCI IRQ pins on teardown
When the submission and completion queues are being torn down
the IRQ will be asserted for the completion queue when the
submsission queue is deleted. Then when the completion queue
is deleted it stays asserted. Thus, on systems that do
not use MSI, no further interrupts can be triggered on the host.

Linux sees this as a long delay when unbinding the nvme device.
Eventually the interrupt timeout occurs and it continues.

To fix this we ensure we deassert the IRQ for a CQ when it is
deleted.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-11-22 19:37:31 +01:00
Paolo Bonzini 71a86ddece nvme: fix CMB endianness confusion
The CMB is marked as DEVICE_LITTLE_ENDIAN, so the data must be
read/written as if it was little-endian output (in the case of
big endian, we get two swaps, one in the memory core and one
in nvme.c).

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-11-22 19:37:31 +01:00
Kevin Wolf 2067d39e5e Revert "nvme: fix oob access issue(CVE-2018-16847)"
This reverts commit 5e3c0220d7.
We have a better fix commited for this now.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-11-22 16:43:52 +01:00
Paolo Bonzini 87ad860c62 nvme: fix out-of-bounds access to the CMB
Because the CMB BAR has a min_access_size of 2, if you read the last
byte it will try to memcpy *2* bytes from n->cmbuf, causing an off-by-one
error.  This is CVE-2018-16847.

Another way to fix this might be to register the CMB as a RAM memory
region, which would also be more efficient.  However, that might be a
change for big-endian machines; I didn't think this through and I don't
know how real hardware works.  Add a basic testcase for the CMB in case
somebody does this change later on.

Cc: Keith Busch <keith.busch@intel.com>
Cc: qemu-block@nongnu.org
Reported-by: Li Qiang <liq3ea@gmail.com>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Tested-by: Li Qiang <liq3ea@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-11-22 16:43:52 +01:00
Igor Druzhinin 6bf7463615 nvme: call blk_drain in NVMe reset code to avoid lockups
When blk_flush called in NVMe reset path S/C queues are already freed
which means that re-entering AIO handling loop having some IO requests
unfinished will lockup or crash as their SG structures being potentially
reused. Call blk_drain before freeing the queues to avoid this nasty
scenario.

Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-11-22 16:43:52 +01:00
Li Qiang 5e3c0220d7 nvme: fix oob access issue(CVE-2018-16847)
Currently, the nvme_cmb_ops mr doesn't check the addr and size.
This can lead an oob access issue. This is triggerable in the guest.
Add check to avoid this issue.

Fixes CVE-2018-16847.

Reported-by: Li Qiang <liq3ea@gmail.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Li Qiang <liq3ea@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-11-19 12:51:16 +01:00
Li Qiang a883d6a0bc nvme: free cmbuf in nvme_exit
This avoid a memory leak in unhotplug nvme device.

Signed-off-by: Li Qiang <liq3ea@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-11-12 17:46:57 +01:00
Li Qiang 20faf0f5f8 nvme: don't unref ctrl_mem when device unrealized
Currently, when hotplug/unhotplug nvme device, it will cause an
assert in object.c. Following is the backtrack:

ERROR:qom/object.c:981:object_unref: assertion failed: (obj->ref > 0)

Thread 2 "qemu-system-x86" received signal SIGABRT, Aborted.
[Switching to Thread 0x7fffcbd32700 (LWP 18844)]
0x00007fffdb9e4fff in raise () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) bt
/lib/x86_64-linux-gnu/libglib-2.0.so.0
/lib/x86_64-linux-gnu/libglib-2.0.so.0
qom/object.c:981
/home/liqiang02/qemu-upstream/qemu/memory.c:1732
/home/liqiang02/qemu-upstream/qemu/memory.c:285
util/qemu-thread-posix.c:504
/lib/x86_64-linux-gnu/libpthread.so.0

This is caused by memory_region_unref in nvme_exit.

Remove it to make the PCIdevice refcount correct.

Signed-off-by: Li Qiang <liq3ea@gmail.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-11-12 17:46:57 +01:00
Kevin Wolf 572023f7b2 block: Remove deprecated -drive option serial
This reinstates commit b008326744,
which was temporarily reverted for the 3.0 release so that libvirt gets
some extra time to update their command lines.

The -drive option serial was deprecated in QEMU 2.10. It's time to
remove it.

Tests need to be updated to set the serial number with -global instead
of using the -drive option.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
2018-08-15 12:50:39 +02:00
Cornelia Huck 44e8b4689c Revert "block: Remove deprecated -drive option serial"
This reverts commit b008326744.

Hold off removing this for one more QEMU release (current libvirt
release still uses it.)

Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-07-10 14:36:11 +02:00