|
|
|
HXCOMM Use DEFHEADING() to define headings in both help text and rST.
|
|
|
|
HXCOMM Text between SRST and ERST is copied to the rST version and
|
|
|
|
HXCOMM discarded from C version.
|
|
|
|
HXCOMM DEF(option, HAS_ARG/0, opt_enum, opt_help, arch_mask) is used to
|
|
|
|
HXCOMM construct option structures, enums and help message for specified
|
|
|
|
HXCOMM architectures.
|
|
|
|
HXCOMM HXCOMM can be used for comments, discarded from both rST and C.
|
|
|
|
|
|
|
|
DEFHEADING(Standard options:)
|
|
|
|
|
|
|
|
DEF("help", 0, QEMU_OPTION_h,
|
|
|
|
"-h or -help display this help and exit\n", QEMU_ARCH_ALL)
|
|
|
|
SRST
|
|
|
|
``-h``
|
|
|
|
Display help and exit
|
|
|
|
ERST
|
|
|
|
|
|
|
|
DEF("version", 0, QEMU_OPTION_version,
|
|
|
|
"-version display version information and exit\n", QEMU_ARCH_ALL)
|
|
|
|
SRST
|
|
|
|
``-version``
|
|
|
|
Display version information and exit
|
|
|
|
ERST
|
|
|
|
|
|
|
|
DEF("machine", HAS_ARG, QEMU_OPTION_machine, \
|
|
|
|
"-machine [type=]name[,prop[=value][,...]]\n"
|
|
|
|
" selects emulated machine ('-machine help' for list)\n"
|
|
|
|
" property accel=accel1[:accel2[:...]] selects accelerator\n"
|
|
|
|
" supported accelerators are kvm, xen, hax, hvf, nvmm, whpx or tcg (default: tcg)\n"
|
|
|
|
" vmport=on|off|auto controls emulation of vmport (default: auto)\n"
|
|
|
|
" dump-guest-core=on|off include guest memory in a core dump (default=on)\n"
|
|
|
|
" mem-merge=on|off controls memory merge support (default: on)\n"
|
|
|
|
" aes-key-wrap=on|off controls support for AES key wrapping (default=on)\n"
|
|
|
|
" dea-key-wrap=on|off controls support for DEA key wrapping (default=on)\n"
|
nvdimm acpi: build ACPI NFIT table
NFIT is defined in ACPI 6.0: 5.2.25 NVDIMM Firmware Interface Table (NFIT)
Currently, we only support PMEM mode. Each device has 3 structures:
- SPA structure, defines the PMEM region info
- MEM DEV structure, it has the @handle which is used to associate specified
ACPI NVDIMM device we will introduce in later patch.
Also we can happily ignored the memory device's interleave, the real
nvdimm hardware access is hidden behind host
- DCR structure, it defines vendor ID used to associate specified vendor
nvdimm driver. Since we only implement PMEM mode this time, Command
window and Data window are not needed
The NVDIMM functionality is controlled by the parameter, 'nvdimm', which
is introduced for the machine, there is a example to enable it:
-machine pc,nvdimm -m 8G,maxmem=100G,slots=100 -object \
memory-backend-file,id=mem1,share,mem-path=/tmp/nvdimm1,size=10G -device \
nvdimm,memdev=mem1,id=nv1
It is disabled on default
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
7 years ago
|
|
|
" suppress-vmdesc=on|off disables self-describing migration (default=off)\n"
|
|
|
|
" nvdimm=on|off controls NVDIMM support (default=off)\n"
|
|
|
|
" memory-encryption=@var{} memory encryption object to use (default=none)\n"
|
|
|
|
" hmat=on|off controls ACPI HMAT support (default=off)\n"
|
|
|
|
" memory-backend='backend-id' specifies explicitly provided backend for main RAM (default=none)\n",
|
|
|
|
QEMU_ARCH_ALL)
|
|
|
|
SRST
|
|
|
|
``-machine [type=]name[,prop=value[,...]]``
|
|
|
|
Select the emulated machine by name. Use ``-machine help`` to list
|
|
|
|
available machines.
|
|
|
|
|
|
|
|
For architectures which aim to support live migration compatibility
|
|
|
|
across releases, each release will introduce a new versioned machine
|
|
|
|
type. For example, the 2.8.0 release introduced machine types
|
|
|
|
"pc-i440fx-2.8" and "pc-q35-2.8" for the x86\_64/i686 architectures.
|
|
|
|
|
|
|
|
To allow live migration of guests from QEMU version 2.8.0, to QEMU
|
|
|
|
version 2.9.0, the 2.9.0 version must support the "pc-i440fx-2.8"
|
|
|
|
and "pc-q35-2.8" machines too. To allow users live migrating VMs to
|
|
|
|
skip multiple intermediate releases when upgrading, new releases of
|
|
|
|
QEMU will support machine types from many previous versions.
|
|
|
|
|
|
|
|
Supported machine properties are:
|
|
|
|
|
|
|
|
``accel=accels1[:accels2[:...]]``
|
|
|
|
This is used to enable an accelerator. Depending on the target
|
|
|
|
architecture, kvm, xen, hax, hvf, nvmm, whpx or tcg can be available.
|
|
|
|
By default, tcg is used. If there is more than one accelerator
|
|
|
|
specified, the next one is used if the previous one fails to
|
|
|
|
initialize.
|
|
|
|
|
|
|
|
``vmport=on|off|auto``
|
|
|
|
Enables emulation of VMWare IO port, for vmmouse etc. auto says
|
|
|
|
to select the value based on accel. For accel=xen the default is
|
|
|
|
off otherwise the default is on.
|
|
|
|
|
|
|
|
``dump-guest-core=on|off``
|
|
|
|
Include guest memory in a core dump. The default is on.
|
|
|
|
|
|
|
|
``mem-merge=on|off``
|
|
|
|
Enables or disables memory merge support. This feature, when
|
|
|
|
supported by the host, de-duplicates identical memory pages
|
|
|
|
among VMs instances (enabled by default).
|
|
|
|
|
|
|
|
``aes-key-wrap=on|off``
|
|
|
|
Enables or disables AES key wrapping support on s390-ccw hosts.
|
|
|
|
This feature controls whether AES wrapping keys will be created
|
|
|
|
to allow execution of AES cryptographic functions. The default
|
|
|
|
is on.
|
|
|
|
|
|
|
|
``dea-key-wrap=on|off``
|
|
|
|
Enables or disables DEA key wrapping support on s390-ccw hosts.
|
|
|
|
This feature controls whether DEA wrapping keys will be created
|
|
|
|
to allow execution of DEA cryptographic functions. The default
|
|
|
|
is on.
|
|
|
|
|
|
|
|
``nvdimm=on|off``
|
|
|
|
Enables or disables NVDIMM support. The default is off.
|
|
|
|
|
|
|
|
``memory-encryption=``
|
|
|
|
Memory encryption object to use. The default is none.
|
|
|
|
|
|
|
|
``hmat=on|off``
|
|
|
|
Enables or disables ACPI Heterogeneous Memory Attribute Table
|
|
|
|
(HMAT) support. The default is off.
|
|
|
|
|
|
|
|
``memory-backend='id'``
|
|
|
|
An alternative to legacy ``-mem-path`` and ``mem-prealloc`` options.
|
|
|
|
Allows to use a memory backend as main RAM.
|
|
|
|
|
|
|
|
For example:
|
|
|
|
::
|
|
|
|
|
|
|
|
-object memory-backend-file,id=pc.ram,size=512M,mem-path=/hugetlbfs,prealloc=on,share=on
|
|
|
|
-machine memory-backend=pc.ram
|
|
|
|
-m 512M
|
|
|
|
|
|
|
|
Migration compatibility note:
|
|
|
|
|
|
|
|
* as backend id one shall use value of 'default-ram-id', advertised by
|
|
|
|
machine type (available via ``query-machines`` QMP command), if migration
|
|
|
|
to/from old QEMU (<5.0) is expected.
|
|
|
|
* for machine types 4.0 and older, user shall
|
|
|
|
use ``x-use-canonical-path-for-ramblock-id=off`` backend option
|
|
|
|
if migration to/from old QEMU (<5.0) is expected.
|
|
|
|
|
|
|
|
For example:
|
|
|
|
::
|
|
|
|
|
|
|
|
-object memory-backend-ram,id=pc.ram,size=512M,x-use-canonical-path-for-ramblock-id=off
|
|
|
|
-machine memory-backend=pc.ram
|
|
|
|
-m 512M
|
|
|
|
ERST
|
|
|
|
|
vl: Add sgx compound properties to expose SGX EPC sections to guest
Because SGX EPC is enumerated through CPUID, EPC "devices" need to be
realized prior to realizing the vCPUs themselves, i.e. long before
generic devices are parsed and realized. From a virtualization
perspective, the CPUID aspect also means that EPC sections cannot be
hotplugged without paravirtualizing the guest kernel (hardware does
not support hotplugging as EPC sections must be locked down during
pre-boot to provide EPC's security properties).
So even though EPC sections could be realized through the generic
-devices command, they need to be created much earlier for them to
actually be usable by the guest. Place all EPC sections in a
contiguous block, somewhat arbitrarily starting after RAM above 4g.
Ensuring EPC is in a contiguous region simplifies calculations, e.g.
device memory base, PCI hole, etc..., allows dynamic calculation of the
total EPC size, e.g. exposing EPC to guests does not require -maxmem,
and last but not least allows all of EPC to be enumerated in a single
ACPI entry, which is expected by some kernels, e.g. Windows 7 and 8.
The new compound properties command for sgx like below:
......
-object memory-backend-epc,id=mem1,size=28M,prealloc=on \
-object memory-backend-epc,id=mem2,size=10M \
-M sgx-epc.0.memdev=mem1,sgx-epc.1.memdev=mem2
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <20210719112136.57018-6-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2 years ago
|
|
|
DEF("M", HAS_ARG, QEMU_OPTION_M,
|
numa: Enable numa for SGX EPC sections
The basic SGX did not enable numa for SGX EPC sections, which
result in all EPC sections located in numa node 0. This patch
enable SGX numa function in the guest and the EPC section can
work with RAM as one numa node.
The Guest kernel related log:
[ 0.009981] ACPI: SRAT: Node 0 PXM 0 [mem 0x180000000-0x183ffffff]
[ 0.009982] ACPI: SRAT: Node 1 PXM 1 [mem 0x184000000-0x185bfffff]
The SRAT table can normally show SGX EPC sections menory info in different
numa nodes.
The SGX EPC numa related command:
......
-m 4G,maxmem=20G \
-smp sockets=2,cores=2 \
-cpu host,+sgx-provisionkey \
-object memory-backend-ram,size=2G,host-nodes=0,policy=bind,id=node0 \
-object memory-backend-epc,id=mem0,size=64M,prealloc=on,host-nodes=0,policy=bind \
-numa node,nodeid=0,cpus=0-1,memdev=node0 \
-object memory-backend-ram,size=2G,host-nodes=1,policy=bind,id=node1 \
-object memory-backend-epc,id=mem1,size=28M,prealloc=on,host-nodes=1,policy=bind \
-numa node,nodeid=1,cpus=2-3,memdev=node1 \
-M sgx-epc.0.memdev=mem0,sgx-epc.0.node=0,sgx-epc.1.memdev=mem1,sgx-epc.1.node=1 \
......
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <20211101162009.62161-2-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
1 year ago
|
|
|
" sgx-epc.0.memdev=memid,sgx-epc.0.node=numaid\n",
|
vl: Add sgx compound properties to expose SGX EPC sections to guest
Because SGX EPC is enumerated through CPUID, EPC "devices" need to be
realized prior to realizing the vCPUs themselves, i.e. long before
generic devices are parsed and realized. From a virtualization
perspective, the CPUID aspect also means that EPC sections cannot be
hotplugged without paravirtualizing the guest kernel (hardware does
not support hotplugging as EPC sections must be locked down during
pre-boot to provide EPC's security properties).
So even though EPC sections could be realized through the generic
-devices command, they need to be created much earlier for them to
actually be usable by the guest. Place all EPC sections in a
contiguous block, somewhat arbitrarily starting after RAM above 4g.
Ensuring EPC is in a contiguous region simplifies calculations, e.g.
device memory base, PCI hole, etc..., allows dynamic calculation of the
total EPC size, e.g. exposing EPC to guests does not require -maxmem,
and last but not least allows all of EPC to be enumerated in a single
ACPI entry, which is expected by some kernels, e.g. Windows 7 and 8.
The new compound properties command for sgx like below:
......
-object memory-backend-epc,id=mem1,size=28M,prealloc=on \
-object memory-backend-epc,id=mem2,size=10M \
-M sgx-epc.0.memdev=mem1,sgx-epc.1.memdev=mem2
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <20210719112136.57018-6-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2 years ago
|
|
|
QEMU_ARCH_ALL)
|
|
|
|
|
|
|
|
SRST
|
numa: Enable numa for SGX EPC sections
The basic SGX did not enable numa for SGX EPC sections, which
result in all EPC sections located in numa node 0. This patch
enable SGX numa function in the guest and the EPC section can
work with RAM as one numa node.
The Guest kernel related log:
[ 0.009981] ACPI: SRAT: Node 0 PXM 0 [mem 0x180000000-0x183ffffff]
[ 0.009982] ACPI: SRAT: Node 1 PXM 1 [mem 0x184000000-0x185bfffff]
The SRAT table can normally show SGX EPC sections menory info in different
numa nodes.
The SGX EPC numa related command:
......
-m 4G,maxmem=20G \
-smp sockets=2,cores=2 \
-cpu host,+sgx-provisionkey \
-object memory-backend-ram,size=2G,host-nodes=0,policy=bind,id=node0 \
-object memory-backend-epc,id=mem0,size=64M,prealloc=on,host-nodes=0,policy=bind \
-numa node,nodeid=0,cpus=0-1,memdev=node0 \
-object memory-backend-ram,size=2G,host-nodes=1,policy=bind,id=node1 \
-object memory-backend-epc,id=mem1,size=28M,prealloc=on,host-nodes=1,policy=bind \
-numa node,nodeid=1,cpus=2-3,memdev=node1 \
-M sgx-epc.0.memdev=mem0,sgx-epc.0.node=0,sgx-epc.1.memdev=mem1,sgx-epc.1.node=1 \
......
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <20211101162009.62161-2-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
1 year ago
|
|
|
``sgx-epc.0.memdev=@var{memid},sgx-epc.0.node=@var{numaid}``
|
vl: Add sgx compound properties to expose SGX EPC sections to guest
Because SGX EPC is enumerated through CPUID, EPC "devices" need to be
realized prior to realizing the vCPUs themselves, i.e. long before
generic devices are parsed and realized. From a virtualization
perspective, the CPUID aspect also means that EPC sections cannot be
hotplugged without paravirtualizing the guest kernel (hardware does
not support hotplugging as EPC sections must be locked down during
pre-boot to provide EPC's security properties).
So even though EPC sections could be realized through the generic
-devices command, they need to be created much earlier for them to
actually be usable by the guest. Place all EPC sections in a
contiguous block, somewhat arbitrarily starting after RAM above 4g.
Ensuring EPC is in a contiguous region simplifies calculations, e.g.
device memory base, PCI hole, etc..., allows dynamic calculation of the
total EPC size, e.g. exposing EPC to guests does not require -maxmem,
and last but not least allows all of EPC to be enumerated in a single
ACPI entry, which is expected by some kernels, e.g. Windows 7 and 8.
The new compound properties command for sgx like below:
......
-object memory-backend-epc,id=mem1,size=28M,prealloc=on \
-object memory-backend-epc,id=mem2,size=10M \
-M sgx-epc.0.memdev=mem1,sgx-epc.1.memdev=mem2
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <20210719112136.57018-6-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2 years ago
|
|
|
Define an SGX EPC section.
|
|
|
|
ERST
|
|
|
|
|
|
|
|
DEF("cpu", HAS_ARG, QEMU_OPTION_cpu,
|
|
|
|
"-cpu cpu select CPU ('-cpu help' for list)\n", QEMU_ARCH_ALL)
|
|
|
|
SRST
|
|
|
|
``-cpu model``
|
|
|
|
Select CPU model (``-cpu help`` for list and additional feature
|
|
|
|
selection)
|
|
|
|
ERST
|
|
|
|
|
|
|
|
DEF("accel", HAS_ARG, QEMU_OPTION_accel,
|
|
|
|
"-accel [accel=]accelerator[,prop[=value][,...]]\n"
|
|
|
|
" select accelerator (kvm, xen, hax, hvf, nvmm, whpx or tcg; use 'help' for a list)\n"
|
|
|
|
" igd-passthru=on|off (enable Xen integrated Intel graphics passthrough, default=off)\n"
|
|
|
|
" kernel-irqchip=on|off|split controls accelerated irqchip support (default=on)\n"
|
|
|
|
" kvm-shadow-mem=size of KVM shadow MMU in bytes\n"
|
|
|
|
" split-wx=on|off (enable TCG split w^x mapping)\n"
|
|
|
|
" tb-size=n (TCG translation block cache size)\n"
|
|
|
|
" dirty-ring-size=n (KVM dirty ring GFN count, default 0)\n"
|
|
|
|
" thread=single|multi (enable multi-threaded TCG)\n", QEMU_ARCH_ALL)
|
|
|
|
SRST
|
|
|
|
``-accel name[,prop=value[,...]]``
|
|
|
|
This is used to enable an accelerator. Depending on the target
|
|
|
|
architecture, kvm, xen, hax, hvf, nvmm, whpx or tcg can be available. By
|
|
|
|
default, tcg is used. If there is more than one accelerator
|
|
|
|
specified, the next one is used if the previous one fails to
|
|
|
|
initialize.
|
|
|
|
|
|
|
|
``igd-passthru=on|off``
|
|
|
|
When Xen is in use, this option controls whether Intel
|
|
|
|
integrated graphics devices can be passed through to the guest
|
|
|
|
(default=off)
|
|
|
|
|
|
|
|
``kernel-irqchip=on|off|split``
|
|
|
|
Controls KVM in-kernel irqchip support. The default is full
|
|
|
|
acceleration of the interrupt controllers. On x86, split irqchip
|
|
|
|
reduces the kernel attack surface, at a performance cost for
|
|
|
|
non-MSI interrupts. Disabling the in-kernel irqchip completely
|
|
|
|
is not recommended except for debugging purposes.
|
|
|
|
|
|
|
|
``kvm-shadow-mem=size``
|
|
|
|
Defines the size of the KVM shadow MMU.
|
|
|
|
|
|
|
|
``split-wx=on|off``
|
|
|
|
Controls the use of split w^x mapping for the TCG code generation
|
|
|
|
buffer. Some operating systems require this to be enabled, and in
|
|
|
|
such a case this will default on. On other operating systems, this
|
|
|
|
will default off, but one may enable this for testing or debugging.
|
|
|
|
|
|
|
|
``tb-size=n``
|
|
|
|
Controls the size (in MiB) of the TCG translation block cache.
|
|
|
|
|
|
|
|
``thread=single|multi``
|
|
|
|
Controls number of TCG threads. When the TCG is multi-threaded
|
|
|
|
there will be one thread per vCPU therefore taking advantage of
|
|
|
|
additional host cores. The default is to enable multi-threading
|
|
|
|
where both the back-end and front-ends support it and no
|
|
|
|
incompatible TCG features have been enabled (e.g.
|
|
|
|
icount/replay).
|
|
|
|
|
|
|
|
``dirty-ring-size=n``
|
|
|
|
When the KVM accelerator is used, it controls the size of the per-vCPU
|
|
|
|
dirty page ring buffer (number of entries for each vCPU). It should
|
|
|
|
be a value that is power of two, and it should be 1024 or bigger (but
|
|
|
|
still less than the maximum value that the kernel supports). 4096
|
|
|
|
could be a good initial value if you have no idea which is the best.
|
|
|
|
Set this value to 0 to disable the feature. By default, this feature
|
|
|
|
is disabled (dirty-ring-size=0). When enabled, KVM will instead
|
|
|
|
record dirty pages in a bitmap.
|
|
|
|
|
|
|
|
ERST
|
|
|
|
|
|
|
|
DEF("smp", HAS_ARG, QEMU_OPTION_smp,
|
hw/core/machine: Introduce CPU cluster topology support
The new Cluster-Aware Scheduling support has landed in Linux 5.16,
which has been proved to benefit the scheduling performance (e.g.
load balance and wake_affine strategy) on both x86_64 and AArch64.
So now in Linux 5.16 we have four-level arch-neutral CPU topology
definition like below and a new scheduler level for clusters.
struct cpu_topology {
int thread_id;
int core_id;
int cluster_id;
int package_id;
int llc_id;
cpumask_t thread_sibling;
cpumask_t core_sibling;
cpumask_t cluster_sibling;
cpumask_t llc_sibling;
}
A cluster generally means a group of CPU cores which share L2 cache
or other mid-level resources, and it is the shared resources that
is used to improve scheduler's behavior. From the point of view of
the size range, it's between CPU die and CPU core. For example, on
some ARM64 Kunpeng servers, we have 6 clusters in each NUMA node,
and 4 CPU cores in each cluster. The 4 CPU cores share a separate
L2 cache and a L3 cache tag, which brings cache affinity advantage.
In virtualization, on the Hosts which have pClusters (physical
clusters), if we can design a vCPU topology with cluster level for
guest kernel and have a dedicated vCPU pinning. A Cluster-Aware
Guest kernel can also make use of the cache affinity of CPU clusters
to gain similar scheduling performance.
This patch adds infrastructure for CPU cluster level topology
configuration and parsing, so that the user can specify cluster
parameter if their machines support it.
Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
Message-Id: <20211228092221.21068-3-wangyanan55@huawei.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
[PMD: Added '(since 7.0)' to @clusters in qapi/machine.json]
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
1 year ago
|
|
|
"-smp [[cpus=]n][,maxcpus=maxcpus][,sockets=sockets][,dies=dies][,clusters=clusters][,cores=cores][,threads=threads]\n"
|
|
|
|
" set the number of initial CPUs to 'n' [default=1]\n"
|
|
|
|
" maxcpus= maximum number of total CPUs, including\n"
|
|
|
|
" offline CPUs for hotplug, etc\n"
|
|
|
|
" sockets= number of sockets on the machine board\n"
|
|
|
|
" dies= number of dies in one socket\n"
|
hw/core/machine: Introduce CPU cluster topology support
The new Cluster-Aware Scheduling support has landed in Linux 5.16,
which has been proved to benefit the scheduling performance (e.g.
load balance and wake_affine strategy) on both x86_64 and AArch64.
So now in Linux 5.16 we have four-level arch-neutral CPU topology
definition like below and a new scheduler level for clusters.
struct cpu_topology {
int thread_id;
int core_id;
int cluster_id;
int package_id;
int llc_id;
cpumask_t thread_sibling;
cpumask_t core_sibling;
cpumask_t cluster_sibling;
cpumask_t llc_sibling;
}
A cluster generally means a group of CPU cores which share L2 cache
or other mid-level resources, and it is the shared resources that
is used to improve scheduler's behavior. From the point of view of
the size range, it's between CPU die and CPU core. For example, on
some ARM64 Kunpeng servers, we have 6 clusters in each NUMA node,
and 4 CPU cores in each cluster. The 4 CPU cores share a separate
L2 cache and a L3 cache tag, which brings cache affinity advantage.
In virtualization, on the Hosts which have pClusters (physical
clusters), if we can design a vCPU topology with cluster level for
guest kernel and have a dedicated vCPU pinning. A Cluster-Aware
Guest kernel can also make use of the cache affinity of CPU clusters
to gain similar scheduling performance.
This patch adds infrastructure for CPU cluster level topology
configuration and parsing, so that the user can specify cluster
parameter if their machines support it.
Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
Message-Id: <20211228092221.21068-3-wangyanan55@huawei.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
[PMD: Added '(since 7.0)' to @clusters in qapi/machine.json]
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
1 year ago
|
|
|
" clusters= number of clusters in one die\n"
|
|
|
|
" cores= number of cores in one cluster\n"
|
|
|
|
" threads= number of threads in one core\n"
|
|
|
|
"Note: Different machines may have different subsets of the CPU topology\n"
|
|
|
|
" parameters supported, so the actual meaning of the supported parameters\n"
|
|
|
|
" will vary accordingly. For example, for a machine type that supports a\n"
|
|
|
|
" three-level CPU hierarchy of sockets/cores/threads, the parameters will\n"
|
|
|
|
" sequentially mean as below:\n"
|
|
|
|
" sockets means the number of sockets on the machine board\n"
|
|
|
|
" cores means the number of cores in one socket\n"
|
|
|
|
" threads means the number of threads in one core\n"
|
|
|
|
" For a particular machine type board, an expected CPU topology hierarchy\n"
|
|
|
|
" can be defined through the supported sub-option. Unsupported parameters\n"
|
|
|
|
" can also be provided in addition to the sub-option, but their values\n"
|
|
|
|
" must be set as 1 in the purpose of correct parsing.\n",
|
|
|
|
QEMU_ARCH_ALL)
|
|
|
|
SRST
|
hw/core/machine: Introduce CPU cluster topology support
The new Cluster-Aware Scheduling support has landed in Linux 5.16,
which has been proved to benefit the scheduling performance (e.g.
load balance and wake_affine strategy) on both x86_64 and AArch64.
So now in Linux 5.16 we have four-level arch-neutral CPU topology
definition like below and a new scheduler level for clusters.
struct cpu_topology {
int thread_id;
int core_id;
int cluster_id;
int package_id;
int llc_id;
cpumask_t thread_sibling;
cpumask_t core_sibling;
cpumask_t cluster_sibling;
cpumask_t llc_sibling;
}
A cluster generally means a group of CPU cores which share L2 cache
or other mid-level resources, and it is the shared resources that
is used to improve scheduler's behavior. From the point of view of
the size range, it's between CPU die and CPU core. For example, on
some ARM64 Kunpeng servers, we have 6 clusters in each NUMA node,
and 4 CPU cores in each cluster. The 4 CPU cores share a separate
L2 cache and a L3 cache tag, which brings cache affinity advantage.
In virtualization, on the Hosts which have pClusters (physical
clusters), if we can design a vCPU topology with cluster level for
guest kernel and have a dedicated vCPU pinning. A Cluster-Aware
Guest kernel can also make use of the cache affinity of CPU clusters
to gain similar scheduling performance.
This patch adds infrastructure for CPU cluster level topology
configuration and parsing, so that the user can specify cluster
parameter if their machines support it.
Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
Message-Id: <20211228092221.21068-3-wangyanan55@huawei.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
[PMD: Added '(since 7.0)' to @clusters in qapi/machine.json]
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
1 year ago
|
|
|
``-smp [[cpus=]n][,maxcpus=maxcpus][,sockets=sockets][,dies=dies][,clusters=clusters][,cores=cores][,threads=threads]``
|
|
|
|
Simulate a SMP system with '\ ``n``\ ' CPUs initially present on
|
|
|
|
the machine type board. On boards supporting CPU hotplug, the optional
|
|
|
|
'\ ``maxcpus``\ ' parameter can be set to enable further CPUs to be
|
|
|
|
added at runtime. When both parameters are omitted, the maximum number
|
|
|
|
of CPUs will be calculated from the provided topology members and the
|
|
|
|
initial CPU count will match the maximum number. When only one of them
|
|
|
|
is given then the omitted one will be set to its counterpart's value.
|
|
|
|
Both parameters may be specified, but the maximum number of CPUs must
|
|
|
|
be equal to or greater than the initial CPU count. Product of the
|
|
|
|
CPU topology hierarchy must be equal to the maximum number of CPUs.
|
|
|
|
Both parameters are subject to an upper limit that is determined by
|
|
|
|
the specific machine type chosen.
|
|
|
|
|
|
|
|
To control reporting of CPU topology information, values of the topology
|
|
|
|
parameters can be specified. Machines may only support a subset of the
|
|
|
|
parameters and different machines may have different subsets supported
|
|
|
|
which vary depending on capacity of the corresponding CPU targets. So
|
|
|
|
for a particular machine type board, an expected topology hierarchy can
|
|
|
|
be defined through the supported sub-option. Unsupported parameters can
|
|
|
|
also be provided in addition to the sub-option, but their values must be
|
|
|
|
set as 1 in the purpose of correct parsing.
|
|
|
|
|
|
|
|
Either the initial CPU count, or at least one of the topology parameters
|
|
|
|
must be specified. The specified parameters must be greater than zero,
|
|
|
|
explicit configuration like "cpus=0" is not allowed. Values for any
|
|
|
|
omitted parameters will be computed from those which are given.
|
|
|
|
|
|
|
|
For example, the following sub-option defines a CPU topology hierarchy
|
|
|
|
(2 sockets totally on the machine, 2 cores per socket, 2 threads per
|
|
|
|
core) for a machine that only supports sockets/cores/threads.
|
|
|
|
Some members of the option can be omitted but their values will be
|
|
|
|
automatically computed:
|
|
|
|
|
|
|
|
::
|
|
|
|
|
|
|
|
-smp 8,sockets=2,cores=2,threads=2,maxcpus=8
|
|
|
|
|
|
|
|
The following sub-option defines a CPU topology hierarchy (2 sockets
|
|
|
|
totally on the machine, 2 dies per socket, 2 cores per die, 2 threads
|
|
|
|
per core) for PC machines which support sockets/dies/cores/threads.
|
|
|
|
Some members of the option can be omitted but their values will be
|
|
|
|
automatically computed:
|
|
|
|
|
|
|
|
::
|
|
|
|
|
|
|
|
-smp 16,sockets=2,dies=2,cores=2,threads=2,maxcpus=16
|
|
|
|
|
hw/arm/virt: Support CPU cluster on ARM virt machine
ARM64 machines like Kunpeng Family Server Chips have a level
of hardware topology in which a group of CPU cores share L3
cache tag or L2 cache. For example, Kunpeng 920 typically
has 6 or 8 clusters in each NUMA node (also represent range
of CPU die), and each cluster has 4 CPU cores. All clusters
share L3 cache data, but CPU cores in each cluster share a
local L3 tag.
Running a guest kernel with Cluster-Aware Scheduling on the
Hosts which have physical clusters, if we can design a vCPU
topology with cluster level for guest kernel and then have
a dedicated vCPU pinning, the guest will gain scheduling
performance improvement from cache affinity of CPU cluster.
So let's enable the support for this new parameter on ARM
virt machines. After this patch, we can define a 4-level
CPU hierarchy like: cpus=*,maxcpus=*,sockets=*,clusters=*,
cores=*,threads=*.
Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Message-id: 20220107083232.16256-2-wangyanan55@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
1 year ago
|
|
|
The following sub-option defines a CPU topology hierarchy (2 sockets
|
|
|
|
totally on the machine, 2 clusters per socket, 2 cores per cluster,
|
|
|
|
2 threads per core) for ARM virt machines which support sockets/clusters
|
|
|
|
/cores/threads. Some members of the option can be omitted but their values
|
|
|
|
will be automatically computed:
|
|
|
|
|
|
|
|
::
|
|
|
|
|
|
|
|
-smp 16,sockets=2,clusters=2,cores=2,threads=2,maxcpus=16
|
|
|
|
|
|
|
|
Historically preference was given to the coarsest topology parameters
|
|
|
|
when computing missing values (ie sockets preferred over cores, which
|
|
|
|
were preferred over threads), however, this behaviour is considered
|
|
|
|
liable to change. Prior to 6.2 the preference was sockets over cores
|
|
|
|
over threads. Since 6.2 the preference is cores over sockets over threads.
|
|
|
|
|
|
|
|
For example, the following option defines a machine board with 2 sockets
|
|
|
|
of 1 core before 6.2 and 1 socket of 2 cores after 6.2:
|
|
|
|
|
|
|
|
::
|
|
|
|
|
|
|
|
-smp 2
|
|
|
|
ERST
|
|
|
|
|
|
|
|
DEF("numa", HAS_ARG, QEMU_OPTION_numa,
|
|
|
|
"-numa node[,mem=size][,cpus=firstcpu[-lastcpu]][,nodeid=node][,initiator=node]\n"
|
|
|
|
"-numa node[,memdev=id][,cpus=firstcpu[-lastcpu]][,nodeid=node][,initiator=node]\n"
|
|
|
|
"-numa dist,src=source,dst=destination,val=distance\n"
|
|
|
|
"-numa cpu,node-id=node[,socket-id=x][,core-id=y][,thread-id=z]\n"
|
|
|
|
"-numa hmat-lb,initiator=node,target=node,hierarchy=memory|first-level|second-level|third-level,data-type=access-latency|read-latency|write-latency[,latency=lat][,bandwidth=bw]\n"
|
|
|
|
"-numa hmat-cache,node-id=node,size=size,level=level[,associativity=none|direct|complex][,policy=none|write-back|write-through][,line=size]\n",
|
|
|
|
QEMU_ARCH_ALL)
|
|
|
|
SRST
|
|
|
|
``-numa node[,mem=size][,cpus=firstcpu[-lastcpu]][,nodeid=node][,initiator=initiator]``
|
|
|
|
\
|
|
|
|
``-numa node[,memdev=id][,cpus=firstcpu[-lastcpu]][,nodeid=node][,initiator=initiator]``
|
|
|
|
\
|
|
|
|
``-numa dist,src=source,dst=destination,val=distance``
|
|
|
|
\
|
|
|
|
``-numa cpu,node-id=node[,socket-id=x][,core-id=y][,thread-id=z]``
|
|
|
|
\
|
|
|
|
``-numa hmat-lb,initiator=node,target=node,hierarchy=hierarchy,data-type=tpye[,latency=lat][,bandwidth=bw]``
|
|
|
|
\
|
|
|
|
``-numa hmat-cache,node-id=node,size=size,level=level[,associativity=str][,policy=str][,line=size]``
|
|
|
|
Define a NUMA node and assign RAM and VCPUs to it. Set the NUMA
|
|
|
|
distance from a source node to a destination node. Set the ACPI
|
|
|
|
Heterogeneous Memory Attributes for the given nodes.
|
|
|
|
|
|
|
|
Legacy VCPU assignment uses '\ ``cpus``\ ' option where firstcpu and
|
|
|
|
lastcpu are CPU indexes. Each '\ ``cpus``\ ' option represent a
|
|
|
|
contiguous range of CPU indexes (or a single VCPU if lastcpu is
|
|
|
|
omitted). A non-contiguous set of VCPUs can be represented by
|
|
|
|
providing multiple '\ ``cpus``\ ' options. If '\ ``cpus``\ ' is
|
|
|
|
omitted on all nodes, VCPUs are automatically split between them.
|
|
|
|
|
|
|
|
For example, the following option assigns VCPUs 0, 1, 2 and 5 to a
|
|
|
|
NUMA node:
|
|
|
|
|
|
|
|
::
|
|
|
|
|
|
|
|
-numa node,cpus=0-2,cpus=5
|
|
|
|
|
|
|
|
'\ ``cpu``\ ' option is a new alternative to '\ ``cpus``\ ' option
|
|
|
|
which uses '\ ``socket-id|core-id|thread-id``\ ' properties to
|
|
|
|
assign CPU objects to a node using topology layout properties of
|
|
|
|
CPU. The set of properties is machine specific, and depends on used
|
|
|
|
machine type/'\ ``smp``\ ' options. It could be queried with
|
|
|
|
'\ ``hotpluggable-cpus``\ ' monitor command. '\ ``node-id``\ '
|
|
|
|
property specifies node to which CPU object will be assigned, it's
|
|
|
|
required for node to be declared with '\ ``node``\ ' option before
|
|
|
|
it's used with '\ ``cpu``\ ' option.
|
|
|
|
|
|
|
|
For example:
|
|
|
|
|
|
|
|
::
|
|
|
|
|
|
|
|
-M pc \
|
|
|
|
-smp 1,sockets=2,maxcpus=2 \
|
|
|
|
-numa node,nodeid=0 -numa node,nodeid=1 \
|
|
|
|
-numa cpu,node-id=0,socket-id=0 -numa cpu,node-id=1,socket-id=1
|
|
|
|
|
|
|
|
Legacy '\ ``mem``\ ' assigns a given RAM amount to a node (not supported
|
|
|
|
for 5.1 and newer machine types). '\ ``memdev``\ ' assigns RAM from
|
|
|
|
a given memory backend device to a node. If '\ ``mem``\ ' and
|
|
|
|
'\ ``memdev``\ ' are omitted in all nodes, RAM is split equally between them.
|
|
|
|
|
|
|
|
|
|
|
|
'\ ``mem``\ ' and '\ ``memdev``\ ' are mutually exclusive.
|
|
|
|
Furthermore, if one node uses '\ ``memdev``\ ', all of them have to
|
|
|
|
use it.
|
|
|
|
|
|
|
|
'\ ``initiator``\ ' is an additional option that points to an
|
|
|
|
initiator NUMA node that has best performance (the lowest latency or
|
|
|
|
largest bandwidth) to this NUMA node. Note that this option can be
|
|
|
|
set only when the machine property 'hmat' is set to 'on'.
|
|
|
|
|
|
|
|
Following example creates a machine with 2 NUMA nodes, node 0 has
|
|
|
|
CPU. node 1 has only memory, and its initiator is node 0. Note that
|
|
|
|
because node 0 has CPU, by default the initiator of node 0 is itself
|
|
|
|
and must be itself.
|
|
|
|
|
|
|
|
::
|
|
|
|
|
|
|
|
-machine hmat=on \
|
|
|
|
-m 2G,slots=2,maxmem=4G \
|
|
|
|
-object memory-backend-ram,size=1G,id=m0 \
|
|
|
|
-object memory-backend-ram,size=1G,id=m1 \
|
|
|
|
-numa node,nodeid=0,memdev=m0 \
|
|
|
|
-numa node,nodeid=1,memdev=m1,initiator=0 \
|
|
|
|
-smp 2,sockets=2,maxcpus=2 \
|
|
|
|
-numa cpu,node-id=0,socket-id=0 \
|
|
|
|
-numa cpu,node-id=0,socket-id=1
|
|
|
|
|
|
|
|
source and destination are NUMA node IDs. distance is the NUMA
|
|
|
|
distance from source to destination. The distance from a node to
|
|
|
|
itself is always 10. If any pair of nodes is given a distance, then
|
|
|
|
all pairs must be given distances. Although, when distances are only
|
|
|
|
given in one direction for each pair of nodes, then the distances in
|
|
|
|
the opposite directions are assumed to be the same. If, however, an
|
|
|
|
asymmetrical pair of distances is given for even one node pair, then
|
|
|
|
all node pairs must be provided distance values for both directions,
|
|
|
|
even when they are symmetrical. When a node is unreachable from
|
|
|
|
another node, set the pair's distance to 255.
|
|
|
|
|
|
|
|
Note that the -``numa`` option doesn't allocate any of the specified
|
|
|
|
resources, it just assigns existing resources to NUMA nodes. This
|
|
|
|
means that one still has to use the ``-m``, ``-smp`` options to
|
|
|
|
allocate RAM and VCPUs respectively.
|
|
|
|
|
|
|
|
Use '\ ``hmat-lb``\ ' to set System Locality Latency and Bandwidth
|
|
|
|
Information between initiator and target NUMA nodes in ACPI
|
|
|
|
Heterogeneous Attribute Memory Table (HMAT). Initiator NUMA node can
|
|
|
|
create memory requests, usually it has one or more processors.
|
|
|
|
Target NUMA node contains addressable memory.
|
|
|
|
|
|
|
|
In '\ ``hmat-lb``\ ' option, node are NUMA node IDs. hierarchy is
|
|
|
|
the memory hierarchy of the target NUMA node: if hierarchy is
|
|
|
|
'memory', the structure represents the memory performance; if
|
|
|
|
hierarchy is 'first-level\|second-level\|third-level', this
|
|
|
|
structure represents aggregated performance of memory side caches
|
|
|
|
for each domain. type of 'data-type' is type of data represented by
|
|
|
|
this structure instance: if 'hierarchy' is 'memory', 'data-type' is
|
|
|
|
'access\|read\|write' latency or 'access\|read\|write' bandwidth of
|
|
|
|
the target memory; if 'hierarchy' is
|
|
|
|
'first-level\|second-level\|third-level', 'data-type' is
|
|
|
|
'access\|read\|write' hit latency or 'access\|read\|write' hit
|
|
|
|
bandwidth of the target memory side cache.
|
|
|
|
|
|
|
|
lat is latency value in nanoseconds. bw is bandwidth value, the
|
|
|
|
possible value and units are NUM[M\|G\|T], mean that the bandwidth
|
|
|
|
value are NUM byte per second (or MB/s, GB/s or TB/s depending on
|
|
|
|
used suffix). Note that if latency or bandwidth value is 0, means
|
|
|
|
the corresponding latency or bandwidth information is not provided.
|
|
|
|
|
|
|
|
In '\ ``hmat-cache``\ ' option, node-id is the NUMA-id of the memory
|
|
|
|
belongs. size is the size of memory side cache in bytes. level is
|
|
|
|
the cache level described in this structure, note that the cache
|
|
|
|
level 0 should not be used with '\ ``hmat-cache``\ ' option.
|
|
|
|
associativity is the cache associativity, the possible value is
|
|
|
|
'none/direct(direct-mapped)/complex(complex cache indexing)'. policy
|
|
|
|
is the write policy. line is the cache Line size in bytes.
|
|
|
|
|
|
|
|
For example, the following options describe 2 NUMA nodes. Node 0 has
|
|
|
|
2 cpus and a ram, node 1 has only a ram. The processors in node 0
|
|
|
|
access memory in node 0 with access-latency 5 nanoseconds,
|
|
|
|
access-bandwidth is 200 MB/s; The processors in NUMA node 0 access
|
|
|
|
memory in NUMA node 1 with access-latency 10 nanoseconds,
|
|
|
|
access-bandwidth is 100 MB/s. And for memory side cache information,
|
|
|
|
NUMA node 0 and 1 both have 1 level memory cache, size is 10KB,
|
|
|
|
policy is write-back, the cache Line size is 8 bytes:
|
|
|
|
|
|
|
|
::
|
|
|
|
|
|
|
|
-machine hmat=on \
|
|
|
|
-m 2G \
|
|
|
|
-object memory-backend-ram,size=1G,id=m0 \
|
|
|
|
-object memory-backend-ram,size=1G,id=m1 \
|
|
|
|
-smp 2,sockets=2,maxcpus=2 \
|
|
|
|
-numa node,nodeid=0,memdev=m0 \
|
|
|
|
-numa node,nodeid=1,memdev=m1,initiator=0 \
|
|
|
|
-numa cpu,node-id=0,socket-id=0 \
|
|
|
|
-numa cpu,node-id=0,socket-id=1 \
|
|
|
|
-numa hmat-lb,initiator=0,target=0,hierarchy=memory,data-type=access-latency,latency=5 \
|
|
|
|
-numa hmat-lb,initiator=0,target=0,hierarchy=memory,data-type=access-bandwidth,bandwidth=200M \
|
|
|
|
-numa hmat-lb,initiator=0,target=1,hierarchy=memory,data-type=access-latency,latency=10 \
|
|
|
|
-numa hmat-lb,initiator=0,target=1,hierarchy=memory,data-type=access-bandwidth,bandwidth=100M \
|
|
|
|
-numa hmat-cache,node-id=0,size=10K,level=1,associativity=direct,policy=write-back,line=8 \
|
|
|
|
-numa hmat-cache,node-id=1,size=10K,level=1,associativity=direct,policy=write-back,line=8
|
|
|
|
ERST
|
|
|
|
|
hw/cxl/host: Add support for CXL Fixed Memory Windows.
The concept of these is introduced in [1] in terms of the
description the CEDT ACPI table. The principal is more general.
Unlike once traffic hits the CXL root bridges, the host system
memory address routing is implementation defined and effectively
static once observable by standard / generic system software.
Each CXL Fixed Memory Windows (CFMW) is a region of PA space
which has fixed system dependent routing configured so that
accesses can be routed to the CXL devices below a set of target
root bridges. The accesses may be interleaved across multiple
root bridges.
For QEMU we could have fully specified these regions in terms
of a base PA + size, but as the absolute address does not matter
it is simpler to let individual platforms place the memory regions.
ExampleS:
-cxl-fixed-memory-window targets.0=cxl.0,size=128G
-cxl-fixed-memory-window targets.0=cxl.1,size=128G
-cxl-fixed-memory-window targets.0=cxl0,targets.1=cxl.1,size=256G,interleave-granularity=2k
Specifies
* 2x 128G regions not interleaved across root bridges, one for each of
the root bridges with ids cxl.0 and cxl.1
* 256G region interleaved across root bridges with ids cxl.0 and cxl.1
with a 2k interleave granularity.
When system software enumerates the devices below a given root bridge
it can then decide which CFMW to use. If non interleave is desired
(or possible) it can use the appropriate CFMW for the root bridge in
question. If there are suitable devices to interleave across the
two root bridges then it may use the 3rd CFMS.
A number of other designs were considered but the following constraints
made it hard to adapt existing QEMU approaches to this particular problem.
1) The size must be known before a specific architecture / board brings
up it's PA memory map. We need to set up an appropriate region.
2) Using links to the host bridges provides a clean command line interface
but these links cannot be established until command line devices have
been added.
Hence the two step process used here of first establishing the size,
interleave-ways and granularity + caching the ids of the host bridges
and then, once available finding the actual host bridges so they can
be used later to support interleave decoding.
[1] CXL 2.0 ECN: CEDT CFMWS & QTG DSM (computeexpresslink.org / specifications)
Signed-off-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Acked-by: Markus Armbruster <armbru@redhat.com> # QAPI Schema
Message-Id: <20220429144110.25167-28-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
11 months ago
|
|
|
DEF("cxl-fixed-memory-window", HAS_ARG, QEMU_OPTION_cxl_fixed_memory_window,
|
|
|
|
"-cxl-fixed-memory-window targets.0=firsttarget,targets.1=secondtarget,size=size[,interleave-granularity=granularity]\n",
|
|
|
|
QEMU_ARCH_ALL)
|
|
|
|
SRST
|
|
|
|
``-cxl-fixed-memory-window targets.0=firsttarget,targets.1=secondtarget,size=size[,interleave-granularity=granularity]``
|
|
|
|
Define a CXL Fixed Memory Window (CFMW).
|
|
|
|
|
|
|
|
Described in the CXL 2.0 ECN: CEDT CFMWS & QTG _DSM.
|
|
|
|
|
|
|
|
They are regions of Host Physical Addresses (HPA) on a system which
|
|
|
|
may be interleaved across one or more CXL host bridges. The system
|
|
|
|
software will assign particular devices into these windows and
|
|
|
|
configure the downstream Host-managed Device Memory (HDM) decoders
|
|
|
|
in root ports, switch ports and devices appropriately to meet the
|
|
|
|
interleave requirements before enabling the memory devices.
|
|
|
|
|
|
|
|
``targets.X=firsttarget`` provides the mapping to CXL host bridges
|
|
|
|
which may be identified by the id provied in the -device entry.
|
|
|
|
Multiple entries are needed to specify all the targets when
|
|
|
|
the fixed memory window represents interleaved memory. X is the
|
|
|
|
target index from 0.
|
|
|
|
|
|
|
|
``size=size`` sets the size of the CFMW. This must be a multiple of
|
|
|
|
256MiB. The region will be aligned to 256MiB but the location is
|
|
|
|
platform and configuration dependent.
|
|
|
|
|
|
|
|
``interleave-granularity=granularity`` sets the granularity of
|
|
|
|
interleave. Default 256KiB. Only 256KiB, 512KiB, 1024KiB, 2048KiB
|
|
|
|
4096KiB, 8192KiB and 16384KiB granularities supported.
|
|
|
|
|
|
|
|
Example:
|
|
|
|
|
|
|
|
::
|
|
|
|
|
|
|
|
-cxl-fixed-memory-window targets.0=cxl.0,targets.1=cxl.1,size=128G,interleave-granularity=512k
|
|
|
|
|
|
|
|
ERST
|
|
|
|
|
|
|
|
DEF("add-fd", HAS_ARG, QEMU_OPTION_add_fd,
|
|
|
|
"-add-fd fd=fd,set=set[,opaque=opaque]\n"
|
|
|
|
" Add 'fd' to fd 'set'\n", QEMU_ARCH_ALL)
|
|
|
|
SRST
|
|
|
|
``-add-fd fd=fd,set=set[,opaque=opaque]``
|
|
|
|
Add a file descriptor to an fd set. Valid options are:
|
|
|
|
|
|
|
|
``fd=fd``
|
|
|
|
This option defines the file descriptor of which a duplicate is
|
|
|
|
added to fd set. The file descriptor cannot be stdin, stdout, or
|
|
|
|
stderr.
|
|
|
|
|
|
|
|
``set=set``
|
|
|
|
This option defines the ID of the fd set to add the file
|
|
|
|
descriptor to.
|
|
|
|
|
|
|
|
``opaque=opaque``
|
|
|
|
This option defines a free-form string that can be used to
|
|
|
|
describe fd.
|
|
|
|
|
|
|
|
You can open an image using pre-opened file descriptors from an fd
|
|
|
|
set:
|
|
|
|
|
|
|
|
.. parsed-literal::
|
|
|
|
|
|
|
|
|qemu_system| \\
|
|
|
|
-add-fd fd=3,set=2,opaque="rdwr:/path/to/file" \\
|
|
|
|
-add-fd fd=4,set=2,opaque="rdonly:/path/to/file" \\
|
|
|
|
-drive file=/dev/fdset/2,index=0,media=disk
|
|
|
|
ERST
|
|
|
|
|
|
|
|
DEF("set", HAS_ARG, QEMU_OPTION_set,
|
|
|
|
"-set group.id.arg=value\n"
|
|
|
|
" set <arg> parameter for item <id> of type <group>\n"
|
|
|
|
" i.e. -set drive.$id.file=/path/to/image\n", QEMU_ARCH_ALL)
|
|
|
|
SRST
|
|
|
|
``-set group.id.arg=value``
|
|
|
|
Set parameter arg for item id of type group
|
|
|
|
ERST
|
|
|
|
|
|
|
|
DEF("global", HAS_ARG, QEMU_OPTION_global,
|
|
|
|
"-global driver.property=value\n"
|
|
|
|
"-global driver=driver,property=property,value=value\n"
|
|
|
|
" set a global default for a driver property\n",
|
|
|
|
QEMU_ARCH_ALL)
|
|
|
|
SRST
|
|
|
|
``-global driver.prop=value``
|
|
|
|
\
|
|