hw/arm/virt: Support CPU cluster on ARM virt machine

ARM64 machines like Kunpeng Family Server Chips have a level
of hardware topology in which a group of CPU cores share L3
cache tag or L2 cache. For example, Kunpeng 920 typically
has 6 or 8 clusters in each NUMA node (also represent range
of CPU die), and each cluster has 4 CPU cores. All clusters
share L3 cache data, but CPU cores in each cluster share a
local L3 tag.

Running a guest kernel with Cluster-Aware Scheduling on the
Hosts which have physical clusters, if we can design a vCPU
topology with cluster level for guest kernel and then have
a dedicated vCPU pinning, the guest will gain scheduling
performance improvement from cache affinity of CPU cluster.

So let's enable the support for this new parameter on ARM
virt machines. After this patch, we can define a 4-level
CPU hierarchy like: cpus=*,maxcpus=*,sockets=*,clusters=*,
cores=*,threads=*.

Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Message-id: 20220107083232.16256-2-wangyanan55@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This commit is contained in:
Yanan Wang 2022-01-07 16:32:27 +08:00 committed by Peter Maydell
parent 6d81f4887f
commit d55c316f91
2 changed files with 11 additions and 0 deletions

View File

@ -2718,6 +2718,7 @@ static void virt_machine_class_init(ObjectClass *oc, void *data)
hc->unplug_request = virt_machine_device_unplug_request_cb;
hc->unplug = virt_machine_device_unplug_cb;
mc->nvdimm_supported = true;
mc->smp_props.clusters_supported = true;
mc->auto_enable_numa_with_memhp = true;
mc->auto_enable_numa_with_memdev = true;
mc->default_ram_id = "mach-virt.ram";

View File

@ -277,6 +277,16 @@ SRST
-smp 16,sockets=2,dies=2,cores=2,threads=2,maxcpus=16
The following sub-option defines a CPU topology hierarchy (2 sockets
totally on the machine, 2 clusters per socket, 2 cores per cluster,
2 threads per core) for ARM virt machines which support sockets/clusters
/cores/threads. Some members of the option can be omitted but their values
will be automatically computed:
::
-smp 16,sockets=2,clusters=2,cores=2,threads=2,maxcpus=16
Historically preference was given to the coarsest topology parameters
when computing missing values (ie sockets preferred over cores, which
were preferred over threads), however, this behaviour is considered