qemu-e2k/hw
Yanan Wang 864c3b5c32 hw/core/machine: Introduce CPU cluster topology support
The new Cluster-Aware Scheduling support has landed in Linux 5.16,
which has been proved to benefit the scheduling performance (e.g.
load balance and wake_affine strategy) on both x86_64 and AArch64.

So now in Linux 5.16 we have four-level arch-neutral CPU topology
definition like below and a new scheduler level for clusters.
struct cpu_topology {
    int thread_id;
    int core_id;
    int cluster_id;
    int package_id;
    int llc_id;
    cpumask_t thread_sibling;
    cpumask_t core_sibling;
    cpumask_t cluster_sibling;
    cpumask_t llc_sibling;
}

A cluster generally means a group of CPU cores which share L2 cache
or other mid-level resources, and it is the shared resources that
is used to improve scheduler's behavior. From the point of view of
the size range, it's between CPU die and CPU core. For example, on
some ARM64 Kunpeng servers, we have 6 clusters in each NUMA node,
and 4 CPU cores in each cluster. The 4 CPU cores share a separate
L2 cache and a L3 cache tag, which brings cache affinity advantage.

In virtualization, on the Hosts which have pClusters (physical
clusters), if we can design a vCPU topology with cluster level for
guest kernel and have a dedicated vCPU pinning. A Cluster-Aware
Guest kernel can also make use of the cache affinity of CPU clusters
to gain similar scheduling performance.

This patch adds infrastructure for CPU cluster level topology
configuration and parsing, so that the user can specify cluster
parameter if their machines support it.

Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
Message-Id: <20211228092221.21068-3-wangyanan55@huawei.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
[PMD: Added '(since 7.0)' to @clusters in qapi/machine.json]
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2021-12-31 13:42:39 +01:00
..
9pfs 9pfs: use P9Array in v9fs_walk() 2021-10-27 14:45:22 +02:00
acpi failover: fix unplug pending detection 2021-11-28 17:03:52 -05:00
adc hw/adc: Add basic Aspeed ADC model 2021-10-12 08:20:08 +02:00
alpha
arm dma: Let dma_memory_read/write() take MemTxAttrs argument 2021-12-30 17:16:32 +01:00
audio pci: Let ld*_pci_dma() propagate MemTxResult 2021-12-31 01:05:27 +01:00
avr hw/avr: Realize AVRCPU qdev object using qdev_realize() 2021-12-17 10:43:24 +01:00
block virtio-blk: Fix clean up of host notifiers for single MR transaction. 2021-12-06 14:21:14 +00:00
char Fix STM32F2XX USART data register readout 2021-12-15 10:11:34 +00:00
core hw/core/machine: Introduce CPU cluster topology support 2021-12-31 13:42:39 +01:00
cpu
cris
display dma: Let dma_memory_map() take MemTxAttrs argument 2021-12-30 17:16:32 +01:00
dma dma: Let dma_memory_read/write() take MemTxAttrs argument 2021-12-30 17:16:32 +01:00
gpio hw: aspeed_gpio: Fix GPIO array indexing 2021-10-12 08:20:08 +02:00
hppa
hyperv dma: Let dma_memory_map() take MemTxAttrs argument 2021-12-30 17:16:32 +01:00
i2c aspeed/i2c: QOMify AspeedI2CBus 2021-10-12 08:20:08 +02:00
i386 dma: Let dma_memory_read/write() take MemTxAttrs argument 2021-12-30 17:16:32 +01:00
ide dma: Let dma_buf_read() take MemTxAttrs argument 2021-12-31 01:05:27 +01:00
input hw/input/lasips2: Fix typos in function names 2021-10-31 21:05:40 +01:00
intc dma: Let ld*_dma() propagate MemTxResult 2021-12-31 01:05:27 +01:00
ipack qbus: Rename qbus_create_inplace() to qbus_init() 2021-09-30 13:42:10 +01:00
ipmi
isa vt82c686: Add a method to VIA_ISA to raise ISA interrupts 2021-10-18 00:41:36 +02:00
m68k m68k pull request 20211109 2021-11-09 13:16:56 +01:00
mem hw/mem/pc-dimm: Restrict NUMA-specific code to NUMA machines 2021-11-11 03:13:05 -05:00
microblaze hw/microblaze: Replace drive_get_next() by drive_get() 2021-12-15 08:38:16 +01:00
mips hw/mips/boston: Fix load_elf() error detection 2021-12-06 11:57:36 +01:00
misc dma: Let dma_memory_read/write() take MemTxAttrs argument 2021-12-30 17:16:32 +01:00
net pci: Let ld*_pci_dma() propagate MemTxResult 2021-12-31 01:05:27 +01:00
nios2
nubus qbus: Rename qbus_create_inplace() to qbus_init() 2021-09-30 13:42:10 +01:00
nvme dma: Let dma_buf_read() take MemTxAttrs argument 2021-12-31 01:05:27 +01:00
nvram dma: Let st*_dma() take MemTxAttrs argument 2021-12-31 01:05:27 +01:00
openrisc
pci Fix bad overflow check in hw/pci/pcie.c 2021-11-29 08:49:36 -05:00
pci-bridge qdev: Make DeviceState.id independent of QemuOpts 2021-10-15 16:06:35 +02:00
pci-host dma: Let dma_memory_read/write() take MemTxAttrs argument 2021-12-30 17:16:32 +01:00
pcmcia
ppc ppc/pnv: Use QOM hierarchy to scan PEC PHB4 devices 2021-12-17 17:57:19 +01:00
rdma qapi: introduce x-query-rdma QMP command 2021-11-02 15:55:14 +00:00
remote hw/remote/proxy: Categorize Wireless devices as 'Network' ones 2021-10-04 09:47:26 +02:00
riscv hw/riscv: Use load address rather than entry point for fw_dynamic next_addr 2021-12-20 14:53:31 +10:00
rtc hw/rtc/pl031: Send RTC_CHANGE QMP event 2021-11-15 18:53:00 +00:00
rx
s390x s390x/pci: add supported DT information to clp response 2021-12-17 09:12:37 +01:00
scsi pci: Let ld*_pci_dma() propagate MemTxResult 2021-12-31 01:05:27 +01:00
sd dma: Let dma_memory_read/write() take MemTxAttrs argument 2021-12-30 17:16:32 +01:00
sensor
sh4 hw/intc/sh_intc: Inline and drop sh_intc_source() function 2021-10-30 18:39:37 +02:00
smbios
sparc
sparc64 hw: Replace trivial drive_get_next() by drive_get() 2021-12-15 08:38:16 +01:00
ssi aspeed/smc: Use a container for the flash mmio address space 2021-10-22 09:52:17 +02:00
timer hw/timer/sh_timer: Remove use of hw_error 2021-10-30 18:39:37 +02:00
tpm tpm: mark correct memory region range dirty when clearing RAM 2021-10-02 08:43:21 +02:00
tricore
usb pci: Let ld*_pci_dma() take MemTxAttrs argument 2021-12-31 01:05:27 +01:00
vfio vfio: Fix memory leak of hostwin 2021-11-17 11:25:55 -07:00
virtio dma: Let dma_memory_map() take MemTxAttrs argument 2021-12-30 17:16:32 +01:00
watchdog watchdog: remove select_watchdog_action 2021-11-02 15:57:27 +01:00
xen pci: Export pci_for_each_device_under_bus*() 2021-11-01 19:36:11 -04:00
xenpv
xtensa
Kconfig
meson.build