qemu-e2k/hw
Laurent Vivier 3bfe57165b numa: equally distribute memory on nodes
When there are more nodes than available memory to put the minimum
allowed memory by node, all the memory is put on the last node.

This is because we put (ram_size / nb_numa_nodes) &
~((1 << mc->numa_mem_align_shift) - 1); on each node, and in this
case the value is 0. This is particularly true with pseries,
as the memory must be aligned to 256MB.

To avoid this problem, this patch uses an error diffusion algorithm [1]
to distribute equally the memory on nodes.

We introduce numa_auto_assign_ram() function in MachineClass
to keep compatibility between machine type versions.
The legacy function is used with pseries-2.9, pc-q35-2.9 and
pc-i440fx-2.9 (and previous), the new one with all others.

Example:

qemu-system-ppc64 -S -nographic  -nodefaults -monitor stdio -m 1G -smp 8 \
                  -numa node -numa node -numa node \
                  -numa node -numa node -numa node

Before:

(qemu) info numa
6 nodes
node 0 cpus: 0 6
node 0 size: 0 MB
node 1 cpus: 1 7
node 1 size: 0 MB
node 2 cpus: 2
node 2 size: 0 MB
node 3 cpus: 3
node 3 size: 0 MB
node 4 cpus: 4
node 4 size: 0 MB
node 5 cpus: 5
node 5 size: 1024 MB

After:
(qemu) info numa
6 nodes
node 0 cpus: 0 6
node 0 size: 0 MB
node 1 cpus: 1 7
node 1 size: 256 MB
node 2 cpus: 2
node 2 size: 0 MB
node 3 cpus: 3
node 3 size: 256 MB
node 4 cpus: 4
node 4 size: 256 MB
node 5 cpus: 5
node 5 size: 256 MB

[1] https://en.wikipedia.org/wiki/Error_diffusion

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20170502162955.1610-2-lvivier@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
[ehabkost: s/ram_size/size/ at numa_default_auto_assign_ram()]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-11 16:08:47 -03:00
..
9pfs Xen 2017/04/21 + fix 2017-04-26 10:22:31 +01:00
acpi numa: Allow setting NUMA distance for different NUMA nodes 2017-05-11 16:08:37 -03:00
adc
alpha
arm -----BEGIN PGP SIGNATURE----- 2017-04-25 14:14:17 +01:00
audio audio: Use ARRAY_SIZE from qemu/osdep.h 2017-05-04 09:16:05 +02:00
block qobject: Use simpler QDict/QList scalar insertion macros 2017-05-09 09:13:51 +02:00
bt
char s390x/3270: Mark non-migratable and enable the device 2017-05-04 10:34:37 +02:00
core numa: equally distribute memory on nodes 2017-05-11 16:08:47 -03:00
cpu
cris
display cg3: add explicit ram_addr_t cast to scanline page variable 2017-05-05 09:49:00 +01:00
dma
gpio
i2c
i386 numa: equally distribute memory on nodes 2017-05-11 16:08:47 -03:00
ide
input input: Add trace event for empty keyboard queue 2017-05-03 14:20:12 +02:00
intc ppc/pnv: add a PnvICPState object 2017-04-26 12:00:42 +10:00
ipack
ipmi ipmi: introduce an ipmi_bmc_gen_event() API 2017-04-26 12:41:55 +10:00
isa
lm32
m68k
mem
microblaze
mips
misc
moxie
net
nios2
nvram
openrisc target/openrisc: Support non-busy idle state using PMR SPR 2017-05-04 09:39:14 +09:00
pci pci: Reduce scope of error injection 2017-05-08 20:32:14 +02:00
pci-bridge
pci-host hw/i386: Build-time assertion on pc/q35 reset register being identical. 2017-05-03 12:29:40 +02:00
pcmcia
ppc numa: equally distribute memory on nodes 2017-05-11 16:08:47 -03:00
s390x Basic support for using channel-attached 3270 'green-screen' 2017-05-05 16:56:38 +01:00
scsi vhost-scsi: create a vhost-scsi-common abstraction 2017-05-05 12:10:00 +02:00
sd
sh4
smbios
sparc
sparc64
ssi
timer
tpm
tricore
unicore32
usb qobject: Use simpler QDict/QList scalar insertion macros 2017-05-09 09:13:51 +02:00
vfio vfio/pci: Fix incorrect error message 2017-05-03 14:52:35 -06:00
virtio
watchdog
xen xen: use a better chardev type check 2017-05-04 15:34:41 +04:00
xenpv
xtensa
Makefile.objs