qemu-e2k

Commit Graph

Author	SHA1	Message	Date
Gerd Hoffmann	bb1599b64c	opengl: add egl-headless display Add egl-headless user interface. It doesn't provide a real user interface, it only provides opengl support using drm render nodes. It will copy back the bits rendered by the guest using virgl back to a DisplaySurface and kick the usual display update code paths, so spice and vnc and screendump can pick it up. Use it this way: qemu -display egl-headless -vnc $display qemu -display egl-headless -spice gl=off,$args Note that you should prefer native spice opengl support (-spice gl=on) if possible because that delivers better performance. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Message-id: 20170505104101.30589-7-kraxel@redhat.com	2017-05-12 12:02:48 +02:00
Gerd Hoffmann	bc8c946f72	egl: explicitly ask for core context Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Message-id: 20170505104101.30589-6-kraxel@redhat.com	2017-05-12 12:02:48 +02:00
Gerd Hoffmann	151c8e608e	egl-helpers: add missing error check Code didn't check for qemu_egl_init_dpy_mesa() failures, add it. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-id: 20170505104101.30589-5-kraxel@redhat.com	2017-05-12 12:02:48 +02:00
Gerd Hoffmann	e1913dbb58	egl-helpers: fix display init for x11 When running on gtk we need X11 platform not mesa platform. Create separate functions for mesa and x11 so we can keep the egl #ifdef mess local to egl-helpers.c Fixes: `0ea1523fb6` Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Message-id: 20170505104101.30589-4-kraxel@redhat.com	2017-05-12 12:02:48 +02:00
Gerd Hoffmann	9f728c7940	egl-helpers: drop support for gles and debug logging Leftover from the early opengl days. Unused now, so delete the dead code. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-id: 20170505104101.30589-3-kraxel@redhat.com	2017-05-12 12:02:48 +02:00
Gerd Hoffmann	c19f4fbce1	virtio-gpu: move virtio_gpu_gl_block Move to virtio-gpu-3d.c where all the other virgl code lives too. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-id: 20170505104101.30589-2-kraxel@redhat.com	2017-05-12 12:02:48 +02:00
Dr. David Alan Gilbert	08b277ac46	migration/i386: Remove support for pre-0.12 formats Remove support for versions of the CPU state prior to 11 which is the version used in qemu 0.12 - you'd be pretty lucky if you got a migration stream to work from anything that old anyway. This doesn't affect the machine type definition in any way. My main reason for doing this is the hack for sysenter_esp/eip that uses .get/.put's in state versions less than 7 (that's prior to somewhere before 0.10). Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20170405190024.27581-4-dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-05-11 16:08:51 -03:00
Dr. David Alan Gilbert	ab808276f8	vmstatification: i386 FPReg Convert the fpreg save/restore to use VMSTATE_ macros rather than .get/.put. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20170405190024.27581-3-dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-05-11 16:08:50 -03:00
Dr. David Alan Gilbert	46baa9007f	migration/i386: Remove old non-softfloat 64bit FP support Long long ago, we used to support storing the x86 FP registers in a 64bit format. Then `c31da136a0` in v0.14-rc0 removed the last support for writing that in the migration format. Even before that, it was only used if you had softfloat disabled (i.e. !USE_X86LDOUBLE) so in practice use of it in even earlier qemu is unlikely for most users. Kill it off, it's complicated, and possibly broken. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20170405190024.27581-2-dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-05-11 16:08:50 -03:00
Igor Mammedov	2941020a47	tests: check -numa node,cpu=props_list usecase Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Message-Id: <1494415802-227633-19-git-send-email-imammedo@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-05-11 16:08:50 -03:00
Igor Mammedov	419fcdec3c	numa: add '-numa cpu,...' option for property based node mapping legacy cpu to node mapping is using cpu index values to map VCPU to node with help of '-numa node,nodeid=node,cpus=x[-y]' option. However cpu index is internal concept and QEMU users have to guess /reimplement qemu's logic/ to map it to a concrete cpu socket/core/thread to make sane CPUs placement across numa nodes. This patch allows to map cpu objects to numa nodes using the same properties as used for cpus with -device/device_add (socket-id/core-id/thread-id/node-id). At present valid properties/values to address CPUs could be fetched using hotpluggable-cpus monitor/qmp command, it will require user to start qemu twice when creating domain to fetch possible CPUs for a machine type/-smp layout first and then the second time with numa explicit mapping for actual usage. The first step results could be saved and reused to set/change mapping later as far as machine type/-smp stays the same. Proposed impl. supports exact and wildcard matching to simplify CLI and allow to set mapping for a specific cpu or group of cpu objects specified by matched properties. For example: # exact mapping x86 -numa cpu,node-id=x,socket-id=y,core-id=z,thread-id=n # exact mapping SPAPR -numa cpu,node-id=x,core-id=y # wildcard mapping, all cpu objects that match socket-id=y # are mapped to node-id=x -numa cpu,node-id=x,socket-id=y Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <1494415802-227633-18-git-send-email-imammedo@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-05-11 16:08:50 -03:00
Igor Mammedov	1171ae9a5b	numa: remove node_cpu bitmaps as they are no longer used Postfactum "CPU(s) present in multiple NUMA nodes" check was the last user of node_cpu bitmaps, but it's not need as machine_set_cpu_numa_node() does the similar check at the time mapping is set for cpus (i.e. when -numa cpus= is parsed) and ensures that cpu can be mapped only to one node. Remove duplicate check based on node_cpu bitmaps and since the last user is gone remove node_cpu as well, which completes internal transition from legacy bitmap based mapping storage to possible_cpus storage. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Andrew Jones <drjones@redhat.com> Message-Id: <1494415802-227633-17-git-send-email-imammedo@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-05-11 16:08:50 -03:00
Igor Mammedov	ec78f8114b	numa: use possible_cpus for not mapped CPUs check and remove corresponding part in numa.c that uses node_cpu bitmaps. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Andrew Jones <drjones@redhat.com> Message-Id: <1494415802-227633-16-git-send-email-imammedo@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-05-11 16:08:50 -03:00
Igor Mammedov	482dfe9a9e	machine: call machine init from wrapper add machine_run_board_init() wrapper that calls machine init for now but in follow up patches it will be used to run generic machine code that should run before machine init. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <1494415802-227633-15-git-send-email-imammedo@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-05-11 16:08:50 -03:00
Igor Mammedov	3b8a8557f7	numa: remove no longer need numa_post_machine_init() CPUState::numa_node is still in use but now it's set by board when it creates CPU objects. So there isn't any need to set it again after all CPU's are created, since it's been already set. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Andrew Jones <drjones@redhat.com> Message-Id: <1494415802-227633-14-git-send-email-imammedo@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-05-11 16:08:50 -03:00
Igor Mammedov	6accfb7823	tests: numa: add case for QMP command query-cpus Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Message-Id: <1494415802-227633-13-git-send-email-imammedo@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-05-11 16:08:50 -03:00
Igor Mammedov	afed5a5a70	QMP: include CpuInstanceProperties into query_cpus output output if board supports CpuInstanceProperties, report them for each CPU thread listed. Main motivation for this is to provide these properties introspection via QMP interface for using in test cases to verify numa node to cpu mapping, which includes not only boards that support cpu hotplug and have this info in query-hotpluggable-cpus (pc/spapr) but also for boards that don't not support hotpluggable-cpus but support numa mapping (virt-arm). Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <1494415802-227633-12-git-send-email-imammedo@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-05-11 16:08:49 -03:00
Igor Mammedov	4ccf5826f9	virt-arm: get numa node mapping from possible_cpus instead of numa_get_node_for_cpu() Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Message-Id: <1494415802-227633-11-git-send-email-imammedo@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-05-11 16:08:49 -03:00
Igor Mammedov	722387e78d	spapr: get numa node mapping from possible_cpus instead of numa_get_node_for_cpu() it's safe to remove thread node_id != core node_id error branch as machine_set_cpu_numa_node() also does mismatch check and is called even before any CPU is created. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Acked-by: David Gibson <david@gibson.dropbear.id.au> Message-Id: <1494415802-227633-10-git-send-email-imammedo@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-05-11 16:08:49 -03:00
Igor Mammedov	ea2650724c	pc: get numa node mapping from possible_cpus instead of numa_get_node_for_cpu() Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Message-Id: <1494415802-227633-9-git-send-email-imammedo@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-05-11 16:08:49 -03:00
Igor Mammedov	af9b20e8d2	numa: do default mapping based on possible_cpus instead of node_cpu bitmaps Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Andrew Jones <drjones@redhat.com> Message-Id: <1494415802-227633-8-git-send-email-imammedo@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-05-11 16:08:49 -03:00
Igor Mammedov	7c88e65d9e	numa: mirror cpu to node mapping in MachineState::possible_cpus Introduce machine_set_cpu_numa_node() helper that stores node mapping for CPU in MachineState::possible_cpus. CPU and node it belongs to is specified by 'props' argument. Patch doesn't remove old way of storing mapping in numa_info[X].node_cpu as removing it at the same time makes patch rather big. Instead it just mirrors mapping in possible_cpus and follow up per target patches will switch to possible_cpus and numa_info[X].node_cpu will be removed once there isn't any users left. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Andrew Jones <drjones@redhat.com> Message-Id: <1494415802-227633-7-git-send-email-imammedo@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-05-11 16:08:49 -03:00
Igor Mammedov	64c2a8f6d3	numa: add check that board supports cpu_index to node mapping Default node mapping initialization already checks that board supports cpu_index to node mapping and refuses to start if it's not supported. Do the same for explicitly provided mapping "-numa node,cpus=..." Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Message-Id: <1494415802-227633-6-git-send-email-imammedo@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-05-11 16:08:49 -03:00
Igor Mammedov	bd4c1bfe3e	virt-arm: add node-id property to CPU it will allow switching from cpu_index to property based numa mapping in follow up patches. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Message-Id: <1494415802-227633-5-git-send-email-imammedo@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-05-11 16:08:49 -03:00
Igor Mammedov	93b2a8cb0b	pc: add node-id property to CPU it will allow switching from cpu_index to property based numa mapping in follow up patches. PS: patch changes default value of CPUState::numa_node from 0 to CPU_UNSET_NUMA_NODE_ID. The only place for x86 that would affected is monitor's 'infor numa' command which uses that field. However legacy 0 value is still preserved by pc_cpu_pre_plug() in this patch if user/numa.c hasn't set it explicitly, so there is no change in behavior. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <1494415802-227633-4-git-send-email-imammedo@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-05-11 16:08:48 -03:00
Igor Mammedov	0b8497f08c	spapr: add node-id property to sPAPR core it will allow switching from cpu_index to core based numa mapping in follow up patches. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Message-Id: <1494415802-227633-3-git-send-email-imammedo@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-05-11 16:08:48 -03:00
Igor Mammedov	ea089eebbd	numa: move source of default CPUs to NUMA node mapping into boards Originally CPU threads were by default assigned in round-robin fashion. However it was causing issues in guest since CPU threads from the same socket/core could be placed on different NUMA nodes. Commit `fb43b73b` (pc: fix default VCPU to NUMA node mapping) fixed it by grouping threads within a socket on the same node introducing cpu_index_to_socket_id() callback and commit `20bb648d` (spapr: Fix default NUMA node allocation for threads) reused callback to fix similar issues for SPAPR machine even though socket doesn't make much sense there. As result QEMU ended up having 3 default distribution rules used by 3 targets /virt-arm, spapr, pc/. In effort of moving NUMA mapping for CPUs into possible_cpus, generalize default mapping in numa.c by making boards decide on default mapping and let them explicitly tell generic numa code to which node a CPU thread belongs to by replacing cpu_index_to_socket_id() with @cpu_index_to_instance_props() which provides default node_id assigned by board to specified cpu_index. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <1494415802-227633-2-git-send-email-imammedo@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-05-11 16:08:48 -03:00
Igor Mammedov	d9c34f9c6c	hw/arm/virt: explicitly allocate cpu_index for cpus Currently cpu_index is implicitly auto assigned during cpu.realize() time cpu_exec_realizefn()->cpu_list_add(). It happens to match index in possible_cpus so take control over it and make board initialize cpu_index to possible_cpus index explicitly. It will at least document that board is in control of it and when '-device cpu' support comes it will keep cpu_index stable regardless of order cpus are created so it won't break migration. Within this series it will be used for internal conversion from storing cpu_index based NUMA node bitmaps to property based mapping with possible_cpus, And will allow map cpu_index to a CPU entry in possible_cpus array. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Message-Id: <1493816238-33120-5-git-send-email-imammedo@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-05-11 16:08:48 -03:00
Igor Mammedov	17d3d0e2d9	hw/arm/virt: use machine->possible_cpus for storing possible topology info for now precalculate and store mp_afinity in possible_cpus as ARM cpus don't have socket/core/thread-id properties yet. In follow patches possible_cpus will be used for storing and setting NUMA node mapping and replace legacy bitmap based numa_info[node_id].node_cpu/numa_get_node_for_cpu() For the lack of better idea, this patch cannibalizes possible_cpus.cpus[x].props.thread_id so that *_cpu_index_to_props() callback could return addressable by props CPU which will be used by machine_set_cpu_numa_node() in follow up patches to assign a CPU to node. But cannibalizing is fine for now as that thread_id isn't exposed to users (no hotpluggable_cpus callback support for ARM yet) and it will be used only internally until 'device_add cpu' is supported where we can decide on which properties to use. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <1493816238-33120-4-git-send-email-imammedo@redhat.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-05-11 16:08:48 -03:00
Igor Mammedov	46de5913b6	hw/arm/virt: extract mp-affinity calculation in separate function Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <1493816238-33120-3-git-send-email-imammedo@redhat.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-05-11 16:08:48 -03:00
Igor Mammedov	63baf8bf01	tests: add CPUs to numa node mapping test Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Message-Id: <1493816238-33120-2-git-send-email-imammedo@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-05-11 16:08:48 -03:00
He Chen	fda4096fca	tests: acpi: extend cphp and memhp testcase with numa distance check Signed-off-by: He Chen <he.chen@linux.intel.com> Message-Id: <1493803036-4048-1-git-send-email-he.chen@linux.intel.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> [ehabkost: regenerated tests/acpi-tst-data, included SLIT table] Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-05-11 16:08:48 -03:00
Laurent Vivier	3bfe57165b	numa: equally distribute memory on nodes When there are more nodes than available memory to put the minimum allowed memory by node, all the memory is put on the last node. This is because we put (ram_size / nb_numa_nodes) & ~((1 << mc->numa_mem_align_shift) - 1); on each node, and in this case the value is 0. This is particularly true with pseries, as the memory must be aligned to 256MB. To avoid this problem, this patch uses an error diffusion algorithm [1] to distribute equally the memory on nodes. We introduce numa_auto_assign_ram() function in MachineClass to keep compatibility between machine type versions. The legacy function is used with pseries-2.9, pc-q35-2.9 and pc-i440fx-2.9 (and previous), the new one with all others. Example: qemu-system-ppc64 -S -nographic -nodefaults -monitor stdio -m 1G -smp 8 \ -numa node -numa node -numa node \ -numa node -numa node -numa node Before: (qemu) info numa 6 nodes node 0 cpus: 0 6 node 0 size: 0 MB node 1 cpus: 1 7 node 1 size: 0 MB node 2 cpus: 2 node 2 size: 0 MB node 3 cpus: 3 node 3 size: 0 MB node 4 cpus: 4 node 4 size: 0 MB node 5 cpus: 5 node 5 size: 1024 MB After: (qemu) info numa 6 nodes node 0 cpus: 0 6 node 0 size: 0 MB node 1 cpus: 1 7 node 1 size: 256 MB node 2 cpus: 2 node 2 size: 0 MB node 3 cpus: 3 node 3 size: 256 MB node 4 cpus: 4 node 4 size: 256 MB node 5 cpus: 5 node 5 size: 256 MB [1] https://en.wikipedia.org/wiki/Error_diffusion Signed-off-by: Laurent Vivier <lvivier@redhat.com> Message-Id: <20170502162955.1610-2-lvivier@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> [ehabkost: s/ram_size/size/ at numa_default_auto_assign_ram()] Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-05-11 16:08:47 -03:00
He Chen	0f203430dd	numa: Allow setting NUMA distance for different NUMA nodes This patch is going to add SLIT table support in QEMU, and provides additional option `dist` for command `-numa` to allow user set vNUMA distance by QEMU command. With this patch, when a user wants to create a guest that contains several vNUMA nodes and also wants to set distance among those nodes, the QEMU command would like: ``` -numa node,nodeid=0,cpus=0 \ -numa node,nodeid=1,cpus=1 \ -numa node,nodeid=2,cpus=2 \ -numa node,nodeid=3,cpus=3 \ -numa dist,src=0,dst=1,val=21 \ -numa dist,src=0,dst=2,val=31 \ -numa dist,src=0,dst=3,val=41 \ -numa dist,src=1,dst=2,val=21 \ -numa dist,src=1,dst=3,val=31 \ -numa dist,src=2,dst=3,val=21 \ ``` Signed-off-by: He Chen <he.chen@linux.intel.com> Message-Id: <1493260558-20728-1-git-send-email-he.chen@linux.intel.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-05-11 16:08:37 -03:00
Laurent Vivier	ecc1f5adee	maintainers: Add myself as linux-user reviewer I volunteer to review linux-user patches. Adding myself will help to not miss some of them. Signed-off-by: Laurent Vivier <laurent@vivier.eu> Acked-by: Riku Voipio <riku.voipio@linaro.org> Message-id: 20170510153950.29343-1-laurent@vivier.eu Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-05-11 13:31:11 -04:00
Daniel P. Berrange	4ed3d478c6	i386: rewrite way CPUID index is validated Change the nested if statements into a flat format, to make it clearer what validation / capping is being performed on different CPUID index values. NB this changes behaviour when "index > env->cpuid_xlevel2". This won't have any guest-visible effect because no there is no CPUID[0xC0000001] feature supported by TCG, and KVM code will never call cpu_x86_cpuid() with such an index value. Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Message-Id: <20170509132736.10071-2-berrange@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-05-11 10:54:04 -03:00
Kevin Wolf	d541e201bd	Block patches for the block queue. -----BEGIN PGP SIGNATURE----- iQFGBAABCAAwFiEEkb62CjDbPohX0Rgp9AfbAGHVz0AFAlkUWPkSHG1yZWl0ekBy ZWRoYXQuY29tAAoJEPQH2wBh1c9AQuYIAKu7UHbUc3rL4FOErA2kor/Lp55ApCfi WbBaeh06jNPXQ5/S7jie30WAlc0DT+wwfWlFTl7gnYsNmXI3yrBkbEbnZMA2uSEz qz+MEEx81v3BS7MDM5sKjoJVEt76zCS3f7MbrLYLrkdHg3AGo2tfMtIlABFM+I6T lkfxYqosR0pvN8bSPAyoAQvVKYefLdi+O0poG6WruCMcF58dmZn8GzwVfnncGbqz vsd2wcm983S+a7PgGT2VBNXfyBqZ0tHqn/gl5Bc5leZTEhV9DQSh8Z0DKBh+o5Cv 8iyrmklpGdr7sI+YHHkB1zecQxI854HaB/4+Hv8KudHXI9hVmqXyXaQ= =JcaJ -----END PGP SIGNATURE----- Merge remote-tracking branch 'mreitz/tags/pull-block-2017-05-11' into queue-block Block patches for the block queue. # gpg: Signature made Thu May 11 14:28:41 2017 CEST # gpg: using RSA key 0xF407DB0061D5CF40 # gpg: Good signature from "Max Reitz <mreitz@redhat.com>" # Primary key fingerprint: 91BE B60A 30DB 3E88 57D1 1829 F407 DB00 61D5 CF40 * mreitz/tags/pull-block-2017-05-11: (22 commits) MAINTAINERS: Add qemu-progress to the block layer qcow2: Discard/zero clusters by byte count qcow2: Assert that cluster operations are aligned qcow2: Optimize write zero of unaligned tail cluster iotests: Add test 179 to cover write zeroes with unmap iotests: Improve _filter_qemu_img_map qcow2: Optimize zero_single_l2() to minimize L2 churn qcow2: Make distinction between zero cluster types obvious qcow2: Name typedef for cluster type qcow2: Correctly report status of preallocated zero clusters block: Update comments on BDRV_BLOCK_* meanings qcow2: Use consistent switch indentation qcow2: Nicer variable names in qcow2_update_snapshot_refcount() tests: Add coverage for recent block geometry fixes blkdebug: Add ability to override unmap geometries blkdebug: Simplify override logic blkdebug: Add pass-through write_zero and discard support blkdebug: Refactor error injection blkdebug: Sanity check block layer guarantees qemu-io: Switch 'map' output to byte-based reporting ... Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-05-11 14:34:56 +02:00
Max Reitz	8dd30c86dd	MAINTAINERS: Add qemu-progress to the block layer util/qemu-progress.c is currently unmaintained. The only user of its functionality is qemu-img, so it effectively is part of the block layer. Suggested-by: Fam Zheng <famz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20170428165517.30341-1-mreitz@redhat.com Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-05-11 14:28:07 +02:00
Eric Blake	d2cb36af2b	qcow2: Discard/zero clusters by byte count Passing a byte offset, but sector count, when we ultimately want to operate on cluster granularity, is madness. Clean up the external interfaces to take both offset and count as bytes, while still keeping the assertion added previously that the caller must align the values to a cluster. Then rename things to make sure backports don't get confused by changed units: instead of qcow2_discard_clusters() and qcow2_zero_clusters(), we now have qcow2_cluster_discard() and qcow2_cluster_zeroize(). The internal functions still operate on clusters at a time, and return an int for number of cleared clusters; but on an image with 2M clusters, a single L2 table holds 256k entries that each represent a 2M cluster, totalling well over INT_MAX bytes if we ever had a request for that many bytes at once. All our callers currently limit themselves to 32-bit bytes (and therefore fewer clusters), but by making this function 64-bit clean, we have one less place to clean up if we later improve the block layer to support 64-bit bytes through all operations (with the block layer auto-fragmenting on behalf of more-limited drivers), rather than the current state where some interfaces are artificially limited to INT_MAX at a time. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 20170507000552.20847-13-eblake@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-05-11 14:28:07 +02:00
Eric Blake	f10ee139ad	qcow2: Assert that cluster operations are aligned We already audited (in commit `0c1bd469`) that qcow2_discard_clusters() is only passed cluster-aligned start values; but we can further tighten the assertion that the only unaligned end value is at EOF. Recent commits have taken advantage of an unaligned tail cluster, for both discard and write zeroes. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 20170507000552.20847-12-eblake@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-05-11 14:28:07 +02:00
Eric Blake	fbaa6bb3d3	qcow2: Optimize write zero of unaligned tail cluster We've already improved discards to operate efficiently on the tail of an unaligned qcow2 image; it's time to make a similar improvement to write zeroes. The special case is only valid at the tail cluster of a file, where we must recognize that any sectors beyond the image end would implicitly read as zero, and therefore should not penalize our logic for widening a partial cluster into writing the whole cluster as zero. However, note that for now, the special case of end-of-file is only recognized if there is no backing file, or if the backing file has the same length; that's because when the backing file is shorter than the active layer, we don't have code in place to recognize that reads of a sector unallocated at the top and beyond the backing end-of-file are implicitly zero. It's not much of a real loss, because most people don't use images that aren't cluster-aligned, or where the active layer is a different size than the backing layer (especially where the difference falls within a single cluster). Update test 154 to cover the new scenarios, using two images of intentionally differing length. While at it, fix the test to gracefully skip when run as ./check -qcow2 -o compat=0.10 154 since the older format lacks zero clusters already required earlier in the test. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 20170507000552.20847-11-eblake@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-05-11 14:28:07 +02:00
Eric Blake	e249d51952	iotests: Add test 179 to cover write zeroes with unmap No tests were covering write zeroes with unmap. Additionally, I needed to prove that my previous patches for correct status reporting and write zeroes optimizations actually had an impact. The test works for cluster_size between 8k and 2M (for smaller sizes, it fails because our allocation patterns are not contiguous with small clusters - in part, the largest consecutive allocation we tend to get is often bounded by the size covered by one L2 table). Note that testing for zero clusters is tricky: 'qemu-io map' reports whether data comes from the current layer of the image (useful for sniffing out which regions of the file have QCOW_OFLAG_ZERO) - but doesn't show which clusters have mappings; while 'qemu-img map' sees "zero":true for both unallocated and zero clusters for any qcow2 with no backing layer (so less useful at detecting true zero clusters), but reliably shows mappings. So we have to rely on both queries side-by-side at each point of the test. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 20170507000552.20847-10-eblake@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-05-11 14:28:07 +02:00
Eric Blake	d9ca2214bd	iotests: Improve _filter_qemu_img_map Although _filter_qemu_img_map documents that it scrubs offsets, it was only doing so for human mode. Of the existing tests using the filter (97, 122, 150, 154, 176), two of them are affected, but it does not hurt the validity of the tests to not require particular mappings (another test, 66, uses offsets but intentionally does not pass through _filter_qemu_img_map, because it checks that offsets are unchanged before and after an operation). Another justification for this patch is that it will allow a future patch to utilize 'qemu-img map --output=json' to check the status of preallocated zero clusters without regards to the mapping (since the qcow2 mapping can be very sensitive to the chosen cluster size, when preallocation is not in use). Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 20170507000552.20847-9-eblake@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-05-11 14:28:07 +02:00
Eric Blake	06cc5e2b2d	qcow2: Optimize zero_single_l2() to minimize L2 churn Similar to discard_single_l2(), we should try to avoid dirtying the L2 cache when the cluster we are changing already has the right characteristics. Note that by the time we get to zero_single_l2(), BDRV_REQ_MAY_UNMAP is a requirement to unallocate a cluster (this is because the block layer clears that flag if discard.* flags during open requested that we never punch holes - see the conversation around commit `170f4b2e`, https://lists.gnu.org/archive/html/qemu-devel/2016-09/msg07306.html). Therefore, this patch can only reuse a zero cluster as-is if either unmapping is not requested, or if the zero cluster was not associated with an allocation. Technically, there are some cases where an unallocated cluster already reads as all zeroes (namely, when there is no backing file [easy: check bs->backing], or when the backing file also reads as zeroes [harder: we can't check bdrv_get_block_status since we are already holding the lock]), where the guest would not immediately see a difference if we left that cluster unallocated. But if the user did not request unmapping, leaving an unallocated cluster is wrong; and even if the user DID request unmapping, keeping a cluster unallocated risks a subtle semantic change of guest-visible contents if a backing file is later added, and it is not worth auditing whether all internal uses such as mirror properly avoid an unmap request. Thus, this patch is intentionally limited to just clusters that are already marked as zero. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 20170507000552.20847-8-eblake@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-05-11 14:28:07 +02:00
Eric Blake	fdfab37dfe	qcow2: Make distinction between zero cluster types obvious Treat plain zero clusters differently from allocated ones, so that we can simplify the logic of checking whether an offset is present. Do this by splitting QCOW2_CLUSTER_ZERO into two new enums, QCOW2_CLUSTER_ZERO_PLAIN and QCOW2_CLUSTER_ZERO_ALLOC. I tried to arrange the enum so that we could use 'ret <= QCOW2_CLUSTER_ZERO_PLAIN' for all unallocated types, and 'ret >= QCOW2_CLUSTER_ZERO_ALLOC' for allocated types, although I didn't actually end up taking advantage of the layout. In many cases, this leads to simpler code, by properly combining cases (sometimes, both zero types pair together, other times, plain zero is more like unallocated while allocated zero is more like normal). Signed-off-by: Eric Blake <eblake@redhat.com> Message-id: 20170507000552.20847-7-eblake@redhat.com Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-05-11 14:28:07 +02:00
Eric Blake	3ef9521893	qcow2: Name typedef for cluster type Although it doesn't add all that much type safety (this is C, after all), it does add a bit of legibility to use the name QCow2ClusterType instead of a plain int. In particular, qcow2_get_cluster_offset() has an overloaded return type; a QCow2ClusterType on success, and -errno on failure; keeping the cluster type in a separate variable makes it slightly easier for the next patch to make further computations based on the type. Suggested-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com> Message-id: 20170507000552.20847-6-eblake@redhat.com [mreitz: Use the new type in two more places (one of them pulled from the next patch)] Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-05-11 14:28:06 +02:00
Eric Blake	4341df8a83	qcow2: Correctly report status of preallocated zero clusters We were throwing away the preallocation information associated with zero clusters. But we should be matching the well-defined semantics in bdrv_get_block_status(), where (BDRV_BLOCK_ZERO \| BDRV_BLOCK_OFFSET_VALID) informs the user which offset is reserved, while still reminding the user that reading from that offset is likely to read garbage. count_contiguous_clusters_by_type() is now used only for unallocated cluster runs, hence it gets renamed and tightened. Making this change lets us see which portions of an image are zero but preallocated, when using qemu-img map --output=json. The --output=human side intentionally ignores all zero clusters, whether or not they are preallocated. The fact that there is no change to qemu-iotests './check -qcow2' merely means that we aren't yet testing this aspect of qemu-img; a later patch will add a test. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 20170507000552.20847-5-eblake@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-05-11 14:28:06 +02:00
Eric Blake	4c41cb4955	block: Update comments on BDRV_BLOCK_* meanings We had some conflicting documentation: a nice 8-way table that described all possible combinations of DATA, ZERO, and OFFSET_VALID, contrasted with text that implied that OFFSET_VALID always meant raw data could be read directly. Furthermore, the text refers a lot to bs->file, even though the interface was updated back in `67a0fd2a` to let the driver pass back a specific BDS (not necessarily bs->file). As the 8-way table is the intended semantics, simplify the rest of the text to get rid of the confusion. ALLOCATED is always set by the block layer for convenience (drivers do not have to worry about it). RAW is used only internally, but by more than the raw driver. Document these additional items on the driver callback. Suggested-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 20170507000552.20847-4-eblake@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-05-11 14:28:06 +02:00
Eric Blake	bbd995d830	qcow2: Use consistent switch indentation Fix a couple of inconsistent indentations, before an upcoming patch further tweaks the switch statements. (best viewed with 'git diff -b'). Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 20170507000552.20847-3-eblake@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-05-11 14:28:06 +02:00
Eric Blake	b32cbae111	qcow2: Nicer variable names in qcow2_update_snapshot_refcount() In order to keep checkpatch happy when the next patch changes indentation, we first have to shorten some long lines. The easiest approach is to use a new variable in place of 'offset & L2E_OFFSET_MASK', except that 'offset' is the best name for that variable. Change '[old_]offset' to '[old_]entry' to make room. While touching things, also fix checkpatch warnings about unusual 'for' statements. Suggested by Max Reitz <mreitz@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com> Message-id: 20170507000552.20847-2-eblake@redhat.com Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-05-11 14:28:06 +02:00

... 2 3 4 5 6 ...

53395 Commits All Branches Search

53395 Commits

All Branches