qemu-e2k

Author	SHA1	Message	Date
Nicholas Piggin	dc5e072188	spapr: TCG allow up to 8-thread SMT on POWER8 and newer CPUs PPC TCG supports SMT CPU configurations for non-hypervisor state, so permit POWER8-10 pseries machines to enable SMT. This requires PIR and TIR be set, because that's how sibling thread matching is done by TCG. spapr's nested-HV capability does not currently coexist with SMT, so that combination is prohibited (interestingly somewhat analogous to LPAR-per-core mode on real hardware which also does not support KVM). Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> [ clg: Also test smp_threads when checking for POWER8 CPU and above ] Signed-off-by: Cédric Le Goater <clg@kaod.org>	2023-06-25 22:41:30 +02:00
Nicholas Piggin	ccc5a4c5e1	spapr: Add SPAPR_CAP_AIL_MODE_3 for AIL mode 3 support for H_SET_MODE hcall The behaviour of the Address Translation Mode on Interrupt resource is not consistently supported by all CPU versions or all KVM versions: KVM HV does not support mode 2, and does not support mode 3 on POWER7 or early POWER9 processesors. KVM PR only supports mode 0. TCG supports all modes (0, 2, 3) on CPUs with support for the corresonding LPCR[AIL] mode. This leads to inconsistencies in guest behaviour and could cause problems migrating guests. This was not noticable for Linux guests for a long time because the kernel only uses modes 0 and 3, and it used to consider AIL-3 to be advisory in that it would always keep the AIL-0 vectors around, so it did not matter whether or not interrupts were delivered according to the AIL mode. Recent Linux guests depend on AIL mode 3 working as specified in order to support the SCV facility interrupt. If AIL-3 can not be provided, then H_SET_MODE must return an error to Linux so it can disable the SCV facility (failure to do so can lead to userspace being able to crash the guest kernel). Add the ail-mode-3 capability to specify that AIL-3 is supported. AIL-0 is implied as the baseline, and AIL-2 is no longer supported by spapr. AIL-2 is not known to be used by any software, but support in TCG could be restored with an ail-mode-2 capability quite easily if a regression is reported. Modify the H_SET_MODE Address Translation Mode on Interrupt resource handler to check capabilities and correctly return error if not supported. KVM has a cap to advertise support for AIL-3. Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Message-Id: <20230515160216.394612-1-npiggin@gmail.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>	2023-05-28 07:13:54 -03:00
Dr. David Alan Gilbert	118d4ed045	Trivial: 3 char repeat typos Inspired by Julia Lawall's fixing of Linux kernel comments, I looked at qemu, although I did it manually. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Message-Id: <20220614104045.85728-2-dgilbert@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2022-06-28 11:06:02 +02:00
Daniel Henrique Barboza	37d1953448	hw/ppc/spapr_caps.c: use g_autofree in spapr_caps_add_properties() Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20220228175004.8862-6-danielhb413@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>	2022-03-02 06:51:39 +01:00
Daniel Henrique Barboza	bc940c46c9	hw/ppc/spapr_caps.c: use g_autofree in spapr_cap_get_string() Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20220228175004.8862-5-danielhb413@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>	2022-03-02 06:51:39 +01:00
Daniel Henrique Barboza	ea8464fa27	hw/ppc/spapr_caps.c: use g_autofree in spapr_cap_set_string() And get rid of the 'out' label since it's now unused. Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20220228175004.8862-4-danielhb413@gmail.com> [ clg: Fixed typo in commit log ] Signed-off-by: Cédric Le Goater <clg@kaod.org>	2022-03-02 06:51:39 +01:00
Nicholas Piggin	120f738a46	spapr: implement nested-hv capability for the virtual hypervisor This implements the Nested KVM HV hcall API for spapr under TCG. The L2 is switched in when the H_ENTER_NESTED hcall is made, and the L1 is switched back in returned from the hcall when a HV exception is sent to the vhyp. Register state is copied in and out according to the nested KVM HV hcall API specification. The hdecr timer is started when the L2 is switched in, and it provides the HDEC / 0x980 return to L1. The MMU re-uses the bare metal radix 2-level page table walker by using the get_pate method to point the MMU to the nested partition table entry. MMU faults due to partition scope errors raise HV exceptions and accordingly are routed back to the L1. The MMU does not tag translations for the L1 (direct) vs L2 (nested) guests, so the TLB is flushed on any L1<->L2 transition (hcall entry and exit). Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> [ clg: checkpatch fixes ] Message-Id: <20220216102545.1808018-10-npiggin@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>	2022-02-18 08:34:14 +01:00
Bharata B Rao	82123b756a	target/ppc: Support for H_RPT_INVALIDATE hcall If KVM_CAP_RPT_INVALIDATE KVM capability is enabled, then - indicate the availability of H_RPT_INVALIDATE hcall to the guest via ibm,hypertas-functions property. - Enable the hcall Both the above are done only if the new sPAPR machine capability cap-rpt-invalidate is set. Signed-off-by: Bharata B Rao <bharata@linux.ibm.com> Message-Id: <20210706112440.1449562-3-bharata@linux.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-07-09 11:01:06 +10:00
Lucas Mateus Castro (alqotel)	fd1eb085da	target/ppc: moved function out of mmu-hash64 The function ppc_hash64_filter_pagesizes has been moved from a function with prototype in mmu-hash64.h and implemented in mmu-hash64.c to a static function in hw/ppc/spapr_caps.c as it's only used in that file. Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br> Message-Id: <20210506163941.106984-3-lucas.araujo@eldorado.org.br> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-05-19 10:30:28 +10:00
Chen Qun	d6eb39b554	qtest: delete superfluous inclusions of qtest.h There are 23 files that include the "sysemu/qtest.h", but they do not use any qtest functions. Signed-off-by: Chen Qun <kuhn.chenqun@huawei.com> Acked-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20210226081414.205946-1-kuhn.chenqun@huawei.com> Signed-off-by: Thomas Huth <thuth@redhat.com>	2021-03-09 06:03:53 +01:00
Greg Kurz	35dce34fbc	spapr: Add a return value to spapr_check_pagesize() As recommended in "qapi/error.h", return true on success and false on failure. This allows to reduce error propagation overhead in the callers. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20200914123505.612812-14-groug@kaod.org> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-10-09 10:15:06 +11:00
Greg Kurz	19d55e2031	spapr: Forbid nested KVM-HV in pre-power9 compat mode Nested KVM HV only works if the kernel is using the radix MMU mode, ie. the CPU is POWER9 and it is not running in some pre-power9 compat mode. Otherwise, the KVM HV module fails to load in the guest with -ENODEV. It might be painful for a user to discover this late that nested cannot work with their setup. Erroring out at machine init instead seems to be the best we can do. Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Message-Id: <159491948127.188975.9621435875869177751.stgit@bahia.lan> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-08-12 13:16:27 +10:00
Greg Kurz	d9c5b5fa86	spapr: Use error_append_hint() in spapr_caps.c We have a dedicated error API for hints. Use it instead of embedding the hint in the error message, as recommanded in the "qapi/error.h" header file. While here, have cap_fwnmi_apply(), which already uses error_append_hint(), to call ERRP_GUARD() as well. Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Message-Id: <159594297421.8262.14314530897345809924.stgit@bahia.lan> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-08-12 13:16:27 +10:00
Markus Armbruster	668f62ec62	error: Eliminate error_propagate() with Coccinelle, part 1 When all we do with an Error we receive into a local variable is propagating to somewhere else, we can just as well receive it there right away. Convert if (!foo(..., &err)) { ... error_propagate(errp, err); ... return ... } to if (!foo(..., errp)) { ... ... return ... } where nothing else needs @err. Coccinelle script: @rule1 forall@ identifier fun, err, errp, lbl; expression list args, args2; binary operator op; constant c1, c2; symbol false; @@ if ( ( - fun(args, &err, args2) + fun(args, errp, args2) \| - !fun(args, &err, args2) + !fun(args, errp, args2) \| - fun(args, &err, args2) op c1 + fun(args, errp, args2) op c1 ) ) { ... when != err when != lbl: when strict - error_propagate(errp, err); ... when != err ( return; \| return c2; \| return false; ) } @rule2 forall@ identifier fun, err, errp, lbl; expression list args, args2; expression var; binary operator op; constant c1, c2; symbol false; @@ - var = fun(args, &err, args2); + var = fun(args, errp, args2); ... when != err if ( ( var \| !var \| var op c1 ) ) { ... when != err when != lbl: when strict - error_propagate(errp, err); ... when != err ( return; \| return c2; \| return false; \| return var; ) } @depends on rule1 \|\| rule2@ identifier err; @@ - Error *err = NULL; ... when != err Not exactly elegant, I'm afraid. The "when != lbl:" is necessary to avoid transforming if (fun(args, &err)) { goto out } ... out: error_propagate(errp, err); even though other paths to label out still need the error_propagate(). For an actual example, see sclp_realize(). Without the "when strict", Coccinelle transforms vfio_msix_setup(), incorrectly. I don't know what exactly "when strict" does, only that it helps here. The match of return is narrower than what I want, but I can't figure out how to express "return where the operand doesn't use @err". For an example where it's too narrow, see vfio_intx_enable(). Silently fails to convert hw/arm/armsse.c, because Coccinelle gets confused by ARMSSE being used both as typedef and function-like macro there. Converted manually. Line breaks tidied up manually. One nested declaration of @local_err deleted manually. Preexisting unwanted blank line dropped in hw/riscv/sifive_e.c. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200707160613.848843-35-armbru@redhat.com>	2020-07-10 15:18:08 +02:00
Markus Armbruster	62a35aaa31	qapi: Use returned bool to check for failure, Coccinelle part The previous commit enables conversion of visit_foo(..., &err); if (err) { ... } to if (!visit_foo(..., errp)) { ... } for visitor functions that now return true / false on success / error. Coccinelle script: @@ identifier fun =~ "check_list\|input_type_enum\|lv_start_struct\|lv_type_bool\|lv_type_int64\|lv_type_str\|lv_type_uint64\|output_type_enum\|parse_type_bool\|parse_type_int64\|parse_type_null\|parse_type_number\|parse_type_size\|parse_type_str\|parse_type_uint64\|print_type_bool\|print_type_int64\|print_type_null\|print_type_number\|print_type_size\|print_type_str\|print_type_uint64\|qapi_clone_start_alternate\|qapi_clone_start_list\|qapi_clone_start_struct\|qapi_clone_type_bool\|qapi_clone_type_int64\|qapi_clone_type_null\|qapi_clone_type_number\|qapi_clone_type_str\|qapi_clone_type_uint64\|qapi_dealloc_start_list\|qapi_dealloc_start_struct\|qapi_dealloc_type_anything\|qapi_dealloc_type_bool\|qapi_dealloc_type_int64\|qapi_dealloc_type_null\|qapi_dealloc_type_number\|qapi_dealloc_type_str\|qapi_dealloc_type_uint64\|qobject_input_check_list\|qobject_input_check_struct\|qobject_input_start_alternate\|qobject_input_start_list\|qobject_input_start_struct\|qobject_input_type_any\|qobject_input_type_bool\|qobject_input_type_bool_keyval\|qobject_input_type_int64\|qobject_input_type_int64_keyval\|qobject_input_type_null\|qobject_input_type_number\|qobject_input_type_number_keyval\|qobject_input_type_size_keyval\|qobject_input_type_str\|qobject_input_type_str_keyval\|qobject_input_type_uint64\|qobject_input_type_uint64_keyval\|qobject_output_start_list\|qobject_output_start_struct\|qobject_output_type_any\|qobject_output_type_bool\|qobject_output_type_int64\|qobject_output_type_null\|qobject_output_type_number\|qobject_output_type_str\|qobject_output_type_uint64\|start_list\|visit_check_list\|visit_check_struct\|visit_start_alternate\|visit_start_list\|visit_start_struct\|visit_type_."; expression list args; typedef Error; Error err; @@ - fun(args, &err); - if (err) + if (!fun(args, &err)) { ... } A few line breaks tidied up manually. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200707160613.848843-19-armbru@redhat.com>	2020-07-10 15:18:08 +02:00
Greg Kurz	a816f2d6b8	spapr: Simplify some warning printing paths in spapr_caps.c We obviously only want to print a warning in these cases, but this is done in a rather convoluted manner. Just use warn_report() instead. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <159188281098.70166.18387926536399257573.stgit@bahia.lan> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-06-26 09:22:29 +10:00
Markus Armbruster	40c2281cc3	Drop more @errp parameters after previous commit Several functions can't fail anymore: ich9_pm_add_properties(), device_add_bootindex_property(), ppc_compat_add_property(), spapr_caps_add_properties(), PropertyInfo.create(). Drop their @errp parameter. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20200505152926.18877-16-armbru@redhat.com>	2020-05-15 07:08:14 +02:00
Markus Armbruster	d2623129a7	qom: Drop parameter @errp of object_property_add() & friends The only way object_property_add() can fail is when a property with the same name already exists. Since our property names are all hardcoded, failure is a programming error, and the appropriate way to handle it is passing &error_abort. Same for its variants, except for object_property_add_child(), which additionally fails when the child already has a parent. Parentage is also under program control, so this is a programming error, too. We have a bit over 500 callers. Almost half of them pass &error_abort, slightly fewer ignore errors, one test case handles errors, and the remaining few callers pass them to their own callers. The previous few commits demonstrated once again that ignoring programming errors is a bad idea. Of the few ones that pass on errors, several violate the Error API. The Error ** argument must be NULL, &error_abort, &error_fatal, or a pointer to a variable containing NULL. Passing an argument of the latter kind twice without clearing it in between is wrong: if the first call sets an error, it no longer points to NULL for the second call. ich9_pm_add_properties(), sparc32_ledma_realize(), sparc32_dma_realize(), xilinx_axidma_realize(), xilinx_enet_realize() are wrong that way. When the one appropriate choice of argument is &error_abort, letting users pick the argument is a bad idea. Drop parameter @errp and assert the preconditions instead. There's one exception to "duplicate property name is a programming error": the way object_property_add() implements the magic (and undocumented) "automatic arrayification". Don't drop @errp there. Instead, rename object_property_add() to object_property_try_add(), and add the obvious wrapper object_property_add(). Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20200505152926.18877-15-armbru@redhat.com> [Two semantic rebase conflicts resolved]	2020-05-15 07:07:58 +02:00
Markus Armbruster	7eecec7d12	qom: Drop object_property_set_description() parameter @errp object_property_set_description() and object_class_property_set_description() fail only when property @name is not found. There are 85 calls of object_property_set_description() and object_class_property_set_description(). None of them can fail: * 84 immediately follow the creation of the property. * The one in spapr_rng_instance_init() refers to a property created in spapr_rng_class_init(), from spapr_rng_properties[]. Every one of them still gets to decide what to pass for @errp. 51 calls pass &error_abort, 32 calls pass NULL, one receives the error and propagates it to &error_abort, and one propagates it to &error_fatal. I'm actually surprised none of them violates the Error API. What are we gaining by letting callers handle the "property not found" error? Use when the property is not known to exist is simpler: you don't have to guard the call with a check. We haven't found such a use in 5+ years. Until we do, let's make life a bit simpler and drop the @errp parameter. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20200505152926.18877-8-armbru@redhat.com> [One semantic rebase conflict resolved]	2020-05-15 07:06:49 +02:00
Nicholas Piggin	ec010c0066	ppc/spapr: KVM FWNMI should not be enabled until guest requests it The KVM FWNMI capability should be enabled with the "ibm,nmi-register" rtas call. Although MCEs from KVM will be delivered as architected interrupts to the guest before "ibm,nmi-register" is called, KVM has different behaviour depending on whether the guest has enabled FWNMI (it attempts to do more recovery on behalf of a non-FWNMI guest). Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Message-Id: <20200325142906.221248-2-npiggin@gmail.com> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-04-07 08:55:10 +10:00
Nicholas Piggin	89ba45652b	ppc/spapr: Allow FWNMI on TCG There should no longer be a reason to prevent TCG providing FWNMI. System Reset interrupts are generated to the guest with nmi monitor command and H_SIGNAL_SYS_RESET. Machine Checks can not be injected currently, but this could be implemented with the mce monitor cmd similarly to i386. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Message-Id: <20200316142613.121089-6-npiggin@gmail.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> [dwg: Re-enable FWNMI in qtests, since that now works] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-03-17 17:00:22 +11:00
Nicholas Piggin	8af7e1fe6f	ppc/spapr: Change FWNMI names The option is called "FWNMI", and it involves more than just machine checks, also machine checks can be delivered without the FWNMI option, so re-name various things to reflect that. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Message-Id: <20200316142613.121089-3-npiggin@gmail.com> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-03-17 17:00:22 +11:00
Aravinda Prasad	f03496bc12	ppc: spapr: Handle "ibm,nmi-register" and "ibm,nmi-interlock" RTAS calls This patch adds support in QEMU to handle "ibm,nmi-register" and "ibm,nmi-interlock" RTAS calls. The machine check notification address is saved when the OS issues "ibm,nmi-register" RTAS call. This patch also handles the case when multiple processors experience machine check at or about the same time by handling "ibm,nmi-interlock" call. In such cases, as per PAPR, subsequent processors serialize waiting for the first processor to issue the "ibm,nmi-interlock" call. The second processor that also received a machine check error waits till the first processor is done reading the error log. The first processor issues "ibm,nmi-interlock" call when the error log is consumed. Signed-off-by: Aravinda Prasad <arawinda.p@gmail.com> [Register fwnmi RTAS calls in core_rtas_register_types() where other RTAS calls are registered] Signed-off-by: Ganesh Goudar <ganeshgr@linux.ibm.com> Message-Id: <20200130184423.20519-6-ganeshgr@linux.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-02-03 11:33:11 +11:00
Aravinda Prasad	9d953ce447	ppc: spapr: Introduce FWNMI capability Introduce fwnmi an spapr capability and add a helper function which tries to enable it, which would be used by following patch of the series. This patch by itself does not change the existing behavior. Signed-off-by: Aravinda Prasad <arawinda.p@gmail.com> [eliminate cap_ppc_fwnmi, add fwnmi cap to migration state and reprhase the commit message] Signed-off-by: Ganesh Goudar <ganeshgr@linux.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Message-Id: <20200130184423.20519-3-ganeshgr@linux.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-02-03 11:33:10 +11:00
David Gibson	37965dfe4d	spapr: Enable DD2.3 accelerated count cache flush in pseries-5.0 machine For POWER9 DD2.2 cpus, the best current Spectre v2 indirect branch mitigation is "count cache disabled", which is configured with: -machine cap-ibs=fixed-ccd However, this option isn't available on DD2.3 CPUs with KVM, because they don't have the count cache disabled. For POWER9 DD2.3 cpus, it is "count cache flush with assist", configured with: -machine cap-ibs=workaround,cap-ccf-assist=on However this option isn't available on DD2.2 CPUs with KVM, because they don't have the special CCF assist instruction this relies on. On current machine types, we default to "count cache flush w/o assist", that is: -machine cap-ibs=workaround,cap-ccf-assist=off This runs, with mitigation on both DD2.2 and DD2.3 host cpus, but has a fairly significant performance impact. It turns out we can do better. The special instruction that CCF assist uses to trigger a count cache flush is a no-op on earlier CPUs, rather than trapping or causing other badness. It doesn't, of itself, implement the mitigation, but if we have count-cache-disabled, then the count cache flush is unnecessary, and so using the count cache flush mitigation is harmless. Therefore for the new pseries-5.0 machine type, enable cap-ccf-assist by default. Along with that, suppress throwing an error if cap-ccf-assist is selected but KVM doesn't support it, as long as KVM is giving us count-cache-disabled. To allow TCG to work out of the box, even though it doesn't implement the ccf flush assist, downgrade the error in that case to a warning. This matches several Spectre mitigations where we allow TCG to operate for debugging, since we don't really make guarantees about TCG security properties anyway. While we're there, make the TCG warning for this case match that for other mitigations. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Tested-by: Michael Ellerman <mpe@ellerman.id.au>	2020-02-03 11:33:02 +11:00
Shivaprasad G Bhat	d758880586	ppc: fix memory leak in spapr_caps_add_properties Free the capability name string after setting the capability. Signed-off-by: Shivaprasad G Bhat <sbhat@linux.ibm.com> Message-Id: <156335156198.82682.8756968724044750843.stgit@lep8c.aus.stglabs.ibm.com> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-08-21 17:17:11 +10:00
Daniel Black	f92be77fea	spapr: quantify error messages regarding capability settings Its not immediately obvious how cap-X=Y setting need to be applied to the command line so, for spapr capability error messages, this has been clarified to: appending -machine cap-X=Y The wrong value messages have been left as is, as the user has found the right location. Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Daniel Black <daniel@linux.ibm.com> Message-Id: <20190812071044.30806-1-daniel@linux.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-08-21 11:32:11 +10:00
Markus Armbruster	d645427057	Include migration/vmstate.h less In my "build everything" tree, changing migration/vmstate.h triggers a recompile of some 2700 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h). hw/hw.h supposedly includes it for convenience. Several other headers include it just to get VMStateDescription. The previous commit made that unnecessary. Include migration/vmstate.h only where it's still needed. Touching it now recompiles only some 1600 objects. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Message-Id: <20190812052359.30071-16-armbru@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>	2019-08-16 13:31:52 +02:00
Markus Armbruster	14a48c1d0d	qemu-common: Move tcg_enabled() etc. to sysemu/tcg.h Other accelerators have their own headers: sysemu/hax.h, sysemu/hvf.h, sysemu/kvm.h, sysemu/whpx.h. Only tcg_enabled() & friends sit in qemu-common.h. This necessitates inclusion of qemu-common.h into headers, which is against the rules spelled out in qemu-common.h's file comment. Move tcg_enabled() & friends into their own header sysemu/tcg.h, and adjust #include directives. Cc: Richard Henderson <rth@twiddle.net> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190523143508.25387-2-armbru@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> [Rebased with conflicts resolved automatically, except for accel/tcg/tcg-all.c]	2019-06-11 20:22:09 +02:00
Greg Kurz	3725ef1a94	spapr: Don't migrate the hpt_maxpagesize cap to older machine types Commit 0b8c89be7f7b added the hpt_maxpagesize capability to the migration stream. This is okay for new machine types but it breaks backward migration to older QEMUs, which don't expect the extra subsection. Add a compatibility boolean flag to the sPAPR machine class and use it to skip migration of the capability for machine types 4.0 and older. This fixes migration to an older QEMU. Note that the destination will emit a warning: qemu-system-ppc64: warning: cap-hpt-max-page-size lower level (16) in incoming stream than on destination (24) This is expected and harmless though. It is okay to migrate from a lower HPT maximum page size (64k) to a greater one (16M). Fixes: 0b8c89be7f7b "spapr: Add forgotten capability to migration stream" Based-on: <20190522074016.10521-3-clg@kaod.org> Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <155853262675.1158324.17301777846476373459.stgit@bahia.lan> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-05-29 11:39:47 +10:00
David Gibson	64d4a53431	spapr: Add forgotten capability to migration stream spapr machine capabilities are supposed to be sent in the migration stream so that we can sanity check the source and destination have compatible configuration. Unfortunately, when we added the hpt-max-page-size capability, we forgot to add it to the migration state. This means that we can generate spurious warnings when both ends are configured for large pages, or potentially fail to warn if the source is configured for huge pages, but the destination is not. Fixes: `2309832afd` "spapr: Maximum (HPT) pagesize property" Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Cédric Le Goater <clg@kaod.org>	2019-05-29 11:39:45 +10:00
David Hildenbrand	905b7ee4d6	exec: Introduce qemu_maxrampagesize() and rename qemu_getrampagesize() Rename qemu_getrampagesize() to qemu_minrampagesize(). While at it, properly rename find_max_supported_pagesize() to find_min_backend_pagesize(). s390x is actually interested into the maximum ram pagesize, so introduce and use qemu_maxrampagesize(). Add a TODO, indicating that looking at any mapped memory backends is not 100% correct in some cases. Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20190417113143.5551-3-david@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>	2019-04-25 13:47:27 +02:00
David Gibson	ce2918cbc3	spapr: Use CamelCase properly The qemu coding standard is to use CamelCase for type and structure names, and the pseries code follows that... sort of. There are quite a lot of places where we bend the rules in order to preserve the capitalization of internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR". That was a bad idea - it frequently leads to names ending up with hard to read clusters of capital letters, and means they don't catch the eye as type identifiers, which is kind of the point of the CamelCase convention in the first place. In short, keeping type identifiers look like CamelCase is more important than preserving standard capitalization of internal "words". So, this patch renames a heap of spapr internal type names to a more standard CamelCase. In addition to case changes, we also make some other identifier renames: VIOsPAPR* -> SpaprVio* The reverse word ordering was only ever used to mitigate the capital cluster, so revert to the natural ordering. VIOsPAPRVTYDevice -> SpaprVioVty VIOsPAPRVLANDevice -> SpaprVioVlan Brevity, since the "Device" didn't add useful information sPAPRDRConnector -> SpaprDrc sPAPRDRConnectorClass -> SpaprDrcClass Brevity, and makes it clearer this is the same thing as a "DRC" mentioned in many other places in the code This is 100% a mechanical search-and-replace patch. It will, however, conflict with essentially any and all outstanding patches touching the spapr code. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-03-12 14:33:05 +11:00
Suraj Jitindar Singh	006e9d3618	target/ppc/tcg: make spapr_caps apply cap-[cfpc/sbbc/ibs] non-fatal for tcg The spapr_caps cap-cfpc, cap-sbbc and cap-ibs are used to control the availability of certain mitigations to the guest. These haven't been implemented under TCG, it is unlikely they ever will be, and it is unclear as to whether they even need to be. As such, make failure to apply these capabilities under TCG non-fatal. Instead we print a warning message to the user but still allow the guest to continue. Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Message-Id: <20190301044609.9626-2-sjitindarsingh@gmail.com> [dwg: Small style fix] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-03-12 14:32:54 +11:00
Suraj Jitindar Singh	8ff43ee404	target/ppc/spapr: Add SPAPR_CAP_CCF_ASSIST Introduce a new spapr_cap SPAPR_CAP_CCF_ASSIST to be used to indicate the requirement for a hw-assisted version of the count cache flush workaround. The count cache flush workaround is a software workaround which can be used to flush the count cache on context switch. Some revisions of hardware may have a hardware accelerated flush, in which case the software flush can be shortened. This cap is used to set the availability of such hardware acceleration for the count cache flush routine. The availability of such hardware acceleration is indicated by the H_CPU_CHAR_BCCTR_FLUSH_ASSIST flag being set in the characteristics returned from the KVM_PPC_GET_CPU_CHAR ioctl. Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Message-Id: <20190301031912.28809-2-sjitindarsingh@gmail.com> [dwg: Small style fixes] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-03-12 12:07:49 +11:00
Suraj Jitindar Singh	399b2896d4	target/ppc/spapr: Add workaround option to SPAPR_CAP_IBS The spapr_cap SPAPR_CAP_IBS is used to indicate the level of capability for mitigations for indirect branch speculation. Currently the available values are broken (default), fixed-ibs (fixed by serialising indirect branches) and fixed-ccd (fixed by diabling the count cache). Introduce a new value for this capability denoted workaround, meaning that software can work around the issue by flushing the count cache on context switch. This option is available if the hypervisor sets the H_CPU_BEHAV_FLUSH_COUNT_CACHE flag in the cpu behaviours returned from the KVM_PPC_GET_CPU_CHAR ioctl. Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Message-Id: <20190301031912.28809-1-sjitindarsingh@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-03-12 12:07:49 +11:00
Suraj Jitindar Singh	edaa799559	target/ppc/spapr: Enable the large decrementer for pseries-4.0 Enable the large decrementer by default for the pseries-4.0 machine type. It is disabled again by default_caps_with_cpu() for pre-POWER9 cpus since they don't support the large decrementer. Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Message-Id: <20190301024317.22137-4-sjitindarsingh@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-03-12 12:07:49 +11:00
Suraj Jitindar Singh	7d050527e3	target/ppc: Implement large decrementer support for KVM Implement support to allow KVM guests to take advantage of the large decrementer introduced on POWER9 cpus. To determine if the host can support the requested large decrementer size, we check it matches that specified in the ibm,dec-bits device-tree property. We also need to enable it in KVM by setting the LPCR_LD bit in the LPCR. Note that to do this we need to try and set the bit, then read it back to check the host allowed us to set it, if so we can use it but if we were unable to set it the host cannot support it and we must not use the large decrementer. Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20190301024317.22137-3-sjitindarsingh@gmail.com> [dwg: Small style fixes] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-03-12 12:07:49 +11:00
Suraj Jitindar Singh	a8dafa5251	target/ppc: Implement large decrementer support for TCG Prior to POWER9 the decrementer was a 32-bit register which decremented with each tick of the timebase. From POWER9 onwards the decrementer can be set to operate in a mode called large decrementer where it acts as a n-bit decrementing register which is visible as a 64-bit register, that is the value of the decrementer is sign extended to 64 bits (where n is implementation dependant). The mode in which the decrementer operates is controlled by the LPCR_LD bit in the logical paritition control register (LPCR). >From POWER9 onwards the HDEC (hypervisor decrementer) was enlarged to h-bits, also sign extended to 64 bits (where h is implementation dependant). Note this isn't configurable and is always enabled. On POWER9 the large decrementer and hdec are both 56 bits, as represented by the lrg_decr_bits cpu class property. Since they are the same size we only add one property for now, which could be extended in the case they ever differ in the future. We also add the lrg_decr_bits property for POWER5+/7/8 since it is used to determine the size of the hdec, which is only generated on the POWER5+ processor and later. On these processors it is 32 bits. Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20190301024317.22137-2-sjitindarsingh@gmail.com> [dwg: Small style fixes] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-03-12 12:07:49 +11:00
Suraj Jitindar Singh	c982f5cf9a	target/ppc/spapr: Add SPAPR_CAP_LARGE_DECREMENTER Add spapr_cap SPAPR_CAP_LARGE_DECREMENTER to be used to control the availability of the large decrementer for a guest. Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Message-Id: <20190301024317.22137-1-sjitindarsingh@gmail.com> [dwg: Trivial style fix] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-03-12 12:07:49 +11:00
Suraj Jitindar Singh	b9a477b725	ppc/spapr_caps: Add SPAPR_CAP_NESTED_KVM_HV Add the spapr cap SPAPR_CAP_NESTED_KVM_HV to be used to control the availability of nested kvm-hv to the level 1 (L1) guest. Assuming a hypervisor with support enabled an L1 guest can be allowed to use the kvm-hv module (and thus run it's own kvm-hv guests) by setting: -machine pseries,cap-nested-hv=true or disabled with: -machine pseries,cap-nested-hv=false Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2018-11-08 13:08:35 +11:00
Greg Kurz	e89372951d	spapr: compute default value of "hpt-max-page-size" later It is currently not possible to run a pseries-2.12 or older machine with HV KVM. QEMU prints the following and exits right away. qemu-system-ppc64: KVM doesn't support for base page shift 34 The "hpt-max-page-size" capability was recently added to spapr to hide host configuration details from HPT mode guests. Its default value for newer machine types is 64k. For backwards compatibility, pseries-2.12 and older machine types need a different value. This is handled as usual in a class init function. The default value is 16G, ie, all page sizes supported by POWER7 and newer CPUs, but HV KVM requires guest pages to be hpa contiguous as well as gpa contiguous. The default value is the page size used to back the guest RAM in this case. Unfortunately kvmppc_hpt_needs_host_contiguous_pages()->kvm_enabled() is called way before KVM init and returns false, even if the user requested KVM. We thus end up selecting 16G, which isn't supported by HV KVM. The default value must be set during machine init, because we can safely assume that KVM is initialized at this point. We fix this by moving the logic to default_caps_with_cpu(). Since the user cannot pass cap-hpt-max-page-size=0, we set the default to 0 in the pseries-2.12 class init function and use that as a flag to do the real work. Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2018-07-03 10:20:15 +10:00
David Gibson	9dceda5fc3	spapr: Limit available pagesizes to provide a consistent guest environment KVM HV has some limitations (deriving from the hardware) that mean not all host-cpu supported pagesizes may be usable in the guest. At present this means that KVM guests and TCG guests may see different available page sizes even if they notionally have the same vcpu model. This is confusing and also prevents migration between TCG and KVM. This patch makes the environment consistent by always allowing the same set of pagesizes. Since we can't remove the KVM limitations, we do this by always applying the same limitations it has, even to TCG guests. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org>	2018-06-22 14:19:07 +10:00
David Gibson	123eec6552	spapr: Use maximum page size capability to simplify memory backend checking The way we used to handle KVM allowable guest pagesizes for PAPR guests required some convoluted checking of memory attached to the guest. The allowable pagesizes advertised to the guest cpus depended on the memory which was attached at boot, but then we needed to ensure that any memory later hotplugged didn't change which pagesizes were allowed. Now that we have an explicit machine option to control the allowable maximum pagesize we can simplify this. We just check all memory backends against that declared pagesize. We check base and cold-plugged memory at reset time, and hotplugged memory at pre_plug() time. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org>	2018-06-22 14:19:07 +10:00
David Gibson	2309832afd	spapr: Maximum (HPT) pagesize property The way the POWER Hash Page Table (HPT) MMU is virtualized by KVM HV means that every page that the guest puts in the pagetables must be truly physically contiguous, not just GPA-contiguous. In effect this means that an HPT guest can't use any pagesizes greater than the host page size used to back its memory. At present we handle this by changing what we advertise to the guest based on the backing pagesizes. This is pretty bad, because it means the guest sees a different environment depending on what should be host configuration details. As a start on fixing this, we add a new capability parameter to the pseries machine type which gives the maximum allowed pagesizes for an HPT guest. For now we just create and validate the parameter without making it do anything. For backwards compatibility, on older machine types we set it to the max available page size for the host. For the 3.0 machine type, we fix it to 16, the intention being to only allow HPT pagesizes up to 64kiB by default in future. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org>	2018-06-22 14:19:07 +10:00
David Gibson	e2e4f64118	spapr: Add cpu_apply hook to capabilities spapr capabilities have an apply hook to actually activate (or deactivate) the feature in the system at reset time. However, a number of capabilities affect the setup of cpus, and need to be applied to each of them - including hotplugged cpus for extra complication. To make this simpler, add an optional cpu_apply hook that is called from spapr_cpu_reset(). Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cédric Le Goater <clg@kaod.org>	2018-06-21 21:22:53 +10:00
David Gibson	9f6edd066e	spapr: Compute effective capability values earlier Previously, the effective values of the various spapr capability flags were only determined at machine reset time. That was a lazy way of making sure it was after cpu initialization so it could use the cpu object to inform the defaults. But we've now improved the compat checking code so that we don't need to instantiate the cpus to use it. That lets us move the resolution of the capability defaults much earlier. This is going to be necessary for some future capabilities. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cédric Le Goater <clg@kaod.org>	2018-06-21 21:22:53 +10:00
David Gibson	ad99d04c76	target/ppc: Allow cpu compatiblity checks based on type, not instance ppc_check_compat() is used in a number of places to check if a cpu object supports a certain compatiblity mode, subject to various constraints. It takes a PowerPCCPU *, however it really only depends on the cpu's class. We have upcoming cases where it would be useful to make compatibility checks before we fully instantiate the cpu objects. ppc_type_check_compat() will now make an equivalent check, but based on a CPU's QOM typename instead of an instantiated CPU object. We make use of the new interface in several places in spapr, where we're essentially making a global check, rather than one specific to a particular cpu. This avoids some ugly uses of first_cpu to grab a "representative" instance. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cédric Le Goater <clg@kaod.org>	2018-06-21 21:22:53 +10:00
Suraj Jitindar Singh	b2540203bd	ppc/spapr_caps: Don't disable cap_cfpc on POWER8 by default In default_caps_with_cpu() we set spapr_cap_cfpc to broken for POWER8 processors and before. Since we no longer require private l1d cache on POWER8 for this cap to be set to workaround change this to default to broken for POWER7 processors and before. Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2018-06-16 16:32:33 +10:00
Suraj Jitindar Singh	813f3cf655	ppc/spapr-caps: Define the pseries-2.12-sxxm machine type The sxxm (speculative execution exploit mitigation) machine type is a variant of the 2.12 machine type with workarounds for speculative execution vulnerabilities enabled by default. Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2018-03-06 13:16:29 +11:00

1 2

68 Commits