The revision 0x300 generic error data entry is different
from the old version, but currently iterating through the
GHES estatus blocks does not take into account this difference.
This will lead to failure to get the right data entry if GHES
has revision 0x300 error data entry.
Update the GHES estatus iteration macro to properly increment using
acpi_hest_get_next(), and correct the iteration termination condition
because the status block data length only includes error data
length.
Convert the CPER estatus checking and printing iteration logic
to use same macro.
Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
Tested-by: Tyler Baicar <tbaicar@codeaurora.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Currently external aborts are unsupported by the guest abort
handling. Add handling for SEAs so that the host kernel reports
SEAs which occur in the guest kernel.
When an SEA occurs in the guest kernel, the guest exits and is
routed to kvm_handle_guest_abort(). Prior to this patch, a print
message of an unsupported FSC would be printed and nothing else
would happen. With this patch, the code gets routed to the APEI
handling of SEAs in the host kernel to report the SEA information.
Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
ARM APEI extension proposal added SEA (Synchronous External Abort)
notification type for ARMv8.
Add a new GHES error source handling function for SEA. If an error
source's notification type is SEA, then this function can be registered
into the SEA exception handler. That way GHES will parse and report
SEA exceptions when they occur.
An SEA can interrupt code that had interrupts masked and is treated as
an NMI. To aid this the page of address space for mapping APEI buffers
while in_nmi() is always reserved, and ghes_ioremap_pfn_nmi() is
changed to use the helper methods to find the prot_t to map with in
the same way as ghes_ioremap_pfn_irq().
Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
CC: Jonathan (Zhixiong) Zhang <zjzhang@codeaurora.org>
Reviewed-by: James Morse <james.morse@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The ACPI 6.1 spec adds a new revision of the generic error data
entry structure. Add support to handle the new structure as well
as properly verify and iterate through the generic data entries.
Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
CC: Jonathan (Zhixiong) Zhang <zjzhang@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
A RAS (Reliability, Availability, Serviceability) controller
may be a separate processor running in parallel with OS
execution, and may generate error records for consumption by
the OS. If the RAS controller produces multiple error records,
then they may be overwritten before the OS has consumed them.
The Generic Hardware Error Source (GHES) v2 structure
introduces the capability for the OS to acknowledge the
consumption of the error record generated by the RAS
controller. A RAS controller supporting GHESv2 shall wait for
the acknowledgment before writing a new error record, thus
eliminating the race condition.
Add support for parsing of GHESv2 sub-tables as well.
Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
CC: Jonathan (Zhixiong) Zhang <zjzhang@codeaurora.org>
Reviewed-by: James Morse <james.morse@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The following commit has changed ACPICA table header definitions:
Commit: 88f074f487
Subject: ACPI, CPER: Update cper info
While such definitions are currently maintained in ACPICA. As the
modifications applying to the table definitions affect other OSPMs'
drivers, it is very difficult for ACPICA to initiate a process to
complete the merge. Thus this commit finally only leaves us divergences.
Revert such naming modifications to reduce the source code differecnes
between Linux and ACPICA upstream. No functional changes.
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Cc: Bob Moore <robert.moore@intel.com>
Cc: Chen, Gong <gong.chen@linux.intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
We have a lot of confusing names of functions and data structures in
amongs the the error reporting code. In particular the "apei" prefix
has been applied to many objects that are not part of APEI. Since we
will be using these routines for extended error log reporting it will
be clearer if we fix up the names first.
Signed-off-by: Chen, Gong <gong.chen@linux.intel.com>
Acked-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
In order to allow reporting errors via EDAC, add hooks for:
1) register an EDAC driver;
2) unregister an EDAC driver;
3) report errors via EDAC.
As the EDAC driver will need to access the ghes structure, adds it
as one of the parameters for ghes_do_proc.
Acked-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
As a ghes_edac driver will need to access ghes structures, in order
to properly handle the errors, move those structures to a separate
header file. No functional changes.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>