Skip zero-ing in aer_alloc_rpc() since it is allocated by kzalloc().
The closing comment marker "*/" is recommended for kernel-doc comments.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
I noticed that when I inject a fatal error to an endpoint via
aer-inject, aer_root_reset() is called as reset_link for a
downstream port at upstream of the endpoint:
pcieport 0000:00:06.0: AER: Uncorrected (Fatal) error received: id=5401
:
pcieport 0000:52:02.0: Root Port link has been reset
It externally appears to be working, but internally issues some
accesses to PCI_ERR_ROOT_COMMAND/STATUS registers that is for
root port so not available on downstream port.
This patch introduces default_downstream_reset_link that is
a version of aer_root_reset() with no accesses to root port's
register. It is used for downstream ports that has no reset_link
function its specific.
This patch also updates related description in pcieaer-howto.txt.
Some minor fixes are included.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The pcie->port of port service device points the port associated
the service with. The find_aer_service iterates over children of
given port udev.
So it is clear that the pcie->port of port service of given port
udev must always point the udev.
Therefore we can know the type of udev without checking its children.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Make it clear that we only interest in 2 *_RCV bits.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Current get_e_source() returns pointer to an element of array.
However since it also progress consume counter, it is possible
that the element is overwritten by newly produced data before
the element is really consumed.
This patch changes get_e_source() to copy contents of the element
to address pointed by its caller. Once copied the element in
array can be consumed.
And relocate this function to more innocuous place.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Divide tricky for-loop into readable if-blocks.
The logic to set multi_error_valid (to force walking pci bus
hierarchy to find 2nd~ error devices) is changed too, to check
MULTI_{,_UN}COR_RCV bit individually and to force walk only when
it is required.
And rework setting e_info->severity for uncorrectable, not to use
magic numbers.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Stop iteration if we cannot register any more.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Take core part of find_device_iter() to make a new function
is_error_source() that checks given device has report an error
or not.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Return bool to indicate that the source device is found or not.
This allows us to skip calling aer_process_err_devices() if we can.
And move dev_printk for debug into this function.
v2: return bool instead of int
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
These functions are only called from init/remove path of aerdrv,
so move them from aerdrv_core.c to aerdrv.c, to make them static.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This cleanup solves some minor naming issues by removing unuseful
function aer_delete_rootport() and by renaming disable_root_aer()
to aer_disable_rootport().
- Inconsistent location of alloc & free:
The struct rpc is allocated in aer_alloc_rpc() at aerdrv.c
while it is implicitly freed in aer_delete_rootport() at
aerdrv_core.c.
- Inconsistent function name:
It makes a bit confusion that aer_delete_rootport() is seemed
to be paired with aer_enable_rootport(), i.e. there is neither
"add" against "delete" nor "disable" against "enable".
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
While testing completion timeouts I found that hardware was not recovering.
It looks like the hot reset was never being propagated to the endpoint
devices on the bus due to the fact that we were clearing the bit too
quickly.
The documentation I have states that we should be transmitting hot reset
TS1s for 2ms. To achieve this I have added a 2ms delay from the time we
set the secondary bus reset bit to the time we clear it. In addition I
changed the define used for the secondary bus reset bit to match the
register define that was being used.
Reviewed-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.
percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.
http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.
* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.
* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.
The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.
2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.
3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.
4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.
5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.
6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).
* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig
8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.
Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Set power.async_suspend for all PCI devices and PCIe port services,
so that they can be suspended and resumed in parallel with other
devices they don't depend on in a known way (i.e. devices which are
not their parents or children).
This only affects the "regular" suspend and resume stages, which
means in particular that the restoration of the PCI devices' standard
configuration registers during resume will still be carried out
synchronously (at the "early" resume stage).
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Use pci_pcie_cap() instead of pci_find_capability() to get PCIe
capability offset. This reduces redundant search in PCI configuration
space.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Use pci_is_pcie() instead of looking at obsolete is_pcie field in
struct pci_dev.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Apparently, some machines may have problems with PCI run-time power
management if MSIs are used for the native PCIe PME signaling. In
particular, on the MSI Wind U-100 PCIe PME interrupts are not
generated by a PCIe root port after a resume from suspend to RAM, if
the system wake-up was triggered by a PME from the device attached to
this port. [It doesn't help to free the interrupt on suspend and
request it back on resume, even if that is done along with disabling
the MSI and re-enabling it, respectively.] However, if INTx
interrupts are used for this purpose on the same machine, everything
works just fine.
For this reason, add a kernel command line switch allowing one to
request that MSIs be not used for the native PCIe PME signaling,
introduce a DMI table allowing us to blacklist machines that need
this switch to be set by default and put the MSI Wind U-100 into this
table.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
PCIe native PME detection mechanism is based on interrupts generated
by root ports or event collectors every time a PCIe device sends a
PME message upstream.
Once a PME message has been sent by an endpoint device and received
by its root port (or event collector in the case of root complex
integrated endpoints), the Requester ID from the message header is
registered in the root port's Root Status register. At the same
time, the PME Status bit of the Root Status register is set to
indicate that there's a PME to handle. If PCIe PME interrupt is
enabled for the root port, it generates an interrupt once the PME
Status has been set. After receiving the interrupt, the kernel can
identify the PCIe device that generated the PME using the Requester
ID from the root port's Root Status register. [For details, see PCI
Express Base Specification, Rev. 2.0.]
Implement a driver for the PCIe PME root port service working in
accordance with the above description.
Based on a patch from Shaohua Li <shaohua.li@intel.com>.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The aer_inject module hangs in aer_inject() when checking the device's
error masks. The hang is due to a recursive use of the aer_inject lock.
The aer_inject() routine grabs the lock while processing the error and then
calls pci_read_config_dword to read the masks. The pci_read_config_dword
routine is earlier overridden by pci_read_aer, which among other things,
grabs the aer_inject lock.
Fixed by moving the pci_read_config_dword calls to read the masks to before
the lock is taken.
Acked-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Andrew Patterson <andrew.patterson@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The Correcteable/Uncorrectable Error Mask Registers are used by PCIe AER
driver which will controls the reporting of individual errors to PCIe RC
via PCIe error messages.
If hardware masks special error reporting to RC, the aer_inject driver
should not inject aer error.
Acked-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Youquan, Song <youquan.song@intel.com>
Acked-by: Ying, Huang <ying.huang@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Changing occurrences of variants of PCI-X and PCIe to the PCI-SIG
terms listed in the "Trademark and Logo Usage Guidelines".
http://www.pcisig.com/developers/procedures/logos/Trademark_and_Logo_Usage_Guidelines_updated_112206.pdf
Patch is limited to drivers/pci/ and changes concern non-comment parts or
anything that might be visible to the user.
Signed-off-by: Stefan Assmann <sassmann@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This problem happened when removing PCIe root port using PCI logical
hotplug operation.
The immediate cause of this problem is that the pointer to invalid
data structure is passed to pcie_update_aspm_capable() by
pcie_aspm_exit_link_state(). When pcie_aspm_exit_link_state() received
a pointer to root port link, it unconfigures the root port link and
frees its data structure at first. At this point, there are not links
to configure under the root port and the data structure for root port
link is already freed. So pcie_aspm_exit_link_state() must not call
pcie_update_aspm_capable() and pcie_config_aspm_path().
This patch fixes the problem by changing pcie_aspm_exit_link_state()
not to call pcie_update_aspm_capable() and pcie_config_aspm_path() if
the specified link is root port link.
------------[ cut here ]------------
kernel BUG at drivers/pci/pcie/aspm.c:606!
invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
last sysfs file: /sys/devices/pci0000:40/0000:40:13.0/remove
CPU 1
Modules linked in: shpchp
Pid: 9345, comm: sysfsd Not tainted 2.6.32-rc5 #98 ProLiant DL785 G6
RIP: 0010:[<ffffffff811df69b>] [<ffffffff811df69b>] pcie_update_aspm_capable+0x15/0xbe
RSP: 0018:ffff88082a2f5ca0 EFLAGS: 00010202
RAX: 0000000000000e77 RBX: ffff88182cc3e000 RCX: ffff88082a33d006
RDX: 0000000000000001 RSI: ffffffff811dff4a RDI: ffff88182cc3e000
RBP: ffff88082a2f5cc0 R08: ffff88182cc3e000 R09: 0000000000000000
R10: ffff88182fc00180 R11: ffff88182fc00198 R12: ffff88182cc3e000
R13: 0000000000000000 R14: ffff88182cc3e000 R15: ffff88082a2f5e20
FS: 00007f259a64b6f0(0000) GS:ffff880864600000(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00007feb53f73da0 CR3: 000000102cc94000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process sysfsd (pid: 9345, threadinfo ffff88082a2f4000, task ffff88082a33cf00)
Stack:
ffff88182cc3e000 ffff88182cc3e000 0000000000000000 ffff88082a33cf00
<0> ffff88082a2f5cf0 ffffffff811dff52 ffff88082a2f5cf0 ffff88082c525168
<0> ffff88402c9fd2f8 ffff88402c9fd2f8 ffff88082a2f5d20 ffffffff811d7db2
Call Trace:
[<ffffffff811dff52>] pcie_aspm_exit_link_state+0xf5/0x11e
[<ffffffff811d7db2>] pci_stop_bus_device+0x76/0x7e
[<ffffffff811d7d67>] pci_stop_bus_device+0x2b/0x7e
[<ffffffff811d7e4f>] pci_remove_bus_device+0x15/0xb9
[<ffffffff811dcb8c>] remove_callback+0x29/0x3a
[<ffffffff81135aeb>] sysfs_schedule_callback_work+0x15/0x6d
[<ffffffff81072790>] worker_thread+0x19d/0x298
[<ffffffff8107273b>] ? worker_thread+0x148/0x298
[<ffffffff81135ad6>] ? sysfs_schedule_callback_work+0x0/0x6d
[<ffffffff810765c0>] ? autoremove_wake_function+0x0/0x38
[<ffffffff810725f3>] ? worker_thread+0x0/0x298
[<ffffffff8107629e>] kthread+0x7d/0x85
[<ffffffff8102eafa>] child_rip+0xa/0x20
[<ffffffff8102e4bc>] ? restore_args+0x0/0x30
[<ffffffff81076221>] ? kthread+0x0/0x85
[<ffffffff8102eaf0>] ? child_rip+0x0/0x20
Code: 89 e5 8a 50 48 31 c0 c0 ea 03 83 e2 07 e8 b2 de fe ff c9 48 98 c3 55 48 89 e5 41 56 49 89 fe 41 55 41 54 53 48 83 7f 10 00 74 04 <0f> 0b eb fe 48 8b 05 da 7d 63 00 4c 8d 60 e8 4c 89 e1 eb 24 4c
RIP [<ffffffff811df69b>] pcie_update_aspm_capable+0x15/0xbe
RSP <ffff88082a2f5ca0>
---[ end trace 6ae0f65bdeab8555 ]---
Reported-by: Alex Chiang <achiang@hp.com>
Tested-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The pci_cleanup_aer_correct_error_status() function has been
#if 0'd out since 2.6.25. Time to remove the dead code.
Signed-off-by: Andrew Patterson <andrew.patterson@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The current implementation of pci_cleanup_aer_uncorrect_error_status
only clears either fatal or non-fatal error status bits depending
on the state of the I/O channel. This implementation will then often
leave some bits set after PCI error recovery completes. The uncleared bit
settings will then be falsely reported the next time an AER interrupt is
generated for that hierarchy. An easy way to illustrate this issue is to
use the aer-inject module to simultaneously inject both an uncorrectable
non-fatal and uncorrectable fatal error. One of the errors will not be
cleared.
This patch resolves this issue by unconditionally clearing all bits in
the AER uncorrectable status register. All settings and corrective action
strategies are saved and determined before
pci_cleanup_aer_uncorrect_error_status is called, so this change should not
affect errory handling functionality.
Signed-off-by: Andrew Patterson <andrew.patterson@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Remove 'port_type' field in struct pcie_port_data(), because we can
get port type information from struct pci_dev. With this change, this
patch also does followings:
- Remove struct pcie_port_data because it no longer has any field.
- Remove portdrv private definitions about port type (PCIE_RC_PORT,
PCIE_SW_UPSTREAM_PORT and PCIE_SW_DOWNSTREAM_PORT), and use generic
definitions instead.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Add missing service irqs cleanup in the error code path of
pcie_port_device_register().
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Call pci_enable_device() before initializing service irqs, because
legacy interrupt is initialized in pci_enable_device() on some
architectures.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This patch cleans up the service irqs initialization as follows:
- Remove 'irq_mode' field in pcie_port_data and related definitions,
which is not needed because we can get the same information from
'is_msix', 'is_msi' and 'pin' fields in struct pci_dev.
- Change the name of 'vectors' argument of assign_interrupt_mode() to
'irqs' because it holds irq numbers actually. People might confuse
it with CPU vector or MSI/MSI-X vector.
- Change function name assign_interrupt_mode() to init_service_irqs()
becasuse we no longer have 'irq_mode' data structure, and new name
is more straightforward (IMO).
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
No reason to check PME capability outside get_port_device_capability().
Do it in get_port_device_capability().
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
PCIe port type is already stored in 'pcie_type' field of struct
pci_dev. So we don't need to get it from pci configuration space.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
In the current port bus driver implementation, pcie_device allocation,
initialization and registration are done in separated functions. Doing
those in one function make the code simple and easier to read.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
We don't need pcie_port_device_probe() because we can get pci
device/port type using pci_is_pcie() and 'pcie_type' fields in struct
pci_dev. Remove pcie_port_device_probe().
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Changes for PCIe AER driver to use pci_is_pcie() instead of checking
pci_dev->is_pcie.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Change for PCIe ASPM driver to use pci_is_pcie() instead of checking
pci_dev->is_pcie.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Use pci_pcie_cap() instead of pci_find_capability() to get PCIe capability
offset in PCIe ASPM driver. This avoids unnecessary search in PCI
configuration space.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Use pci_pcie_cap() instead of pci_find_capability() to get PCIe capability
offset in PCI Express Port Bus driver. This avoids unnecessary serarch
in PCI configuration space.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Use pcie_cap() instead of pci_find_capability() to get PCIe capability
offset in PCIe AER driver. This avoids unnecessary search in PCI
configuration space.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Replaced some error return values in aer_inject. Use -ENODEV when we
can't find a device and -ENOTTY when the device does not support PCIe AER.
Acked-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Andrew Patterson <andrew.patterson@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Add support for PCI domains (segments) to aer_inject.
Acked-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Andrew Patterson <andrew.patterson@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Feedback from Hidetoshi Seto and Kenji Kaneshige incorporated. This
correctly handles PCI-X bridges, PCIe root ports and endpoints, and
prints debug messages when invalid/reserved types are found in the
HEST. PCI devices not in domain/segment 0 are not represented in
HEST, thus will be ignored.
Today, the PCIe Advanced Error Reporting (AER) driver attaches itself
to every PCIe root port for which BIOS reports it should, via ACPI
_OSC.
However, _OSC alone is insufficient for newer BIOSes. Part of ACPI
4.0 is the new APEI (ACPI Platform Error Interfaces) which is a way
for OS and BIOS to handshake over which errors for which components
each will handle. One table in ACPI 4.0 is the Hardware Error Source
Table (HEST), where BIOS can define that errors for certain PCIe
devices (or all devices), should be handled by BIOS ("Firmware First
mode"), rather than be handled by the OS.
Dell PowerEdge 11G server BIOS defines Firmware First mode in HEST, so
that it may manage such errors, log them to the System Event Log, and
possibly take other actions. The aer driver should honor this, and
not attach itself to devices noted as such.
Furthermore, Kenji Kaneshige reminded us to disallow changing the AER
registers when respecting Firmware First mode. Platform firmware is
expected to manage these, and if changes to them are allowed, it could
break that firmware's behavior.
The HEST parsing code may be replaced in the future by a more
feature-rich implementation. This patch provides the minimum needed
to prevent breakage until that implementation is available.
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Reviewed-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Matt Domsch <Matt_Domsch@dell.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
PCI: Prevent AER driver from being loaded on non-root port PCIE devices
PCI: get larger bridge ranges when space is available
PCI: pci.c: fix kernel-doc notation
PCI quirk: TI XIO200a erroneously reports support for fast b2b transfers
PCI PM: Read device power state from register after updating it
PCI: remove pci_assign_resource_fixed()
PCI: PCIe portdrv: remove "-driver" from driver name
After m68k's task_thread_info() doesn't refer to current,
it's possible to remove sched.h from interrupt.h and not break m68k!
Many thanks to Heiko Carstens for allowing this.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
A bug was seen on boards using a PLX 8518 switch device which advertises
AER on each of it's transparent bridges. The AER driver was loaded for
each bridge and this driver tried to access the AER source ID register
whenever an interrupt occured on the shared PCI INTX lines. The source
ID register does not exist on non root port PCIE device's which
advertise AER and trying to access this register causes a unsupported
request error on the bridge. Thus, when the next interrupt occurs,
another error is found and the non existent source ID register is
accessed again, and so it goes on.
The result is a spammed dmesg with unsupported request PCI express
errors on the bridge device that the AER driver is loaded against.
Reported-by: Malcolm Crossley <malcolm.crossley2@gefanuc.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Tested-by: Malcolm Crossley <malcolm.crossley2@gefanuc.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
No need to include "-driver" in the driver name.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
CC: Tom Long Nguyen <tom.l.nguyen@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
When booting with pci=nomsi aer causes lost interrupts and
lockdep inversions.
So check if MSIs are not disabled before initializing the aer
driver.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The definition of the ASPM support field in the Link Capabilities
Register had been changed by the "ASPM optionality ECN" as follows:
<Before>
00b Reserved
01b L0s Supported
10b Reserved
11b L0s and L1 Supported
<After>
00b No ASPM Support
01b L0s Supported
10b L1 Supported
11b L0s and L1 Supported
Current linux ASPM driver doesn't enable ASPM if the support field is
00b or 10b. So there is no impact about 00b. But current linux ASPM
driver doesn't enable L1 if the support field is 10b. With this patch,
10b (L1 support) is handled properly.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
After commit c82f63e411
(PCI: check saved state before restore) pcie_portdrv_slot_reset()
may not work correctly if dev->error_state is equal to
pci_channel_io_frozen, because dev->state_saved need not be set at
that time. Fix this issue by setting dev->state_saved before
pci_restore_state() is called in pcie_portdrv_slot_reset().
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
There is a very old quirk for the intel E7502 E7320 and E7525 memory
controller hubs that disables usage of msi interrupts on pcie hotplug
bridges of those devices, and disables changing the affinity of irqs.
Today all we have to do to disable msi on a specific device is to set
dev->no_msi, which is much more straightforward than the previous
logic.
The re-running of this fixup after pci hotplug happens below these
devices is totally bogus. All of the state we change is pure software
state and we don't change the hardware at all. Which means hotplug on
the lower devices doesn't have a chance to change this state. So we
can safely remove the special case from the pciehp driver and the pcie
portdriver.
I suspect the special case was someone's expermental debug code that
slipped in. Certainly it isn't mentioned in commit
6fb8880a61510295aece04a542767161f624dffe aka BKrev:
41966101LJ_ogfOU0m2aE6teZfQnuQ where the code first appears.
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Multiple bits might be set in the Uncorrectable Error Status
register. But aer_print_error_source() only report a error of
the lowest bit set in the error status register.
So print strings for all bits unmasked and set.
And check First Error Pointer to mark the error occured first.
This FEP is not valid when the corresponing bit of the Uncorrectable
Error Status register is not set, or unimplemented or undefined.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
ERR_{,UN}CORRECTABLE_ERROR_MASK are set of error bits which linux know,
set of PCI_ERR_COR_* and PCI_ERR_UNC_* defined in linux/pci_regs.h.
This masks make aerdrv not to report errors of unknown bit, while aerdrv
have ability to report such undefined errors as "Unknown Error Bit %2d".
OTOH aerdrv_errprint does not have any check of setting in mask register.
So it could report masked wrong error by finding bit in status without
knowing that the bit is masked in the mask register.
This patch changes aerdrv to use mask state in mask register propely
instead of defined/hardcoded ERR_{,UN}CORRECTABLE_ERROR_MASK.
This change prevents aerdrv from reporting masked error, and also enable
reporting unknown errors.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Reviewed-by: Andrew Patterson <andrew.patterson@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The static buffer errmsg_buff[] is used only for building error
message in fixed format, and is protected by a spinlock.
This patch removes this buffer and the spinlock.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Reviewed-by: Andrew Patterson <andrew.patterson@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The flag AER_MULTI_ERROR_VALID_FLAG in info->flag does mean that the
root port receives multiple error messages. Error messages can be
posted from different devices, so it does not mean that each reported
device has multiple errors.
If there are multiple error devices and the root port has valid error
source ID, it would be nice to report which device is the error source
reported first.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
In case of multiple errors, struct aer_err_info would be reused among
all reported devices. So the info->status should be initialized before
recycled. Otherwise error of one device might be reported as the error
of another device. Also info->flags has similar problem on reporting
TLP header.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Definitions of MASK macros in aerdrv_errprint.c are tricky and unsafe.
For example, AER_AGENT_TRANSMITTER_MASK(_sev, _stat) does work like:
static inline func(int _sev, int _stat)
{
if (_sev == AER_CORRECTABLE)
return (_stat & (PCI_ERR_COR_REP_ROLL|PCI_ERR_COR_REP_TIMER));
else
return (_stat & PCI_ERR_COR_REP_ROLL);
}
In case of else path here, for uncorrectable errors, testing bits in
_stat by PCI_ERR_COR_* does not make sense because _stat should have only
PCI_ERR_UNC_* bits originated in uncorrectable error status register.
But at this time this is safe because uncorrectable error using bit
position same to PCI_ERR_COR_REP_ROLL(= bit position 8) is not defined.
Likewise, AER_AGENT_COMPLETER_MASK is always PCI_ERR_UNC_COMP_ABORT but
it works because bit 15 of correctable error status is not defined.
It means that these MASK macros will turn to be wrong once if new error
is defined. (In fact, bit 15 of correctable is now defined in PCIe 2.1)
This patch changes these MASK macros to be more strict, not to return
PCI_ERR_COR_* bits for uncorrectable error status and vise versa.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Reviewed-by: Andrew Patterson <andrew.patterson@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The L0s state can be managed separately for each direction (upstream
direction and downstream direction) of the link. But in the current
implementation, those are mixed up. With this patch, L0s for each
direction are managed separately.
To maintain three states (upstream direction L0s, downstream L0s and
L1), 'aspm_support', 'aspm_enabled', 'aspm_capable', 'aspm_disable'
and 'aspm_default' fields in struct pcie_link_state are changed to
3-bit from 2-bit. The 'latency' field is separated to two 'latency_up'
and 'latency_dw' fields to maintain exit latencies for each direction
of the link. For L0, 'latency_up.l0' and 'latency_dw.l0' are used to
configure upstream direction L0s and downstream direction L0s
respectively. For L1, larger value of 'latency_up.l1' and
'latency_dw.l1' is considered as L1 exit latency.
Acked-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
In the current implementation, ASPM L0s/L1 is disabled for all links
in the hierarchy if one of the link doesn't meet latency requirement.
But we can partially enable ASPM L0s/L1 on sub-tree in the hierarchy.
This patch allows partial L0s/L1 enablement in the hierarchy. And it
also reduce the calculation cost of ASPM configuration very much.
In the previous implementation, all links were enabled with the same
state. With this patch, enabled state for each link is determined
simply as follows (the 'requested' is from policy_to_aspm_state()).
enabled = requested & (link->aspm_capable & link->aspm_disable)
Acked-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Introduce 'aspm_capable' field to maintain the capable ASPM setting of
the link. By the 'aspm_capable', we don't need to recheck latency
every time ASPM policy is changed.
Each bit in 'aspm_capable' is associated to ASPM state (L0S/L1). The
bit is set if the associated ASPM state is supported by the link and
it satisfies the latency requirement (i.e. exit latency < endpoint
acceptable latency). The 'aspm_capable' is updated when
- an endpoint device is added (boot time or hot-plug time)
- an endpoint device is removed (hot-unplug time)
- PCI power state is changed.
Acked-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Introduce 'aspm_disable' flag to manage disabled ASPM state more
robust way.
Acked-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Fix possible NULL dereference in pcie_aspm_exit_link_state(). This
patch also cleanup some code.
Acked-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Remove the following check in __pcie_aspm_config_link() because it
nerver be true.
Acked-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
We must not clear bits in 'aspm_enabled' using 'aspm_support', or
'aspm_enabled' and 'aspm_default' might be different from the actual
state. In addtion, 'aspm_default' should be intialized even if
'aspm_support' is 0.
Acked-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
By having a pointer to the root port link, we can remove loops in
get_root_port_link() to search the root port link.
Acked-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
We don't need the 'has_switch' field in the struct pcie_link_state.
Acked-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Cleanup for calc_L0S_latency() and calc_L1_latency().
- Separate exit latency and acceptable latency calculation.
- Some minor cleanups.
Acked-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
In the current ASPM implementation, callers of pcie_set_clock_pm() check
Clock PM capability of the link or current Clock PM state of the link.
This check should be done in pcie_set_clock_pm() itself.
This patch moves those checks into pcie_set_clock_pm(). It also
introduces pcie_set_clkpm_nocheck() that is equivalent to old
pcie_set_clock_pm(), for the caller who wants to change Clocl PM state
regardless of the Clock PM capability or current Clock PM state. In
addition, this patch changes the function name from
pcie_set_clock_pm() to pcie_set_clkpm() for consistency.
Acked-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Clean up ASPM initialization by refactoring some functionality, renaming
functions, and moving things around.
Acked-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
In the current ASPM implementation, there are many functions that
take a pointer to struct pci_dev corresponding to the upstream component
of the link as a parameter. But, since those functions handle PCI
express link state, a pointer to struct pcie_link_state is more
suitable than a pointer to struct pci_dev. Changing a parameter to a
pointer to struct pcie_link_state makes ASPM code much simpler and
easier to read. This patch also contains some minor cleanups. This patch
doesn't have any functional change.
Acked-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Cleanup for some fields in pcie_link_state.
- Add comments.
- make "downstream_has_switch" field 1-bit.
Acked-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The "clk_pm_capable", "clk_pm_enable" and "bios_clk_state" fields in
the struct pcie_link_state only take 1-bit value. So those fields
don't need to be defined as unsigned int. This patch makes those
fields 1-bit, and cleans up some related code.
Acked-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Clean up latency related data structures for ASPM.
- Introduce struct acpi_latency for exit latency and acceptable
latency management. With this change, struct endpoint_state is no
longer needed.
- We don't need to hold both upstream latency and downstream latency
in the current implementation.
Acked-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The "support_state", "enabled_state" and "bios_aspm_state" fields in
the struct pcie_link_state take 2-bit value. So those fields don't
need to be defined as unsigned int. This patch makes those fields
2-bit, and cleans up some related code.
Acked-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Fix a typo in struct pcie_link_state.
The "sibiling" field in the struct pcie_link_state should be
"sibling".
Acked-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Debugging PCIE AER code can be very difficult because it is hard
to trigger various real hardware errors. This patch provide a
software based error injection tool, which can fake various PCIE
errors with a user space helper tool named "aer-inject". Which
can be gotten from:
http://www.kernel.org/pub/linux/kernel/people/yhuang/
The patch fakes AER error by faking some PCIE AER related
registers and an AER interrupt for specified the PCIE device.
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
When a root port receives the same errors more than once before the
kernel process them, the Multiple Error Messages Received flags are set
by hardware. Because the root port could only save one kind of
correctable error source id and another uncorrectable error source id at
the same time, the second message sender id is lost if the 2 messages
are sent from 2 different devices. This patch makes the kernel search
all devices under the root port when multiple messages are received.
Reviewed-by: Andrew Patterson <andrew.patterson@hp.com>
Signed-off-by: Zhang Yanmin <yanmin_zhang@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
When the bus id part of error source id is equal to 0 or nosourceid=1,
make the kernel probe the AER status registers of all devices under the
root port to find the initial error reporter.
Reviewed-by: Andrew Patterson <andrew.patterson@hp.com>
Signed-off-by: Zhang Yanmin <yanmin_zhang@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Based on PCI Express AER specs, a root port might receive multiple
TLP errors while it could only save a correctable error source id
and an uncorrectable error source id at the same time. In addition,
some root port hardware might be unable to provide a correct source
id, i.e., the source id, or the bus id part of the source id provided
by root port might be equal to 0.
The patchset implements the support in kernel by searching the device
tree under the root port.
Patch 1 changes parameter cb of function pci_walk_bus to return a value.
When cb return non-zero, pci_walk_bus stops more searching on the
device tree.
Reviewed-by: Andrew Patterson <andrew.patterson@hp.com>
Signed-off-by: Zhang Yanmin <yanmin_zhang@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>