Commit Graph

1670 Commits

Author SHA1 Message Date
Trent Piepho 83dbf66f04 PCI Hotplug: restore fakephp interface with complete reimplementation
A complete re-implementation of fakephp is necessary if it is to
present its former interface (pre-2.6.27, when it broke). The
reason is that PCI hotplug drivers call pci_hp_register(), which
enforces the rule that only one /sys/bus/pci/slots/ file may be
created per physical slot.

The change breaks the old fakephp's assumption that it could
create a file per function. So we re-implement fakephp to avoid
using the standard PCI hotplug API so that we can restore the old
fakephp user interface.

It puts entries in /sys/bus/pci/slots with the names of all PCI
devices/functions, exactly symmetrical to what is shown in
/sys/bus/pci/devices. Each slots/ entry has a "power" attribute,
which works the same way as the fakephp driver's power attribute
has worked.

There are a few improvements over old fakephp, which couldn't handle
PCI devices being added or removed via a means outside of
fakephp's knowledge.  If a device was added another way, old fakephp
didn't notice and didn't create the fake slot for it.  If a
device was removed another way, old fakephp didn't delete the fake
slot for it (and accessing the stale slot caused an oops).

The new implementation overcomes these limitations. As a
consequence, removing a bridge with other devices behind it now
works as well, which is something else old fakephp couldn't do
previously.

This duplicates a tiny bit of the code in the PCI core that does
this same function.  Re-using that code ends up being more
complex than duplicating it, and it makes code in the PCI core
more ugly just to support this legacy fakephp interface
compatibility layer.

Reviewed-by: James Cameron <qz@hp.com>
Signed-off-by: Trent Piepho <xyzzy@speakeasy.org>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20 14:59:25 -07:00
Alex Chiang 738a6396c2 PCI: Introduce /sys/bus/pci/devices/.../rescan
This interface allows the user to force a rescan of the device's
parent bus and all subordinate buses, and rediscover devices removed
earlier from this part of the device tree.

Cc: Trent Piepho <xyzzy@speakeasy.org>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20 14:59:07 -07:00
Alex Chiang 77c27c7b49 PCI: Introduce /sys/bus/pci/devices/.../remove
This patch adds an attribute named "remove" to a PCI device's sysfs
directory.  Writing a non-zero value to this attribute will remove the PCI
device and any children of it.

Trent Piepho wrote the original implementation and documentation.

Thanks to Vegard Nossum for testing under kmemcheck and finding locking
issues with the sysfs interface.

Cc: Trent Piepho <xyzzy@speakeasy.org>
Tested-by: Vegard Nossum <vegard.nossum@gmail.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20 14:58:48 -07:00
Alex Chiang 705b1aaa82 PCI: Introduce /sys/bus/pci/rescan
This interface allows the user to force a rescan of all PCI buses
in system, and rediscover devices that have been removed earlier.

pci_bus_attrs implementation from Trent Piepho.

Thanks to Vegard Nossum for discovering locking issues with the
sysfs interface.

Cc: Trent Piepho <xyzzy@speakeasy.org>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20 14:57:58 -07:00
Alex Chiang 3ed4fd96b3 PCI: Introduce pci_rescan_bus()
This API is used by the PCI core to rescan a bus and rediscover
newly added devices.

Over time, it is expected that the various PCI hotplug drivers
will migrate to this interface and away from the old
pci_do_scan_bus() interface.

Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20 14:57:44 -07:00
Alex Chiang 9dd90cafa7 PCI: do not enable bridges more than once
In preparation for PCI core hotplug, we need to ensure that we do
not attempt to re-enable bridges that have already been enabled.

Reported-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20 14:57:36 -07:00
Alex Chiang b73e97d95c PCI: do not initialize bridges more than once
In preparation for PCI core hotplug, we need to ensure that we do
not attempt to re-initialize bridges that have already been initialized.

We only need to worry about non-root buses, since we will not allow
root bus removal.

Reported-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20 14:57:32 -07:00
Alex Chiang 74710ded8e PCI: always scan child buses
While scanning bridges, we stop our scan if we encounter a bus
that we've seen before, to work around some buggy chipsets. This
is a good idea, but prevents us from fully scanning the PCI bus
at a future time (to find newly hot-added devices, for example).

Change the logic so that we skip _re-adding_ an existing bus
that we've seen before, but also allow the scan to descend to
all child buses.

Now that we're potentially scanning our child buses again, we
also need to be sure not to attempt re-initializing their BARs
so we avoid that.

This patch lays the groundwork to allow the user to issue a
rescan of the PCI bus at any time.

Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20 14:57:21 -07:00
Trent Piepho 1b69dfc649 PCI: pci_scan_slot() returns newly found devices
pci_scan_slot() has been rewritten to be less complex and will now
return the number of *new* devices found.

Existing callers need not worry because they already assume that
they can't call pci_scan_slot() on an already-scanned slot.

Thus, there is no semantic change for existing callers: returning
newly found devices (this patch) is exactly equal to returning all
found devices (before this patch).

This patch adds some more groundwork to allow us to rescan the
PCI bus during runtime to discover newly added devices.

Signed-off-by: Trent Piepho <xyzzy@speakeasy.org>
Reviewed-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20 14:57:05 -07:00
Trent Piepho 90bdb3117f PCI: don't scan existing devices
pci_scan_single_device is supposed to add newly discovered
devices to pci_bus->devices, but doesn't check to see if the
device has already been added. This can cause problems if we ever
want to use this interface to rescan the PCI bus.

If the device is already added, just return it.

Signed-off-by: Trent Piepho <xyzzy@speakeasy.org>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20 14:56:45 -07:00
Yu Zhao 74bb1bcc7d PCI: handle SR-IOV Virtual Function Migration
Add or remove a Virtual Function after receiving a Migrate In or Out
Request.

Reviewed-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20 10:48:28 -07:00
Yu Zhao dd7cc44d0b PCI: add SR-IOV API for Physical Function driver
Add or remove the Virtual Function when the SR-IOV is enabled or
disabled by the device driver. This can happen anytime rather than
only at the device probe stage.

Reviewed-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20 10:48:26 -07:00
Yu Zhao 480b93b783 PCI: centralize device setup code
Move the device setup stuff into pci_setup_device() which will be used
to setup the Virtual Function later.

Reviewed-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20 10:48:25 -07:00
Yu Zhao a28724b0fb PCI: reserve bus range for SR-IOV device
Reserve the bus number range used by the Virtual Function when
pcibios_assign_all_busses() returns true.

Reviewed-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20 10:48:24 -07:00
Yu Zhao 8c5cdb6adc PCI: restore saved SR-IOV state
Restore the volatile registers in the SR-IOV capability after the
D3->D0 transition.

Reviewed-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20 10:48:24 -07:00
Yu Zhao d1b054da8f PCI: initialize and release SR-IOV capability
If a device has the SR-IOV capability, initialize it (set the ARI
Capable Hierarchy in the lowest numbered PF if necessary; calculate
the System Page Size for the VF MMIO, probe the VF Offset, Stride
and BARs). A lock for the VF bus allocation is also initialized if
a PF is the lowest numbered PF.

Reviewed-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20 10:48:22 -07:00
David O'Shea 8293b0f629 PCI: Compaq Evo D510 SMBus quirk using USB instead of VGA
On the Compaq Evo D510 SFF/CMT, a PCI quirk activated the SMBus device
based on detection of the on-board VGA controller, but the on-board
VGA is disabled if an AGP card is inserted, so look for one of the USB
controllers instead.

Signed-off-by: David O'Shea <dcoshea@hotmail.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20 10:48:19 -07:00
Dave Airlie 217f45de3d PCI: expose boot VGA device via sysfs.
X really would like to know which VGA device was considered the boot
device by the system. The x86 PCI fixups have support for discovering
this but we provide no way to expose it to userspace.

This adds a sysfs file per VGA class device which has the value 0 for
non the boot device or unknown, and 1 if the VGA device is the boot
device.

Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20 10:48:17 -07:00
Yinghai Lu dfadd9edff PCI/x86: detect host bridge config space size w/o using quirks
Many host bridges support a 4k config space, so check them directy
instead of using quirks to add them.

We only need to do this extra check for host bridges at this point,
because only host bridges are known to have extended address space
without also having a PCI-X/PCI-E caps.  Other devices with this
property could be done with quirks (if there are any).

As a bonus, we can remove the quirks for AMD host bridges with family
10h and 11h since they're not needed any more.

With this patch, we can get correct pci cfg size of new Intel CPUs/IOHs
with host bridges.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Reviewed-by: Matthew Wilcox <willy@linux.intel.com>
Cc: <stable@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20 10:48:17 -07:00
Alex Chiang 745be2e700 PCIe: portdrv: call pci_disable_device during remove
The PCIe port driver calls pci_enable_device when registering
ports, but never calls pci_disable_device during removal.

Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20 10:48:16 -07:00
Alex Chiang 9efb5fe1b8 PCI: PCIe portdrv: eliminate double kfree in remove path
Commit 55633af3 (PCIe portdrv: Use driver data to simplify code)
added a kfree of the driver private data in pcie_port_device_remove
but forgot to remove the old kfree from pcie_portdrv_remove.

Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20 10:48:16 -07:00
Geert Uytterhoeven 6a3b3e2680 PCI: Use kzalloc() in pci_create_bus()
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20 10:48:15 -07:00
Yuji Shimada 32a9a682be PCI: allow assignment of memory resources with a specified alignment
This patch allows memory resources to be assigned with a specified
alignment at boot-time or run-time. The patch is useful when we use PCI
pass-through, because page-aligned memory resources are required to
securely share PCI resources with guest drivers.

If you want to assign the resource at boot time, please set
"pci=resource_alignment=" boot parameter.

This is format of "pci=resource_alignment=" boot parameter:

        [<order of align>@][<domain>:]<bus>:<slot>.<func>[; ...]
                Specifies alignment and device to reassign
                aligned memory resources.
                If <order of align> is not specified, PAGE_SIZE is
                used as alignment.
                PCI-PCI bridge can be specified, if resource
                windows need to be expanded.

This is example:

        pci=resource_alignment=20@07:00.0;18@0f:00.0;00:1d.7

If you want to assign the resource at run-time, please set
"/sys/bus/pci/resource_alignment" file, and hot-remove the device and
hot-add the device.  For this purpose, fakephp or PCI hotplug interfaces
can be used.

The format of "/sys/bus/pci/resource_alignment" file is the same with
boot parameter. You can use "," instead of ";".

For example:

        # cd /sys/bus/pci
        # echo -n 20@12:00.0 > resource_alignment
        # echo 1 > devices/0000:12:00.0/remove
        # echo 1 > rescan

Reviewed-by: Alex Chiang <achiang@hp.com>
Reviewed-by: Yu Zhao <yu.zhao@intel.com>
Signed-off-by: Yuji Shimada <shimada-yxb@necst.nec.co.jp>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20 10:48:15 -07:00
Matthew Wilcox 1c8d7b0a56 PCI MSI: Add support for multiple MSI
Add the new API pci_enable_msi_block() to allow drivers to
request multiple MSI and reimplement pci_enable_msi in terms of
pci_enable_msi_block.  Ensure that the architecture back ends don't
have to know about multiple MSI.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20 10:48:14 -07:00
Matthew Wilcox f2440d9acb PCI MSI: Refactor interrupt masking code
Since most of the callers already know whether they have an MSI or
an MSI-X capability, split msi_set_mask_bits() into msi_mask_irq()
and msix_mask_irq().  The only callers which don't (mask_msi_irq()
and unmask_msi_irq()) can share code in msi_set_mask_bit().  This then
becomes the only caller of msix_flush_writes(), so we can inline it.
The flushing read can be to any address that belongs to the device,
so we can eliminate the calculation too.

We can also get rid of maskbits_mask from struct msi_desc and simply
recalculate it on the rare occasion that we need it.  The single-bit
'masked' element is replaced by a copy of the 32-bit 'masked' register,
so this patch does not affect the size of msi_desc.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20 10:48:13 -07:00
Matthew Wilcox 264d9caaa1 PCI MSI: Use mask_pos instead of mask_base when appropriate
MSI interrupts have a mask_pos where MSI-X have a mask_base.  Use a
transparent union to get rid of some ugly casts.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20 10:48:13 -07:00
Matthew Wilcox 379f5327a8 PCI MSI: msi_desc->dev is always initialised
By passing the pci_dev into alloc_msi_entry() we can be sure that
the ->dev entry is always assigned and so we don't need to check it.
Also, we used kzalloc() so we don't need to initialise ->irq to 0.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20 10:48:12 -07:00
Matthew Wilcox 24d2755339 PCI MSI: Replace 'type' with 'is_msix'
By changing from a 5-bit field to a 1-bit field, we free up some bits
that can be used by a later patch.  Also rearrange the fields for better
packing on 64-bit platforms (reducing the size of msi_desc from 72 bytes
to 64 bytes).

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20 10:48:12 -07:00
Chris Wright 0994375e96 PCI: add remove_id sysfs entry
This adds a remove_id sysfs entry to allow users of new_id to later
remove the added dynid.  One use case is management tools that want to
dynamically bind/unbind devices to pci-stub driver while devices are
assigned to KVM guests.  Rather than having to track which driver was
originally bound to the driver, a mangement tool can simply:

Guest uses device

Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20 10:48:11 -07:00
Kenji Kaneshige c74d724462 PCI: fix wrong assumption in pci_common_swizzle
Current pci_common_swizzle() seems to have a assumption that
pci_bus->self is NULL on the pci root bus. But it might not be true on
some platforms. Because of this wrong assumption, pci_common_swizzle()
might cause endless loop. We must check pci_bus->parent instead.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20 10:48:05 -07:00
Kenji Kaneshige c2a3072e01 PCI: fix wrong assumption in pci_get_interrupt_pin
Current pci_get_interrupt_pin() seems to have an assumption that
pci_bus->self is NULL on the root pci bus. But it might not be true on
some platforms. Because of this wrong assumption, current
pci_get_interrupt_pin() might cause endless loop. We must check
pci_bus->parent instead.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20 10:48:04 -07:00
Kenji Kaneshige f92d4e29d7 PCI: fix wrong assumption in pci_read_bridge_bases
Current pci_read_bridge_bases() has an assumption that pci_bus->self
is NULL on the pci root bus (It checks pci_bus->self to see if the pci
bus is root bus). But is might not true on some platforms. We must
check pci_bus->parent instead.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20 10:48:04 -07:00
Kenji Kaneshige 151ab36a2e PCI: fix wrong assumption in pci_find_upstream_pcie_bridge
Current pci_find_upstream_pcie_bridge() has a wrong assumption that
pci_bus->self is NULL on the root pci bus. But it might not true on
some platforms. Because of this wrong assumption, current
pci_find_upstream_pcie_bridge() might cause endless loop. We must
check pci_bus->parent instead.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20 10:48:03 -07:00
Kenji Kaneshige d391f00f0e PCI hotplug: fix wrong assumption in acpi_get_hp_hw_control_from_firmware
Current acpi_get_hp_hw_control_from_firmware() has a assumption that
pci_bus->self is NULL on a PCI root bus. But it might not be true on
some platforms. Because of this wrong assumption, current
acpi_get_hp_hw_control_from_firmware() might cause endless loop. We
must check pci_bus->parent instead.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20 10:48:02 -07:00
Kenji Kaneshige 267efd7eec PCI hotplug: fix wrong assumption in acpi_get_hp_params_from_firmware
Current acpi_get_hp_params_from_firmware() has a assumption that
pci_bus->self is NULL on the root pci bus. But it might not true on
some platforms. Because of this wrong assumption, current
acpi_get_hp_params_from_firmware() might cause endless loop. We must
check pci_bus->parent instead.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20 10:48:00 -07:00
Rafael J. Wysocki 3a3c244c9a PCI: PCIe portdrv: Implement pm object
Implement pm object for the PCI Express port driver in order to use
the new power management framework and reduce the code size.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20 10:47:49 -07:00
Eric W. Biederman ae40582e99 PCI: pcie_portdriver: fix pcie_port_device_remove
pcie_port_device_remove currently calls the remove method of port
drivers twice.  Ouch!

We are calling device_for_each_child multiple times for no apparent
reason.

So make it simple. Place put_device and device_unregister into
remove_iter, and throw out the rest.  Only call device_for_each_child
once.

The code is simpler and actually works!

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20 10:47:33 -07:00
Ivan Kokshaysky 10a0ef39fb PCI/alpha: pci sysfs resources
This closes http://bugzilla.kernel.org/show_bug.cgi?id=10893
which is a showstopper for X development on alpha.

The generic HAVE_PCI_MMAP code (drivers/pci-sysfs.c) is not
very useful since we have to deal with three different types
of MMIO address spaces: sparse and dense mappings for old
ev4/ev5 machines and "normal" 1:1 MMIO space (bwx) for ev56 and
later.
Also "write combine" mappings are meaningless on alpha - roughly
speaking, alpha does write combining, IO reordering and other
optimizations by default, unless user splits IO accesses
with memory barriers.

I think the cleanest way to deal with resource files on alpha
is to convert the default no-op pci_create_resource_files() and
pci_remove_resource_files() for !HAVE_PCI_MMAP case into __weak
functions and override them with alpha specific ones.

Another alpha hook is needed for "legacy_" resource files
to handle sparse addressing (pci_adjust_legacy_attr).

With the "standard" resourceN files on ev56/ev6 libpciaccess
works "out of the box". Handling of resourceN_sparse/resourceN_dense
files on older machines obviously requires some userland work.

Sparse/dense stuff has been tested on sx164 (pca56/pyxis, normally
uses bwx IO) with the kernel hacked into "cia compatible" mode.

Signed-off-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-19 19:29:36 -07:00
Andrew Morton ea7415512a PCI: constify pci_bus_assign_resources()
drivers/pci/hotplug/fakephp.c: In function 'pci_rescan_bus':
drivers/pci/hotplug/fakephp.c:271: warning: passing argument 1 of 'pci_bus_assign_resources' discards qualifiers from pointer target type

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-19 19:29:35 -07:00
akpm@linux-foundation.org c48f1670f4 PCI: constify pci_bus_add_devices()
drivers/pci/hotplug/fakephp.c:283: warning: passing argument 1 of 'pci_bus_add_devices' discards qualifiers from pointer target type

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-19 19:29:35 -07:00
Michael Ellerman b5fbf53324 PCI/MSI: Allow arch code to return the number of MSI-X available
There is code in msix_capability_init() which, when the requested number
of MSI-X couldn't be allocated, calculates how many MSI-X /could/ be
allocated and returns that to the driver. That allows the driver to then
make a second request, with a number of MSIs that should succeed.

The current code requires the arch code to setup as many msi_descs as it
can, and then return to the generic code. On some platforms the arch
code may already know how many MSI-X it can allocate, before it sets up
any of the msi_descs.

So change the logic such that if the arch code returns a positive error
code, that is taken to be the number of MSI-X that could be allocated.
If the error code is negative we still calculate the number available
using the old method.

Because it's a little subtle, make sure the error return code from
arch_setup_msi_irq() is always negative. That way only implementations
of arch_setup_msi_irqs() need to be careful about returning a positive
error code.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-19 19:29:34 -07:00
Roel Kluin 35e1801ea6 PCI hotplug: shpchp: fix bus number check to avoid false positive
With for (busnr = 0; busnr <= end; busnr++) { ... } busnr reaches end + 1
after the loop.  So fix the "no busses available" check to look for just
busnr > end rather than >=.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-19 19:29:33 -07:00
Kenji Kaneshige 9f5404d8ea PCI/ACPI: rename pci_osc_control_set()
- Rename pci_osc_control_set() to acpi_pci_osc_control_set() according
  to the other API names in drivers/acpi/pci_root.c.

- Move _OSC related definitions to include/linux/acpi.h because _OSC
  related API is implemented in drivers/acpi/pci_root.c now.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Reviewed-by: Andrew Patterson <andrew.patterson@hp.com>
Tested-by: Andrew Patterson <andrew.patterson@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-19 19:29:33 -07:00
Kenji Kaneshige 63f10f0f6d PCI/ACPI: move _OSC code to pci_root.c
Move PCI _OSC management code from drivers/pci/pci-acpi.c to
drivers/acpi/pci_root.c. The benefits are

- We no longer need struct osc_data and its management code (contents
  are moved to struct acpi_pci_root). This simplify the code, and we
  no longer care about kmalloc() failure.

- We can make pci_acpi_osc_support() be a static function, which is
  called only from drivers/acpi/pci_root.c.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Reviewed-by: Andrew Patterson <andrew.patterson@hp.com>
Tested-by: Andrew Patterson <andrew.patterson@hp.com>
Acked-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-19 19:29:32 -07:00
Sheng Yang 5fe5db05f6 PCI: Speed up device reset function
For all devices need to do function level reset, currently we need wait for
at least 200ms, which can be too long if we have lots of devices...

The patch checked pending bit before msleep() to skip some unnecessary
sleeping interval.

Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-19 19:29:31 -07:00
Jiri Slaby 4c9c16867e PCI quirk: don't mark one netmos as class other
Let it stay as serial, since it doesn't have subdevice in the form of 0x00PS.

Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-19 19:29:31 -07:00
Alex Chiang 6279504141 PCI: enhance physical slot debug information
Convert usages of pr_debug to dev_dbg and add physical slot name.

Note that we use dev_dbg on the struct pci_bus and still manually
print out the PCI slot number (instead of calling dev_dbg on a
pci_dev) because a struct pci_bus with empty physical slots will
not have any pci_devs.

Reviewed-by: Andrew Patterson <andrew.patterson@hp.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-19 19:29:30 -07:00
Kenji Kaneshige 6a82e21823 PCI: pciehp: make cmd_busy flag one bit
The cmd_busy field in struct controller takes only two values 0 or
1. So it should be one bit.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-19 19:29:30 -07:00
Kenji Kaneshige 99f0169c17 PCI: pciehp: enable software notification on empty slots
Current pciehp disables software notification of adapter presence
changed event and MRL changed event when slot is turned off. Because
of this, there is no way to detect those events on empty slots in the
current pciehp implementation.

According to the past discussion(*), this behavior was introduced to
prevent endless loop that could happen if pcie_isr() runs after power
fault is detected on a certain platform whose stickey power-fault bit
remains on till the slot is powered on again.

(*) http://sourceforge.net/mailarchive/message.php?msg_id=20051130135409.A14918%40unix-os.sc.intel.com

I think this endless loop can be avoided using one bit flag that
indicates power fault had been detected, instead of disabling software
notification of adapter present changed event and MRL changed event.

With this patch, we can enable software notification mechanism of
presence changed and MRL changed event on the empty slots again.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-19 19:29:29 -07:00
Kenji Kaneshige 81b840cd27 PCI: pciehp: fix possible endless loop in pcie_isr
Fix possible endless loop in pcie_isr.

Currently, pcie_isr() (interrupt service routine of pciehp) can end up in an
endless loop if the Slot Status register is set again immediately after being
cleared. According to the past discussion (see below URL) this case can happen
if the power fault detected bit is set during handling.

http://sourceforge.net/mailarchive/message.php?msg_id=20051130135409.A14918%40unix-os.sc.intel.com

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-19 19:29:28 -07:00
Julia Lawall 0b3e7388e3 PCI: introduce missing kfree
Error handling code following a kmalloc should free the allocated data.
Since the subsequent code that could provoke an error does not use the
allocated data, the allocation is just moved below it.

The semantic match that finds the problem is as follows:
(http://www.emn.fr/x-info/coccinelle/)

// <smpl>
@r exists@
local idexpression x;
statement S;
expression E;
identifier f,l;
position p1,p2;
expression *ptr != NULL;
@@

(
if ((x@p1 = \(kmalloc\|kzalloc\|kcalloc\)(...)) == NULL) S
|
x@p1 = \(kmalloc\|kzalloc\|kcalloc\)(...);
...
if (x == NULL) S
)
<... when != x
     when != if (...) { <+...x...+> }
x->f = E
...>
(
 return \(0\|<+...x...+>\|ptr\);
|
 return@p2 ...;
)

@script:python@
p1 << r.p1;
p2 << r.p2;
@@

print "* file: %s kmalloc %s return %s" % (p1[0].file,p1[0].line,p2[0].line)
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Reviewed-by: Matthew Wilcox <willy@linux.intel.com>
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-19 19:29:28 -07:00
Frank Seidel 1c35b8e538 PCI: add missing KERN_* constants to printks
According to kerneljanitors todo list all printk calls (beginning
a new line) should have an according KERN_* constant.
Those are the missing pieces here for the pci subsystem.

Signed-off-by: Frank Seidel <frank@f-seidel.de>
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Tested-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-19 19:29:27 -07:00
Yu Zhao 2b56313448 PCI: check if a bus is added when removing it
When removing a bus, 'is_added' should be checked to make sure the
bus has been successfully added by pci_bus_add_child() who will sets
'is_added'.

Signed-off-by: Yu Zhao <yu.zhao@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-19 19:29:26 -07:00
Michael Ellerman 11df1f0551 PCI/MSI: Use #ifdefs instead of weak functions
Weak functions aren't all they're cracked up to be. They lead to
incorrect binaries with some toolchains, they require us to have empty
functions we otherwise wouldn't, and the unused code is not elided
(as of gcc 4.3.2 anyway).

So replace the weak MSI arch hooks with the #define foo foo idiom. We no
longer need empty versions of arch_setup/teardown_msi_irq().

This is less source (by 1 line!), and results in smaller binaries too:

   text	   data	    bss	    dec	    hex	filename
9354300	1693916	 678424	11726640 b2ef30	build/powerpc/vmlinux-before
9354052	1693852	 678424	11726328 b2edf8	build/powerpc/vmlinux-after

Also smaller on x86_64 and arm (iop13xx).

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-19 19:29:26 -07:00
Rafael J. Wysocki b43d451385 PCI/PCIe portdrv: Fix allocation of interrupts
If MSI-X interrupt mode is used by the PCI Express port driver, too
many vectors are allocated and it is not ensured that the right
vectors will be used for the right services.  Namely, the PCI Express
specification states that both PCI Express native PME and PCI Express
hotplug will always use the same MSI or MSI-X message for signalling
interrupts, which implies that the same vector will be used by both
of them.  Also, the VC service does not use interrupts at all.
Moreover, is not clear which of the vectors allocated by
pci_enable_msix() in the current code will be used for PME and
hotplug and which of them will be used for AER if all of these
services are configured.

For these reasons, rework the allocation of interrupts for PCI
Express ports so that if MSI-X are enabled, the right vectors will be
used for the right purposes.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Reviewed-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-19 19:29:25 -07:00
Rafael J. Wysocki a52e2e3513 PCI/MSI: Introduce pci_msix_table_size()
Introduce new function pci_msix_table_size() returning the size of
the MSI-X table of given PCI device or 0 if the device doesn't
support MSI-X.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Reviewed-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-19 19:29:25 -07:00
Kay Sievers a447b77282 PCI: struct device - replace bus_id with dev_name(), dev_set_name()
More dev_set_name conversion.

Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-19 19:29:24 -07:00
Rafael J. Wysocki 22106368c9 PCI: PCIe portdrv: Remove struct pcie_port_service_id
The PCI Express port driver uses 'struct pcie_port_service_id' for
matching port service devices and drivers, but this structure
contains fields that duplicate information from the port device
itself (vendor, device, subvendor, subdevice) and fields that are not
used by any existing port service driver (class, class_mask,
drvier_data).  Also, both existing port service drivers (AER and
PCIe HP) don't even use the vendor and device fields for device
matching.  Therefore 'struct pcie_port_service_id' can be removed
altogether and the only useful members of it (port_type, service) can
be introduced directly into the port service device and port service
driver structures.  That simplifies the code quite a bit and reduces
its size.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-19 19:29:23 -07:00
Rafael J. Wysocki 0516c8bcd2 PCI: PCIe portdrv: Simplily probe callback of service drivers
The second argument of the ->probe() callback in
struct pcie_port_service_driver is unnecessary and never used.
Remove it.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-19 19:29:23 -07:00
Rafael J. Wysocki 87d2e2ecf6 PCI: PCIe portdrv: Remove unnecessary function
The function pcie_portdrv_save_config() in portdrv_pci.c is not
necessary.  Remove it.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-19 19:29:22 -07:00
Rafael J. Wysocki f118c0c3cf PCI: PCIe portdrv: Do not enable port device before setting up interrupts
The PCI Express port driver calls pci_enable_device() before setting
up interrupts, which is wrong, because if there is an interrupt pin
configured for the port, pci_enable_device() will likely set up an
interrupt link for it.  However, this shouldn't be done if either
MSI or MSI-X interrupt mode is chosen for the port.

The solution is to call pci_enable_device() after setting up
interrupts, because in that case the interrupt link won't be set up
if MSI or MSI-X are enabled.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-19 19:29:22 -07:00
Rafael J. Wysocki 90e9cd50f7 PCI: PCIe portdrv: Aviod using service devices with wrong interrupts
The PCI Express port driver should not attempt to register service
devices that require the ability to generate interrupts if generating
interrupts is not possible.  Namely, if the port has no interrupt pin
configured and we cannot set up MSI or MSI-X for it, there is no way
it can generate interrupts and in such a case the port services that
rely on interrupts (PME, PCIe HP, AER) should not be enabled for it.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-19 19:29:21 -07:00
Rafael J. Wysocki 1bf83e558c PCI: PCIe portdrv: Use driver data to simplify code
PCI Express port driver extension, as defined by struct
pcie_port_device_ext in portdrv.h, is allocated and initialized, but
never used (it also is never freed).  Extend it to hold the PCI Express
port type as well as the port interrupt mode, change its name and use it
to simplify the code in portdrv_core.c .

Additionally, remove the redundant interrupt_mode member of struct
pcie_device defined in include/linux/pcieport_if.h .

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-19 19:29:20 -07:00
Harvey Harrison e496b617b4 PCI: __FUNCTION__ is gcc-specific, use __func__
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-19 19:29:20 -07:00
Ingo Molnar 04dfcfcb54 Merge branch 'linus' into core/iommu 2009-03-18 10:37:43 +01:00
Suresh Siddha fa4b57cc04 x86, dmar: use atomic allocations for QI and Intr-remapping init
Impact: invalid use of GFP_KERNEL in interrupt context

Queued invalidation and interrupt-remapping will get initialized with
interrupts disabled (while enabling interrupt-remapping). So use
GFP_ATOMIC instead of GFP_KERNEL for memory alloacations.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2009-03-17 16:49:30 -07:00
Suresh Siddha 2e93456f5c x86, intr-remapping: fix free_irte() to clear all the IRTE entries
Impact: fix interrupt table entry leak

Fix the typo which was not clearing all the interrupt remapping table
entries corresponding to an irq.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2009-03-17 15:42:00 -07:00
Suresh Siddha 1531a6a6b8 x86, dmar: start with sane state while enabling dma and interrupt-remapping
Impact: cleanup/sanitization

Start from a sane state while enabling dma and interrupt-remapping, by
clearing the previous recorded faults and disabling previously
enabled queued invalidation and interrupt-remapping.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2009-03-17 15:39:58 -07:00
Suresh Siddha eba67e5da6 x86, dmar: routines for disabling queued invalidation and intr remapping
Impact: new interfaces (not yet used)

Routines for disabling queued invalidation and interrupt remapping.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2009-03-17 15:39:20 -07:00
Suresh Siddha 9d783ba042 x86, x2apic: enable fault handling for intr-remapping
Impact: interface augmentation (not yet used)

Enable fault handling flow for intr-remapping aswell. Fault handling
code now shared by both dma-remapping and intr-remapping.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2009-03-17 15:38:59 -07:00
Suresh Siddha 0ac2491f57 x86, dmar: move page fault handling code to dmar.c
Impact: code movement

Move page fault handling code to dmar.c
This will be shared both by DMA-remapping and Intr-remapping code.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2009-03-17 15:37:06 -07:00
Suresh Siddha 4c5502b1c5 x86, x2apic: fix lock ordering during IRQ migration
Impact: fix potential deadlock on x2apic

fix "hard-safe -> hard-unsafe lock order detected" with irq_2_ir_lock

On x2apic enabled system:
   [ INFO: hard-safe -> hard-unsafe lock order detected ]
   2.6.27-03151-g4480f15b #1
   ------------------------------------------------------
   swapper/1 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
    (irq_2_ir_lock){--..}, at: [<ffffffff8038ebc0>] get_irte+0x2f/0x95

   and this task is already holding:
    (&irq_desc_lock_class){+...}, at: [<ffffffff802649ed>] setup_irq+0x67/0x281
   which would create a new lock dependency:
    (&irq_desc_lock_class){+...} -> (irq_2_ir_lock){--..}

   but this new dependency connects a hard-irq-safe lock:
    (&irq_desc_lock_class){+...}
   ... which became hard-irq-safe at:
     [<ffffffffffffffff>] 0xffffffffffffffff

   to a hard-irq-unsafe lock:
    (irq_2_ir_lock){--..}
   ... which became hard-irq-unsafe at:
   ...  [<ffffffff802547b5>] __lock_acquire+0x571/0x706
     [<ffffffff8025499f>] lock_acquire+0x55/0x71
     [<ffffffff8062f2c4>] _spin_lock+0x2c/0x38
     [<ffffffff8038ee50>] alloc_irte+0x8a/0x14b
     [<ffffffff8021f733>] setup_IO_APIC_irq+0x119/0x30e
     [<ffffffff8090860e>] setup_IO_APIC+0x146/0x6e5
     [<ffffffff809058fc>] native_smp_prepare_cpus+0x24e/0x2e9
     [<ffffffff808f982c>] kernel_init+0x5a/0x176
     [<ffffffff8020c289>] child_rip+0xa/0x11
     [<ffffffffffffffff>] 0xffffffffffffffff

Fix this theoretical lock order issue by using spin_lock_irqsave() instead of
spin_lock()

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2009-03-17 15:36:40 -07:00
Ingo Molnar edb35028e4 Merge branches 'irq/genirq' and 'linus' into irq/core 2009-03-16 09:20:13 +01:00
Ingo Molnar 0634023562 Merge branch 'x86/core' into x86/kconfig 2009-03-13 17:08:30 +01:00
Ingo Molnar 238a5b4bff Merge branch 'cpus4096' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-x86 into cpus4096 2009-03-13 05:54:55 +01:00
Ingo Molnar 17d85bc756 Merge commit 'v2.6.29-rc8' into cpus4096 2009-03-13 05:54:43 +01:00
Rusty Russell a70f730282 cpumask: replace node_to_cpumask with cpumask_of_node.
Impact: cleanup

node_to_cpumask (and the blecherous node_to_cpumask_ptr which
contained a declaration) are replaced now everyone implements
cpumask_of_node.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-03-13 14:49:46 +10:30
Alex Chiang d899871936 PCIe: portdrv: call pci_disable_device during remove
The PCIe port driver calls pci_enable_device() during probe but
never calls pci_disable_device() during remove.

Cc: stable@kernel.org
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
2009-03-12 15:42:35 -04:00
Prakash Punnoor 6a958d5b28 pci: Fix typo in message while disabling HT MSI mapping
"Enabling" should read "Disabling"

Signed-off-by: Prakash Punnoor <prakash@punnoor.de>
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
2009-03-12 15:42:29 -04:00
Prakash Punnoor 7726c3308a pci: don't disable too many HT MSI mapping
Prakash's system needs MSI disabled on some bridges, but not all.
This seems to be the minimal fix for 2.6.29, but should be replaced
during 2.6.30.

Signed-off-by: Prakash Punnoor <prakash@punnoor.de>
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
2009-03-12 15:41:57 -04:00
Michael Ellerman 3f3b902ed8 powerpc/pseries: The RPA PCI hotplug driver depends on EEH
The RPA PCI hotplug driver calls EEH routines, so should depend on
EEH. Also PPC_PSERIES implies PPC64, so remove that.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
2009-03-12 15:10:02 -04:00
Alex Chiang cb4cb4ac73 PCIe: AER: during disable, check subordinate before walking
Commit 47a8b0cc (Enable PCIe AER only after checking firmware
support) wants to walk the PCI bus in the remove path to disable
AER, and calls pci_walk_bus for downstream bridges.

Unfortunately, in the remove path, we remove devices and bridges
in a depth-first manner, starting with the furthest downstream
bridge and working our way backwards.

The furthest downstream bridges will not have a dev->subordinate,
and we hit a NULL deref in pci_walk_bus.

Check for dev->subordinate first before attempting to walk the
PCI hierarchy below us.

Acked-by: Andrew Patterson <andrew.patterson@hp.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
2009-03-12 15:09:51 -04:00
Alexander Duyck 649426efcf PCI: Add PCI quirk to disable L0s ASPM state for 82575 and 82598
This patch is intended to disable L0s ASPM link state for 82598 (ixgbe)
parts due to the fact that it is possible to corrupt TX data when coming
back out of L0s on some systems.  The workaround had been added for 82575
(igb) previously, but did not use the ASPM api.  This quirk uses the ASPM
api to prevent the ASPM subsystem from re-enabling the L0s state.

Instead of adding the fix in igb to the ixgbe driver as well it was
decided to move it into a pci quirk.  It is necessary to move the fix out
of the driver and into a pci quirk in order to prevent the issue from
occuring prior to driver load to handle the possibility of the device being
passed to a VM via direct assignment.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
CC: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
2009-03-12 15:09:41 -04:00
Ingo Molnar 7df4edb07c Merge branch 'linus' into core/iommu 2009-03-05 12:47:28 +01:00
Ingo Molnar 55f2b78995 Merge branch 'x86/urgent' into x86/pat 2009-03-01 12:47:58 +01:00
Linus Torvalds 4bdc1b9650 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
  PCI: AMD 813x B2 devices do not need boot interrupt quirk
  PCI: Enable PCIe AER only after checking firmware support
  PCI: pciehp: Handle interrupts that happen during initialization.
  PCI: don't enable too many HT MSI mappings
  PCI: add some sysfs ABI docs
  PCI quirk: enable MSI on 8132
2009-02-26 14:43:42 -08:00
Stefan Assmann bbe194433b PCI: AMD 813x B2 devices do not need boot interrupt quirk
Turns out that the new AMD 813x devices do not need the
quirk_disable_amd_813x_boot_interrupt quirk to be run on them.  If it
is, no interrupts are seen on the PCI-X adapter.

From: Stefan Assmann <sassmann@novell.com>
Reported-by: Jamie Wellnitz <Jamie.Wellnitz@emulex.com>
Tested-by: Jamie Wellnitz <Jamie.Wellnitz@emulex.com>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Jesse Barnes <jbarnes@hobbes.lan>
2009-02-26 14:08:09 -08:00
Linus Torvalds 60042600c5 Merge git://git.infradead.org/iommu-2.6
* git://git.infradead.org/iommu-2.6:
  intel-iommu: fix endless "Unknown DMAR structure type" loop
  VT-d: handle Invalidation Queue Error to avoid system hang
  intel-iommu: fix build error with INTR_REMAP=y and DMAR=n
2009-02-25 09:31:21 -08:00
Andrew Patterson 1f9f13c8d5 PCI: Enable PCIe AER only after checking firmware support
The PCIe port driver currently sets the PCIe AER error reporting bits for
any root or switch port without first checking to see if firmware will grant
control. This patch moves setting these bits to the AER service driver
aer_enable_port routine.  The bits are then set for the root port and any
downstream switch ports after the check for firmware support (aer_osc_setup)
is made. The patch also unsets the bits in a similar fashion when the AER
service driver is unloaded.

Reviewed-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Andrew Patterson <andrew.patterson@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@hobbes.lan>
2009-02-24 09:47:46 -08:00
Eric W. Biederman dbc7e1e567 PCI: pciehp: Handle interrupts that happen during initialization.
Move the enabling of interrupts after all of the data structures
are setup so that we can safely run the interrupt handler as
soon as it is registered.

Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Tested-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: Jesse Barnes <jbarnes@hobbes.lan>
2009-02-24 09:36:56 -08:00
Yinghai Lu 1dec6b054d PCI: don't enable too many HT MSI mappings
Prakash reported that his c51-mcp51 ondie sound card doesn't work with
MSI.  But if he hacks out the HT-MSI quirk, MSI works fine.

So this patch reworks the nv_msi_ht_cap_quirk().  It will now only
enable ht_msi on own its root device, avoiding enabling it on devices
following that root dev.

Reported-by: Prakash Punnoor <prakash@punnoor.de>
Tested-by: Prakash Punnoor <prakash@punnoor.de>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@hobbes.lan>
2009-02-24 09:34:31 -08:00
Ingo Molnar fc6fc7f1b1 Merge branch 'linus' into x86/apic
Conflicts:
	arch/x86/mach-default/setup.c

Semantic conflict resolution:
	arch/x86/kernel/setup.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-22 20:05:19 +01:00
Yinghai Lu e0ae4f5503 PCI quirk: enable MSI on 8132
David reported that LSI SAS doesn't work with MSI.  It turns out that
his BIOS doesn't enable it, but the HT MSI 8132 does support HT MSI.
Add quirk to enable it

Cc: stable@kernel.org
Reported-by: David Lang <david@lang.hm>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-02-18 10:59:46 -08:00
Linus Torvalds 8ce9a75a30 Merge branch 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  iommu: fix Intel IOMMU write-buffer flushing
  futex: fix reference leak

Trivial conflicts fixed manually in drivers/pci/intel-iommu.c
2009-02-17 14:26:35 -08:00
Linus Torvalds c951aa62d5 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
  PCI: Documentation: fix minor PCIe HOWTO thinko
  PCI: fix missing kernel-doc and typos
  PCI: fix struct pci_platform_pm_ops kernel-doc
  PCI: fix rom.c kernel-doc warning
  PCI/MSI: fix msi_mask() shift fix
2009-02-17 14:23:35 -08:00
David Woodhouse ca77fde8e6 Fix Intel IOMMU write-buffer flushing
This is the cause of the DMA faults and disk corruption that people have
been seeing. Some chipsets neglect to report the RWBF "capability" --
the flag which says that we need to flush the chipset write-buffer when
changing the DMA page tables, to ensure that the change is visible to
the IOMMU.

Override that bit on the affected chipsets, and everything is happy
again.

Thanks to Chris and Bhavesh and others for helping to debug.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Tested-by: Chris Wright <chrisw@sous-sol.org>
Reviewed-by: Bhavesh Davda <bhavesh@vmware.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-17 14:02:57 -08:00
David Woodhouse 9af88143b2 iommu: fix Intel IOMMU write-buffer flushing
This is the cause of the DMA faults and disk corruption that people have
been seeing. Some chipsets neglect to report the RWBF "capability" --
the flag which says that we need to flush the chipset write-buffer when
changing the DMA page tables, to ensure that the change is visible to
the IOMMU.

Override that bit on the affected chipsets, and everything is happy
again.

Thanks to Chris and Bhavesh and others for helping to debug.

Should resolve:

  https://bugzilla.redhat.com/show_bug.cgi?id=479996
  http://bugzilla.kernel.org/show_bug.cgi?id=12578

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Tested-and-acked-by: Chris Wright <chrisw@sous-sol.org>
Reviewed-by: Bhavesh Davda <bhavesh@vmware.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-14 22:47:09 +01:00
Tony Battersby 084eb960e8 intel-iommu: fix endless "Unknown DMAR structure type" loop
I have a SuperMicro C2SBX motherboard with BIOS revision 1.0b.  With vt-d
enabled in the BIOS, Linux gets into an endless loop printing
"DMAR:Unknown DMAR structure type" when booting.  Here is the DMAR ACPI
table:

DMAR @ 0x7fe86dec
  0000: 44 4d 41 52 98 00 00 00 01 6f 49 6e 74 65 6c 20  DMAR.....oIntel
  0010: 4f 45 4d 44 4d 41 52 20 00 00 04 06 4c 4f 48 52  OEMDMAR ....LOHR
  0020: 01 00 00 00 23 00 00 00 00 00 00 00 00 00 00 00  ....#...........
  0030: 01 00 58 00 00 00 00 00 00 a0 e8 7f 00 00 00 00  ..X.............
  0040: ff ff ef 7f 00 00 00 00 01 08 00 00 00 00 1d 00  ................
  0050: 01 08 00 00 00 00 1d 01 01 08 00 00 00 00 1d 02  ................
  0060: 01 08 00 00 00 00 1d 07 01 08 00 00 00 00 1a 00  ................
  0070: 01 08 00 00 00 00 1a 01 01 08 00 00 00 00 1a 02  ................
  0080: 01 08 00 00 00 00 1a 07 01 08 00 00 00 00 1a 07  ................
  0090: c0 00 68 00 04 10 66 60                          ..h...f`

Here are the messages printed by the kernel:

DMAR:Host address width 36
DMAR:RMRR base: 0x000000007fe8a000 end: 0x000000007fefffff
DMAR:Unknown DMAR structure type
DMAR:Unknown DMAR structure type
DMAR:Unknown DMAR structure type
...

Although I not very familiar with ACPI, to me it looks like struct
acpi_dmar_header::length == 0x0058 is incorrect, causing
parse_dmar_table() to look at an invalid offset on the next loop.  This
offset happens to have struct acpi_dmar_header::length == 0x0000, which
prevents the loop from ever terminating.  This patch checks for this
condition and bails out instead of looping forever.

Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-02-14 08:33:34 +00:00
Randy Dunlap f5ddcac435 PCI: fix missing kernel-doc and typos
Fix pci kernel-doc parameter missing notation, correct
function name, and fix typo:

Warning(linux-2.6.28-git10//drivers/pci/pci.c:1511): No description found for parameter 'exclusive'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-02-13 12:03:08 -08:00
Randy Dunlap b33bfdef24 PCI: fix struct pci_platform_pm_ops kernel-doc
Fix struct pci_platform_pm_ops kernel-doc notation.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-02-13 12:02:47 -08:00
Randy Dunlap 4cc59c721c PCI: fix rom.c kernel-doc warning
Fix PCI kernel-doc warning:

Warning(linux-2.6.29-rc4-git1/drivers/pci/rom.c:67): No description found for parameter 'pdev'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-02-13 12:01:56 -08:00
Matthew Wilcox 0b49ec37a2 PCI/MSI: fix msi_mask() shift fix
Hidetoshi Seto points out that commit
bffac3c593 has wrong values in the array.
Rather than correct the array, we can just use a bounds check and
perform the calculation specified in the comment.  As a bonus, this will
not run off the end of the array if the device specifies an illegal
value in the MSI capability.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-02-13 11:59:03 -08:00
Ingo Molnar 8f8573ae9f Merge branches 'irq/genirq', 'irq/sparseirq' and 'irq/urgent' into irq/core 2009-02-13 11:57:18 +01:00
Ingo Molnar a56cdcb662 Merge branches 'x86/acpi', 'x86/asm', 'x86/cpudetect', 'x86/crashdump', 'x86/debug', 'x86/defconfig', 'x86/doc', 'x86/header-fixes', 'x86/headers' and 'x86/minor-fixes' into x86/core 2009-02-13 09:46:36 +01:00
Ingo Molnar f8a6b2b9ce Merge branch 'linus' into x86/apic
Conflicts:
	arch/x86/kernel/acpi/boot.c
	arch/x86/mm/fault.c
2009-02-13 09:44:22 +01:00
Linus Torvalds 9ce04f9238 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  ptrace, x86: fix the usage of ptrace_fork()
  i8327: fix outb() parameter order
  x86: fix math_emu register frame access
  x86: math_emu info cleanup
  x86: include correct %gs in a.out core dump
  x86, vmi: put a missing paravirt_release_pmd in pgd_dtor
  x86: find nr_irqs_gsi with mp_ioapic_routing
  x86: add clflush before monitor for Intel 7400 series
  x86: disable intel_iommu support by default
  x86: don't apply __supported_pte_mask to non-present ptes
  x86: fix grammar in user-visible BIOS warning
  x86/Kconfig.cpu: make Kconfig help readable in the console
  x86, 64-bit: print DMI info in the oops trace
2009-02-11 08:23:22 -08:00
Yinghai Lu 8e1568f350 pci, x86, acpi: fix early_ioremap() leak
Pawel reported:
------------[ cut here ]------------
WARNING: at arch/x86/mm/ioremap.c:616 check_early_ioremap_leak+0x52/0x67()
Hardware name:
Debug warning: early ioremap leak of 1 areas detected.
Modules linked in:
Pid: 1, comm: swapper Not tainted 2.6.29-rc4-tip #2
...

Reported-by: Pawel Dziekonski <dzieko@gmail.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-11 14:20:10 +01:00
Yu Zhao 704126ad81 VT-d: handle Invalidation Queue Error to avoid system hang
When hardware detects any error with a descriptor from the invalidation
queue, it stops fetching new descriptors from the queue until software
clears the Invalidation Queue Error bit in the Fault Status register.
Following fix handles the IQE so the kernel won't be trapped in an
infinite loop.

Signed-off-by: Yu Zhao <yu.zhao@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-02-09 11:03:17 +00:00
Joerg Roedel 43f7392ba9 intel-iommu: fix build error with INTR_REMAP=y and DMAR=n
This fix should be safe since iommu->agaw is only used in intel-iommu.c.
And this file is only compiled with DMAR=y.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-02-09 10:00:53 +00:00
Ingo Molnar 9d45cf9e36 Merge branch 'x86/urgent' into x86/apic
Conflicts:
	arch/x86/mach-default/setup.c

Semantic merge:
	arch/x86/kernel/irqinit_32.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-05 22:30:01 +01:00
Kyle McMartin 0cd5c3c80a x86: disable intel_iommu support by default
Due to recurring issues with DMAR support on certain platforms.
There's a number of filesystem corruption incidents reported:

  https://bugzilla.redhat.com/show_bug.cgi?id=479996
  http://bugzilla.kernel.org/show_bug.cgi?id=12578

Provide a Kconfig option to change whether it is enabled by
default.

If disabled, it can still be reenabled by passing intel_iommu=on to the
kernel. Keep the .config option off by default.

Signed-off-by: Kyle McMartin <kyle@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-By: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-05 16:48:38 +01:00
Rafael J. Wysocki 5294e25671 PCI PM: make the PM core more careful with drivers using the new PM framework
Currently, the PM core always attempts to manage devices with drivers
that use the new PM framework.  In particular, it attempts to disable
the devices (which is unnecessary), to save their state (which may be
undesirable if the driver has done that already) and to put them into
low power states (again, this may be undesirable if the driver has
already put the device into a low power state).  That need not be
the right thing to do, so make the core be more careful in this
respect.

Generally, there are the following categories of devices to consider:
* bridge devices without drivers
* non-bridge devices without drivers
* bridge devices with drivers
* non-bridge devices with drivers
and each of them should be handled differently.

For bridge devices without drivers the PCI PM core will save their
state on suspend and restore it (early) during resume, after putting
them into D0 if necessary.  It will not attempt to do anything else
to these devices.

For non-bridge devices without drivers the PCI PM core will disable
them and save their state on suspend.  During resume, it will put
them into D0, if necessary, restore their state (early) and reenable
them.

For bridge devices with drivers the PCI PM core will only save
their state on suspend if the driver hasn't done that already.
Still, the core will restore their state (early) during resume,
after putting them into D0, if necessary.

For non-bridge devices with drivers the PCI PM core will only save
their state on suspend if the driver hasn't done that already.  Also,
if the state of the device hasn't been saved by the driver, the core
will attempt to put the device into a low power state.  During
resume the core will restore the state of the device (early), after
putting it into D0, if necessary.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-02-04 17:22:35 -08:00
Rafael J. Wysocki 49c968111a PCI PM: Read power state from device after trying to change it on resume
pci_restore_standard_config() unconditionally changes current_state
to PCI_D0 after attempting to change the device's power state, but
it should rather read the actual current power state from the
device.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-02-04 17:22:28 -08:00
Rafael J. Wysocki cbbc2f6b0d PCI PM: Do not disable and enable bridges during suspend-resume
It is a mistake to disable and enable PCI bridges and PCI Express
ports during suspend-resume, at least at the time when it is
currently done.  Disabling them may lead to problems with accessing
devices behind them and they should be automatically enabled when
their standard config spaces are restored.  Fix this by not attempting
to disable bridges during suspend and enable them during resume.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-02-04 17:21:26 -08:00
Rafael J. Wysocki 27be54a65c PCI: PCIe portdrv: Simplify suspend and resume
Simplify suspend and resume of the PCI Express port driver.  It no
longer needs to save and restore the standard configuration space of the
device; this is now done by the PCI PM core layer.

This patch is reported to fix the regression tracked as
http://bugzilla.kernel.org/show_bug.cgi?id=12598

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Reported-and-tested-by: Parag Warudkar <parag.lkml@gmail.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-02-04 17:21:19 -08:00
Rafael J. Wysocki 99dadce875 PCI PM: Fix saving of device state in pci_legacy_suspend
Make pci_legacy_suspend() save the state of the device if it is
in PCI_UNKNOWN after its suspend callback has run and warn only if
the power state of the device has been changed by its suspend
callback.

Also, use WARN_ONCE(), which is more useful, in pci_legacy_suspend(),
so that the name of the offending function is printed.

Additionally, remove the unnecessary line of code setting
pci_dev->state_saved.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-02-04 17:21:08 -08:00
Rafael J. Wysocki 144a76bc88 PCI PM: Check if the state has been saved before trying to restore it
Check if the standard configuration registers of a PCI device have
been saved during suspend before trying to restore them during
resume.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Reported-By: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-02-04 17:20:39 -08:00
Rafael J. Wysocki ddb7c9d29f PCI PM: Fix handling of devices without drivers
Suspend to RAM is reported to break on some machines as a result of
attempting to put one of driverless PCI devices into a low power
state.  Avoid that by not attepmting to power manage driverless
devices during suspend.

Fix up pci_pm_poweroff() after a previous incomplete fix for the same
thing during hibernation.

This patch is reported to fix the regression from 2.6.28 tracked as
http://bugzilla.kernel.org/show_bug.cgi?id=12605

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Reported-and-tested-by: Eric Sesterhenn <snakebyte@gmx.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-02-04 17:20:09 -08:00
Timothy S. Nelson 97c44836cd PCI: return error on failure to read PCI ROMs
This patch makes the ROM reading code return an error to user space if
the size of the ROM read is equal to 0.

The patch also emits a warnings if the contents of the ROM are invalid,
and documents the effects of the "enable" file on ROM reading.

Signed-off-by: Timothy S. Nelson <wayland@wayland.id.au>
Acked-by: Alex Villacis-Lasso <a_villacis@palosanto.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-02-04 16:58:41 -08:00
Alex Chiang 3419c75e15 PCI: properly clean up ASPM link state on device remove
We only want to disable ASPM when the last function is removed from
the parent's device list. We determine this by checking to see if
the parent's device list is completely empty.

Unfortunately, we never hit that code because the parent is considered
an upstream port, and never had an ASPM link_state associated with it.

The early check for !link_state causes us to return early, we never
discover that our device list is empty, and thus we never remove the
downstream ports' link_state nodes.

Instead of checking to see if the parent's device list is empty, we can
check to see if we are the last device on the list, and if so, then we
know that we can clean up properly.

Cc: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-02-04 16:58:40 -08:00
FUJITA Tomonori d7ab5c46ae intel-iommu: make dma mapping functions static
The dma ops unification enables X86 and IA64 to share intel_dma_ops so
we can make dma mapping functions static. This also remove unused
intel_map_single().

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-29 14:39:29 +01:00
FUJITA Tomonori dfb805e831 IA64: fix VT-d dma_mapping_error
dma_mapping_error is used to see if dma_map_single and dma_map_page
succeed. IA64 VT-d dma_mapping_error always says that dma_map_single
is successful even though it could fail. Note that X86 VT-d works
properly in this regard.

This patch fixes IA64 VT-d dma_mapping_error by adding VT-d's own
dma_mapping_error() that works for both X86_64 and IA64. VT-d uses
zero as an error dma address so VT-d's dma_mapping_error returns 1 if
a passed dma address is zero (as x86's VT-d dma_mapping_error does
now).

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-29 14:39:29 +01:00
Matthew Garrett 71a082efc9 PCI hotplug: Change link order of pciehp & acpiphp
Some hardware exposes PCIE slots in such a way that they can be claimed
by either the acpiphp or pciehp driver. pciehp is the preferred driver
if the firmware allows the OS to claim control via the _OSC method so
should be loaded first - if it fails to bind (either due to a missing
_OSC method or the firmware refusing to hand off control) then we can
fall back to acpiphp or a vendor-specific driver.

This patch simply changes the link order to ensure that pciehp will be
initialised before acpiphp if both are statically built into the kernel.

Signed-off-by: Matthew Garrett <mjg@redhat.com>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-01-27 15:35:51 -08:00
Darrick J. Wong bf4162bcf8 PCI hotplug: fakephp: Allocate PCI resources before adding the device
For PCI devices, pci_bus_assign_resources() must be called to set up the
pci_device->resource array before pci_bus_add_devices() can be called, else
attempts to load drivers results in BAR collision errors where there are none.
This is not done in fakephp, so devices can be "unplugged" but scanning the
parent bus won't bring the devices back due to resource unallocation.  Move the
pci_bus_add_device-calling logic into pci_rescan_bus and preface it with a call
to pci_bus_assign_resources so that we only have to (re)allocate resources once
per bus where a new device is found.

Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Acked-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-01-27 10:53:24 -08:00
Matthew Wilcox bffac3c593 PCI MSI: Fix undefined shift by 32
Add an msi_mask() function which returns the correct bitmask for the
number of MSI interrupts you have.  This fixes an undefined bug in
msi_capability_init().

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-01-27 09:53:25 -08:00
Rafael J. Wysocki 476e7faefc PCI PM: Do not wait for buses in B2 or B3 during resume
pci_restore_standard_config() adds extra delay for PCI buses in
low power states (B2 or B3), but this is only correct for buses in
B2, because the buses in B3 are reset when they are put back into
B0.  Thus we should wait for such buses to settle after the reset,
but it's not a good idea to wait that long (1.1 s) with interrupts
off.

On the other hand, we have never waited for buses in B2 and B3
during resume and it seems reasonable to go back to this well
tested behaviour.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-01-27 09:47:10 -08:00
Rafael J. Wysocki 48f67f54a5 PCI PM: Power up devices before restoring their state
Devices that have MSI-X enabled before suspend to RAM or hibernation
and that are in a low power state during resume will not be handled
correctly by pci_restore_standard_config().  Namely, it first calls
pci_restore_state() which calls pci_restore_msi_state(), which in turn
executes __pci_restore_msix_state() that accesses the device's memory
space to restore the contents of the MSI-X table.  However, if the
device is in a low power state at this point, it's memory space is
not accessible.

The easiest way to fix this potential problem is to make
pci_restore_standard_config() call pci_restore_state() after
it has put the device into the full power state, D0.  Fortunately,
all of this is done with interrupts off, so the change of ordering
should not cause any trouble.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-01-27 09:47:02 -08:00
Rafael J. Wysocki 545ffd58ad PCI PM: Fix hibernation breakage on EeePC 701
Hibernation breaks on EeePC 701 as a result of attempting to put one
of its (driverless) devices into a low power state.  Avoid that by
not attepmting to power manage driverless devices during hibernation.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Reported-and-tested-by: Alan Jenkins <alan-jenkins@tuffmail.co.uk>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-01-27 09:46:27 -08:00
Rafael J. Wysocki 418e4da33f PCI PM: Fix suspend error paths and testing facility breakage
If one of device drivers refuses to suspend by returning error code
from its ->suspend() callback, the devices that have already been
suspended are resumed by executing their drivers' ->resume()
callbacks.  Some of these callbacks expect the device's
configuration space to be restored if the device has been put into
D3 before they are called.  Unfortunately, this mechanism has been
broken by recent changes moving the restoration of config spaces
of some devices (most importantly, USB controllers and HDA Intel)
into the resume callbacks executed with interrupts off.  Obviously,
these callbacks are not invoked in the suspend error path and, as a
result, the system cannot be successfully brought back into the
working state in case of a suspend error.  The same thing happens
in the hibernation error path right before putting the system into
S4.

Similarly, the suspend testing facility associated with the
/sys/power/pm_test file is broken, because it uses the very same
mechanism that is used in the suspend and hibernation error paths.

Fix the breakage by making the PCI core restore the configuration
spaces of PCI devices that haven't been restored already before
pci_pm_resume() is called for those devices by the PM core.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-01-27 09:45:46 -08:00
Ingo Molnar 3ddeb51d9c Merge branch 'linus' into core/percpu
Conflicts:
	arch/x86/kernel/setup_percpu.c
2009-01-27 12:01:51 +01:00
Linus Torvalds 4a4565921a Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
  PCI hotplug: fix lock imbalance in pciehp
  PCI PM: Restore standard config registers of all devices early
  PCI/MSI: bugfix/utilize for msi_capability_init()
2009-01-26 10:13:36 -08:00
Ingo Molnar 198030782c Merge branch 'x86/mm' into core/percpu
Conflicts:
	arch/x86/mm/fault.c
2009-01-21 10:39:51 +01:00
Jiri Slaby c2fdd36b55 PCI hotplug: fix lock imbalance in pciehp
set_lock_status omits mutex_unlock in fail path. Add the omitted
unlock.

As a result a lockup caused by this can be triggered from userspace
by writing 1 to /sys/bus/pci/slots/.../lock often enough.

Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-01-19 10:55:54 -08:00
Rafael J. Wysocki aa8c6c9374 PCI PM: Restore standard config registers of all devices early
There is a problem in our handling of suspend-resume of PCI devices that
many of them have their standard config registers restored with
interrupts enabled and they are put into the full power state with
interrupts enabled as well.  This may lead to the following scenario:
  * an interrupt vector is shared between two or more devices
  * one device is resumed earlier and generates an interrupt
  * the interrupt handler of another device tries to handle it and
    attempts to access the device the config space of which hasn't been
    restored yet and/or which still is in a low power state
  * the system crashes as a result

To prevent this from happening we should restore the standard
configuration registers of all devices with interrupts disabled and we
should put them into the D0 power state right after that.
Unfortunately, this cannot be done using the existing
pci_set_power_state(), because it can sleep.  Also, to do it we have to
make sure that the config spaces of all devices were actually saved
during suspend.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-01-16 12:57:58 -08:00
Hidetoshi Seto 0db29af1e7 PCI/MSI: bugfix/utilize for msi_capability_init()
This patch fix a following bug and does a cleanup.

bug:
	commit 5993760f7f
	had a wrong change (since is_64 is boolean[0|1]):

-               pci_write_config_dword(dev,
-                       msi_mask_bits_reg(pos, is_64bit_address(control)),
-                       maskbits);
+               pci_write_config_dword(dev, entry->msi_attrib.is_64, maskbits);

utilize:
	Unify separated if (entry->msi_attrib.maskbit) statements.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Acked-by: "Jike Song" <albcamus@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-01-16 12:35:25 -08:00
James Bottomley d45e085548 ACPI PCI hotplug: harden against panic regression
ACPI hotplug panic with current git head
http://lkml.org/lkml/2009/1/10/136

Rather than reverting the entire commit that causes the crash:
e8c331e963
"PCI hotplug: introduce functions for ACPI slot detection"

simply harden against it while the changes to
the hotplug code on this particularl machine are understood.

Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Len Brown <len.brown@intel.com>
2009-01-16 15:20:00 -05:00
Linus Torvalds 50246dd41c Revert "PCI PM: Register power state of devices during initialization"
This reverts commit 98e6e286d7, as Yinghai
Lu reports that it breaks kexec with at least the e1000 and e1000e
drivers.  The reason is that the shutdown sequence puts the hardware
into D3 sleep, and the commit causes us to claim that it then is in D0
(running) state just because we don't understand the PM capabilities.

Which then later makes "pci_set_power_state()" not do anything, and the
device never wakes up properly and just returns 0xff to everything.

Reported-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: From: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Jesse Barnes <jesse.barnes@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-01-16 08:14:51 -08:00
Ingo Molnar af2519fb22 Merge branch 'linus' into core/iommu
Conflicts:
	arch/ia64/include/asm/dma-mapping.h
	arch/ia64/include/asm/machvec.h
	arch/ia64/include/asm/machvec_sn2.h
2009-01-16 10:09:10 +01:00
Ingo Molnar 7f268f4352 Merge branches 'cpus4096', 'x86/cleanups' and 'x86/urgent' into x86/percpu 2009-01-15 13:18:57 +01:00
Heiko Carstens c4ea37c26a [CVE-2009-0029] System call wrappers part 26
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2009-01-14 14:15:29 +01:00
Dirk Hohndel 288e4877f9 Prevent oops at boot with VT-d
With some broken BIOSs when VT-d is enabled, the data structures are
filled incorrectly. This can cause a NULL pointer dereference in very
early boot.

Signed-off-by: Dirk Hohndel <hohndel@linux.intel.com>
Acked-by: Yu Zhao <yu.zhao@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-01-13 08:03:46 -08:00
Yinghai Lu d7e51e6689 sparseirq: make some func to be used with genirq
Impact: clean up sparseirq fallout on random.c

Ingo suggested to change some ifdef from SPARSE_IRQ to GENERIC_HARDIRQS
so we could some #ifdef later if all arch support genirq

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-11 04:46:26 +01:00
Ingo Molnar 0811a433c6 Merge branch 'linus' into core/iommu 2009-01-11 00:51:06 +01:00
Ingo Molnar 1de8cd3cb9 Merge branch 'linus' into x86/cleanups 2009-01-10 23:56:42 +01:00
Linus Torvalds 4e9b1c184c Merge branch 'cpus4096-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'cpus4096-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  [IA64] fix typo in cpumask_of_pcibus()
  x86: fix x86_32 builds for summit and es7000 arch's
  cpumask: use work_on_cpu in acpi-cpufreq.c for read_measured_perf_ctrs
  cpumask: use work_on_cpu in acpi-cpufreq.c for drv_read and drv_write
  cpumask: use cpumask_var_t in acpi-cpufreq.c
  cpumask: use work_on_cpu in acpi/cstate.c
  cpumask: convert struct cpufreq_policy to cpumask_var_t
  cpumask: replace CPUMASK_ALLOC etc with cpumask_var_t
  x86: cleanup remaining cpumask_t ops in smpboot code
  cpumask: update pci_bus_show_cpuaffinity to use new cpumask API
  cpumask: update local_cpus_show to use new cpumask API
  ia64: cpumask fix for is_affinity_mask_valid()
2009-01-10 06:12:18 -08:00
Len Brown b2576e1d44 Merge branch 'linus' into release 2009-01-09 03:39:43 -05:00
Jaswinder Singh Rajput 6d652ea1d0 x86: smp.h move boot_cpu_id declartion to cpu.h
Impact: cleanup

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-07 21:48:26 +01:00
Rafael J. Wysocki f6dc1e5e3d PCI PM: Put PM callbacks in the order of execution
Put PM callbacks in drivers/pci/pci-driver.c in the order in which
they are executed which makes it much easier to follow the code.

No functional changes should result from this.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-01-07 11:19:43 -08:00
Rafael J. Wysocki d67e37d793 PCI PM: Run default PM callbacks for all devices using new framework
It should be quite clear that it generally makes sense to execute
the default PM callbacks (ie. the callbacks used for handling
suspend, hibernation and resume of PCI devices without drivers) for
all devices.  Of course, the drivers that provide legacy PCI PM
support (ie. the ->suspend, ->suspend_late, ->resume_early
or ->resume hooks in the pci_driver structure), carry out these
operations too, so we can't do it for devices with such drivers.
Still, we can make the default PM callbacks run for devices with
drivers using the new framework (ie. implement the pm object), since
there are no such drivers at the moment.

This also simplifies the code and makes it smaller.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-01-07 11:19:39 -08:00
Rafael J. Wysocki 98e6e286d7 PCI PM: Register power state of devices during initialization
Use the observation that the power state of a PCI device can be
loaded into its pci_dev structure as soon as pci_pm_init() is run for
it and make that happen.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-01-07 11:18:04 -08:00