Commit Graph

66 Commits

Author SHA1 Message Date
Ross Zwisler
02cb489be7 ACPI: NUMA: Fix typo in the full name of SRAT
To save someone the time of searching the ACPI spec for
"Static Resource Affinity Table".

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-07-24 22:27:44 +02:00
Boris Ostrovsky
aec03f89e9 ACPI/NUMA: Do not map pxm to node when NUMA is turned off
acpi_map_pxm_to_node() unconditially maps nodes even when NUMA is turned
off. So acpi_get_node() might return a node > 0, which is fatal when NUMA
is disabled as the rest of the kernel assumes that only node 0 exists.

Expose numa_off to the acpi code and return NUMA_NO_NODE when it's set.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: fenghua.yu@intel.com
Cc: tony.luck@intel.com
Cc: linux-ia64@vger.kernel.org
Cc: catalin.marinas@arm.com
Cc: rjw@rjwysocki.net
Cc: will.deacon@arm.com
Cc: linux-acpi@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: lenb@kernel.org
Link: http://lkml.kernel.org/r/1481602709-18260-1-git-send-email-boris.ostrovsky@oracle.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-12-15 11:32:32 +01:00
Hanjun Guo
4bac6fa73d ACPI / NUMA: Enable ACPI based NUMA on ARM64
Add function needed for cpu to node mapping, and enable ACPI based
NUMA for ARM64 in Kconfig

Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Robert Richter <rrichter@cavium.com>
[david.daney@cavium.com added ACPI_NUMA default to y for ARM64]
Signed-off-by: David Daney <david.daney@cavium.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-06-22 01:36:59 +02:00
David Daney
e0af261a43 ACPI / NUMA: Improve SRAT error detection and add messages
Loosely based on code from Robert Richter and Hanjun Guo.

Improve out of range node detection as well as allow for Larger SRAT
entities.

Add printing of nice messages.

Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-05-30 14:27:08 +02:00
Hanjun Guo
3770442e79 ACPI / NUMA: Move acpi_numa_memory_affinity_init() to drivers/acpi/numa.c
acpi_numa_memory_affinity_init() will be reused by arm64.  Move it to
drivers/acpi/numa.c to facilitate reuse.

No code change.

Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Robert Richter <rrichter@cavium.com>
Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-05-30 14:27:08 +02:00
David Daney
e84025e274 ACPI / NUMA: move bad_srat() and srat_disabled() to drivers/acpi/numa.c
bad_srat() and srat_disabled() are shared by x86 and follow-on arm64
patches.  Move them to drivers/acpi/numa.c in preparation for arm64
support.

Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Robert Richter <rrichter@cavium.com>
[david.daney@cavium.com moved definitions to drivers/acpi/numa.c]
Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-05-30 14:27:08 +02:00
Hanjun Guo
6525afdf53 ACPI / NUMA: move acpi_numa_slit_init() to drivers/acpi/numa.c
Identical implementations of acpi_numa_slit_init() are used by both
x86 and follow-on arm64 support.  Move it to drivers/acpi/numa.c, and
guard with CONFIG_X86 || CONFIG_ARM64 because ia64 has its own
architecture specific implementation.

No code change.

Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Robert Richter <rrichter@cavium.com>
Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-05-30 14:27:07 +02:00
Robert Richter
312521d054 ACPI / NUMA: Move acpi_numa_arch_fixup() to ia64 only
Since acpi_numa_arch_fixup() is only used in arch ia64, move it there
to make a generic interface easier. This avoids empty function stubs
or some complex kconfig options for x86 and arm64.

Signed-off-by: Robert Richter <rrichter@cavium.com>
Reviewed-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-05-30 14:27:07 +02:00
Hanjun Guo
258cb74ba5 ACPI / NUMA: remove duplicate NULL check
The argument "header" for acpi_table_print_srat_entry()
is always checked before the function is called, it's
duplicate to check it again, remove it.

Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Robert Richter <rrichter@cavium.com>
Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-05-30 14:27:07 +02:00
Hanjun Guo
3dda448189 ACPI / NUMA: Replace ACPI_DEBUG_PRINT() with pr_debug()
ACPI_DEBUG_PRINT is a bit fragile in acpi/numa.c, the first thing
is that component ACPI_NUMA(0x80000000) is not described in the
Documentation/acpi/debug.txt, and even not defined in the struct
acpi_dlayer acpi_debug_layers which we can not dynamically enable/disable
it with /sys/modules/acpi/parameters/debug_layer. another thing
is that ACPI_DEBUG_OUTPUT is controlled by ACPICA which not coordinate
well with ACPI drivers.

Replace ACPI_DEBUG_PRINT() with pr_debug() in this patch as pr_debug
will do the same thing for debug purpose and it can make the code much
cleaner, also remove the related code which not needed anymore if
ACPI_DEBUG_PRINT() is gone.

Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Robert Richter <rrichter@cavium.com>
Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-05-30 14:27:07 +02:00
Hanjun Guo
ac906a6d56 ACPI / NUMA: Use pr_fmt() instead of printk
Just do some cleanups to replace printk with pr_fmt().

Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Robert Richter <rrichter@cavium.com>
Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-05-30 14:27:07 +02:00
Lukasz Anaczkowski
702b07fcc9 ACPI / SRAT: fix SRAT parsing order with both LAPIC and X2APIC present
SRAT maps APIC ID to proximity domains ids (PXM). Mapping from PXM to
NUMA node ids is based on order of entries in SRAT table.
SRAT table has just LAPIC entires or mix of LAPIC and X2APIC entries.
As long as there are only LAPIC entires, mapping from proximity domain
id to NUMA node id is as assumed by BIOS. However, once APIC entries are
mixed, X2APIC entries would be first mapped which causes unexpected NUMA
node mapping.

To fix that, change parsing to check each entry against both LAPIC and
X2APIC so mapping is in the SRAT/PXM order.

This is supplemental change to the fix made by commit d81056b527
(Handle apic/x2apic entries in MADT in correct order) and using the
mechanism introduced by 9b3fedd (ACPI / tables: Add acpi_subtable_proc
to ACPI table parsers).

Fixes: d81056b527 (Handle apic/x2apic entries in MADT in correct order)
Signed-off-by: Lukasz Anaczkowski <lukasz.anaczkowski@intel.com>
[ rjw : Subject & changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-04-21 22:13:03 +02:00
Jarkko Nikula
4c62dbbce9 ACPI: Remove FSF mailing addresses
There is no need to carry potentially outdated Free Software Foundation
mailing address in file headers since the COPYING file includes it.

Signed-off-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-07-08 02:27:32 +02:00
Toshi Kani
99759869fa acpi: Add acpi_map_pxm_to_online_node()
The kernel initializes CPU & memory's NUMA topology from ACPI
SRAT table.  Some other ACPI tables, such as NFIT and DMAR, also
contain proximity IDs for their device's NUMA topology.  This
information can be used to improve performance of these devices.

This patch introduces acpi_map_pxm_to_online_node(), which is
similar to acpi_map_pxm_to_node(), but always returns an online
node.  When the mapped node from a given proximity ID is offline,
it looks up the node distance table and returns the nearest
online node.

ACPI device drivers, which are called after the NUMA initialization
has completed in the kernel, can call this interface to obtain their
device NUMA topology from ACPI tables.  Such drivers do not have to
deal with offline nodes.  A node may be offline when a device
proximity ID is unique, SRAT memory entry does not exist, or NUMA is
disabled, ex. "numa=off" on x86.

This patch also moves the pxm range check from acpi_get_node() to
acpi_map_pxm_to_node().

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2015-06-26 11:23:38 -04:00
Hanjun Guo
2fad93083e ACPI / table: remove duplicate NULL check for the handler of acpi_table_parse()
In acpi_table_parse(), pointer of the table to pass to handler() is
checked before handler() called, so remove all the duplicate NULL
check in the handler function.

CC: Tony Luck <tony.luck@intel.com>
CC: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-02-06 01:34:47 +01:00
Bjorn Helgaas
beffbe54fa ACPI / numa: Use __weak, not the gcc-specific version
Use "__weak" instead of the gcc-specific "__attribute__ ((weak))".

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-02-03 10:39:43 -07:00
Bjorn Helgaas
d79ed248d9 ACPI / numa: Make __acpi_map_pxm_to_node(), acpi_get_pxm() static
__acpi_map_pxm_to_node() and acpi_get_pxm() are only used within
drivers/acpi/numa.c.  This makes them static and removes their
declarations.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-02-03 10:39:38 -07:00
Bjorn Helgaas
962fe9c91c ACPI / numa: Simplify acpi_get_node() style
Simplify control flow by removing local variable initialization and
returning a constant as soon as possible.  No functional change.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-02-03 10:39:32 -07:00
Bjorn Helgaas
486c79b500 ACPI / numa: Fix acpi_get_node() prototype
acpi_get_node() takes an acpi_handle, not an "acpi_handle *".  This
fixes the prototype and the definitions.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-02-03 10:39:27 -07:00
Lv Zheng
8b48463f89 ACPI: Clean up inclusions of ACPI header files
Replace direct inclusions of <acpi/acpi.h>, <acpi/acpi_bus.h> and
<acpi/acpi_drivers.h>, which are incorrect, with <linux/acpi.h>
inclusions and remove some inclusions of those files that aren't
necessary.

First of all, <acpi/acpi.h>, <acpi/acpi_bus.h> and <acpi/acpi_drivers.h>
should not be included directly from any files that are built for
CONFIG_ACPI unset, because that generally leads to build warnings about
undefined symbols in !CONFIG_ACPI builds.  For CONFIG_ACPI set,
<linux/acpi.h> includes those files and for CONFIG_ACPI unset it
provides stub ACPI symbols to be used in that case.

Second, there are ordering dependencies between those files that always
have to be met.  Namely, it is required that <acpi/acpi_bus.h> be included
prior to <acpi/acpi_drivers.h> so that the acpi_pci_root declarations the
latter depends on are always there.  And <acpi/acpi.h> which provides
basic ACPICA type declarations should always be included prior to any other
ACPI headers in CONFIG_ACPI builds.  That also is taken care of including
<linux/acpi.h> as appropriate.

Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com> (drivers/pci stuff)
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> (Xen stuff)
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-12-07 01:03:14 +01:00
Jianguo Wu
1bb25df0fd ACPI / mm: use NUMA_NO_NODE
Use more appropriate NUMA_NO_NODE instead of -1

Signed-off-by: Jianguo Wu <wujianguo@huawei.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-24 01:40:38 +02:00
Hanjun Guo
40e318563c ACPI / numa: Fix __init attribute location in slit_valid()
__init belongs after the return type on functions, not before it.

Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-08-13 12:35:42 +02:00
Yinghai Lu
20e6926dcb x86, ACPI, mm: Revert movablemem_map support
Tim found:

  WARNING: at arch/x86/kernel/smpboot.c:324 topology_sane.isra.2+0x6f/0x80()
  Hardware name: S2600CP
  sched: CPU #1's llc-sibling CPU #0 is not on the same node! [node: 1 != 0]. Ignoring dependency.
  smpboot: Booting Node   1, Processors  #1
  Modules linked in:
  Pid: 0, comm: swapper/1 Not tainted 3.9.0-0-generic #1
  Call Trace:
    set_cpu_sibling_map+0x279/0x449
    start_secondary+0x11d/0x1e5

Don Morris reproduced on a HP z620 workstation, and bisected it to
commit e8d1955258 ("acpi, memory-hotplug: parse SRAT before memblock
is ready")

It turns out movable_map has some problems, and it breaks several things

1. numa_init is called several times, NOT just for srat. so those
	nodes_clear(numa_nodes_parsed)
	memset(&numa_meminfo, 0, sizeof(numa_meminfo))
   can not be just removed.  Need to consider sequence is: numaq, srat, amd, dummy.
   and make fall back path working.

2. simply split acpi_numa_init to early_parse_srat.
   a. that early_parse_srat is NOT called for ia64, so you break ia64.
   b.  for (i = 0; i < MAX_LOCAL_APIC; i++)
	     set_apicid_to_node(i, NUMA_NO_NODE)
     still left in numa_init. So it will just clear result from early_parse_srat.
     it should be moved before that....
   c.  it breaks ACPI_TABLE_OVERIDE...as the acpi table scan is moved
       early before override from INITRD is settled.

3. that patch TITLE is total misleading, there is NO x86 in the title,
   but it changes critical x86 code. It caused x86 guys did not
   pay attention to find the problem early. Those patches really should
   be routed via tip/x86/mm.

4. after that commit, following range can not use movable ram:
  a. real_mode code.... well..funny, legacy Node0 [0,1M) could be hot-removed?
  b. initrd... it will be freed after booting, so it could be on movable...
  c. crashkernel for kdump...: looks like we can not put kdump kernel above 4G
	anymore.
  d. init_mem_mapping: can not put page table high anymore.
  e. initmem_init: vmemmap can not be high local node anymore. That is
     not good.

If node is hotplugable, the mem related range like page table and
vmemmap could be on the that node without problem and should be on that
node.

We have workaround patch that could fix some problems, but some can not
be fixed.

So just remove that offending commit and related ones including:

 f7210e6c4a ("mm/memblock.c: use CONFIG_HAVE_MEMBLOCK_NODE_MAP to
    protect movablecore_map in memblock_overlaps_region().")

 01a178a94e ("acpi, memory-hotplug: support getting hotplug info from
    SRAT")

 27168d38fa ("acpi, memory-hotplug: extend movablemem_map ranges to
    the end of node")

 e8d1955258 ("acpi, memory-hotplug: parse SRAT before memblock is
    ready")

 fb06bc8e5f ("page_alloc: bootmem limit with movablecore_map")

 42f47e27e7 ("page_alloc: make movablemem_map have higher priority")

 6981ec3114 ("page_alloc: introduce zone_movable_limit[] to keep
    movable limit for nodes")

 34b71f1e04 ("page_alloc: add movable_memmap kernel parameter")

 4d59a75125 ("x86: get pg_data_t's memory from other node")

Later we should have patches that will make sure kernel put page table
and vmemmap on local node ram instead of push them down to node0.  Also
need to find way to put other kernel used ram to local node ram.

Reported-by: Tim Gardner <tim.gardner@canonical.com>
Reported-by: Don Morris <don.morris@hp.com>
Bisected-by: Don Morris <don.morris@hp.com>
Tested-by: Don Morris <don.morris@hp.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Thomas Renninger <trenn@suse.de>
Cc: Tejun Heo <tj@kernel.org>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-03-02 09:34:39 -08:00
Tang Chen
e8d1955258 acpi, memory-hotplug: parse SRAT before memblock is ready
On linux, the pages used by kernel could not be migrated.  As a result,
if a memory range is used by kernel, it cannot be hot-removed.  So if we
want to hot-remove memory, we should prevent kernel from using it.

The way now used to prevent this is specify a memory range by
movablemem_map boot option and set it as ZONE_MOVABLE.

But when the system is booting, memblock will allocate memory, and
reserve the memory for kernel.  And before we parse SRAT, and know the
node memory ranges, memblock is working.  And it may allocate memory in
ranges to be set as ZONE_MOVABLE.  This memory can be used by kernel,
and never be freed.

So, let's parse SRAT before memblock is called first.  And it is early
enough.

The first call of memblock_find_in_range_node() is in:

  setup_arch()
    |-->setup_real_mode()

so, this patch add a function early_parse_srat() to parse SRAT, and call
it before setup_real_mode() is called.

NOTE:

1) early_parse_srat() is called before numa_init(), and has initialized
   numa_meminfo.  So DO NOT clear numa_nodes_parsed in numa_init() and DO
   NOT zero numa_meminfo in numa_init(), otherwise we will lose memory
   numa info.

2) I don't know why using count of memory affinities parsed from SRAT
   as a return value in original acpi_numa_init().  So I add a static
   variable srat_mem_cnt to remember this count and use it as the return
   value of the new acpi_numa_init()

[mhocko@suse.cz: parse SRAT before memblock is ready fix]
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Reviewed-by: Wen Congyang <wency@cn.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: Jianguo Wu <wujianguo@huawei.com>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Wu Jianguo <wujianguo@huawei.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Len Brown <lenb@kernel.org>
Cc: "Brown, Len" <len.brown@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-23 17:50:14 -08:00
Rafael J. Wysocki
a68d35323b Merge branch 'acpi-assorted'
* acpi-assorted:
  ACPI: Add DMI entry for Sony VGN-FW41E_H
  ACPI: fix obsolete comment in custom_method.c
  ACPI / thermal: Use mode to enable/disable kernel thermal processing
  ACPI thermal: remove unnecessary newline from exception message
  ACPI sysfs: remove unnecessary newline from exception
  ACPI video: remove unnecessary newline from error messages
  ACPI: SRAT: report non-volatile memory in debug
  ACPI: Rework acpi_get_child() to be more efficient
2013-02-15 13:58:43 +01:00
Davidlohr Bueso
208f6cc9e5 ACPI: SRAT: report non-volatile memory in debug
Just as with the other memory affinity flags, report
non-volatile memory with ACPI debug.

Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-01-26 00:34:21 +01:00
Lv Zheng
b43e1065ca ACPICA: Cleanup table handler naming conflicts.
This is a cosmetic patch only. Comparison of the resulting binary showed
only line number differences.

This patch does not affect the generation of the Linux binary.
This patch decreases 44 lines of 20121114 divergence.diff.

There are naming conflicts between Linux and ACPICA on table handlers. This
patch cleans up this conflicts to reduce the source code diff between Linux
and ACPICA.

Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-01-11 13:10:16 +01:00
Thomas Renninger
095adbb644 ACPI: Only count valid srat memory structures
Otherwise you could run into:
WARN_ON in numa_register_memblks(), because node_possible_map is zero

References: https://bugzilla.novell.com/show_bug.cgi?id=757888

On this machine (ProLiant ML570 G3) the SRAT table contains:
  - No processor affinities
  - One memory affinity structure (which is set disabled)

CC: Per Jessen <per@opensuse.org>
CC: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-08-03 00:15:53 -04:00
Thomas Renninger
f3946fb6e5 ACPI: Untangle a return statement for better readability
No functional change.

Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-08-03 00:15:42 -04:00
Kurt Garloff
8df0eb7c9d ACPI: Store SRAT table revision
In SRAT v1, we had 8bit proximity domain (PXM) fields; SRAT v2 provides
32bits for these. The new fields were reserved before.
According to the ACPI spec, the OS must disregrard reserved fields.
In order to know whether or not, we must know what version the SRAT
table has.

This patch stores the SRAT table revision for later consumption
by arch specific __init functions.

Signed-off-by: Kurt Garloff <kurt@garloff.de>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-01-17 04:19:04 -05:00
Tejun Heo
940fed2e79 x86-64, NUMA: Unify {acpi|amd}_{numa_init|scan_nodes}() arguments and return values
The functions used during NUMA initialization - *_numa_init() and
*_scan_nodes() - have different arguments and return values.  Unify
them such that they all take no argument and return 0 on success and
-errno on failure.  This is in preparation for further NUMA init
cleanups.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Shaohui Zheng <shaohui.zheng@intel.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: H. Peter Anvin <hpa@linux.intel.com>
2011-02-16 12:13:06 +01:00
Tony Luck
9378b63ccb x86, ia64, acpi: Clean up x86-ism in drivers/acpi/numa.c
As pointed out by Linus CONFIG_X86 in drivers/acpi/numa.c is
ugly.

Builds and boots on ia64 (both normally and with maxcpus=8 to limit
the number of cpus).

Signed-off-by: Tony Luck <tony.luck@intel.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: Len Brown <len.brown@intel.com>
LKML-Reference: <4D2D6B5D.4080208@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-12 12:15:09 +01:00
Yinghai Lu
d3bd058826 x86, acpi: Parse all SRAT cpu entries even above the cpu number limitation
Recent Intel new system have different order in MADT, aka will list all thread0
at first, then all thread1.
But SRAT table still old order, it will list cpus in one socket all together.

If the user have compiled limited NR_CPUS or boot with nr_cpus=, could have missed
to put some cpus apic id to node mapping into apicid_to_node[].

for example for 4 sockets system with 64 cpus with nr_cpus=32 will get crash...

[    9.106288] Total of 32 processors activated (136190.88 BogoMIPS).
[    9.235021] divide error: 0000 [#1] SMP
[    9.235315] last sysfs file:
[    9.235481] CPU 1
[    9.235592] Modules linked in:
[    9.245398]
[    9.245478] Pid: 2, comm: kthreadd Not tainted 2.6.37-rc1-tip-yh-01782-ge92ef79-dirty #274      /Sun Fire x4800
[    9.265415] RIP: 0010:[<ffffffff81075a8f>]  [<ffffffff81075a8f>] select_task_rq_fair+0x4f0/0x623
...
[    9.645938] RIP  [<ffffffff81075a8f>] select_task_rq_fair+0x4f0/0x623
[    9.665356]  RSP <ffff88103f8d1c40>
[    9.665568] ---[ end trace 2296156d35fdfc87 ]---

So let just parse all cpu entries in SRAT.

Also add apicid checking with MAX_LOCAL_APIC, in case We could out of boundaries of
apicid_to_node[].

it fixes following bug too.
https://bugzilla.kernel.org/show_bug.cgi?id=22662

-v2: expand to 32bit according to hpa
   need to add MAX_LOCAL_APIC for 32bit

Reported-and-Tested-by: Wu Fengguang <fengguang.wu@intel.com>
Reported-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Tested-by: Myron Stowe <myron.stowe@hp.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <4D0AD486.9020704@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2010-12-23 13:16:18 -08:00
Andi Kleen
cfa806f059 gcc-4.6: ACPI: fix unused but set variables in ACPI
Some minor improvements in error handling, but overall it was mostly dead
code.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Len Brown <len.brown@intel.com>
2010-08-15 00:53:08 -04:00
David Rientjes
0f9b75ef37 ACPI: NUMA: map pxms to low node ids
pxms are mapped to low node ids to maintain generic kernel use of
functions such as pxm_to_node() that are used to determine device
affinity.  Otherwise, there is no pxm-to-node and node-to-pxm matching
rule for x86_64 users of NUMA emulation where a single pxm may be bound
to multiple NUMA nodes.

Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2010-04-04 00:50:01 -04:00
Yinghai Lu
2b633e3fac smp: Use nr_cpus= to set nr_cpu_ids early
On x86, before prefill_possible_map(), nr_cpu_ids will be NR_CPUS aka
CONFIG_NR_CPUS.

Add nr_cpus= to set nr_cpu_ids. so we can simulate cpus <=8 are installed on
normal config.

-v2: accordging to Christoph, acpi_numa_init should use nr_cpu_ids in stead of
     NR_CPUS.
-v3: add doc in kernel-parameters.txt according to Andrew.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <1265793639-15071-34-git-send-email-yinghai@kernel.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: Tony Luck <tony.luck@intel.com>
2010-02-17 17:30:22 -08:00
Len Brown
aa96ce0af8 Merge branch 'misc-2.6.33' into release 2009-12-16 14:22:32 -05:00
David Rientjes
b552a8c56d ACPI: remove NID_INVAL
NUMA_NO_NODE has been exported globally and thus it can replace NID_INVAL
in the acpi code.

Also removes the unused acpi_unmap_pxm_to_node() function.

[akpm@linux-foundation.org: coding-style fixes]
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Reviewed-by: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Len Brown <len.brown@intel.com>
2009-12-16 01:53:19 -05:00
David Rientjes
8716273cae x86: Export srat physical topology
This is the counterpart to "x86: export k8 physical topology" for
SRAT. It is not as invasive because the acpi code already seperates
node setup into detection and registration steps, with the
exception of registering e820 active regions in
acpi_numa_memory_affinity_init().  This is now moved to
acpi_scan_nodes() if NUMA emulation is disabled or deferred.

acpi_numa_init() now returns a value which specifies whether an
underlying SRAT was located.  If so, that topology can be used by
the emulation code to interleave emulated nodes over physical nodes
or to register the nodes for ACPI.

acpi_get_nodes() may now be used to export the srat physical
topology of the machine for NUMA emulation.

Signed-off-by: David Rientjes <rientjes@google.com>
Cc: Andreas Herrmann <andreas.herrmann3@amd.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: Ankita Garg <ankita@in.ibm.com>
Cc: Len Brown <len.brown@intel.com>
LKML-Reference: <alpine.DEB.1.00.0909251518580.14754@chino.kir.corp.google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-12 22:56:46 +02:00
Len Brown
a192a9580b ACPI: Move definition of PREFIX from acpi_bus.h to internal..h
Linux/ACPI core files using internal.h all PREFIX "ACPI: ",
however, not all ACPI drivers use/want it -- and they
should not have to #undef PREFIX to define their own.

Add GPL commment to internal.h while we are there.

This does not change any actual console output,
asside from a whitespace fix.

Signed-off-by: Len Brown <len.brown@intel.com>
2009-08-28 19:57:27 -04:00
Suresh Siddha
7237d3de78 x86, ACPI: add support for x2apic ACPI extensions
All logical processors with APIC ID values of 255 and greater will have their
APIC reported through Processor X2APIC structure (type-9 entry type) and all
logical processors with APIC ID less than 255 will have their APIC reported
through legacy Processor Local APIC (type-0 entry type) only. This is the
same case even for NMI structure reporting.
    
The Processor X2APIC Affinity structure provides the association between the
X2APIC ID of a logical processor and the proximity domain to which the logical
processor belongs.
    
For OSPM, Procssor IDs outside the 0-254 range are to be declared as Device()
objects in the ACPI namespace.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2009-04-03 20:08:12 -04:00
Cyrill Gorcunov
27ce341983 acpi: check for pxm_to_node_map overflow
It is hardly (if ever) possible but in case of broken _PXM entry we could
reach out of pxm_to_node_map array bounds in acpi_map_pxm_to_node() call.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Len Brown <len.brown@intel.com>
2009-03-16 00:35:30 -04:00
Lin Ming
ea7e96e0f2 ACPI: remove private acpica headers from driver files
External driver files should not include any private acpica headers.

Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2008-12-31 01:15:22 -05:00
Matthew Wilcox
27663c5855 ACPI: Change acpi_evaluate_integer to support 64-bit on 32-bit kernels
As of version 2.0, ACPI can return 64-bit integers.  The current
acpi_evaluate_integer only supports 64-bit integers on 64-bit platforms.
Change the argument to take a pointer to an acpi_integer so we support
64-bit integers on all platforms.

lenb: replaced use of "acpi_integer" with "unsigned long long"
lenb: fixed bug in acpi_thermal_trips_update()

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2008-10-11 02:47:33 -04:00
Bob Moore
19d0cfe9dd ACPICA: Update DMAR and SRAT table definitions
Synchronized tables with current specifications.

Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2008-07-16 23:27:04 +02:00
Fenghua Yu
39b8931b5c ACPI: handle invalid ACPI SLIT table
This is a SLIT sanity checking patch.  It moves slit_valid() function to
generic ACPI code and does sanity checking for both x86 and ia64.  It sets up
node_distance with LOCAL_DISTANCE and REMOTE_DISTANCE when hitting invalid
SLIT table on ia64.  It also cleans up unused variable localities in
acpi_parse_slit() on x86.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Len Brown <len.brown@intel.com>
2008-06-11 19:13:46 -04:00
Adrian Bunk
e5685b9d35 ACPI: misc cleanups
This patch contains the following possible cleanups:
    - make the following needlessly global code static:
      - drivers/acpi/bay.c:dev_attr_eject
      - drivers/acpi/bay.c:dev_attr_present
      - drivers/acpi/dock.c:dev_attr_docked
      - drivers/acpi/dock.c:dev_attr_flags
      - drivers/acpi/dock.c:dev_attr_uid
      - drivers/acpi/dock.c:dev_attr_undock
      - drivers/acpi/pci_bind.c:acpi_pci_unbind()
      - drivers/acpi/pci_link.c:acpi_link_lock
      - drivers/acpi/sbs.c:acpi_sbs_callback()
      - drivers/acpi/sbshc.c:acpi_smbus_transaction()
      - drivers/acpi/sleep/main.c:acpi_sleep_prepare()
    - #if 0 the following unused global functions:
      - drivers/acpi/numa.c:acpi_unmap_pxm_to_node()
    - remove the following unused EXPORT_SYMBOL's:
      - acpi_register_gsi
      - acpi_unregister_gsi
      - acpi_strict
      - acpi_bus_receive_event
      - register_acpi_bus_type
      - unregister_acpi_bus_type
      - acpi_os_printf
      - acpi_os_sleep
      - acpi_os_stall
      - acpi_os_read_pci_configuration
      - acpi_os_create_semaphore
      - acpi_os_delete_semaphore
      - acpi_os_wait_semaphore
      - acpi_os_signal_semaphore
      - acpi_os_signal
      - acpi_pci_irq_enable
      - acpi_get_pxm

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Acked-by: Alexey Starikovskiy <astarikovskiy@suse.de>
Signed-off-by: Len Brown <len.brown@intel.com>
2008-02-07 03:33:23 -05:00
Jan Beulich
ffada8913e ACPI: fix modpost warnings
for sn2_defconfig:

WARNING: vmlinux.o(.text+0x4b8601): Section mismatch: reference to .init.data:node_to_pxm_map (between '__acpi_map_pxm_to_node' and 'acpi_get_pxm')
WARNING: vmlinux.o(.text+0x4b8741): Section mismatch: reference to .init.data:pxm_to_node_map (between 'acpi_map_pxm_to_node' and 'acpi_get_node')

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2007-12-13 17:50:09 -05:00
David Rientjes
3484d79813 x86_64: fake pxm-to-node mapping for fake numa
For NUMA emulation, our SLIT should represent the true NUMA topology of the
system but our proximity domain to node ID mapping needs to reflect the
emulated state.

When NUMA emulation has successfully setup fake nodes on the system, a new
function, acpi_fake_nodes() is called.  This function determines the proximity
domain (_PXM) for each true node found on the system.  It then finds which
emulated nodes have been allocated on this true node as determined by its
starting address.  The node ID to PXM mapping is changed so that each fake
node ID points to the PXM of the true node that it is located on.

If the machine failed to register a SLIT, then we assume there is no special
requirement for emulated node affinity so we use the default LOCAL_DISTANCE,
which is newly exported to this code, as our measurement if the emulated nodes
appear in the same PXM.  Otherwise, we use REMOTE_DISTANCE.

PXM_INVAL and NID_INVAL are also exported to the ACPI header file so that we
can compare node_to_pxm() results in generic code (in this case, the SRAT
code).

Cc: Len Brown <lenb@kernel.org>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-21 18:37:10 -07:00
David Rientjes
ae2c6dcf90 x86_64: various cleanups in NUMA scan node
In acpi_scan_nodes(), we immediately return -1 if acpi_numa <= 0, meaning
we haven't detected any underlying ACPI topology or we have explicitly
disabled its use from the command-line with numa=noacpi.

acpi_table_print_srat_entry() and acpi_table_parse_srat() are only
referenced within drivers/acpi/numa.c, so we can mark them as static and
remove their prototypes from the header file.

Likewise, pxm_to_node_map[] and node_to_pxm_map[] are only used within
drivers/acpi/numa.c, so we mark them as static and remove their externs
from the header file.

The automatic 'result' variable is unused in acpi_numa_init(), so it's
removed.

Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-21 18:37:08 -07:00