linux

Author	SHA1	Message	Date
Andy Whitcroft	b9ada4281c	x86: reinstate numa remap for SPARSEMEM on x86 NUMA systems Recent kernels have been panic'ing trying to allocate memory early in boot, in __alloc_pages: BUG: unable to handle kernel paging request at 00001568 IP: [<c10407b6>] __alloc_pages+0x33/0x2cc pdpt = 00000000013a5001 pde = 0000000000000000 Oops: 0000 [#1] SMP Modules linked in: Pid: 1, comm: swapper Not tainted (2.6.25 #78) EIP: 0060:[<c10407b6>] EFLAGS: 00010246 CPU: 0 EIP is at __alloc_pages+0x33/0x2cc EAX: 00001564 EBX: 000412d0 ECX: 00001564 EDX: 000005c3 ESI: f78012a0 EDI: 00000001 EBP: 00001564 ESP: f7871e50 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 Process swapper (pid: 1, ti=f7870000 task=f786f670 task.ti=f7870000) Stack: 00000000 f786f670 00000010 00000000 0000b700 000412d0 f78012a0 00000001 00000000 c105b64d 00000000 000412d0 f78012a0 f7803120 00000000 c105c1c5 00000010 f7803144 000412d0 00000001 f7803130 f7803120 f78012a0 00000001 Call Trace: [<c105b64d>] kmem_getpages+0x94/0x129 [<c105c1c5>] cache_grow+0x8f/0x123 [<c105c689>] ____cache_alloc_node+0xb9/0xe4 [<c105c999>] kmem_cache_alloc_node+0x92/0xd2 [<c1018929>] build_sched_domains+0x536/0x70d [<c100b63c>] do_flush_tlb_all+0x0/0x3f [<c100b63c>] do_flush_tlb_all+0x0/0x3f [<c10572d6>] interleave_nodes+0x23/0x5a [<c105c44f>] alternate_node_alloc+0x43/0x5b [<c1018b47>] arch_init_sched_domains+0x46/0x51 [<c136e85e>] kernel_init+0x0/0x82 [<c137ac19>] sched_init_smp+0x10/0xbb [<c136e8a1>] kernel_init+0x43/0x82 [<c10035cf>] kernel_thread_helper+0x7/0x10 Debugging this showed that the NODE_DATA() for nodes other than node 0 were all NULL. Tracing this back showed that the NODE_DATA() pointers were being initialised to each nodes remap space. However under SPARSEMEM remap is disabled which leads to the pgdat's being placed incorrectly at kernel virtual address 0. Leading to the panic when attempting to allocate memory from these nodes. Numa remap was disabled in the commit below. This occured while fixing problems triggered when attempting to boot x86_32 NUMA SPARSEMEM kernels on non-numa hardware. x86: make NUMA work on 32-bit commit `1b000a5dbe` The real problem is believed to be related to other alignment issues in the regions blocked out from the bootmem allocator for small memory systems, and has been fixed separately. Therefore re-enable remap for SPARSMEM, which fixes pgdat allocation issues. Testing confirms that SPARSMEM NUMA kernels will boot correctly with this part of the change reverted. Signed-off-by: Andy Whitcroft <apw@shadowen.org> Acked-by: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-05-20 14:10:48 +02:00
Avi Kivity	31f4d870b0	x86: fix crash on cpu hotplug on pat-incapable machines pat_disable() is __init, which means it goes away after booting is complete. Unfortunately it is used by the hotplug code if the machine is not pat-capable, causing a crash. Fix by marking pat_disable() as __cpuinit. Signed-off-by: Avi Kivity <avi@qumranet.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-05-17 22:57:20 +02:00
Pranith Kumar	afc8534380	x86: arch/x86/mm/pat.c - fix warning fix this warning: arch/x86/mm/pat.c: In function `phys_mem_access_prot_allowed': arch/x86/mm/pat.c:558: warning: long long unsigned int format, long unsigned int arg (arg 6) arch/x86/mm/pat.c: In function `map_devmem': arch/x86/mm/pat.c:580: warning: long long unsigned int format, long unsigned int arg (arg 6) Signed-off-by: D Pranith Kumar <bobby.prani@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-05-13 19:39:30 +02:00
Hugh Dickins	61165d7a03	x86: fix app crashes after SMP resume After resume on a 2cpu laptop, kernel builds collapse with a sed hang, sh or make segfault (often on 20295564), real-time signal to cc1 etc. Several hurdles to jump, but a manually-assisted bisect led to -rc1's `d2bcbad5f3` x86: do not zap_low_mappings in __smp_prepare_cpus. Though the low mappings were removed at bootup, they were left behind (with Global flags helping to keep them in TLB) after resume or cpu online, causing the crashes seen. Reinstate zap_low_mappings (with local __flush_tlb_all) for each cpu_up on x86_32. This used to be serialized by smp_commenced_mask: that's now gone, but a low_mappings flag will do. No need for native_smp_cpus_done to repeat the zap: let mem_init zap BSP's low mappings just like on UP. (In passing, fix error code from native_cpu_up: do_boot_cpu returns a variety of diagnostic values, Dprintk what it says but convert to -EIO. And save_pg_dir separately before zap_low_mappings: doesn't matter now, but zapping twice in succession wiped out resume's swsusp_pg_dir.) That worked well on the duo and one quad, but wouldn't boot 3rd or 4th cpu on P4 Xeon, oopsing just after unlock_ipi_call_lock. The TLB flush IPI now being sent reveals a long-standing bug: the booting cpu has its APIC readied in smp_callin at the top of start_secondary, but isn't put into the cpu_online_map until just before that unlock_ipi_call_lock. So native_smp_call_function_mask to online cpus would send_IPI_allbutself, including the cpu just coming up, though it has been excluded from the count to wait for: by the time it handles the IPI, the call data on native_smp_call_function_mask's stack may well have been overwritten. So fall back to send_IPI_mask while cpu_online_map does not match cpu_callout_map: perhaps there's a better APICological fix to be made at the start_secondary end, but I wouldn't know that. Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-05-13 19:36:12 +02:00
Thomas Gleixner	8d4a430085	x86: cleanup PAT cpu validation Move the scattered checks for PAT support to a single function. Its moved to addon_cpuid_features.c as this file is shared between 32 and 64 bit. Remove the manipulation of the PAT feature bit and just disable PAT in the PAT layer, based on the PAT bit provided by the CPU and the current CPU version/model white list. Change the boot CPU check so it works on Voyager somewhere in the future as well :) Also panic, when a secondary has PAT disabled but the primary one has alrady switched to PAT. We have no way to undo that. The white list is kept for now to ensure that we can rely on known to work CPU types and concentrate on the software induced problems instead of fighthing CPU erratas and subtle wreckage caused by not yet verified CPUs. Once the PAT code has stabilized enough, we can remove the white list and open the can of worms. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-05-08 15:43:51 +02:00
Hugh Dickins	aeed5fce37	x86: fix PAE pmd_bad bootup warning Fix warning from pmd_bad() at bootup on a HIGHMEM64G HIGHPTE x86_32. That came from `9fc34113f6` x86: debug pmd_bad(); but we understand now that the typecasting was wrong for PAE in the previous version: pagetable pages above 4GB looked bad and stopped Arjan from booting. And revert that `cded932b75` x86: fix pmd_bad and pud_bad to support huge pages. It was the wrong way round: we shouldn't weaken every pmd_bad and pud_bad check to let huge pages slip through - in part they check that we _don't_ have a huge page where it's not expected. Put the x86 pmd_bad() and pud_bad() definitions back to what they have long been: they can be improved (x86_32 should use PTE_MASK, to stop PAE thinking junk in the upper word is good; and x86_64 should follow x86_32's stricter comparison, to stop thinking any subset of required bits is good); but that should be a later patch. Fix Hans' good observation that follow_page() will never find pmd_huge() because that would have already failed the pmd_bad test: test pmd_huge in between the pmd_none and pmd_bad tests. Tighten x86's pmd_huge() check? No, once it's a hugepage entry, it can get quite far from a good pmd: for example, PROT_NONE leaves it with only ACCESSED of the KERN_PGTABLE bits. However... though follow_page() contains this and another test for huge pages, so it's nice to keep it working on them, where does it actually get called on a huge page? get_user_pages() checks is_vm_hugetlb_page(vma) to to call alternative hugetlb processing, as does unmap_vmas() and others. Signed-off-by: Hugh Dickins <hugh@veritas.com> Earlier-version-tested-by: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jeff Chua <jeff.chua.linux@gmail.com> Cc: Hans Rosenfeld <hans.rosenfeld@amd.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-05-06 13:08:58 -07:00
Thomas Gleixner	48b83d2425	x86: undo visws/numaq build changes arch/x86/pci/Makefile_32 has a nasty detail. VISWS and NUMAQ build override the generic pci-y rules. This needs a proper cleanup, but that needs more thoughts. Undo commit `895d30935e` x86: numaq fix do not override the existing pci-y rule when adding visws or numaq rules. There is also a stupid init function ordering problem vs. acpi.o Add comments to the Makefile to avoid tripping over this again. Remove the srat stub code in discontig_32.c to allow a proper NUMAQ build. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-05-04 20:04:45 +02:00
Andres Salomon	cb8ab687c3	x86: ioremap ram check fix `bdd3cee2e4` (x86: ioremap(), extend check to all RAM pages) breaks OLPC's ioremap call. The ioremap that OLPC uses is: romsig = ioremap(0xffffffc0, 16); The commit that breaks it is basically: - for (pfn = phys_addr >> PAGE_SHIFT; pfn < max_pfn_mapped && - (pfn << PAGE_SHIFT) < last_addr; pfn++) { + for (pfn = phys_addr >> PAGE_SHIFT; + (pfn << PAGE_SHIFT) < last_addr; pfn++) { + Previously, the 'pfn < max_pfn_mapped' check would've caused us to not enter the loop. Removing that check means we loop infinitely. The reason for that is because pfn is 0xfffff, and last_addr is 0xffffffcf. The remaining check that is used to exit the loop is not sufficient; when pfn<<PAGE_SHIFT is 0xfffff000, that is less than 0xffffffcf; when we increment pfn and it overflows (pfn == 0x100000), pfn<<PAGE_SHIFT ends up being 0. That, of course, is less than last_addr. In effect, pfn<<PAGE_SHIFT is never lower than last_addr. The simple fix for this is to limit the last_addr check to the PAGE_MASK; a patch is below. Signed-off-by: Andres Salomon <dilinger@debian.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-30 23:15:35 +02:00
Suresh Siddha	de33c442ed	x86 PAT: fix performance drop for glx, use UC minus for ioremap(), ioremap_nocache() and pci_mmap_page_range() Use UC_MINUS for ioremap(), ioremap_nocache() instead of strong UC. Once all the X drivers move to ioremap_wc(), we can go back to strong UC semantics for ioremap() and ioremap_nocache(). To avoid attribute aliasing issues, pci_mmap_page_range() will also use UC_MINUS for default non write-combining mapping request. Next steps: a) change all the video drivers using ioremap() or ioremap_nocache() and adding WC MTTR using mttr_add() to ioremap_wc() b) for strict usage, we can go back to strong uc semantics for ioremap() and ioremap_nocache() after some grace period for completing step-a. c) user level X server needs to use the appropriate method for setting up WC mapping (like using resourceX_wc sysfs file instead of adding MTRR for WC and using /dev/mem or resourceX under /sys) Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-30 23:15:35 +02:00
Ingo Molnar	2544a873ab	revert: "x86: ioremap(), extend check to all RAM pages" Vegard Nossum reported a large (150 seconds) boot delay during bootup, and bisected it to "x86: ioremap(), extend check to all RAM pages" (commit `bdd3cee2e4`). Revert this commit for now. Bisected-by: Vegard Nossum <vegard.nossum@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-30 23:15:34 +02:00
Adrian Bunk	b9e017e04b	x86: unexport kmap_atomic_to_page This patch removes the no longer used export of kmap_atomic_to_page. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-30 23:15:34 +02:00
Linus Torvalds	5f78e4d339	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86-bigbox-pci * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86-bigbox-pci: x86: add pci=check_enable_amd_mmconf and dmi check x86: work around io allocation overlap of HT links acpi: get boot_cpu_id as early for k8_scan_nodes x86_64: don't need set default res if only have one root bus x86: double check the multi root bus with fam10h mmconf x86: multi pci root bus with different io resource range, on 64-bit x86: use bus conf in NB conf fun1 to get bus range on, on 64-bit x86: get mp_bus_to_node early x86 pci: remove checking type for mmconfig probe x86: remove unneeded check in mmconf reject driver core: try parent numa_node at first before using default x86: seperate mmconf for fam10h out from setup_64.c x86: if acpi=off, force setting the mmconf for fam10h x86_64: check MSR to get MMCONFIG for AMD Family 10h x86_64: check and enable MMCONFIG for AMD Family 10h x86_64: set cfg_size for AMD Family 10h in case MMCONFIG x86: mmconf enable mcfg early x86: clear pci_mmcfg_virt when mmcfg get rejected x86: validate against acpi motherboard resources Fixed up fairly trivial conflicts in arch/x86/pci/{init.c,pci.h} due to OLPC support manually.	2008-04-29 08:26:51 -07:00
Christoph Lameter	2301696932	vmallocinfo: add caller information Add caller information so that /proc/vmallocinfo shows where the allocation request for a slice of vmalloc memory originated. Results in output like this: 0xffffc20000000000-0xffffc20000801000 8392704 alloc_large_system_hash+0x127/0x246 pages=2048 vmalloc vpages 0xffffc20000801000-0xffffc20000806000 20480 alloc_large_system_hash+0x127/0x246 pages=4 vmalloc 0xffffc20000806000-0xffffc20000c07000 4198400 alloc_large_system_hash+0x127/0x246 pages=1024 vmalloc vpages 0xffffc20000c07000-0xffffc20000c0a000 12288 alloc_large_system_hash+0x127/0x246 pages=2 vmalloc 0xffffc20000c0a000-0xffffc20000c0c000 8192 acpi_os_map_memory+0x13/0x1c phys=cff68000 ioremap 0xffffc20000c0c000-0xffffc20000c0f000 12288 acpi_os_map_memory+0x13/0x1c phys=cff64000 ioremap 0xffffc20000c10000-0xffffc20000c15000 20480 acpi_os_map_memory+0x13/0x1c phys=cff65000 ioremap 0xffffc20000c16000-0xffffc20000c18000 8192 acpi_os_map_memory+0x13/0x1c phys=cff69000 ioremap 0xffffc20000c18000-0xffffc20000c1a000 8192 acpi_os_map_memory+0x13/0x1c phys=fed1f000 ioremap 0xffffc20000c1a000-0xffffc20000c1c000 8192 acpi_os_map_memory+0x13/0x1c phys=cff68000 ioremap 0xffffc20000c1c000-0xffffc20000c1e000 8192 acpi_os_map_memory+0x13/0x1c phys=cff68000 ioremap 0xffffc20000c1e000-0xffffc20000c20000 8192 acpi_os_map_memory+0x13/0x1c phys=cff68000 ioremap 0xffffc20000c20000-0xffffc20000c22000 8192 acpi_os_map_memory+0x13/0x1c phys=cff68000 ioremap 0xffffc20000c22000-0xffffc20000c24000 8192 acpi_os_map_memory+0x13/0x1c phys=cff68000 ioremap 0xffffc20000c24000-0xffffc20000c26000 8192 acpi_os_map_memory+0x13/0x1c phys=e0081000 ioremap 0xffffc20000c26000-0xffffc20000c28000 8192 acpi_os_map_memory+0x13/0x1c phys=e0080000 ioremap 0xffffc20000c28000-0xffffc20000c2d000 20480 alloc_large_system_hash+0x127/0x246 pages=4 vmalloc 0xffffc20000c2d000-0xffffc20000c31000 16384 tcp_init+0xd5/0x31c pages=3 vmalloc 0xffffc20000c31000-0xffffc20000c34000 12288 alloc_large_system_hash+0x127/0x246 pages=2 vmalloc 0xffffc20000c34000-0xffffc20000c36000 8192 init_vdso_vars+0xde/0x1f1 0xffffc20000c36000-0xffffc20000c38000 8192 pci_iomap+0x8a/0xb4 phys=d8e00000 ioremap 0xffffc20000c38000-0xffffc20000c3a000 8192 usb_hcd_pci_probe+0x139/0x295 [usbcore] phys=d8e00000 ioremap 0xffffc20000c3a000-0xffffc20000c3e000 16384 sys_swapon+0x509/0xa15 pages=3 vmalloc 0xffffc20000c40000-0xffffc20000c61000 135168 e1000_probe+0x1c4/0xa32 phys=d8a20000 ioremap 0xffffc20000c61000-0xffffc20000c6a000 36864 _xfs_buf_map_pages+0x8e/0xc0 vmap 0xffffc20000c6a000-0xffffc20000c73000 36864 _xfs_buf_map_pages+0x8e/0xc0 vmap 0xffffc20000c73000-0xffffc20000c7c000 36864 _xfs_buf_map_pages+0x8e/0xc0 vmap 0xffffc20000c7c000-0xffffc20000c7f000 12288 e1000e_setup_tx_resources+0x29/0xbe pages=2 vmalloc 0xffffc20000c80000-0xffffc20001481000 8392704 pci_mmcfg_arch_init+0x90/0x118 phys=e0000000 ioremap 0xffffc20001481000-0xffffc20001682000 2101248 alloc_large_system_hash+0x127/0x246 pages=512 vmalloc 0xffffc20001682000-0xffffc20001e83000 8392704 alloc_large_system_hash+0x127/0x246 pages=2048 vmalloc vpages 0xffffc20001e83000-0xffffc20002204000 3674112 alloc_large_system_hash+0x127/0x246 pages=896 vmalloc vpages 0xffffc20002204000-0xffffc2000220d000 36864 _xfs_buf_map_pages+0x8e/0xc0 vmap 0xffffc2000220d000-0xffffc20002216000 36864 _xfs_buf_map_pages+0x8e/0xc0 vmap 0xffffc20002216000-0xffffc2000221f000 36864 _xfs_buf_map_pages+0x8e/0xc0 vmap 0xffffc2000221f000-0xffffc20002228000 36864 _xfs_buf_map_pages+0x8e/0xc0 vmap 0xffffc20002228000-0xffffc20002231000 36864 _xfs_buf_map_pages+0x8e/0xc0 vmap 0xffffc20002231000-0xffffc20002234000 12288 e1000e_setup_rx_resources+0x35/0x122 pages=2 vmalloc 0xffffc20002240000-0xffffc20002261000 135168 e1000_probe+0x1c4/0xa32 phys=d8a60000 ioremap 0xffffc20002261000-0xffffc2000270c000 4894720 sys_swapon+0x509/0xa15 pages=1194 vmalloc vpages 0xffffffffa0000000-0xffffffffa0022000 139264 module_alloc+0x4f/0x55 pages=33 vmalloc 0xffffffffa0022000-0xffffffffa0029000 28672 module_alloc+0x4f/0x55 pages=6 vmalloc 0xffffffffa002b000-0xffffffffa0034000 36864 module_alloc+0x4f/0x55 pages=8 vmalloc 0xffffffffa0034000-0xffffffffa003d000 36864 module_alloc+0x4f/0x55 pages=8 vmalloc 0xffffffffa003d000-0xffffffffa0049000 49152 module_alloc+0x4f/0x55 pages=11 vmalloc 0xffffffffa0049000-0xffffffffa0050000 28672 module_alloc+0x4f/0x55 pages=6 vmalloc [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Christoph Lameter <clameter@sgi.com> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Hugh Dickins <hugh@veritas.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:21 -07:00
Jeremy Fitzhardinge	180c06efce	hotplug-memory: make online_page() common All architectures use an effectively identical definition of online_page(), so just make it common code. x86-64, ia64, powerpc and sh are actually identical; x86-32 is slightly different. x86-32's differences arise because it puts its hotplug pages in the highmem zone. We can handle this in the generic code by inspecting the page to see if its in highmem, and update the totalhigh_pages count appropriately. This leaves init_32.c:free_new_highpage with a single caller, so I folded it into add_one_highpage_init. I also removed an incorrect comment referring to the NUMA case; any NUMA details have already been dealt with by the time online_page() is called. [akpm@linux-foundation.org: fix indenting] Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Acked-by: Dave Hansen <dave@linux.vnet.ibm.com> Reviewed-by: KAMEZAWA Hiroyuki <kamez.hiroyu@jp.fujitsu.com> Tested-by: KAMEZAWA Hiroyuki <kamez.hiroyu@jp.fujitsu.com> Cc: Yasunori Goto <y-goto@jp.fujitsu.com> Cc: Christoph Lameter <clameter@sgi.com> Acked-by: Ingo Molnar <mingo@elte.hu> Acked-by: Yasunori Goto <y-goto@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:17 -07:00
Ingo Molnar	f022bfd582	x86: PAT fix Adrian Bunk noticed the following Coverity report: > Commit `e7f260a276` > (x86: PAT use reserve free memtype in mmap of /dev/mem) > added the following gem to arch/x86/mm/pat.c: > > <-- snip --> > > ... > int phys_mem_access_prot_allowed(struct file file, unsigned long pfn, > unsigned long size, pgprot_t vma_prot) > { > u64 offset = ((u64) pfn) << PAGE_SHIFT; > unsigned long flags = _PAGE_CACHE_UC_MINUS; > unsigned long ret_flags; > ... > ... (nothing that touches ret_flags) > ... > if (flags != _PAGE_CACHE_UC_MINUS) { > retval = reserve_memtype(offset, offset + size, flags, NULL); > } else { > retval = reserve_memtype(offset, offset + size, -1, &ret_flags); > } > > if (retval < 0) > return 0; > > flags = ret_flags; > > if (pfn <= max_pfn_mapped && > ioremap_change_attr((unsigned long)__va(offset), size, flags) < 0) { > free_memtype(offset, offset + size); > printk(KERN_INFO > "%s:%d /dev/mem ioremap_change_attr failed %s for %Lx-%Lx\n", > current->comm, current->pid, > cattr_name(flags), > offset, offset + size); > return 0; > } > > vma_prot = __pgprot((pgprot_val(vma_prot) & ~_PAGE_CACHE_MASK) \| > flags); > return 1; > } > > <-- snip --> > > If (flags != _PAGE_CACHE_UC_MINUS) we pass garbage from the stack to > ioremap_change_attr() and/or __pgprot(). > > Spotted by the Coverity checker. the fix simplifies the code as we get rid of the 'ret_flags' complication. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:15:16 -07:00
Linus Torvalds	86cf02f8ea	x86 PAT: tone down debugging messages some more Ingo already fixed one of these at my request (in "x86 PAT: tone down debugging messages", commit `1ebcc654f0`), but there was another one he missed. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-27 11:59:30 -07:00
Yinghai Lu	cbf9bd603a	acpi: get boot_cpu_id as early for k8_scan_nodes [mingo@elte.hu: split from "x86_64: get boot_cpu_id as early for k8_scan_nodes] Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-26 23:41:04 +02:00
Yinghai Lu	c2b91e2eec	x86_64/mm: check and print vmemmap allocation continuous On big systems with lots of memory, don't print out too much during bootup, and make it easy to find if it is continuous. on 256G 8 sockets system will get [ffffe20000000000-ffffe20002bfffff] PMD -> [ffff810001400000-ffff810003ffffff] on node 0 [ffffe2001c700000-ffffe2001c7fffff] potential offnode page_structs [ffffe20002c00000-ffffe2001c7fffff] PMD -> [ffff81000c000000-ffff8100255fffff] on node 0 [ffffe20038700000-ffffe200387fffff] potential offnode page_structs [ffffe2001c800000-ffffe200387fffff] PMD -> [ffff810820200000-ffff81083c1fffff] on node 1 [ffffe20040000000-ffffe2007fffffff] PUD ->ffff811027a00000 on node 2 [ffffe20038800000-ffffe2003fffffff] PMD -> [ffff811020200000-ffff8110279fffff] on node 2 [ffffe20054700000-ffffe200547fffff] potential offnode page_structs [ffffe20040000000-ffffe200547fffff] PMD -> [ffff811027c00000-ffff81103c3fffff] on node 2 [ffffe20070700000-ffffe200707fffff] potential offnode page_structs [ffffe20054800000-ffffe200707fffff] PMD -> [ffff811820200000-ffff81183c1fffff] on node 3 [ffffe20080000000-ffffe200bfffffff] PUD ->ffff81202fa00000 on node 4 [ffffe20070800000-ffffe2007fffffff] PMD -> [ffff812020200000-ffff81202f9fffff] on node 4 [ffffe2008c700000-ffffe2008c7fffff] potential offnode page_structs [ffffe20080000000-ffffe2008c7fffff] PMD -> [ffff81202fc00000-ffff81203c3fffff] on node 4 [ffffe200a8700000-ffffe200a87fffff] potential offnode page_structs [ffffe2008c800000-ffffe200a87fffff] PMD -> [ffff812820200000-ffff81283c1fffff] on node 5 [ffffe200c0000000-ffffe200ffffffff] PUD ->ffff813037a00000 on node 6 [ffffe200a8800000-ffffe200bfffffff] PMD -> [ffff813020200000-ffff8130379fffff] on node 6 [ffffe200c4700000-ffffe200c47fffff] potential offnode page_structs [ffffe200c0000000-ffffe200c47fffff] PMD -> [ffff813037c00000-ffff81303c3fffff] on node 6 [ffffe200c4800000-ffffe200e07fffff] PMD -> [ffff813820200000-ffff81383c1fffff] on node 7 instead of a very long print out... Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-26 22:51:09 +02:00
Yinghai Lu	1a27fc0a42	x86_64: fix setup_node_bootmem to support big mem excluding with memmap typical case: four sockets system, every node has 4g ram, and we are using: memmap=10g$4g to mask out memory on node1 and node2 when numa is enabled, early_node_mem is used to get node_data and node_bootmap. if it can not get memory from the same node with find_e820_area(), it will use alloc_bootmem to get buff from previous nodes. so check it and print out some info about it. need to move early_res_to_bootmem into every setup_node_bootmem. and it takes range that node has. otherwise alloc_bootmem could return addr that reserved early. depends on "mm: make reserve_bootmem can crossed the nodes". Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-26 22:51:08 +02:00
Yinghai Lu	8b3cd09ed2	x86_64: make reserve_bootmem_generic() use new reserve_bootmem() "mm: make reserve_bootmem can crossed the nodes" provides new reserve_bootmem(), let reserve_bootmem_generic() use that. reserve_bootmem_generic() is used to reserve initramdisk, so this way we can make sure even when bootloader or kexec load ranges cross the node memory boundaries, reserve_bootmem still works. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-26 22:51:08 +02:00
Venki Pallipadi	0124cecfc8	x86, PAT: disable /dev/mem mmap RAM with PAT disable /dev/mem mmap of RAM with PAT. It makes things safer and eliminates aliasing. A future improvement would be to avoid the range_is_allowed duplication. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-26 21:28:29 +02:00
Linus Torvalds	4a27214d7b	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86-fixes * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86-fixes: x86 PAT: decouple from nonpromisc devmem x86 PAT: tone down debugging messages	2008-04-26 09:50:58 -07:00
Dmitri Vorobiev	f7f17a67c5	x86: remove NexGen support It is claimed that NexGen CPUs were never shipped: http://lkml.org/lkml/2008/4/20/179 Also, the kernel support for these chips has been broken for a long time, the code intended to support NexGen thereby being essentially dead. As an outcome of the discussion that can be found using the URL above, this patch removes the NexGen support altogether. The changes in this patch survived a defconfig build for i386, a couple of successful randconfig builds, as well as a runtime test, which consisted in booting a 32-bit x86 box up to the shell prompt. Signed-off-by: Dmitri Vorobiev <dmitri.vorobiev@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-26 17:35:47 +02:00
Ingo Molnar	1ebcc654f0	x86 PAT: tone down debugging messages Linus reported these excessive debug printouts: > Overlap at 0xe0300000-0xe0400000 > Overlap at 0xe0300000-0xe0380000 > Overlap at 0xe0300000-0xe0400000 > Overlap at 0xe0300000-0xe0400000 > Overlap at 0xe0300000-0xe0400000 > Overlap at 0xe0300000-0xe0400000 > Overlap at 0xe0300000-0xe0400000 turn that into a pr_debug(). Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-26 16:01:26 +02:00
Linus Torvalds	bf16ae2509	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86-pat * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86-pat: generic: add ioremap_wc() interface wrapper /dev/mem: make promisc the default pat: cleanups x86: PAT use reserve free memtype in mmap of /dev/mem x86: PAT phys_mem_access_prot_allowed for dev/mem mmap x86: PAT avoid aliasing in /dev/mem read/write devmem: add range_is_allowed() check to mmap of /dev/mem x86: introduce /dev/mem restrictions with a config option	2008-04-25 12:48:08 -07:00
Linus Torvalds	4b7227ca32	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-xen-next * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-xen-next: (52 commits) xen: add balloon driver xen: allow compilation with non-flat memory xen: fold xen_sysexit into xen_iret xen: allow set_pte_at on init_mm to be lockless xen: disable preemption during tlb flush xen pvfb: Para-virtual framebuffer, keyboard and pointer driver xen: Add compatibility aliases for frontend drivers xen: Module autoprobing support for frontend drivers xen blkfront: Delay wait for block devices until after the disk is added xen/blkfront: use bdget_disk xen: Make xen-blkfront write its protocol ABI to xenstore xen: import arch generic part of xencomm xen: make grant table arch portable xen: replace callers of alloc_vm_area()/free_vm_area() with xen_ prefixed one xen: make include/xen/page.h portable moving those definitions under asm dir xen: add resend_irq_on_evtchn() definition into events.c Xen: make events.c portable for ia64/xen support xen: move events.c to drivers/xen for IA64/Xen support xen: move features.c from arch/x86/xen/features.c to drivers/xen xen: add missing definitions in include/xen/interface/vcpu.h which ia64/xen needs ...	2008-04-25 12:32:10 -07:00
Ingo Molnar	70c9f590ff	x86: remove set_fixmap() warning set_fixmap()+clear_fixmap() is safe. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-25 19:54:07 +02:00
Ingo Molnar	82a355f5a2	x86: make __set_fixmap() non-init Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-25 19:54:07 +02:00
Jeremy Fitzhardinge	85958b465c	x86: unify pgd ctor/dtor All pagetables need fundamentally the same setup and destruction, so just use the same code for everything. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-24 23:57:31 +02:00
Jeremy Fitzhardinge	68db065c84	x86: unify KERNEL_PGD_PTRS Make KERNEL_PGD_PTRS common, as previously it was only being defined for 32-bit. There are a couple of follow-on changes from this: - KERNEL_PGD_PTRS was being defined in terms of USER_PGD_PTRS. The definition of USER_PGD_PTRS doesn't really make much sense on x86-64, since it can have two different user address-space configurations. I renamed USER_PGD_PTRS to KERNEL_PGD_BOUNDARY, which is meaningful for all of 32/32, 32/64 and 64/64 process configurations. - USER_PTRS_PER_PGD was also defined and was being used for similar purposes. Converting its users to KERNEL_PGD_BOUNDARY left it completely unused, and so I removed it. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Andi Kleen <ak@suse.de> Cc: Zach Amsden <zach@vmware.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-24 23:57:31 +02:00
Jeremy Fitzhardinge	c20311e165	x86/pgtable.h: demacro ptep_clear_flush_young Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-24 23:57:31 +02:00
Jeremy Fitzhardinge	f9fbf1a36a	x86/pgtable.h: demacro ptep_test_and_clear_young Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-24 23:57:31 +02:00
Jeremy Fitzhardinge	ee5aa8d3ba	x86/pgtable.h: demacro ptep_set_access_flags Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-24 23:57:31 +02:00
Jeremy Fitzhardinge	2761fa0920	x86: add pud_alloc for 4-level pagetables Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-24 23:57:31 +02:00
Jeremy Fitzhardinge	6944a9c894	x86: rename paravirt_alloc_pt etc after the pagetable structure Rename (alloc\|release)_(pt\|pd) to pte/pmd to explicitly match the name of the appropriate pagetable level structure. [ x86.git merge work by Mark McLoughlin <markmc@redhat.com> ] Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-24 23:57:31 +02:00
Jeremy Fitzhardinge	394158559d	x86: move all the pgd_list handling to one place Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-24 23:57:31 +02:00
Jeremy Fitzhardinge	5a5f8f4224	x86: move pgalloc pud and pgd operations into common place Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-24 23:57:30 +02:00
Jeremy Fitzhardinge	170fdff705	x86: move pmd functions into common asm/pgalloc.h Common definitions for 3-level pagetable functions. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-24 23:57:30 +02:00
Jeremy Fitzhardinge	397f687ab7	x86: move pte functions into common asm/pgalloc.h Common definitions for 2-level pagetable functions. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-24 23:57:30 +02:00
Jeremy Fitzhardinge	1d262d3a49	x86: put paravirt stubs into common asm/pgalloc.h Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-24 23:57:30 +02:00
Ingo Molnar	1ec1fe73df	x86: xen unify x86 add common mm pgtable c fix Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-24 23:57:30 +02:00
Jeremy Fitzhardinge	4f76cd3822	x86: add common mm/pgtable.c Add a common arch/x86/mm/pgtable.c file for common pagetable functions. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-24 23:57:30 +02:00
Jeremy Fitzhardinge	79bf6d66ab	x86: convert pgalloc_64.h from macros to inlines Convert asm-x86/pgalloc_64.h from macros into functions (#include hell prevents __*_free_tlb from being inline, but they're probably a bit big to inline anyway). Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-24 23:57:30 +02:00
Ingo Molnar	28eb559b5b	pat: cleanups Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-24 23:40:47 +02:00
venkatesh.pallipadi@intel.com	e7f260a276	x86: PAT use reserve free memtype in mmap of /dev/mem Use reserve_memtype and free_memtype wrappers for /dev/mem mmaps. The memtype is slightly complicated here, given that we have to support existing X mappings. We fallback on UC_MINUS for that. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-24 23:40:47 +02:00
venkatesh.pallipadi@intel.com	f0970c13b6	x86: PAT phys_mem_access_prot_allowed for dev/mem mmap Introduce phys_mem_access_prot_allowed(), which checks whether the mapping is possible, without any conflicts and returns success or failure based on that. phys_mem_access_prot() by itself does not allow failure case. This ability to return error is needed for PAT where we may have aliasing conflicts. x86 setup __HAVE_PHYS_MEM_ACCESS_PROT and move x86 specific code out of /dev/mem into arch specific area. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-24 23:40:47 +02:00
venkatesh.pallipadi@intel.com	e045fb2a98	x86: PAT avoid aliasing in /dev/mem read/write Add xlate and unxlate around /dev/mem read/write. This sets up the mapping that can be used for /dev/mem read and write without aliasing worries. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-24 23:40:47 +02:00
Arjan van de Ven	ae531c26c5	x86: introduce /dev/mem restrictions with a config option This patch introduces a restriction on /dev/mem: Only non-memory can be read or written unless the newly introduced config option is set. The X server needs access to /dev/mem for the PCI space, but it doesn't need access to memory; both the file permissions and SELinux permissions of /dev/mem just make X effectively super-super powerful. With the exception of the BIOS area, there's just no valid app that uses /dev/mem on actual memory. Other popular users of /dev/mem are rootkits and the like. (note: mmap access of memory via /dev/mem was already not allowed since a really long time) People who want to use /dev/mem for kernel debugging can enable the config option. The restrictions of this patch have been in the Fedora and RHEL kernels for at least 4 years without any problems. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-24 23:40:47 +02:00
Ingo Molnar	a4928cffe6	"make namespacecheck" fixes Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-24 23:15:44 +02:00
Linus Torvalds	ec965350bb	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched-devel * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched-devel: (62 commits) sched: build fix sched: better rt-group documentation sched: features fix sched: /debug/sched_features sched: add SCHED_FEAT_DEADLINE sched: debug: show a weight tree sched: fair: weight calculations sched: fair-group: de-couple load-balancing from the rb-trees sched: fair-group scheduling vs latency sched: rt-group: optimize dequeue_rt_stack sched: debug: add some debug code to handle the full hierarchy sched: fair-group: SMP-nice for group scheduling sched, cpuset: customize sched domains, core sched, cpuset: customize sched domains, docs sched: prepatory code movement sched: rt: multi level group constraints sched: task_group hierarchy sched: fix the task_group hierarchy for UID grouping sched: allow the group scheduler to have multiple levels sched: mix tasks and groups ...	2008-04-21 15:40:24 -07:00

1 2 3 4 5 ...

438 Commits