This provides a __flush_anon_page() that handles both the aliasing and
non-aliasing cases. This fixes up some crashes with heavy
get_user_pages() users.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This plugs in a BUG_ON() in kmap_coherent() for PG_dcache_dirty pages
to catch when things go horribly wrong. Copied from the MIPS
implementation.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
The TLB miss fast-path presently calls in to update_mmu_cache() to
set up the entry, and does so with a NULL vma. Check for vma validity
in the __update_tlb() ptrace checks.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This splits out a separate __update_cache()/__update_tlb() for
update_mmu_cache() to wrap in to. This lets us share the common
__update_cache() bits while keeping special __update_tlb() handling
broken out.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Now that the SH-4 page clear/copy ops are generic, they can be used for
all platforms with CONFIG_MMU=y. SH-5 remains the odd one out, but it too
will gradually be converted over to using this interface.
SH-3 platforms which do not contain aliases will see no impact from this
change, while aliasing SH-3 platforms will get the same interface as
SH-4.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This wires up clear_user_highpage() on SH-4 and subsequently converts the
SH7705 32kB cache mode over to using it. Now that the SH-4 implementation
handles all of the dcache purging directly in the aliasing case, there is
no need to do this in the default clear_page() implementation.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This inverts the delayed dcache flush a bit to be more in line with other
platforms. At the same time this also gives us the ability to do some
more optimizations and cleanup. Now that the update_mmu_cache() callsite
only tests for the bit, the implementation can gradually be split out and
made generic, rather than relying on special implementations for each of
the peculiar CPU types.
SH7705 in 32kB mode and SH-4 still need slightly different handling, but
this is something that can remain isolated in the varying page copy/clear
routines. On top of that, SH-X3 is dcache coherent, so there is no need
to bother with any of these tests in the PTEAEX version of
update_mmu_cache(), so we kill that off too.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
The last commit changed the behaviour on kernel faults when we were
doing something other than syncing the page tables. vmalloc_sync_one()
needs to return NULL if the page tables are up to date, because the
reason for the fault was not a missing/inconsitent page table entry. By
returning NULL if the page tables are sync'd we signal to the calling
function that further work must be done to resolve this fault.
Also, remove the superfluous __va() around the first argument to
vmalloc_sync_one(). The value of pgd_k is already a virtual address and
using it wth __va() causes a NULL dereference.
Signed-off-by: Matt Fleming <matt@console-pimps.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
* Remove smp_lock.h from files which don't need it (including some headers!)
* Add smp_lock.h to files which do need it
* Make smp_lock.h include conditional in hardirq.h
It's needed only for one kernel_locked() usage which is under CONFIG_PREEMPT
This will make hardirq.h inclusion cheaper for every PREEMPT=n config
(which includes allmodconfig/allyesconfig, BTW)
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This rewrites the vmalloc fault handling as per x86, which subsequently
allows for easy future tie-in for vmalloc_sync_all().
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Like the UP case, use lmb as the foundation of memory resource
management on NUMA.
Signed-off-by: Matt Fleming <matt@console-pimps.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This adds page fault instrumentation for the software performance
counters. Follows the x86 and powerpc changes.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
set_pte_phys() presently uses the global flush_tlb_one(), which locks on
SMP trying to do the IPI. As we have not even initialized the other CPUs
at this point, switch to the local_ variant so the flush happens on the
boot CPU.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This allows the callers to now pass down the full set of FAULT_FLAG_xyz
flags to handle_mm_fault(). All callers have been (mechanically)
converted to the new calling convention, there's almost certainly room
for architectures to clean up their code and then add FAULT_FLAG_RETRY
when that support is added.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
arch/sh has a couple of stray markers without any users introduced
in commit 3d58695edb. Remove them in
preparation of removing the markers in favour of the TRACE_EVENT
macro (and also because we don't keep dead code around).
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This kills off after_bootmem and switches to using slab_is_available()
instead. Presently the only place this is used is by the sh64 ioremap,
and there's not much point in keeping the reference around otherwise.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Several platforms want to be able to do large physically contiguous
allocations (primarily nommu and video codecs on SH-Mobile), provide a
MAX_ORDER override for those cases.
Tested-by: Conrad Parker <conrad@metadecks.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Consolidate these in a single place in the Kconfig menus. At the same
time, disable their interactivity and set them according to the board
config defaults.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Presently this is special-cased for early initialization. While there are
situations where these static early initializations are still necessary,
with minor changes it is possible to use this for the regular ioremap
implementation as well. This allows us to kill off the special-casing for
the remap completely and to start tidying up all of the SH-5
special-casing in drivers.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Presently shm_align_mask is only looked at for the bottom up case, but we
still want this for proper colouring constraints in the topdown case.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Not all PCI channels have non-translatable memory windows, this is a
special property of the on-chip PCIC with its 0xfd00... mapping, handle
this explicitly.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch changes the code to use __is_pci_memory() instead of
is_pci_memaddr(). __is_pci_memory() loops through all the pci
channels on the system to match memory windows.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This prevents the DMA API debugging from running out of entries right
away on boot. Defines 4096 entries by default, which while a bit on the
heavy side, ought to leave enough breathing room for some time.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Forcing direct-mapped worked on certain older 2-way set associative
parts, but was always error prone on 4-way parts. As these are the
norm these days, there is not much point in continuing to support this
mode. Most of the folks that used direct-mapped mode generally just
wanted writethrough caching in the first place..
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
While harmless, PTEA has different semantics on these parts, and is only
used in extended TLB mode. Kill off the legacy support.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This adds support for extended ASIDs (up to 16-bits) on newer SH-X3 cores
that implement the PTAEX register and respective functionality. Presently
only the 65nm SH7786 (90nm only supports legacy 8-bit ASIDs).
The main change is in how the PTE is written out when loading the entry
in to the TLB, as well as in how the TLB entry is selectively flushed.
While SH-X2 extended mode splits out the memory-mapped U and I-TLB data
arrays for extra bits, extended ASID mode splits out the address arrays.
While we don't use the memory-mapped data array access, the address
array accesses are necessary for selective TLB flushes, so these are
implemented newly and replace the generic SH-4 implementation.
With this, TLB flushes in switch_mm() are almost non-existent on newer
parts.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This implements preliminary suspend/resume support for the PMB.
Signed-off-by: Francesco Virlinzi <francesco.virlinzi@st.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This provides a method for supporting fixed PMB mappings inherited from
the bootloader, as an alternative to the dynamic PMB mapping currently
used by the kernel. In the future these methods will be combined.
P1/P2 area is handled like a regular 29-bit physical address, and local
bus device are assigned P3 area addresses.
Signed-off-by: Yoshihiro Shimoda <shimoda.yoshihiro@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Fix iounmap() of pass-through P4 addresses. Without this patch
iounmap() on the sh7780 rtc area results in a warning message.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
If the NULL test is necessary, then the dereference should be moved below
the NULL test.
The semantic patch that makes this change is as follows:
(http://www.emn.fr/x-info/coccinelle/).
// <smpl>
@disable is_null@
identifier f;
expression E;
identifier fld;
statement S;
@@
+ if (E == NULL) S
f(...,E->fld,...);
- if (E == NULL) S
@@
identifier f;
expression E;
identifier fld;
statement S;
@@
+ if (!E) S
f(...,E->fld,...);
- if (!E) S
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Show node to memory section relationship with symlinks in sysfs
Add /sys/devices/system/node/nodeX/memoryY symlinks for all
the memory sections located on nodeX. For example:
/sys/devices/system/node/node1/memory135 -> ../../memory/memory135
indicates that memory section 135 resides on node1.
Also revises documentation to cover this change as well as updating
Documentation/ABI/testing/sysfs-devices-memory to include descriptions
of memory hotremove files 'phys_device', 'phys_index', and 'state'
that were previously not described there.
In addition to it always being a good policy to provide users with
the maximum possible amount of physical location information for
resources that can be hot-added and/or hot-removed, the following
are some (but likely not all) of the user benefits provided by
this change.
Immediate:
- Provides information needed to determine the specific node
on which a defective DIMM is located. This will reduce system
downtime when the node or defective DIMM is swapped out.
- Prevents unintended onlining of a memory section that was
previously offlined due to a defective DIMM. This could happen
during node hot-add when the user or node hot-add assist script
onlines _all_ offlined sections due to user or script inability
to identify the specific memory sections located on the hot-added
node. The consequences of reintroducing the defective memory
could be ugly.
- Provides information needed to vary the amount and distribution
of memory on specific nodes for testing or debugging purposes.
Future:
- Will provide information needed to identify the memory
sections that need to be offlined prior to physical removal
of a specific node.
Symlink creation during boot was tested on 2-node x86_64, 2-node
ppc64, and 2-node ia64 systems. Symlink creation during physical
memory hot-add tested on a 2-node x86_64 system.
Signed-off-by: Gary Hade <garyhade@us.ibm.com>
Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Split pages returned by dma_alloc_coherent() and make sure
we free them one by one.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This migrates from the old bitrotted kgdb stub implementation and moves
to the generic stub. In the process support for SH-2/SH-2A is also added,
which the old stub never provided.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This converts the sh64 /proc/asids entry to debugfs and enables it for
all SH parts that have debugfs enabled.
On MMU systems this can be used to determine which processes are using
which ASIDs which in turn can be used for finer grained cache tag
analysis.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch adds a pass-through case when ioremapping P4 addresses.
Addresses passed to ioremap() should be physical addresses, so the
best option is usually to convert the virtual address to a physical
address before calling ioremap. This will give you a virtual address
in P2 which matches the physical address and this works well for
most internal hardware blocks on the SuperH architecture.
However, some hardware blocks must be accessed through P4. Converting
the P4 address to a physical and then back to a P2 does not work. One
example of this is the sh7722 TMU block, it must be accessed through P4.
Without this patch P4 addresses will be mapped using PTEs which
requires the page allocator to be up and running.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
With the PMB enabled, only P1SEG and up are covered by the PMB mappings,
meaning that situations where out-of-bounds physical addresses are read
from will lead to TLB reset after the PMB miss, allowing for use cases
like dd if=/dev/mem to reset the TLB.
Fix this up to make sure the reference is between __MEMORY_START (phys)
and __pa(high_memory). This is coherent across all variants of sh/sh64
with and without MMU, though the PMB bug itself is only applicable to
SH-4A parts.
Reported-by: Hideo Saito <saito@densan.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
There was a race in the kmap_coherent() implementation. While we
guarded against preemption, there was nothing preventing eviction of
the pre-faulted fixmap entry from the UTLB. Under certain workloads
this would result in the fixmap entries used for cache colouring being
evicted from the UTLB in the midst of a copy_page().
In addition to pre-faulting, we also make sure to preserve the PTEs
in the kernel page table and introduce a cached PTE for kmap_coherent()
usage. This follows a similar change on MIPS ("[MIPS] Fix aliasing bug
in copy_to_user_page / copy_from_user_page").
Reported-by: Hideo Saito <saito@densan.co.jp>
Reported-by: CHIKAMA Masaki <masaki.chikama@gmail.com>
Tested-by: Yoshihiro Shimoda <shimoda.yoshihiro@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Use the generic remove_memory() provided by mm/memory_hotplug.c instead.
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Remove left overs from the generic declared coherent rework.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
debugfs_create_file() returns NULL if an error occurs, returns -ENODEV
when debugfs is not enabled in the kernel.
Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This implements a few trace points across events that are deemed
interesting. This implements a number of trace points:
- The page fault handler / TLB miss
- IPC calls
- Kernel thread creation
The original LTTng patch had the slow-path instrumented, which
fails to account for the vast majority of events. In general
placing this in the fast-path is not a huge performance hit, as
we don't take page faults for kernel addresses.
The other bits of interest are some of the other trap handlers, as
well as the syscall entry/exit (which is better off being handled
through the tracehook API). Most of the other trap handlers are corner
cases where alternate means of notification exist, so there is little
value in placing extra trace points in these locations.
Based on top of the points provided both by the LTTng instrumentation
patch as well as the patch shipping in the ST-Linux tree, albeit in a
stripped down form.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This follows the powerpc commit f6a616800e
'[POWERPC] Fix kernel stack allocation alignment'.
SH has traditionally forced the thread order to be relative to the page
size, so there were never any situations where the same bug was
triggered by slub. Regardless, the usage of > 8kB stacks for the larger
page sizes is overkill, so we switch to using slab allocations there,
as per the powerpc change.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This uses jump_to_uncached() which is now given the noinline attribute
due to the special section mapping. Kill off the inline attribute to
fix up compilation failure.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Because alloc_bootmem functions return the allocated memory always
zeroed, an additional call of memset on allocated memory is
unnecessary.
Signed-off-by: Marek Skuczynski <M.Skuczynski@adbglobal.com>
Signed-off-by: Carl Shaw <carl.shaw@st.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This fixes a problem in the code which copies the vmalloc portion of the
kernel's page table into the current user space page table. The addition
of the four level page table code breaks on folded page tables, because
the pud level is always present (although folded). This updates the code
to use the same style of updates for the pud as is used for the pgd
level.
Signed-off-by: Stuart Menefy <stuart.menefy@st.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
statically initialise the cached_to_uncached offset, so that we can use
it immediatly.
Signed-off-by: Stuart Menefy <stuart.menefy@st.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch kills a section mismatch for platform_resource_setup_memory().
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Allow user to pass parameters on kernel command line to override
default size for physically contiguous memory buffers. The default
VPU buffer size is too small for VGA harware encoding, but instead
of just bumping up the number we allow the user to override the
default size using the command line. Supports SuperH Mobile hardware
blocks such as VEU, VPU and CEU.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Presently we oops in mm/hugetlb.c:1325, which is the order == 0 test in
hugetlb_add_hstate() called at initialization time. So, disable 64kB
huge pages when we're using a 64kB PAGE_SIZE. On most parts this will
force the default to be 1MB huge pages.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6: (28 commits)
mm/hugetlb.c must #include <asm/io.h>
video: Fix up hp6xx driver build regressions.
sh: defconfig updates.
sh: Kill off stray mach-rsk7203 reference.
serial: sh-sci: Fix up SH7760/SH7780/SH7785 early printk regression.
sh: Move out individual boards without mach groups.
sh: Make sure AT_SYSINFO_EHDR is exposed to userspace in asm/auxvec.h.
sh: Allow SH-3 and SH-5 to use common headers.
sh: Provide common CPU headers, prune the SH-2 and SH-2A directories.
sh/maple: clean maple bus code
sh: More header path fixups for mach dir refactoring.
sh: Move out the solution engine headers to arch/sh/include/mach-se/
sh: I2C fix for AP325RXA and Migo-R
sh: Shuffle the board directories in to mach groups.
sh: dma-sh: Fix up dreamcast dma.h mach path.
sh: Switch KBUILD_DEFCONFIG to shx3_defconfig.
sh: Add ARCH_DEFCONFIG entries for sh and sh64.
sh: Fix compile error of Solution Engine
sh: Proper __put_user_asm() size mismatch fix.
sh: Stub in a dummy ENTRY_OFFSET for uImage offset calculation.
...
This follows the sparc changes a439fe51a1.
Most of the moving about was done with Sam's directions at:
http://marc.info/?l=linux-sh&m=121724823706062&w=2
with subsequent hacking and fixups entirely my fault.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
fix the problem that cannot boot using uImage when PAGE_SIZE is
8kbyte or 64kbyte.
Signed-off-by: Yoshihiro Shimoda <shimoda.yoshihiro@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Commit 1ea0704e0d
(mm: add a ptep_modify_prot transaction abstraction)
triggered on sh build errors like the following:
<-- snip -->
...
CC arch/sh/mm/pg-sh4.o
cc1: warnings being treated as errors
include2/asm/pgtable.h:139: error: 'ptep_get_and_clear' declared inline after being called
include2/asm/pgtable.h:139: error: previous declaration of 'ptep_get_and_clear' was here
make[2]: *** [arch/sh/mm/pg-sh4.o] Error 1
<-- snip -->
Since there's no good reason for marking these global functions as
"inline" this patch therefore removes the inline's.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch adds physically contiguous memory chunks to the UIO devices.
The same strategy can be used in the future for the CEU as well.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Remove inline from ptep_get_and_clean() to match with header file prototype.
Makes linux-next build.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
The current kernel behaviour is to reenable interrupts unconditionally
when taking a page fault. This patch changes this to only enable them
if interrupts were previously enabled.
It also fixes a problem seen with this fix in place: the kernel previously
flushed the vsyscall page when handling a signal, which is not only
unncessary, but caused a possible sleep with interrupts disabled.
Signed-off-by: Stuart Menefy <stuart.menefy@st.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Add implementation of flush_icache_range() suitable for signal handler
and kprobes. Remove flush_cache_sigtramp() and change signal.c to use
flush_icache_range().
Signed-off-by: Chris Smith <chris.smith@st.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
16kB is a useful size on nommu, while 64kB still tends to be too big to
be useful. Newer MMUs are likely to support this as well, so plug it
in in anticipation of those, too.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
PAGE_SIZE doesn't need to be fixed at 4096 on nommu, so stub in a !MMU
case for the various PAGE_SIZE Kconfig options.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
When using single_open(), single_release() should be used instead
of seq_release(), otherwise there is a memory leak.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Remove arch-specific show_mem() in favor of the generic version.
This also removes the following redundant information display:
- free pages, printed by show_free_areas()
- pages in slab, printed by show_free_areas()
- free swap pages, printed by show_swap_cache_info()
- pages in swapcache, printed by show_swap_cache_info()
where show_mem() calls show_free_areas(), which calls
show_swap_cache_info().
Signed-off-by: Johannes Weiner <hannes@saeurebad.de>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kmem cache passed to constructor is only needed for constructors that are
themselves multiplexeres. Nobody uses this "feature", nor does anybody uses
passed kmem cache in non-trivial way, so pass only pointer to object.
Non-trivial places are:
arch/powerpc/mm/init_64.c
arch/powerpc/mm/hugetlbpage.c
This is flag day, yes.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Acked-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Jon Tollefson <kniht@linux.vnet.ibm.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Matt Mackall <mpm@selenic.com>
[akpm@linux-foundation.org: fix arch/powerpc/mm/hugetlbpage.c]
[akpm@linux-foundation.org: fix mm/slab.c]
[akpm@linux-foundation.org: fix ubifs]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Almost all users of this field need a PFN instead of a physical address,
so replace node_boot_start with node_min_pfn.
[Lee.Schermerhorn@hp.com: fix spurious BUG_ON() in mark_bootmem()]
Signed-off-by: Johannes Weiner <hannes@saeureba.de>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Straight forward extensions for huge pages located in the PUD instead of
PMDs.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The goal of this patchset is to support multiple hugetlb page sizes. This
is achieved by introducing a new struct hstate structure, which
encapsulates the important hugetlb state and constants (eg. huge page
size, number of huge pages currently allocated, etc).
The hstate structure is then passed around the code which requires these
fields, they will do the right thing regardless of the exact hstate they
are operating on.
This patch adds the hstate structure, with a single global instance of it
(default_hstate), and does the basic work of converting hugetlb to use the
hstate.
Future patches will add more hstate structures to allow for different
hugetlbfs mounts to have different page sizes.
[akpm@linux-foundation.org: coding-style fixes]
Acked-by: Adam Litke <agl@us.ibm.com>
Acked-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There are a lot of places that define either a single bootmem descriptor or an
array of them. Use only one central array with MAX_NUMNODES items instead.
Signed-off-by: Johannes Weiner <hannes@saeurebad.de>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Kyle McMartin <kyle@parisc-linux.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add physical memory resources such as System RAM, Kernel code/data/bss
and reserved crash dump area to /proc/iomem. Same strategy as on x86.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
All architectures use an effectively identical definition of online_page(), so
just make it common code. x86-64, ia64, powerpc and sh are actually
identical; x86-32 is slightly different.
x86-32's differences arise because it puts its hotplug pages in the highmem
zone. We can handle this in the generic code by inspecting the page to see if
its in highmem, and update the totalhigh_pages count appropriately. This
leaves init_32.c:free_new_highpage with a single caller, so I folded it into
add_one_highpage_init.
I also removed an incorrect comment referring to the NUMA case; any NUMA
details have already been dealt with by the time online_page() is called.
[akpm@linux-foundation.org: fix indenting]
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Acked-by: Dave Hansen <dave@linux.vnet.ibm.com>
Reviewed-by: KAMEZAWA Hiroyuki <kamez.hiroyu@jp.fujitsu.com>
Tested-by: KAMEZAWA Hiroyuki <kamez.hiroyu@jp.fujitsu.com>
Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: Christoph Lameter <clameter@sgi.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There's still work that needs to be done here, and this should not be
enabled by default on existing boards.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
__FUNCTION__ is gcc-specific, use __func__
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch fixes the following compile error:
<-- snip -->
...
CC arch/sh/mm/pg-sh7705.o
/home/bunk/linux/kernel-2.6/git/linux-2.6/arch/sh/mm/pg-sh7705.c: In function 'ptep_get_and_clear':
/home/bunk/linux/kernel-2.6/git/linux-2.6/arch/sh/mm/pg-sh7705.c:130: error: implicit declaration of function 'mapping_writably_mapped'
make[2]: *** [arch/sh/mm/pg-sh7705.o] Error 1
<-- snip -->
Signed-off-by: Adrian Bunk <adrian.bunk@movial.fi>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This acts as a reversion of 1c6b2ca5e0 in
the case of UP SH-4, where we still have the risk of a multiple hit
between the slow and fast paths. As seen on SH7780.
Signed-off-by: Hideo Saito <saito@densan.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Previously this took an explicit range, update this to use the same
behaviour as the rest of the SH parts where we simply flush out a line
from the start address.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
The idea is that we want to get rid of the in/out/readb/writeb callbacks from
the machvec and replace that with simple inline read and write operations to
memory. Fast and simple for most hardware devices (think pci).
Some devices require special treatment though - like 16-bit only CF devices -
so we need to have some method to hook in callbacks.
This patch makes it possible to add a per-device trap generating filter. This
way we can get maximum performance of sane hardware - which doesn't need this
filter - and crappy hardware works but gets punished by a performance hit.
V2 changes things around a bit and replaces io access callbacks with a
simple minimum_bus_width value. In the future we can add stride as well.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch fixes the recently introduced declared coherent memory support.
Without this fix a cached memory area is returned by dma_alloc_coherent() -
unless dma_declare_coherent_memory() has setup a separate area.
This patch makes sure an uncached memory area is returned. With this patch
it is now possible to ping through an rtl8139 interface on r2d-plus.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patchset adds a flags variable to reserve_bootmem() and uses the
BOOTMEM_EXCLUSIVE flag in crashkernel reservation code to detect collisions
between crashkernel area and already used memory.
This patch:
Change the reserve_bootmem() function to accept a new flag BOOTMEM_EXCLUSIVE.
If that flag is set, the function returns with -EBUSY if the memory already
has been reserved in the past. This is to avoid conflicts.
Because that code runs before SMP initialisation, there's no race condition
inside reserve_bootmem_core().
[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: fix powerpc build]
Signed-off-by: Bernhard Walle <bwalle@suse.de>
Cc: <linux-arch@vger.kernel.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch adds declared coherent memory support to the sh architecture. All
functions are based on the x86 implementation. Header files are adjusted to
use the new functions instead of the former consistent_alloc() code.
This version includes the few changes what were included in the fix patch
together with modifications based on feedback from Paul.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This saves us from having to use kmalloc() for the fixmap entries,
which is needed early for the uncached fixmap.
Signed-off-by: Stuart Menefy <stuart.menefy@st.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Presently most of the 29-bit physical parts do P1/P2 segmentation
with a 1:1 cached/uncached mapping, jumping between the two to
control the caching behaviour. This provides the basic infrastructure
to maintain this behaviour on 32-bit physical parts that don't map
P1/P2 at all, using a shiny new linker section and corresponding
fixmap entry.
Signed-off-by: Stuart Menefy <stuart.menefy@st.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
With the refactored update_mmu_cache() introduced in older kernels,
there's no longer any need to take the page_table_lock in this path,
so simply drop it completely.
Without this, performance degradation is seen on SMP on heavily
threaded workloads that don't use the split ptlock, and ultimately
we have no reason to contend for the lock in the first place.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
The __do_page_fault() fast-path contains a UTLB flush in order to
force an ITLB reload, this isn't needed in practice as the ITLB is
auto-reloaded from the UTLB anyways, which is already displaced by
the manual 'ldtlb' in the update_mmu_cache() path.
This provides a measurable speed up in the TLB miss fast-path.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Now that copy_to_user_page()/copy_from_user_page() are wired up, we
can drop the old __copy_xxx() implementations. Now that the page
colouring scheme has changed via kmap_coherent(), we can avoid the
flush in these specific helpers.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This moves copy_{to,from}_user_page() out-of-line on SH-4 and
converts for the kmap_coherent() API. Based on the MIPS
implementation.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
With the kmap_coherent() API in place, this is trivial to implement,
and lets us avoid the cache flush in certain cases.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
The ST40 stuff in-tree hasn't built for some time, and hasn't been
updated for over 3 years. ST maintains their own out-of-tree changes
and rebases occasionally, and that's ultimately where all of the ST40
users go anyways.
In order for the ST40 code to be brought up to date most of the stuff
removed in this changeset would have to be rewritten anyways, so there's
very little benefit in keeping the remnants around either.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
movca.l is restricted to SH-4 and up only, though compilers that
are unable to support ISA tuning (especially older versions of
binutils) will happily compile in the bogus opcode on older parts.
Conditionalize it to fix SH-3 regressions noted by Kristoffer.
Signed-off-by: Stuart Menefy <stuart.menefy@st.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
is_init() is an ambiguous name for the pid==1 check. Split it into
is_global_init() and is_container_init().
A cgroup init has it's tsk->pid == 1.
A global init also has it's tsk->pid == 1 and it's active pid namespace
is the init_pid_ns. But rather than check the active pid namespace,
compare the task structure with 'init_pid_ns.child_reaper', which is
initialized during boot to the /sbin/init process and never changes.
Changelog:
2.6.22-rc4-mm2-pidns1:
- Use 'init_pid_ns.child_reaper' to determine if a given task is the
global init (/sbin/init) process. This would improve performance
and remove dependence on the task_pid().
2.6.21-mm2-pidns2:
- [Sukadev Bhattiprolu] Changed is_container_init() calls in {powerpc,
ppc,avr32}/traps.c for the _exception() call to is_global_init().
This way, we kill only the cgroup if the cgroup's init has a
bug rather than force a kernel panic.
[akpm@linux-foundation.org: fix comment]
[sukadev@us.ibm.com: Use is_global_init() in arch/m32r/mm/fault.c]
[bunk@stusta.de: kernel/pid.c: remove unused exports]
[sukadev@us.ibm.com: Fix capability.c to work with threaded init]
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com>
Acked-by: Pavel Emelianov <xemul@openvz.org>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Cedric Le Goater <clg@fr.ibm.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: Herbert Poetzel <herbert@13thfloor.at>
Cc: Kirill Korotaev <dev@sw.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
dma_cache_(wback|inv|wback_inv) were the earliest attempt on a generalized
cache managment API for I/O purposes. Originally it was basically the raw
MIPS low level cache API exported to the entire world. The API has
suffered from a lack of documentation, was not very widely used unlike it's
more modern brothers and can easily be replaced by dma_cache_sync. So
remove it rsp. turn the surviving bits back into an arch private API, as
discussed on linux-arch.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Acked-by: Paul Mackerras <paulus@samba.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Kyle McMartin <kyle@parisc-linux.org>
Acked-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Slab constructors currently have a flags parameter that is never used. And
the order of the arguments is opposite to other slab functions. The object
pointer is placed before the kmem_cache pointer.
Convert
ctor(void *object, struct kmem_cache *s, unsigned long flags)
to
ctor(struct kmem_cache *s, void *object)
throughout the kernel
[akpm@linux-foundation.org: coupla fixes]
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Now, arch dependent code around CONFIG_MEMORY_HOTREMOVE is a mess.
This patch cleans up them. This is against 2.6.23-rc6-mm1.
- fix compile failure on ia64/ CONFIG_MEMORY_HOTPLUG && !CONFIG_MEMORY_HOTREMOVE case.
- For !CONFIG_MEMORY_HOTREMOVE, add generic no-op remove_memory(),
which returns -EINVAL.
- removed remove_pages() only used in powerpc.
- removed no-op remove_memory() in i386, sh, sparc64, x86_64.
- only powerpc returns -ENOSYS at memory hot remove(no-op). changes it
to return -EINVAL.
Note:
Currently, only ia64 supports CONFIG_MEMORY_HOTREMOVE. I welcome other
archs if there are requirements and testers.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We have had complaints where a threaded application is left in a bad state
after one of it's threads is killed when we hit a VM: out_of_memory
condition.
Killing just one of the process threads can leave the application in a bad
state, whereas killing the entire process group would allow for the
application to restart, or be otherwise handled, and makes it very obvious
that something has gone wrong.
This change allows the entire process group to be taken down, rather
than just the one thread.
Signed-off-by: Will Schmidt <will_schmidt@vnet.ibm.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Ian Molton <spyro@f2s.com>
Cc: Haavard Skinnemoen <hskinnemoen@atmel.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Andi Kleen <ak@suse.de>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Matthew Wilcox <willy@debian.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp>
Cc: Richard Curnow <rc@rc0.org.uk>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Chris Zankel <chris@zankel.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This implements a fast-path for small (less than 12 bytes) copies,
with the existing path treated as the slow-path and left as the default
behaviour for all other copy sizes.
Signed-off-by: Stuart Menefy <stuart.menefy@st.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
When using URAM in NUMA mode another active region is needed.
Bump this up so we don't trigger the region truncation in
add_active_range().
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
4kB pages are unstable on extended mode TLB, it's recommended
that TLB compat mode be used when using a 4kB PAGE_SIZE. Set
the default for extended mode to 8kB.
This should have negligible impact, as other than the extra swap
cache entry bits, there's no reason to use the extended mode TLB
with 4kB pages.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>