Yanmin reported that both tbench and hackbench were significantly
hurt by trying to keep tasks local on these domains, esp on small
cache machines.
So disable it in order to promote spreading outside of the cache
domains.
Reported-by: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
CC: Mike Galbraith <efault@gmx.de>
LKML-Reference: <1255083400.8802.15.camel@laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This reverts commit 9bcbdd9c58.
The real bug producing LatencyTop latencies has been fixed in:
f5dc375: sched: Update the clock of runqueue select_task_rq() selected
And the commit being reverted here triggers local timer processing
from every device IRQ. If device IRQs come in at a high frequency,
this could cause a performance regression.
The commit being reverted here purely 'fixed' the reported latency
as a side effect, because CPUs were being moved out of idle more
often.
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Frans Pop <elendil@planet.nl>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
LKML-Reference: <20091008064041.67219b13@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'sparc-perf-events-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
mm, perf_event: Make vmalloc_user() align base kernel virtual address to SHMLBA
perf_event: Provide vmalloc() based mmap() backing
* 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, timers: Check for pending timers after (device) interrupts
NOHZ: update idle state also when NOHZ is inactive
Now that range timers and deferred timers are common, I found a
problem with these using the "perf timechart" tool. Frans Pop also
reported high scheduler latencies via LatencyTop, when using
iwlagn.
It turns out that on x86, these two 'opportunistic' timers only get
checked when another "real" timer happens. These opportunistic
timers have the objective to save power by hitchhiking on other
wakeups, as to avoid CPU wakeups by themselves as much as possible.
The change in this patch runs this check not only at timer
interrupts, but at all (device) interrupts. The effect is that:
1) the deferred timers/range timers get delayed less
2) the range timers cause less wakeups by themselves because
the percentage of hitchhiking on existing wakeup events goes up.
I've verified the working of the patch using "perf timechart", the
original exposed bug is gone with this patch. Frans also reported
success - the latencies are now down in the expected ~10 msec
range.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Tested-by: Frans Pop <elendil@planet.nl>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <20091008064041.67219b13@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/kyle/parisc-2.6:
agp: parisc-agp.c - use correct page_mask function
parisc: Fix linker script breakage.
parisc: convert to asm-generic/hardirq.h
parisc: Make THREAD_SIZE available to assembly files and linker scripts.
parisc: correct use of SHF_ALLOC
parisc: rename parisc's vmalloc_start to parisc_vmalloc_start
parisc: add me to Maintainers
parisc: includecheck fix: signal.c
parisc: HAVE_ARCH_TRACEHOOK
parisc: add skeleton syscall.h
parisc: stop using task->ptrace for {single,block}step flags
parisc: split syscall_trace into two halves
parisc: add missing TI_TASK macro in syscall.S
parisc: tracehook_signal_handler
parisc: tracehook_report_syscall
I was using Coccinelle with the mutex_unlock semantic patch, and it
unconvered this problem. It appears to be a valid missing unlock issue.
This change should correct it by moving the unlock below the label.
This patch is against the mainline kernel.
Cc: Julia Lawall <julia@diku.dk>
Cc: Hiroshi DOYU <Hiroshi.DOYU@nokia.com>
Signed-off-by: Daniel Walker <dwalker@fifo99.com>
Signed-off-by: Kevin Hilman <khilman@deeprootsystems.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
the original flush operation is to flush the function address which is
copied from.
But we do not change the function code and it is not necessary to flush it.
Signed-off-by: janboe <janboe.ye@gmail.com>
Acked-by: Paul Walmsley <paul@pwsan.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Some architectures such as Sparc, ARM and MIPS (basically
everything with flush_dcache_page()) need to deal with dcache
aliases by carefully placing pages in both kernel and user maps.
These architectures typically have to use vmalloc_user() for this.
However, on other architectures, vmalloc() is not needed and has
the downsides of being more restricted and slower than regular
allocations.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: David Miller <davem@davemloft.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1254830228.21044.272.camel@laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Lock DPLL5 at 120MHz at boot. The USBHOST 120MHz f-clock and
USBTLL f-clock are the only users of this DPLL, and 120MHz is
is the only recommended rate for these clocks.
With this patch, the 60 MHz ULPI clock is generated correctly.
Tested on an OMAP3430 SDP.
Signed-off-by: Rajendra Nayak <rnayak@ti.com>
Signed-off-by: Anand Gadiyar <gadiyar@ti.com>
Signed-off-by: Paul Walmsley <paul@pwsan.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Commit cd92204924 added
support for omap850. However, the patch accidentally
removed the wrong ifdef:
# define cpu_is_omap730() 1
# endif
#endif
+#else
+# if defined(CONFIG_ARCH_OMAP850)
+# undef cpu_is_omap850
+# define cpu_is_omap850() 1
+# endif
+#endif
...
void omap2_check_revision(void);
#endif /* defined(CONFIG_ARCH_OMAP2) || defined(CONFIG_ARCH_OMAP3) */
-
-#endif
Instead of removing removing the #endif at the end of the file,
the #endif before #else should have been removed.
But we cannot have multiple #else statements as pointed out by
Alistair Buxton <a.j.buxton@gmail.com>. So the fix is to:
- remove the non-multi-omap special handling, as we need to
detect between omap730 and omap850 anyways.
- add the missing #endif back to the end of the file
Reported-by: Sanjeev Premi <premi@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
* 'kvm-updates/2.6.32' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: add support for change_pte mmu notifiers
KVM: MMU: add SPTE_HOST_WRITEABLE flag to the shadow ptes
KVM: MMU: dont hold pagecount reference for mapped sptes pages
KVM: Prevent overflow in KVM_GET_SUPPORTED_CPUID
KVM: VMX: flush TLB with INVEPT on cpu migration
KVM: fix LAPIC timer period overflow
KVM: s390: fix memsize >= 4G
KVM: SVM: Handle tsc in svm_get_msr/svm_set_msr correctly
KVM: SVM: Fix tsc offset adjustment when running nested
* 'fixes-for-linus' of git://git.monstr.eu/linux-2.6-microblaze:
microblaze: Clear sticky FSR register after saving it to func parametr
microblaze: UMS is used only for MMU kernel
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
sparc: using HZ needs an include of linux/param.h
sparc32: convert to asm-generic/hardirq.h
sparc64: Cache per-cpu %pcr register value in perf code.
sparc64: Fix comment typo in perf_event.c
sparc64: Minor coding style fixups in perf code.
sparc64: Add a basic conflict engine in preparation for multi-counter support.
sparc64: Increase vmalloc size to fix percpu regressions.
sparc64: Add initial perf event conflict resolution and checks.
sparc: Niagara1 perf event support.
sparc: Add Niagara2 HW cache event support.
sparc: Support all ultra3 and ultra4 derivatives.
sparc: Support HW cache events.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu:
m68knommu: fix rename of pt_regs offset defines breakage
m68knommu: remove duplicated #include
m68knommu: show KiB rather than pages in "Freeing initrd memory:" message
The 'pwrdm_for_each()' function walks powerdomains with a spinlock
locked, so the the callbacks cannot do anything which may sleep.
This patch introduces a 'pwrdm_for_each_nolock()' helper which does
the same, but without the spinlock locked. This fixes the following
lockdep warning:
[ 0.000000] WARNING: at kernel/lockdep.c:2460 lockdep_trace_alloc+0xac/0xec()
[ 0.000000] Modules linked in:
(unwind_backtrace+0x0/0xdc) from [<c0045464>] (warn_slowpath_common+0x48/0x60)
(warn_slowpath_common+0x48/0x60) from [<c0067dd4>] (lockdep_trace_alloc+0xac/0xec)
(lockdep_trace_alloc+0xac/0xec) from [<c009da14>] (kmem_cache_alloc+0x1c/0xd0)
(kmem_cache_alloc+0x1c/0xd0) from [<c00b21d8>] (d_alloc+0x1c/0x1a4)
(d_alloc+0x1c/0x1a4) from [<c00a887c>] (__lookup_hash+0xd8/0x118)
(__lookup_hash+0xd8/0x118) from [<c00a9f20>] (lookup_one_len+0x84/0x94)
(lookup_one_len+0x84/0x94) from [<c010d12c>] (debugfs_create_file+0x8c/0x20c)
(debugfs_create_file+0x8c/0x20c) from [<c010d320>] (debugfs_create_dir+0x1c/0x20)
(debugfs_create_dir+0x1c/0x20) from [<c000e8cc>] (pwrdms_setup+0x60/0x90)
(pwrdms_setup+0x60/0x90) from [<c002e010>] (pwrdm_for_each+0x30/0x80)
(pwrdm_for_each+0x30/0x80) from [<c000e79c>] (pm_dbg_init+0x7c/0x14c)
(pm_dbg_init+0x7c/0x14c) from [<c00232b4>] (do_one_initcall+0x5c/0x1b8)
(do_one_initcall+0x5c/0x1b8) from [<c00083f8>] (kernel_init+0x90/0x10c)
(kernel_init+0x90/0x10c) from [<c00242c4>] (kernel_thread_exit+0x0/0x8)
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Signed-off-by: Kevin Hilman <khilman@deeprootsystems.com>
Currently, only GPIOs in the wakeup domain (GPIOs in bank 0) are
enabled as wakups. This patch also enables GPIOs in the PER
powerdomain (banks 2-6) to be used as possible wakeup sources.
In addition, this patch ensures that all GPIO wakeups can wakeup
the MPU using the PM_MPUGRPSEL_<pwrdm> registers.
NOTE: this doesn't enable the individual GPIOs as wakeups, this simply
enables the per-bank wakeups at the powerdomain level.
This problem was discovered by Mike Chan when preventing the CORE
powerdomain from going into retention/off. When CORE was allowed to
hit retention, GPIO wakeups via IO pad were working fine, but when
CORE remained on, GPIO module-level wakeups were not working properly.
To test, prevent CORE from going inactive/retention/off, thus
preventing the IO chain from being armed:
# echo 3 > /debug/pm_debug/core_pwrdm/suspend
This ensures that GPIO wakeups happen via module-level wakeups and
not via IO pad.
Tested on 3430SDP using the touchscreen GPIO (gpio 2, in WKUP)
Tested on Zoom2 using the QUART interrup GPIO (gpio 102, in PER)
Also, c.f. OMAP PM wiki for troubleshooting GPIO wakeup issues:
http://elinux.org/OMAP_Power_Management
Reported-by: Mike Chan <mikechan@google.com>
Signed-off-by: Kevin Hilman <khilman@deeprootsystems.com>
USBHOST module has 2 fclocks (for HOST1 and HOST2), only one iclock
and only a single bit in the WKST register to indicate a wakeup event.
Because of the single WKST bit, we cannot know whether a wakeup event
was on HOST1 or HOST2, so enable both fclocks before clearing the
wakeup event to ensure both hosts can properly clear the event.
Signed-off-by: Vikram Pandita <vikram.pandita@ti.com>
Signed-off-by: Kevin Hilman <khilman@deeprootsystems.com>
Clearing wakeup sources is now only done when the PRM indicates a
wakeup source interrupt. Since we don't handle any other types of
PRCM interrupts right now, warn if we get any other type of PRCM
interrupt. Either code needs to be added to the PRCM interrupt
handler to react to these, or these other interrupts should be masked
off at init.
Updated after Jon Hunter's PRCM IRQ rework by Kevin Hilman.
Signed-off-by: Paul Walmsley <paul@pwsan.com>
Signed-off-by: Kevin Hilman <khilman@deeprootsystems.com>
PM_WKST register contents should be ANDed with the contents of the
MPUGRPSEL registers. Otherwise the MPU PRCM interrupt handler could
wind up clearing wakeup events meant for the IVA PRCM interrupt
handler. A future revision to this code should be to read a cached
version of MPUGRPSEL from the powerdomain code, since PRM reads are
relatively slow.
Updated after Jon Hunter's PRCM IRQ change by Kevin Hilman
Signed-off-by: Paul Walmsley <paul@pwsan.com>
Signed-off-by: Kevin Hilman <khilman@deeprootsystems.com>
There are two scenarios where a race condition could result in a hang
in the prcm_interrupt handler. These are:
1). Waiting for PRM_IRQSTATUS_MPU register to clear.
Bit 0 of the PRM_IRQSTATUS_MPU register indicates that a wake-up event
is pending for the MPU. This bit can only be cleared if the all the
wake-up events latched in the various PM_WKST_x registers have been
cleared. If a wake-up event occurred during the processing of the prcm
interrupt handler, after the corresponding PM_WKST_x register was
checked but before the PRM_IRQSTATUS_MPU was cleared, then the CPU
would be stuck forever waiting for bit 0 in PRM_IRQSTATUS_MPU to be
cleared.
2). Waiting for the PM_WKST_x register to clear.
Some power domains have more than one wake-up source. The PM_WKST_x
registers indicate the source of a wake-up event and need to be cleared
after a wake-up event occurs. When the PM_WKST_x registers are read and
before they are cleared, it is possible that another wake-up event
could occur causing another bit to be set in one of the PM_WKST_x
registers. If this did occur after reading a PM_WKST_x register then
the CPU would miss this event and get stuck forever in a loop waiting
for that PM_WKST_x register to clear.
This patch address the above race conditions that would result in a
hang.
Signed-off-by: Jon Hunter <jon-hunter@ti.com>
Reviewed-by: Paul Walmsley <paul@pwsan.com>
Signed-off-by: Kevin Hilman <khilman@deeprootsystems.com>
Previous patch d63678d607d0e37ec7abe5ceb545d7e8aab956a4 clear
it for noMMU kernel. This one do it for MMU.
Correct noMMU version
Signed-off-by: Michal Simek <monstr@monstr.eu>
this is needed for kvm if it want ksm to directly map pages into its
shadow page tables.
[marcelo: cast pfn assignment to u64]
Signed-off-by: Izik Eidus <ieidus@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
this flag notify that the host physical page we are pointing to from
the spte is write protected, and therefore we cant change its access
to be write unless we run get_user_pages(write = 1).
(this is needed for change_pte support in kvm)
Signed-off-by: Izik Eidus <ieidus@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
When using mmu notifiers, we are allowed to remove the page count
reference tooken by get_user_pages to a specific page that is mapped
inside the shadow page tables.
This is needed so we can balance the pagecount against mapcount
checking.
(Right now kvm increase the pagecount and does not increase the
mapcount when mapping page into shadow page table entry,
so when comparing pagecount against mapcount, you have no
reliable result.)
Signed-off-by: Izik Eidus <ieidus@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
The number of entries is multiplied by the entry size, which can
overflow on 32-bit hosts. Bound the entry count instead.
Reported-by: David Wagner <daw@cs.berkeley.edu>
Cc: stable@kernel.org
Signed-off-by: Avi Kivity <avi@redhat.com>
It is possible that stale EPTP-tagged mappings are used, if a
vcpu migrates to a different pcpu.
Set KVM_REQ_TLB_FLUSH in vmx_vcpu_load, when switching pcpus, which
will invalidate both VPID and EPT mappings on the next vm-entry.
Cc: stable@kernel.org
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
commit 628eb9b8a8
KVM: s390: streamline memslot handling
introduced kvm_s390_vcpu_get_memsize. This broke guests >=4G, since this
function returned an int.
This patch changes the return value to a long.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
When running nested we need to touch the l1 guests
tsc_offset. Otherwise changes will be lost or a wrong value
be read.
Cc: stable@kernel.org
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
When svm_vcpu_load is called while the vcpu is running in
guest mode the tsc adjustment made there is lost on the next
emulated #vmexit. This causes the tsc running backwards in
the guest. This patch fixes the issue by also adjusting the
tsc_offset in the emulated hsave area so that it will not
get lost.
Cc: stable@kernel.org
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
This patch fixes the m32r SMP kernel after 2.6.27.
A part of the following patch breaks m32r SMP operation.
> m32r: convert to generic helpers for IPI function calls
> commit 7b7426c8a6
In the above patch, a CALL_FUNC_SINGLE_IPI was newly introduced,
but the its IPI vector number was wrong in the patch code.
The m32r SMP kernel hanged-up during boot operation, because
the CPU_BOOT_IPI was called instead of CALL_FUNC_SINGLE_IPI
(CPU_BOOT_IPI had no side effect at that time because the 2nd
core had already been started up),
as a result, csd_unlock() was not called, then a dead lock
occurred in csd_lock_wait() after the detection of Compact Flash
memory as IDE generic disk.
Signed-off-by: Toshihiro HANAWA <hanawa@ccs.tsukuba.ac.jp>
Signed-off-by: Hirokazu Takata <takata@linux-m32r.org>
In case CONFIG_DISCONTIGMEM is set, the memory size of system was
always determined by CONFIG_MEMORY_SIZE and was not changeable.
This patch fixes set_memory() of arch/m32r/mm/discontig.c so that
we can specify memory size by the "mem=<size>" kernel parameter.
Signed-off-by: Hirokazu Takata <takata@linux-m32r.org>
Define ioread* and iowrite* macros to fix the following build errors:
CC [M] drivers/uio/uio_smx.o
drivers/uio/uio_smx.c: In function 'smx_handler':
drivers/uio/uio_smx.c:31: error: implicit declaration of function 'ioread32'
drivers/uio/uio_smx.c:37: error: implicit declaration of function 'iowrite32'
Signed-off-by: Hirokazu Takata <takata@linux-m32r.org>
Fix pmd_bad check code of tme_handler (TLB Miss Exception handler).
The correct _KERNPG_TABLE value is not 0x263(=611) but 0x163.
Signed-off-by: Hirokazu Takata <takata@linux-m32r.org>
Currently, on ARMv6 and ARMv7, if an application tries to execute
code (or garbage) on non-executable page it hangs. It caused by
incorrect prefetch abort handling. Now every prefetch abort
processes as a translation fault.
To fix this we have to analyze instruction fault status register
to figure out reason why we've got the abort and process it
accordingly.
To make IFSR different from DFSR we set bit 31 which is reserved in
both IFSR and DFSR.
This patch also tries to protect from future hangs on unexpected
exceptions. An application will be killed if unexpected exception
type was received.
Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Instruction fault status register, IFSR, was introduced on ARMv6 to
provide status information about the last insturction fault. It
needed for proper prefetch abort handling.
Now we have three prefetch abort model:
* legacy - for CPUs before ARMv6. They doesn't provide neither
IFSR nor IFAR. We simulate IFSR with section translation fault
status for them to generalize code;
* ARMv6 - provides IFSR, but not IFAR;
* ARMv7 - provides both IFSR and IFAR.
Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Commit 1522ac3ec9
("Fix virtual to physical translation macro corner cases")
breaks the end of memory check in valid_phys_addr_range().
The modified expression results in the apparent /dev/mem size
being 2 bytes smaller than what it actually is.
This patch reworks the expression to correctly check the address,
while maintaining use of a valid address to __pa().
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>