Commit Graph

1760 Commits

Author SHA1 Message Date
Richard Weinberger
a1850e9c72 um: Get rid of thread_struct->saved_task
As we have a sane show_stack() now, we can drop
the ->saved_task hack.

Signed-off-by: Richard Weinberger <richard@nod.at>
2013-11-17 11:27:35 +01:00
Richard Weinberger
f72c22e45e um: Make stack trace reliable against kernel mode faults
As UML uses an alternative signal stack we cannot use
the current stack pointer for stack dumping if UML itself
dies by SIGSEGV. To bypass this issue we save regs taken
from mcontext in our segv handler into thread_struct and
use these regs to obtain the stack pointer in show_stack().

Signed-off-by: Richard Weinberger <richard@nod.at>
2013-11-17 11:27:30 +01:00
Richard Weinberger
9d1ee8ce92 um: Rewrite show_stack()
Currently on UML stack traces are not very reliable and both
x86 and x86_64 have their on implementations.
This patch unifies both and adds support to outline unreliable
functions calls.

Signed-off-by: Richard Weinberger <richard@nod.at>
2013-11-17 11:27:25 +01:00
Kirill A. Shutemov
647f884e67 um: handle pgtable_page_ctor() fail
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-15 09:32:19 +09:00
Thomas Gleixner
00d1a39e69 preempt: Make PREEMPT_ACTIVE generic
No point in having this bit defined by architecture.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20130917183629.090698799@linutronix.de
2013-11-13 20:21:47 +01:00
Ingo Molnar
fb10d5b7ef Merge branch 'linus' into sched/core
Resolve cherry-picking conflicts:

Conflicts:
	mm/huge_memory.c
	mm/memory.c
	mm/mprotect.c

See this upstream merge commit for more details:

  52469b4fcd Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-01 08:24:41 +01:00
Dan Carpenter
201f99f170 uml: check length in exitcode_proc_write()
We don't cap the size of buffer from the user so we could write past the
end of the array here.  Only root can write to this file.

Reported-by: Nico Golde <nico@ngolde.de>
Reported-by: Fabian Yamaguchi <fabs@goesec.de>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-10-30 12:24:49 -07:00
Peter Zijlstra
a787870924 sched, arch: Create asm/preempt.h
In order to prepare to per-arch implementations of preempt_count move
the required bits into an asm-generic header and use this for all
archs.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-h5j0c1r3e3fk015m30h8f1zx@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-09-25 14:07:50 +02:00
Martin Schwidefsky
0244ad004a Remove GENERIC_HARDIRQ config option
After the last architecture switched to generic hard irqs the config
options HAVE_GENERIC_HARDIRQS & GENERIC_HARDIRQS and the related code
for !CONFIG_GENERIC_HARDIRQS can be removed.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-09-13 15:09:52 +02:00
Johannes Weiner
759496ba64 arch: mm: pass userspace fault flag to generic fault handler
Unlike global OOM handling, memory cgroup code will invoke the OOM killer
in any OOM situation because it has no way of telling faults occuring in
kernel context - which could be handled more gracefully - from
user-triggered faults.

Pass a flag that identifies faults originating in user space from the
architecture-specific fault handlers to generic code so that memcg OOM
handling can be improved.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Cc: David Rientjes <rientjes@google.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: azurIt <azurit@pobox.sk>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-12 15:38:01 -07:00
Johannes Weiner
871341023c arch: mm: do not invoke OOM killer on kernel fault OOM
Kernel faults are expected to handle OOM conditions gracefully (gup,
uaccess etc.), so they should never invoke the OOM killer.  Reserve this
for faults triggered in user context when it is the only option.

Most architectures already do this, fix up the remaining few.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: azurIt <azurit@pobox.sk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-12 15:38:01 -07:00
Richard Weinberger
81bab4c38a um: Add irq chip um/mask handlers
These handlers are not optional and need in our case
dummy implementions to avoid NULL pointer bugs within
the irq core code.

Reported-and-tested-by: Toralf Foester <toralf.foerster@gmx.de>
Signed-off-by: Richard Weinberger <richard@nod.at>
2013-09-07 10:57:19 +02:00
Richard Weinberger
250127216d um: Run UML in it's own session.
If UML is not run by a shell it can happen that UML
will kill unrelated proceses upon a fatal exit because
it issues a kill(0, ...).
To prevent such oddities we create a new session in main().

Reported-and-tested-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2013-09-07 10:57:07 +02:00
Richard Weinberger
91d44ff860 um: Cleanup SIGTERM handling
Richard reported that some UML processes survive if the UML
main process receives a SIGTERM.
This issue was caused by a wrongly placed signal(SIGTERM, SIG_DFL)
in init_new_thread_signals().
It disabled the UML exit handler accidently for some processes.
The correct solution is to disable the fatal handler for all
UML helper threads/processes.
Such that last_ditch_exit() does not get called multiple times
and all processes can exit due to SIGTERM.

Reported-and-tested-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2013-09-07 10:56:58 +02:00
Richard Weinberger
bc1d72e73b um: ubd: Introduce submit_request()
Just a clean-up patch to remove the open coded
variants and to ensure that all requests are submitted the
same way.

Signed-off-by: Richard Weinberger <richard@nod.at>
2013-09-07 10:56:55 +02:00
Richard Weinberger
805f11a0d5 um: ubd: Add REQ_FLUSH suppport
UML's block device driver does not support write barriers,
to support this this patch adds REQ_FLUSH suppport.
Every time the block layer sends a REQ_FLUSH we fsync() now
our backing file to guarantee data consistency.

Reported-and-tested-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2013-09-07 10:56:49 +02:00
Richard Weinberger
f75b1b1bed um: Implement probe_kernel_read()
UML needs it's own probe_kernel_read() to handle kernel
mode faults correctly.
The implementation uses mincore() on the host side to detect
whether a page is owned by the UML kernel process.

This fixes also a possible crash when sysrq-t is used.
Starting with 3.10 sysrq-t calls probe_kernel_read() to
read details from the kernel workers. As kernel worker are
completely async pointers may turn NULL while reading them.

Cc: <stian@nixia.no>
Cc: <tj@kernel.org>
Cc: <stable@vger.kernel.org> # 3.10.x
Signed-off-by: Richard Weinberger <richard@nod.at>
2013-09-07 10:38:34 +02:00
Linus Torvalds
2b047252d0 Fix TLB gather virtual address range invalidation corner cases
Ben Tebulin reported:

 "Since v3.7.2 on two independent machines a very specific Git
  repository fails in 9/10 cases on git-fsck due to an SHA1/memory
  failures.  This only occurs on a very specific repository and can be
  reproduced stably on two independent laptops.  Git mailing list ran
  out of ideas and for me this looks like some very exotic kernel issue"

and bisected the failure to the backport of commit 53a59fc67f ("mm:
limit mmu_gather batching to fix soft lockups on !CONFIG_PREEMPT").

That commit itself is not actually buggy, but what it does is to make it
much more likely to hit the partial TLB invalidation case, since it
introduces a new case in tlb_next_batch() that previously only ever
happened when running out of memory.

The real bug is that the TLB gather virtual memory range setup is subtly
buggered.  It was introduced in commit 597e1c3580 ("mm/mmu_gather:
enable tlb flush range in generic mmu_gather"), and the range handling
was already fixed at least once in commit e6c495a96c ("mm: fix the TLB
range flushed when __tlb_remove_page() runs out of slots"), but that fix
was not complete.

The problem with the TLB gather virtual address range is that it isn't
set up by the initial tlb_gather_mmu() initialization (which didn't get
the TLB range information), but it is set up ad-hoc later by the
functions that actually flush the TLB.  And so any such case that forgot
to update the TLB range entries would potentially miss TLB invalidates.

Rather than try to figure out exactly which particular ad-hoc range
setup was missing (I personally suspect it's the hugetlb case in
zap_huge_pmd(), which didn't have the same logic as zap_pte_range()
did), this patch just gets rid of the problem at the source: make the
TLB range information available to tlb_gather_mmu(), and initialize it
when initializing all the other tlb gather fields.

This makes the patch larger, but conceptually much simpler.  And the end
result is much more understandable; even if you want to play games with
partial ranges when invalidating the TLB contents in chunks, now the
range information is always there, and anybody who doesn't want to
bother with it won't introduce subtle bugs.

Ben verified that this fixes his problem.

Reported-bisected-and-tested-by: Ben Tebulin <tebulin@googlemail.com>
Build-testing-by: Stephen Rothwell <sfr@canb.auug.org.au>
Build-testing-by: Richard Weinberger <richard.weinberger@gmail.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-08-16 08:52:46 -07:00
Richard Weinberger
9a8c135957 um: siginfo cleanup
Currently we use both struct siginfo and siginfo_t.
Let's use struct siginfo internally to avoid ongoing
compiler warning. We are allowed to do so because
struct siginfo and siginfo_t are equivalent.

Signed-off-by: Richard Weinberger <richard@nod.at>
2013-07-19 11:31:36 +02:00
Tristan Schmelcher
7473534130 uml: Fix which_tmpdir failure when /dev/shm is a symlink, and in other edge cases
which_tmpdir did the wrong thing if /dev/shm was a symlink (e.g., to /run/shm),
if there were multiple mounts on top of each other, if the mount(s) were
obscured by a later mount, or if /dev/shm was a prefix of another mount point.
This fixes these cases. Applies to 3.9.6.

Signed-off-by: Tristan Schmelcher <tschmelcher@google.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2013-07-19 10:45:18 +02:00
Richard Weinberger
0974a9cadc um: Fix wait_stub_done() error handling
If we die within a stub handler we only way to reliable
kill the (obviously) dying uml guest process is killing
it's host twin on the host side.

Signed-off-by: Richard Weinberger <richard@nod.at>
2013-07-19 10:45:18 +02:00
Richard Weinberger
dee20035b4 um: Mark stub pages mapping with VM_PFNMAP
Ensure that a process cannot destroy his stub pages with
using MADV_DONTNEED and friends.

Reported-by: toralf.foerster@gmx.de
Signed-off-by: Richard Weinberger <richard@nod.at>
2013-07-19 10:45:17 +02:00
Richard Weinberger
3179ce7254 um: Fix return value of strnlen_user()
In case of an error it must not return -EFAULT.
Return 0 like all other archs do.

Reported-by: toralf.foerster@gmx.de
Signed-off-by: Richard Weinberger <richard@nod.at>
2013-07-19 10:45:17 +02:00
Jiang Liu
715ee35653 mm/um: prepare for removing num_physpages and simplify mem_init()
Prepare for removing num_physpages and simplify mem_init().

Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-07-03 16:07:37 -07:00
Jiang Liu
a214a8c68b UML: normalize global variables exported by vmlinux.lds
Normalize global variables exported by vmlinux.lds to conform usage
guidelines from include/asm-generic/sections.h.

1) Use _text to mark the start of the kernel image including the head
text, and _stext to mark the start of the .text section.
2) Export mandatory global variables __bss_stop.
3) Adjust __init_begin and __init_end to avoid acrossing .text and
   .data sections.

Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-07-03 16:07:34 -07:00
Jiang Liu
0c98853473 mm: concentrate modification of totalram_pages into the mm core
Concentrate code to modify totalram_pages into the mm core, so the arch
memory initialized code doesn't need to take care of it.  With these
changes applied, only following functions from mm core modify global
variable totalram_pages: free_bootmem_late(), free_all_bootmem(),
free_all_bootmem_node(), adjust_managed_page_count().

With this patch applied, it will be much more easier for us to keep
totalram_pages and zone->managed_pages in consistence.

Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Acked-by: David Howells <dhowells@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: <sworddragon2@aol.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Jianguo Wu <wujianguo@huawei.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Michel Lespinasse <walken@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-07-03 16:07:33 -07:00
Jiang Liu
dbe67df4ba mm: enhance free_reserved_area() to support poisoning memory with zero
Address more review comments from last round of code review.
1) Enhance free_reserved_area() to support poisoning freed memory with
   pattern '0'. This could be used to get rid of poison_init_mem()
   on ARM64.
2) A previous patch has disabled memory poison for initmem on s390
   by mistake, so restore to the original behavior.
3) Remove redundant PAGE_ALIGN() when calling free_reserved_area().

Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: <sworddragon2@aol.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Jianguo Wu <wujianguo@huawei.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Michel Lespinasse <walken@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-07-03 16:07:32 -07:00
Jiang Liu
11199692d8 mm: change signature of free_reserved_area() to fix building warnings
Change signature of free_reserved_area() according to Russell King's
suggestion to fix following build warnings:

  arch/arm/mm/init.c: In function 'mem_init':
  arch/arm/mm/init.c:603:2: warning: passing argument 1 of 'free_reserved_area' makes integer from pointer without a cast [enabled by default]
    free_reserved_area(__va(PHYS_PFN_OFFSET), swapper_pg_dir, 0, NULL);
    ^
  In file included from include/linux/mman.h:4:0,
                   from arch/arm/mm/init.c:15:
  include/linux/mm.h:1301:22: note: expected 'long unsigned int' but argument is of type 'void *'
   extern unsigned long free_reserved_area(unsigned long start, unsigned long end,

   mm/page_alloc.c: In function 'free_reserved_area':
>> mm/page_alloc.c:5134:3: warning: passing argument 1 of 'virt_to_phys' makes pointer from integer without a cast [enabled by default]
   In file included from arch/mips/include/asm/page.h:49:0,
                    from include/linux/mmzone.h:20,
                    from include/linux/gfp.h:4,
                    from include/linux/mm.h:8,
                    from mm/page_alloc.c:18:
   arch/mips/include/asm/io.h:119:29: note: expected 'const volatile void *' but argument is of type 'long unsigned int'
   mm/page_alloc.c: In function 'free_area_init_nodes':
   mm/page_alloc.c:5030:34: warning: array subscript is below array bounds [-Warray-bounds]

Also address some minor code review comments.

Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Reported-by: Arnd Bergmann <arnd@arndb.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: <sworddragon2@aol.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Jianguo Wu <wujianguo@huawei.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Michel Lespinasse <walken@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-07-03 16:07:32 -07:00
Linus Torvalds
2d722f6d56 Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
 "The main changes:

   - load-calculation cleanups and improvements, by Alex Shi
   - various nohz related tidying up of statisics, by Frederic
     Weisbecker
   - factor out /proc functions to kernel/sched/proc.c, by Paul
     Gortmaker
   - simplify the RT policy scheduler, by Kirill Tkhai
   - various fixes and cleanups"

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (42 commits)
  sched/debug: Remove CONFIG_FAIR_GROUP_SCHED mask
  sched/debug: Fix formatting of /proc/<PID>/sched
  sched: Fix typo in struct sched_avg member description
  sched/fair: Fix typo describing flags in enqueue_entity
  sched/debug: Add load-tracking statistics to task
  sched: Change get_rq_runnable_load() to static and inline
  sched/tg: Remove tg.load_weight
  sched/cfs_rq: Change atomic64_t removed_load to atomic_long_t
  sched/tg: Use 'unsigned long' for load variable in task group
  sched: Change cfs_rq load avg to unsigned long
  sched: Consider runnable load average in move_tasks()
  sched: Compute runnable load avg in cpu_load and cpu_avg_load_per_task
  sched: Update cpu load after task_tick
  sched: Fix sleep time double accounting in enqueue entity
  sched: Set an initial value of runnable avg for new forked task
  sched: Move a few runnable tg variables into CONFIG_SMP
  Revert "sched: Introduce temporary FAIR_GROUP_SCHED dependency for load-tracking"
  sched: Don't mix use of typedef ctl_table and struct ctl_table
  sched: Remove WARN_ON(!sd) from init_sched_groups_power()
  sched: Fix memory leakage in build_sched_groups()
  ...
2013-07-02 16:17:25 -07:00
Ingo Molnar
2fd1b48788 Linux 3.10
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iQEcBAABAgAGBQJR0K2gAAoJEHm+PkMAQRiGWsEH+gMZSN1qRm34hZ82q1Tx7HvL
 Eb/Gsl3Qw/7G2TlTqgjBUs36IdqV9O2cui/aa3/TfXvdvrx+0GlhRkEwQPc+ygcO
 Mvoyoke4tT4+4jVFdCg1J8avREsa28/6oaHs0ZZxuVmJBBLTJH7aXaNsGn6eU1q9
 9+p798MQis6naIiPC63somlZcCIiBhsuWCPWpEfLMn8G1HWAFTM3xXIbNBqe/brS
 bmIOfhomlIZ5dcdaXGvjtP3+KJhkNDwhkPC4tVYu8JqqgSlrE+a+EGyEuuGqKk10
 U+swiqyuD31uBI9ga54u/2FzSqDiAu6YOcMXevjo/m3g9XLdYbYLvN+nvN8alCQ=
 =Ob6Z
 -----END PGP SIGNATURE-----

Merge tag 'v3.10' into sched/core

Merge in a recent upstream commit:

  c2853c8df5 include/linux/math64.h: add div64_ul()

because:

  72a4cf20cb sched: Change cfs_rq load avg to unsigned long

relies on it.

[ We don't rebase sched/core for this, because the handful of
  followup commits after the broken commit are not behavioral
  changes so are unlikely to be needed during bisection. ]

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-07-01 11:18:53 +02:00
Al Viro
40d158e618 consolidate io_remap_pfn_range definitions
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29 12:46:35 +04:00
Viresh Kumar
0a0fca9d83 sched: Rename sched.c as sched/core.c in comments and Documentation
Most of the stuff from kernel/sched.c was moved to kernel/sched/core.c long time
back and the comments/Documentation never got updated.

I figured it out when I was going through sched-domains.txt and so thought of
fixing it globally.

I haven't crossed check if the stuff that is referenced in sched/core.c by all
these files is still present and hasn't changed as that wasn't the motive behind
this patch.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/cdff76a265326ab8d71922a1db5be599f20aad45.1370329560.git.viresh.kumar@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-06-19 12:58:42 +02:00
Al Viro
c9036e9f3b mconsole: we'd better initialize pos before passing it to vfs_read()...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-19 12:37:57 +04:00
Linus Torvalds
292088ee03 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull more vfs updates from Al Viro:
 "A couple of fixes + getting rid of __blkdev_put() return value"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  proc: Use PDE attribute setting accessor functions
  make blkdev_put() return void
  block_device_operations->release() should return void
  mtd_blktrans_ops->release() should return void
  hfs: SMP race on directory close()
2013-05-07 15:14:53 -07:00
Al Viro
db2a144bed block_device_operations->release() should return void
The value passed is 0 in all but "it can never happen" cases (and those
only in a couple of drivers) *and* it would've been lost on the way
out anyway, even if something tried to pass something meaningful.
Just don't bother.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-05-07 02:16:21 -04:00
Linus Torvalds
534c97b095 Merge branch 'timers-nohz-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull 'full dynticks' support from Ingo Molnar:
 "This tree from Frederic Weisbecker adds a new, (exciting! :-) core
  kernel feature to the timer and scheduler subsystems: 'full dynticks',
  or CONFIG_NO_HZ_FULL=y.

  This feature extends the nohz variable-size timer tick feature from
  idle to busy CPUs (running at most one task) as well, potentially
  reducing the number of timer interrupts significantly.

  This feature got motivated by real-time folks and the -rt tree, but
  the general utility and motivation of full-dynticks runs wider than
  that:

   - HPC workloads get faster: CPUs running a single task should be able
     to utilize a maximum amount of CPU power.  A periodic timer tick at
     HZ=1000 can cause a constant overhead of up to 1.0%.  This feature
     removes that overhead - and speeds up the system by 0.5%-1.0% on
     typical distro configs even on modern systems.

   - Real-time workload latency reduction: CPUs running critical tasks
     should experience as little jitter as possible.  The last remaining
     source of kernel-related jitter was the periodic timer tick.

   - A single task executing on a CPU is a pretty common situation,
     especially with an increasing number of cores/CPUs, so this feature
     helps desktop and mobile workloads as well.

  The cost of the feature is mainly related to increased timer
  reprogramming overhead when a CPU switches its tick period, and thus
  slightly longer to-idle and from-idle latency.

  Configuration-wise a third mode of operation is added to the existing
  two NOHZ kconfig modes:

   - CONFIG_HZ_PERIODIC: [formerly !CONFIG_NO_HZ], now explicitly named
     as a config option.  This is the traditional Linux periodic tick
     design: there's a HZ tick going on all the time, regardless of
     whether a CPU is idle or not.

   - CONFIG_NO_HZ_IDLE: [formerly CONFIG_NO_HZ=y], this turns off the
     periodic tick when a CPU enters idle mode.

   - CONFIG_NO_HZ_FULL: this new mode, in addition to turning off the
     tick when a CPU is idle, also slows the tick down to 1 Hz (one
     timer interrupt per second) when only a single task is running on a
     CPU.

  The .config behavior is compatible: existing !CONFIG_NO_HZ and
  CONFIG_NO_HZ=y settings get translated to the new values, without the
  user having to configure anything.  CONFIG_NO_HZ_FULL is turned off by
  default.

  This feature is based on a lot of infrastructure work that has been
  steadily going upstream in the last 2-3 cycles: related RCU support
  and non-periodic cputime support in particular is upstream already.

  This tree adds the final pieces and activates the feature.  The pull
  request is marked RFC because:

   - it's marked 64-bit only at the moment - the 32-bit support patch is
     small but did not get ready in time.

   - it has a number of fresh commits that came in after the merge
     window.  The overwhelming majority of commits are from before the
     merge window, but still some aspects of the tree are fresh and so I
     marked it RFC.

   - it's a pretty wide-reaching feature with lots of effects - and
     while the components have been in testing for some time, the full
     combination is still not very widely used.  That it's default-off
     should reduce its regression abilities and obviously there are no
     known regressions with CONFIG_NO_HZ_FULL=y enabled either.

   - the feature is not completely idempotent: there is no 100%
     equivalent replacement for a periodic scheduler/timer tick.  In
     particular there's ongoing work to map out and reduce its effects
     on scheduler load-balancing and statistics.  This should not impact
     correctness though, there are no known regressions related to this
     feature at this point.

   - it's a pretty ambitious feature that with time will likely be
     enabled by most Linux distros, and we'd like you to make input on
     its design/implementation, if you dislike some aspect we missed.
     Without flaming us to crisp! :-)

  Future plans:

   - there's ongoing work to reduce 1Hz to 0Hz, to essentially shut off
     the periodic tick altogether when there's a single busy task on a
     CPU.  We'd first like 1 Hz to be exposed more widely before we go
     for the 0 Hz target though.

   - once we reach 0 Hz we can remove the periodic tick assumption from
     nr_running>=2 as well, by essentially interrupting busy tasks only
     as frequently as the sched_latency constraints require us to do -
     once every 4-40 msecs, depending on nr_running.

  I am personally leaning towards biting the bullet and doing this in
  v3.10, like the -rt tree this effort has been going on for too long -
  but the final word is up to you as usual.

  More technical details can be found in Documentation/timers/NO_HZ.txt"

* 'timers-nohz-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (39 commits)
  sched: Keep at least 1 tick per second for active dynticks tasks
  rcu: Fix full dynticks' dependency on wide RCU nocb mode
  nohz: Protect smp_processor_id() in tick_nohz_task_switch()
  nohz_full: Add documentation.
  cputime_nsecs: use math64.h for nsec resolution conversion helpers
  nohz: Select VIRT_CPU_ACCOUNTING_GEN from full dynticks config
  nohz: Reduce overhead under high-freq idling patterns
  nohz: Remove full dynticks' superfluous dependency on RCU tree
  nohz: Fix unavailable tick_stop tracepoint in dynticks idle
  nohz: Add basic tracing
  nohz: Select wide RCU nocb for full dynticks
  nohz: Disable the tick when irq resume in full dynticks CPU
  nohz: Re-evaluate the tick for the new task after a context switch
  nohz: Prepare to stop the tick on irq exit
  nohz: Implement full dynticks kick
  nohz: Re-evaluate the tick from the scheduler IPI
  sched: New helper to prevent from stopping the tick in full dynticks
  sched: Kick full dynticks CPU that have more than one task enqueued.
  perf: New helper to prevent full dynticks CPUs from stopping tick
  perf: Kick full dynticks CPU if events rotation is needed
  ...
2013-05-05 13:23:27 -07:00
Frederic Weisbecker
c032862fba Merge commit '8700c95adb03' into timers/nohz
The full dynticks tree needs the latest RCU and sched
upstream updates in order to fix some dependencies.

Merge a common upstream merge point that has these
updates.

Conflicts:
	include/linux/perf_event.h
	kernel/rcutree.h
	kernel/rcutree_plugin.h

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2013-05-02 17:54:19 +02:00
Linus Torvalds
20b4fb4852 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull VFS updates from Al Viro,

Misc cleanups all over the place, mainly wrt /proc interfaces (switch
create_proc_entry to proc_create(), get rid of the deprecated
create_proc_read_entry() in favor of using proc_create_data() and
seq_file etc).

7kloc removed.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (204 commits)
  don't bother with deferred freeing of fdtables
  proc: Move non-public stuff from linux/proc_fs.h to fs/proc/internal.h
  proc: Make the PROC_I() and PDE() macros internal to procfs
  proc: Supply a function to remove a proc entry by PDE
  take cgroup_open() and cpuset_open() to fs/proc/base.c
  ppc: Clean up scanlog
  ppc: Clean up rtas_flash driver somewhat
  hostap: proc: Use remove_proc_subtree()
  drm: proc: Use remove_proc_subtree()
  drm: proc: Use minor->index to label things, not PDE->name
  drm: Constify drm_proc_list[]
  zoran: Don't print proc_dir_entry data in debug
  reiserfs: Don't access the proc_dir_entry in r_open(), r_start() r_show()
  proc: Supply an accessor for getting the data from a PDE's parent
  airo: Use remove_proc_subtree()
  rtl8192u: Don't need to save device proc dir PDE
  rtl8187se: Use a dir under /proc/net/r8180/
  proc: Add proc_mkdir_data()
  proc: Move some bits from linux/proc_fs.h to linux/{of.h,signal.h,tty.h}
  proc: Move PDE_NET() to fs/proc/proc_net.c
  ...
2013-05-01 17:51:54 -07:00
Tejun Heo
a43cb95d54 dump_stack: unify debug information printed by show_regs()
show_regs() is inherently arch-dependent but it does make sense to print
generic debug information and some archs already do albeit in slightly
different forms.  This patch introduces a generic function to print debug
information from show_regs() so that different archs print out the same
information and it's much easier to modify what's printed.

show_regs_print_info() prints out the same debug info as dump_stack()
does plus task and thread_info pointers.

* Archs which didn't print debug info now do.

  alpha, arc, blackfin, c6x, cris, frv, h8300, hexagon, ia64, m32r,
  metag, microblaze, mn10300, openrisc, parisc, score, sh64, sparc,
  um, xtensa

* Already prints debug info.  Replaced with show_regs_print_info().
  The printed information is superset of what used to be there.

  arm, arm64, avr32, mips, powerpc, sh32, tile, unicore32, x86

* s390 is special in that it used to print arch-specific information
  along with generic debug info.  Heiko and Martin think that the
  arch-specific extra isn't worth keeping s390 specfic implementation.
  Converted to use the generic version.

Note that now all archs print the debug info before actual register
dumps.

An example BUG() dump follows.

 kernel BUG at /work/os/work/kernel/workqueue.c:4841!
 invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
 Modules linked in:
 CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.9.0-rc1-work+ #7
 Hardware name: empty empty/S3992, BIOS 080011  10/26/2007
 task: ffff88007c85e040 ti: ffff88007c860000 task.ti: ffff88007c860000
 RIP: 0010:[<ffffffff8234a07e>]  [<ffffffff8234a07e>] init_workqueues+0x4/0x6
 RSP: 0000:ffff88007c861ec8  EFLAGS: 00010246
 RAX: ffff88007c861fd8 RBX: ffffffff824466a8 RCX: 0000000000000001
 RDX: 0000000000000046 RSI: 0000000000000001 RDI: ffffffff8234a07a
 RBP: ffff88007c861ec8 R08: 0000000000000000 R09: 0000000000000000
 R10: 0000000000000001 R11: 0000000000000000 R12: ffffffff8234a07a
 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
 FS:  0000000000000000(0000) GS:ffff88007dc00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
 CR2: ffff88015f7ff000 CR3: 00000000021f1000 CR4: 00000000000007f0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
 Stack:
  ffff88007c861ef8 ffffffff81000312 ffffffff824466a8 ffff88007c85e650
  0000000000000003 0000000000000000 ffff88007c861f38 ffffffff82335e5d
  ffff88007c862080 ffffffff8223d8c0 ffff88007c862080 ffffffff81c47760
 Call Trace:
  [<ffffffff81000312>] do_one_initcall+0x122/0x170
  [<ffffffff82335e5d>] kernel_init_freeable+0x9b/0x1c8
  [<ffffffff81c47760>] ? rest_init+0x140/0x140
  [<ffffffff81c4776e>] kernel_init+0xe/0xf0
  [<ffffffff81c6be9c>] ret_from_fork+0x7c/0xb0
  [<ffffffff81c47760>] ? rest_init+0x140/0x140
  ...

v2: Typo fix in x86-32.

v3: CPU number dropped from show_regs_print_info() as
    dump_stack_print_info() has been updated to print it.  s390
    specific implementation dropped as requested by s390 maintainers.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Mike Frysinger <vapier@gentoo.org>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Acked-by: Chris Metcalf <cmetcalf@tilera.com>		[tile bits]
Acked-by: Richard Kuo <rkuo@codeaurora.org>		[hexagon bits]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-30 17:04:02 -07:00
Tejun Heo
196779b9b4 dump_stack: consolidate dump_stack() implementations and unify their behaviors
Both dump_stack() and show_stack() are currently implemented by each
architecture.  show_stack(NULL, NULL) dumps the backtrace for the
current task as does dump_stack().  On some archs, dump_stack() prints
extra information - pid, utsname and so on - in addition to the
backtrace while the two are identical on other archs.

The usages in arch-independent code of the two functions indicate
show_stack(NULL, NULL) should print out bare backtrace while
dump_stack() is used for debugging purposes when something went wrong,
so it does make sense to print additional information on the task which
triggered dump_stack().

There's no reason to require archs to implement two separate but mostly
identical functions.  It leads to unnecessary subtle information.

This patch expands the dummy fallback dump_stack() implementation in
lib/dump_stack.c such that it prints out debug information (taken from
x86) and invokes show_stack(NULL, NULL) and drops arch-specific
dump_stack() implementations in all archs except blackfin.  Blackfin's
dump_stack() does something wonky that I don't understand.

Debug information can be printed separately by calling
dump_stack_print_info() so that arch-specific dump_stack()
implementation can still emit the same debug information.  This is used
in blackfin.

This patch brings the following behavior changes.

* On some archs, an extra level in backtrace for show_stack() could be
  printed.  This is because the top frame was determined in
  dump_stack() on those archs while generic dump_stack() can't do that
  reliably.  It can be compensated by inlining dump_stack() but not
  sure whether that'd be necessary.

* Most archs didn't use to print debug info on dump_stack().  They do
  now.

An example WARN dump follows.

 WARNING: at kernel/workqueue.c:4841 init_workqueues+0x35/0x505()
 Hardware name: empty
 Modules linked in:
 CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.9.0-rc1-work+ #9
  0000000000000009 ffff88007c861e08 ffffffff81c614dc ffff88007c861e48
  ffffffff8108f50f ffffffff82228240 0000000000000040 ffffffff8234a03c
  0000000000000000 0000000000000000 0000000000000000 ffff88007c861e58
 Call Trace:
  [<ffffffff81c614dc>] dump_stack+0x19/0x1b
  [<ffffffff8108f50f>] warn_slowpath_common+0x7f/0xc0
  [<ffffffff8108f56a>] warn_slowpath_null+0x1a/0x20
  [<ffffffff8234a071>] init_workqueues+0x35/0x505
  ...

v2: CPU number added to the generic debug info as requested by s390
    folks and dropped the s390 specific dump_stack().  This loses %ksp
    from the debug message which the maintainers think isn't important
    enough to keep the s390-specific dump_stack() implementation.

    dump_stack_print_info() is moved to kernel/printk.c from
    lib/dump_stack.c.  Because linkage is per objecct file,
    dump_stack_print_info() living in the same lib file as generic
    dump_stack() means that archs which implement custom dump_stack()
    - at this point, only blackfin - can't use dump_stack_print_info()
    as that will bring in the generic version of dump_stack() too.  v1
    The v1 patch broke build on blackfin due to this issue.  The build
    breakage was reported by Fengguang Wu.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	[s390 bits]
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Mike Frysinger <vapier@gentoo.org>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Acked-by: Richard Kuo <rkuo@codeaurora.org>		[hexagon bits]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-30 17:04:02 -07:00
Linus Torvalds
8700c95adb Merge branch 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull SMP/hotplug changes from Ingo Molnar:
 "This is a pretty large, multi-arch series unifying and generalizing
  the various disjunct pieces of idle routines that architectures have
  historically copied from each other and have grown in random, wildly
  inconsistent and sometimes buggy directions:

   101 files changed, 455 insertions(+), 1328 deletions(-)

  this went through a number of review and test iterations before it was
  committed, it was tested on various architectures, was exposed to
  linux-next for quite some time - nevertheless it might cause problems
  on architectures that don't read the mailing lists and don't regularly
  test linux-next.

  This cat herding excercise was motivated by the -rt kernel, and was
  brought to you by Thomas "the Whip" Gleixner."

* 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (40 commits)
  idle: Remove GENERIC_IDLE_LOOP config switch
  um: Use generic idle loop
  ia64: Make sure interrupts enabled when we "safe_halt()"
  sparc: Use generic idle loop
  idle: Remove unused ARCH_HAS_DEFAULT_IDLE
  bfin: Fix typo in arch_cpu_idle()
  xtensa: Use generic idle loop
  x86: Use generic idle loop
  unicore: Use generic idle loop
  tile: Use generic idle loop
  tile: Enter idle with preemption disabled
  sh: Use generic idle loop
  score: Use generic idle loop
  s390: Use generic idle loop
  powerpc: Use generic idle loop
  parisc: Use generic idle loop
  openrisc: Use generic idle loop
  mn10300: Use generic idle loop
  mips: Use generic idle loop
  microblaze: Use generic idle loop
  ...
2013-04-30 07:50:17 -07:00
Thomas Gleixner
d0380e6c3c early_printk: consolidate random copies of identical code
The early console implementations are the same all over the place.  Move
the print function to kernel/printk and get rid of the copies.

[akpm@linux-foundation.org: arch/mips/kernel/early_printk.c needs kernel.h for va_list]
[paul.gortmaker@windriver.com: sh4: make the bios early console support depend on EARLY_PRINTK]
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Russell King <linux@arm.linux.org.uk>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Richard Weinberger <richard@nod.at>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-29 18:28:13 -07:00
Jiang Liu
5b472b2057 mm/um: use free_highmem_page() to free highmem pages into buddy system
Use helper function free_highmem_page() to free highmem pages into
the buddy system.

Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-29 15:54:32 -07:00
Jiang Liu
e16e0b1b98 mm/um: use common help functions to free reserved pages
Use common help functions to free reserved pages.

Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-29 15:54:31 -07:00
Thomas Gleixner
d190e8195b idle: Remove GENERIC_IDLE_LOOP config switch
All archs are converted over. Remove the config switch and the
fallback code.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2013-04-17 10:39:38 +02:00
Richard Weinberger
8198c1696a um: Use generic idle loop
Signed-off-by: Richard Weinberger <richard@nod.at>
Cc: user-mode-linux-devel@lists.sourceforge.net
Link: http://lkml.kernel.org/r/1366149209-24787-1-git-send-email-richard@nod.at
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2013-04-17 10:39:38 +02:00
David Howells
4554eb9065 UM: Adjust printk in create_proc_mconsole()
Adjust printk in create_proc_mconsole() to reflect it is now using
proc_create() not create_proc_mconsole().

Signed-off-by: David Howells <dhowells@redhat.com>
2013-04-09 14:16:38 -04:00
Frederic Weisbecker
3451d0243c nohz: Rename CONFIG_NO_HZ to CONFIG_NO_HZ_COMMON
We are planning to convert the dynticks Kconfig options layout
into a choice menu. The user must be able to easily pick
any of the following implementations: constant periodic tick,
idle dynticks, full dynticks.

As this implies a mutual exclusion, the two dynticks implementions
need to converge on the selection of a common Kconfig option in order
to ease the sharing of a common infrastructure.

It would thus seem pretty natural to reuse CONFIG_NO_HZ to
that end. It already implements all the idle dynticks code
and the full dynticks depends on all that code for now.
So ideally the choice menu would propose CONFIG_NO_HZ_IDLE and
CONFIG_NO_HZ_EXTENDED then both would select CONFIG_NO_HZ.

On the other hand we want to stay backward compatible: if
CONFIG_NO_HZ is set in an older config file, we want to
enable CONFIG_NO_HZ_IDLE by default.

But we can't afford both at the same time or we run into
a circular dependency:

1) CONFIG_NO_HZ_IDLE and CONFIG_NO_HZ_EXTENDED both select
   CONFIG_NO_HZ
2) If CONFIG_NO_HZ is set, we default to CONFIG_NO_HZ_IDLE

We might be able to support that from Kconfig/Kbuild but it
may not be wise to introduce such a confusing behaviour.

So to solve this, create a new CONFIG_NO_HZ_COMMON option
which gathers the common code between idle and full dynticks
(that common code for now is simply the idle dynticks code)
and select it from their referring Kconfig.

Then we'll later create CONFIG_NO_HZ_IDLE and map CONFIG_NO_HZ
to it for backward compatibility.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Geoff Levand <geoff@infradead.org>
Cc: Gilad Ben Yossef <gilad@benyossef.com>
Cc: Hakan Akkan <hakanakkan@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Kevin Hilman <khilman@linaro.org>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
2013-04-03 13:56:03 +02:00
Greg Kroah-Hartman
23a376f98c Merge 3.9-rc3 into tty-next 2013-03-21 16:07:34 -07:00
Jiri Slaby
aa27a094e2 TTY: add tty_port_tty_hangup helper
It allows for cleaning up on a considerable amount of places. They did
port_get, hangup, kref_put. Now the only thing needed is to call
tty_port_tty_hangup which does exactly that. And they can also decide
whether to consider CLOCAL or completely ignore that.

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-03-18 16:24:29 -07:00
Jiri Slaby
6aad04f213 TTY: add tty_port_tty_wakeup helper
It allows for cleaning up on a considerable amount of places. They did
port_get, wakeup, kref_put. Now the only thing needed is to call
tty_port_tty_wakeup which does exactly that.

One exception is ifx6x60 where tty_wakeup was open-coded. We now call
tty_wakeup properly there.

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-03-18 16:19:45 -07:00
Richard Weinberger
2116bda6ad um: Use tty_port in SIGWINCH handler
The tty below tty_port might get destroyed by the tty layer
while we hold a reference to it.
So we have to carry tty_port around...

Signed-off-by: Richard Weinberger <richard@nod.at>
2013-03-11 10:08:04 +01:00
Richard Weinberger
cc4f02486c um: Use tty_port_operations->destruct
As we setup the SIGWINCH handler in tty_port_operations->activate
it makes sense to tear down it in ->destruct.

Signed-off-by: Richard Weinberger <richard@nod.at>
2013-03-11 10:08:03 +01:00
Sergei Trofimovich
72383d43b2 um: fix build failure due to mess-up of sig_info protorype
arch/um/os-Linux/signal.c:18:8: error: conflicting types for 'sig_info'
In file included from /home/slyfox/linux-2.6/arch/um/os-Linux/signal.c:12:0:
arch/um/include/shared/as-layout.h:64:15: note: previous declaration of 'sig_info' was here

Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org>
CC: Jeff Dike <jdike@addtoit.com>
CC: Richard Weinberger <richard@nod.at>
CC: "Martin Pärtel" <martin.partel@gmail.com>
CC: Al Viro <viro@zeniv.linux.org.uk>
CC: user-mode-linux-devel@lists.sourceforge.net
CC: user-mode-linux-user@lists.sourceforge.net
CC: linux-kernel@vger.kernel.org
Signed-off-by: Richard Weinberger <richard@nod.at>
2013-03-11 10:08:03 +01:00
Sergei Trofimovich
fdfa4c9528 um: add missing declaration of 'getrlimit()' and friends
arch/um/os-Linux/start_up.c: In function 'check_coredump_limit':
arch/um/os-Linux/start_up.c:338:16: error: storage size of 'lim' isn't known
arch/um/os-Linux/start_up.c:339:2: error: implicit declaration of function 'getrlimit' [-Werror=implicit-function-declaration]

Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org>
CC: Jeff Dike <jdike@addtoit.com>
CC: Richard Weinberger <richard@nod.at>
CC: Al Viro <viro@zeniv.linux.org.uk>
CC: user-mode-linux-devel@lists.sourceforge.net
CC: user-mode-linux-user@lists.sourceforge.net
CC: linux-kernel@vger.kernel.org
Signed-off-by: Richard Weinberger <richard@nod.at>
2013-03-11 10:08:03 +01:00
Paul Chavent
55ea1cfab2 net : enable tx time stamping in the vde driver.
This new version moves the skb_tx_timestamp in the main uml
driver. This should avoid the need to call this function in each
transport (vde, slirp, tuntap, ...). It also add support for ethtool
get_ts_info.

Signed-off-by: Paul Chavent <paul.chavent@onera.fr>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2013-03-11 10:08:02 +01:00
Linus Torvalds
9e2d59ad58 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal
Pull signal handling cleanups from Al Viro:
 "This is the first pile; another one will come a bit later and will
  contain SYSCALL_DEFINE-related patches.

   - a bunch of signal-related syscalls (both native and compat)
     unified.

   - a bunch of compat syscalls switched to COMPAT_SYSCALL_DEFINE
     (fixing several potential problems with missing argument
     validation, while we are at it)

   - a lot of now-pointless wrappers killed

   - a couple of architectures (cris and hexagon) forgot to save
     altstack settings into sigframe, even though they used the
     (uninitialized) values in sigreturn; fixed.

   - microblaze fixes for delivery of multiple signals arriving at once

   - saner set of helpers for signal delivery introduced, several
     architectures switched to using those."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal: (143 commits)
  x86: convert to ksignal
  sparc: convert to ksignal
  arm: switch to struct ksignal * passing
  alpha: pass k_sigaction and siginfo_t using ksignal pointer
  burying unused conditionals
  make do_sigaltstack() static
  arm64: switch to generic old sigaction() (compat-only)
  arm64: switch to generic compat rt_sigaction()
  arm64: switch compat to generic old sigsuspend
  arm64: switch to generic compat rt_sigqueueinfo()
  arm64: switch to generic compat rt_sigpending()
  arm64: switch to generic compat rt_sigprocmask()
  arm64: switch to generic sigaltstack
  sparc: switch to generic old sigsuspend
  sparc: COMPAT_SYSCALL_DEFINE does all sign-extension as well as SYSCALL_DEFINE
  sparc: kill sign-extending wrappers for native syscalls
  kill sparc32_open()
  sparc: switch to use of generic old sigaction
  sparc: switch sys_compat_rt_sigaction() to COMPAT_SYSCALL_DEFINE
  mips: switch to generic sys_fork() and sys_clone()
  ...
2013-02-23 18:50:11 -08:00
Linus Torvalds
21eaab6d19 tty/serial patches for 3.9-rc1
Here's the big tty/serial driver patches for 3.9-rc1.
 
 More tty port rework and fixes from Jiri here, as well as lots of
 individual serial driver updates and fixes.
 
 All of these have been in the linux-next tree for a while.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iEYEABECAAYFAlEmZYQACgkQMUfUDdst+ylJDgCg0B0nMevUUdM4hLvxunbbiyXM
 HUEAoIOedqriNNPvX4Bwy0hjeOEaWx0g
 =vi6x
 -----END PGP SIGNATURE-----

Merge tag 'tty-3.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty

Pull tty/serial patches from Greg Kroah-Hartman:
 "Here's the big tty/serial driver patches for 3.9-rc1.

  More tty port rework and fixes from Jiri here, as well as lots of
  individual serial driver updates and fixes.

  All of these have been in the linux-next tree for a while."

* tag 'tty-3.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (140 commits)
  tty: mxser: improve error handling in mxser_probe() and mxser_module_init()
  serial: imx: fix uninitialized variable warning
  serial: tegra: assume CONFIG_OF
  TTY: do not update atime/mtime on read/write
  lguest: select CONFIG_TTY to build properly.
  ARM defconfigs: add missing inclusions of linux/platform_device.h
  fb/exynos: include platform_device.h
  ARM: sa1100/assabet: include platform_device.h directly
  serial: imx: Fix recursive locking bug
  pps: Fix build breakage from decoupling pps from tty
  tty: Remove ancient hardpps()
  pps: Additional cleanups in uart_handle_dcd_change
  pps: Move timestamp read into PPS code proper
  pps: Don't crash the machine when exiting will do
  pps: Fix a use-after free bug when unregistering a source.
  pps: Use pps_lookup_dev to reduce ldisc coupling
  pps: Add pps_lookup_dev() function
  tty: serial: uartlite: Support uartlite on big and little endian systems
  tty: serial: uartlite: Fix sparse and checkpatch warnings
  serial/arc-uart: Miscll DT related updates (Grant's review comments)
  ...

Fix up trivial conflicts, mostly just due to the TTY config option
clashing with the EXPERIMENTAL removal.
2013-02-21 13:41:04 -08:00
Linus Torvalds
06991c28f3 Driver core patches for 3.9-rc1
Here is the big driver core merge for 3.9-rc1
 
 There are two major series here, both of which touch lots of drivers all
 over the kernel, and will cause you some merge conflicts:
   - add a new function called devm_ioremap_resource() to properly be
     able to check return values.
   - remove CONFIG_EXPERIMENTAL
 
 If you need me to provide a merged tree to handle these resolutions,
 please let me know.
 
 Other than those patches, there's not much here, some minor fixes and
 updates.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iEYEABECAAYFAlEmV0cACgkQMUfUDdst+yncCQCfbmnQZju7kzWXk6PjdFuKspT9
 weAAoMCzcAtEzzc4LXuUxxG/sXBVBCjW
 =yWAQ
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-3.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core patches from Greg Kroah-Hartman:
 "Here is the big driver core merge for 3.9-rc1

  There are two major series here, both of which touch lots of drivers
  all over the kernel, and will cause you some merge conflicts:

   - add a new function called devm_ioremap_resource() to properly be
     able to check return values.

   - remove CONFIG_EXPERIMENTAL

  Other than those patches, there's not much here, some minor fixes and
  updates"

Fix up trivial conflicts

* tag 'driver-core-3.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (221 commits)
  base: memory: fix soft/hard_offline_page permissions
  drivercore: Fix ordering between deferred_probe and exiting initcalls
  backlight: fix class_find_device() arguments
  TTY: mark tty_get_device call with the proper const values
  driver-core: constify data for class_find_device()
  firmware: Ignore abort check when no user-helper is used
  firmware: Reduce ifdef CONFIG_FW_LOADER_USER_HELPER
  firmware: Make user-mode helper optional
  firmware: Refactoring for splitting user-mode helper code
  Driver core: treat unregistered bus_types as having no devices
  watchdog: Convert to devm_ioremap_resource()
  thermal: Convert to devm_ioremap_resource()
  spi: Convert to devm_ioremap_resource()
  power: Convert to devm_ioremap_resource()
  mtd: Convert to devm_ioremap_resource()
  mmc: Convert to devm_ioremap_resource()
  mfd: Convert to devm_ioremap_resource()
  media: Convert to devm_ioremap_resource()
  iommu: Convert to devm_ioremap_resource()
  drm: Convert to devm_ioremap_resource()
  ...
2013-02-21 12:05:51 -08:00
Al Viro
15ce1f7154 x86,um: switch to generic old sigsuspend()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-03 18:16:26 -05:00
Joe Millenbach
4f73bc4dd3 tty: Added a CONFIG_TTY option to allow removal of TTY
The option allows you to remove TTY and compile without errors. This
saves space on systems that won't support TTY interfaces anyway.
bloat-o-meter output is below.

The bulk of this patch consists of Kconfig changes adding "depends on
TTY" to various serial devices and similar drivers that require the TTY
layer.  Ideally, these dependencies would occur on a common intermediate
symbol such as SERIO, but most drivers "select SERIO" rather than
"depends on SERIO", and "select" does not respect dependencies.

bloat-o-meter output comparing our previous minimal to new minimal by
removing TTY.  The list is filtered to not show removed entries with awk
'$3 != "-"' as the list was very long.

add/remove: 0/226 grow/shrink: 2/14 up/down: 6/-35356 (-35350)
function                                     old     new   delta
chr_dev_init                                 166     170      +4
allow_signal                                  80      82      +2
static.__warned                              143     142      -1
disallow_signal                               63      62      -1
__set_special_pids                            95      94      -1
unregister_console                           126     121      -5
start_kernel                                 546     541      -5
register_console                             593     588      -5
copy_from_user                                45      40      -5
sys_setsid                                   128     120      -8
sys_vhangup                                   32      19     -13
do_exit                                     1543    1526     -17
bitmap_zero                                   60      40     -20
arch_local_irq_save                          137     117     -20
release_task                                 674     652     -22
static.spin_unlock_irqrestore                308     260     -48

Signed-off-by: Joe Millenbach <jmillenbach@gmail.com>
Reviewed-by: Jamey Sharp <jamey@minilop.net>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-18 16:15:27 -08:00
Jiri Slaby
2e124b4a39 TTY: switch tty_flip_buffer_push
Now, we start converting tty buffer functions to actually use
tty_port. This will allow us to get rid of the need of tty in many
call sites. Only tty_port will needed and hence no more
tty_port_tty_get in those paths.

Now, the one where most of tty_port_tty_get gets removed:
tty_flip_buffer_push.

IOW we also closed all the races in drivers not using tty_port_tty_get
at all yet.

Also we move tty_flip_buffer_push declaration from include/linux/tty.h
to include/linux/tty_flip.h to all others while we are changing it
anyway.

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-15 22:30:15 -08:00
Jiri Slaby
92a19f9cec TTY: switch tty_insert_flip_char
Now, we start converting tty buffer functions to actually use
tty_port. This will allow us to get rid of the need of tty in many
call sites. Only tty_port will needed and hence no more
tty_port_tty_get in those paths.

tty_insert_flip_char is the next one to proceed. This one is used all
over the code, so the patch is huge.

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-15 22:21:36 -08:00
Jiri Slaby
227434f898 TTY: switch tty_buffer_request_room to tty_port
Now, we start converting tty buffer functions to actually use
tty_port. This will allow us to get rid of the need of tty pointer in
many call sites. Only tty_port will be needed and hence no more
tty_port_tty_get calls in those paths.

Here we start with tty_buffer_request_room.

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-15 22:17:28 -08:00
Kees Cook
75dacf5d80 arch/um: remove depends on CONFIG_EXPERIMENTAL
The CONFIG_EXPERIMENTAL config item has not carried much meaning for a
while now and is almost always enabled by default. As agreed during the
Linux kernel summit, remove it from any "depends on" lines in Kconfigs.

CC: Jeff Dike <jdike@addtoit.com>
CC: Richard Weinberger <richard@nod.at>
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Richard Weinberger <richard@nod.at>
2013-01-11 11:38:04 -08:00
Jiri Pirko
7826d43f2d ethtool: fix drvinfo strings set in drivers
Use strlcpy where possible to ensure the string is \0 terminated.
Use always sizeof(string) instead of 32, ETHTOOL_BUSINFO_LEN
and custom defines.
Use snprintf instead of sprint.
Remove unnecessary inits of ->fw_version
Remove unnecessary inits of drvinfo struct.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-06 21:06:31 -08:00
Jiri Pirko
646cbcdae5 um: net: use eth_hw_addr_random() to generate random mac
Also remove unused "mac" from uml_net struct.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-03 22:37:36 -08:00
Linus Torvalds
54d46ea993 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal
Pull signal handling cleanups from Al Viro:
 "sigaltstack infrastructure + conversion for x86, alpha and um,
  COMPAT_SYSCALL_DEFINE infrastructure.

  Note that there are several conflicts between "unify
  SS_ONSTACK/SS_DISABLE definitions" and UAPI patches in mainline;
  resolution is trivial - just remove definitions of SS_ONSTACK and
  SS_DISABLED from arch/*/uapi/asm/signal.h; they are all identical and
  include/uapi/linux/signal.h contains the unified variant."

Fixed up conflicts as per Al.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal:
  alpha: switch to generic sigaltstack
  new helpers: __save_altstack/__compat_save_altstack, switch x86 and um to those
  generic compat_sys_sigaltstack()
  introduce generic sys_sigaltstack(), switch x86 and um to it
  new helper: compat_user_stack_pointer()
  new helper: restore_altstack()
  unify SS_ONSTACK/SS_DISABLE definitions
  new helper: current_user_stack_pointer()
  missing user_stack_pointer() instances
  Bury the conditionals from kernel_thread/kernel_execve series
  COMPAT_SYSCALL_DEFINE: infrastructure
2012-12-20 18:05:28 -08:00
Al Viro
6bf9adfc90 introduce generic sys_sigaltstack(), switch x86 and um to it
Conditional on CONFIG_GENERIC_SIGALTSTACK; architectures that do not
select it are completely unaffected

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-19 18:07:40 -05:00
Linus Torvalds
6a2b60b17b Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull user namespace changes from Eric Biederman:
 "While small this set of changes is very significant with respect to
  containers in general and user namespaces in particular.  The user
  space interface is now complete.

  This set of changes adds support for unprivileged users to create user
  namespaces and as a user namespace root to create other namespaces.
  The tyranny of supporting suid root preventing unprivileged users from
  using cool new kernel features is broken.

  This set of changes completes the work on setns, adding support for
  the pid, user, mount namespaces.

  This set of changes includes a bunch of basic pid namespace
  cleanups/simplifications.  Of particular significance is the rework of
  the pid namespace cleanup so it no longer requires sending out
  tendrils into all kinds of unexpected cleanup paths for operation.  At
  least one case of broken error handling is fixed by this cleanup.

  The files under /proc/<pid>/ns/ have been converted from regular files
  to magic symlinks which prevents incorrect caching by the VFS,
  ensuring the files always refer to the namespace the process is
  currently using and ensuring that the ptrace_mayaccess permission
  checks are always applied.

  The files under /proc/<pid>/ns/ have been given stable inode numbers
  so it is now possible to see if different processes share the same
  namespaces.

  Through the David Miller's net tree are changes to relax many of the
  permission checks in the networking stack to allowing the user
  namespace root to usefully use the networking stack.  Similar changes
  for the mount namespace and the pid namespace are coming through my
  tree.

  Two small changes to add user namespace support were commited here adn
  in David Miller's -net tree so that I could complete the work on the
  /proc/<pid>/ns/ files in this tree.

  Work remains to make it safe to build user namespaces and 9p, afs,
  ceph, cifs, coda, gfs2, ncpfs, nfs, nfsd, ocfs2, and xfs so the
  Kconfig guard remains in place preventing that user namespaces from
  being built when any of those filesystems are enabled.

  Future design work remains to allow root users outside of the initial
  user namespace to mount more than just /proc and /sys."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (38 commits)
  proc: Usable inode numbers for the namespace file descriptors.
  proc: Fix the namespace inode permission checks.
  proc: Generalize proc inode allocation
  userns: Allow unprivilged mounts of proc and sysfs
  userns: For /proc/self/{uid,gid}_map derive the lower userns from the struct file
  procfs: Print task uids and gids in the userns that opened the proc file
  userns: Implement unshare of the user namespace
  userns: Implent proc namespace operations
  userns: Kill task_user_ns
  userns: Make create_new_namespaces take a user_ns parameter
  userns: Allow unprivileged use of setns.
  userns: Allow unprivileged users to create new namespaces
  userns: Allow setting a userns mapping to your current uid.
  userns: Allow chown and setgid preservation
  userns: Allow unprivileged users to create user namespaces.
  userns: Ignore suid and sgid on binaries if the uid or gid can not be mapped
  userns: fix return value on mntns_install() failure
  vfs: Allow unprivileged manipulation of the mount namespace.
  vfs: Only support slave subtrees across different user namespaces
  vfs: Add a user namespace reference from struct mnt_namespace
  ...
2012-12-17 15:44:47 -08:00
Linus Torvalds
9977d9b379 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal
Pull big execve/kernel_thread/fork unification series from Al Viro:
 "All architectures are converted to new model.  Quite a bit of that
  stuff is actually shared with architecture trees; in such cases it's
  literally shared branch pulled by both, not a cherry-pick.

  A lot of ugliness and black magic is gone (-3KLoC total in this one):

   - kernel_thread()/kernel_execve()/sys_execve() redesign.

     We don't do syscalls from kernel anymore for either kernel_thread()
     or kernel_execve():

     kernel_thread() is essentially clone(2) with callback run before we
     return to userland, the callbacks either never return or do
     successful do_execve() before returning.

     kernel_execve() is a wrapper for do_execve() - it doesn't need to
     do transition to user mode anymore.

     As a result kernel_thread() and kernel_execve() are
     arch-independent now - they live in kernel/fork.c and fs/exec.c
     resp.  sys_execve() is also in fs/exec.c and it's completely
     architecture-independent.

   - daemonize() is gone, along with its parts in fs/*.c

   - struct pt_regs * is no longer passed to do_fork/copy_process/
     copy_thread/do_execve/search_binary_handler/->load_binary/do_coredump.

   - sys_fork()/sys_vfork()/sys_clone() unified; some architectures
     still need wrappers (ones with callee-saved registers not saved in
     pt_regs on syscall entry), but the main part of those suckers is in
     kernel/fork.c now."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal: (113 commits)
  do_coredump(): get rid of pt_regs argument
  print_fatal_signal(): get rid of pt_regs argument
  ptrace_signal(): get rid of unused arguments
  get rid of ptrace_signal_deliver() arguments
  new helper: signal_pt_regs()
  unify default ptrace_signal_deliver
  flagday: kill pt_regs argument of do_fork()
  death to idle_regs()
  don't pass regs to copy_process()
  flagday: don't pass regs to copy_thread()
  bfin: switch to generic vfork, get rid of pointless wrappers
  xtensa: switch to generic clone()
  openrisc: switch to use of generic fork and clone
  unicore32: switch to generic clone(2)
  score: switch to generic fork/vfork/clone
  c6x: sanitize copy_thread(), get rid of clone(2) wrapper, switch to generic clone()
  take sys_fork/sys_vfork/sys_clone prototypes to linux/syscalls.h
  mn10300: switch to generic fork/vfork/clone
  h8300: switch to generic fork/vfork/clone
  tile: switch to generic clone()
  ...

Conflicts:
	arch/microblaze/include/asm/Kbuild
2012-12-12 12:22:13 -08:00
Linus Torvalds
090f8ccba3 Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar:
 "Lots of activity:

   211 files changed, 8328 insertions(+), 4116 deletions(-)

  most of it on the tooling side.

  Main changes:

   * ftrace enhancements and fixes from Steve Rostedt.

   * uprobes fixes, cleanups and preparation for the ARM port from Oleg
     Nesterov.

   * UAPI fixes, from David Howels - prepares the arch/x86 UAPI
     transition

   * Separate perf tests into multiple objects, one per test, from Jiri
     Olsa.

   * Make hardware event translations available in sysfs, from Jiri
     Olsa.

   * Fixes to /proc/pid/maps parsing, preparatory to supporting data
     maps, from Namhyung Kim

   * Implement ui_progress for GTK, from Namhyung Kim

   * Add framework for automated perf_event_attr tests, where tools with
     different command line options will be run from a 'perf test', via
     python glue, and the perf syscall will be intercepted to verify
     that the perf_event_attr fields set by the tool are those expected,
     from Jiri Olsa

   * Add a 'link' method for hists, so that we can have the leader with
     buckets for all the entries in all the hists.  This new method is
     now used in the default 'diff' output, making the sum of the
     'baseline' column be 100%, eliminating blind spots.

   * libtraceevent fixes for compiler warnings trying to make perf it
     build on some distros, like fedora 14, 32-bit, some of the warnings
     really pointed to real bugs.

   * Add a browser for 'perf script' and make it available from the
     report and annotate browsers.  It does filtering to find the
     scripts that handle events found in the perf.data file used.  From
     Feng Tang

   * perf inject changes to allow showing where a task sleeps, from
     Andrew Vagin.

   * Makefile improvements from Namhyung Kim.

   * Add --pre and --post command hooks in 'stat', from Peter Zijlstra.

   * Don't stop synthesizing threads when one vanishes, this is for the
     existing threads when we start a tool like trace.

   * Use sched:sched_stat_runtime to provide a thread summary, this
     produces the same output as the 'trace summary' subcommand of
     tglx's original "trace" tool.

   * Support interrupted syscalls in 'trace'

   * Add an event duration column and filter in 'trace'.

   * There are references to the man pages in some tools, so try to
     build Documentation when installing, warning the user if that is
     not possible, from Borislav Petkov.

   * Give user better message if precise is not supported, from David
     Ahern.

   * Try to find cross-built objdump path by using the session
     environment information in the perf.data file header, from Irina
     Tirdea, original patch and idea by Namhyung Kim.

   * Diplays more output on features check for make V=1, so that one can
     figure out what is happening by looking at gcc output, etc.  From
     Jiri Olsa.

   * Add on_exit implementation for systems without one, e.g.  Android,
     from Bernhard Rosenkraenzer.

   * Only process events for vcpus of interest, helps handling large
     number of events, from David Ahern.

   * Cross compilation fixes for Android, from Irina Tirdea.

   * Add documentation on compiling for Android, from Irina Tirdea.

   * perf diff improvements from Jiri Olsa.

   * Target (task/user/cpu/syswide) handling improvements, from Namhyung
     Kim.

   * Add support in 'trace' for tracing workload given by command line,
     from Namhyung Kim.

   * ... and much more."

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (194 commits)
  uprobes: Use percpu_rw_semaphore to fix register/unregister vs dup_mmap() race
  perf evsel: Introduce is_group_member method
  perf powerpc: Use uapi/unistd.h to fix build error
  tools: Pass the target in descend
  tools: Honour the O= flag when tool build called from a higher Makefile
  tools: Define a Makefile function to do subdir processing
  perf ui: Always compile browser setup code
  perf ui: Add ui_progress__finish()
  perf ui gtk: Implement ui_progress functions
  perf ui: Introduce generic ui_progress helper
  perf ui tui: Move progress.c under ui/tui directory
  perf tools: Add basic event modifier sanity check
  perf tools: Omit group members from perf_evlist__disable/enable
  perf tools: Ensure single disable call per event in record comand
  perf tools: Fix 'disabled' attribute config for record command
  perf tools: Fix attributes for '{}' defined event groups
  perf tools: Use sscanf for parsing /proc/pid/maps
  perf tools: Add gtk.<command> config option for launching GTK browser
  perf tools: Fix compile error on NO_NEWT=1 build
  perf hists: Initialize all of he->stat with zeroes
  ...
2012-12-11 18:14:31 -08:00
Linus Torvalds
37ea95a959 Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RCU update from Ingo Molnar:
 "The major features of this tree are:

     1. A first version of no-callbacks CPUs.  This version prohibits
        offlining CPU 0, but only when enabled via CONFIG_RCU_NOCB_CPU=y.
        Relaxing this constraint is in progress, but not yet ready
        for prime time.  These commits were posted to LKML at
        https://lkml.org/lkml/2012/10/30/724.

     2. Changes to SRCU that allows statically initialized srcu_struct
        structures.  These commits were posted to LKML at
        https://lkml.org/lkml/2012/10/30/296.

     3. Restructuring of RCU's debugfs output.  These commits were posted
        to LKML at https://lkml.org/lkml/2012/10/30/341.

     4. Additional CPU-hotplug/RCU improvements, posted to LKML at
        https://lkml.org/lkml/2012/10/30/327.
        Note that the commit eliminating __stop_machine() was judged to
        be too-high of risk, so is deferred to 3.9.

     5. Changes to RCU's idle interface, most notably a new module
        parameter that redirects normal grace-period operations to
        their expedited equivalents.  These were posted to LKML at
        https://lkml.org/lkml/2012/10/30/739.

     6. Additional diagnostics for RCU's CPU stall warning facility,
        posted to LKML at https://lkml.org/lkml/2012/10/30/315.
        The most notable change reduces the
        default RCU CPU stall-warning time from 60 seconds to 21 seconds,
        so that it once again happens sooner than the softlockup timeout.

     7. Documentation updates, which were posted to LKML at
        https://lkml.org/lkml/2012/10/30/280.
        A couple of late-breaking changes were posted at
        https://lkml.org/lkml/2012/11/16/634 and
        https://lkml.org/lkml/2012/11/16/547.

     8. Miscellaneous fixes, which were posted to LKML at
        https://lkml.org/lkml/2012/10/30/309.

     9. Finally, a fix for an lockdep-RCU splat was posted to LKML
        at https://lkml.org/lkml/2012/11/7/486."

* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (49 commits)
  context_tracking: New context tracking susbsystem
  sched: Mark RCU reader in sched_show_task()
  rcu: Separate accounting of callbacks from callback-free CPUs
  rcu: Add callback-free CPUs
  rcu: Add documentation for the new rcuexp debugfs trace file
  rcu: Update documentation for TREE_RCU debugfs tracing
  rcu: Reduce default RCU CPU stall warning timeout
  rcu: Fix TINY_RCU rcu_is_cpu_rrupt_from_idle check
  rcu: Clarify memory-ordering properties of grace-period primitives
  rcu: Add new rcutorture module parameters to start/end test messages
  rcu: Remove list_for_each_continue_rcu()
  rcu: Fix batch-limit size problem
  rcu: Add tracing for synchronize_sched_expedited()
  rcu: Remove old debugfs interfaces and also RCU flavor name
  rcu: split 'rcuhier' to each flavor
  rcu: split 'rcugp' to each flavor
  rcu: split 'rcuboost' to each flavor
  rcu: split 'rcubarrier' to each flavor
  rcu: Fix tracing formatting
  rcu: Remove the interface "rcudata.csv"
  ...
2012-12-11 18:10:49 -08:00
Linus Torvalds
c6bd5bcc49 TTY/Serial merge for 3.8-rc1
Here's the big tty/serial tree set of changes for 3.8-rc1.
 
 Contained in here is a bunch more reworks of the tty port layer from Jiri and
 bugfixes from Alan, along with a number of other tty and serial driver updates
 by the various driver authors.
 
 Also, Jiri has been coerced^Wconvinced to be the co-maintainer of the TTY
 layer, which is much appreciated by me.
 
 All of these have been in the linux-next tree for a while.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iEYEABECAAYFAlDHhgwACgkQMUfUDdst+ynI6wCcC+YeBwncnoWHvwLAJOwAZpUL
 bysAn28o780/lOsTzp3P1Qcjvo69nldo
 =hN/g
 -----END PGP SIGNATURE-----

Merge tag 'tty-3.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty

Pull TTY/Serial merge from Greg Kroah-Hartman:
 "Here's the big tty/serial tree set of changes for 3.8-rc1.

  Contained in here is a bunch more reworks of the tty port layer from
  Jiri and bugfixes from Alan, along with a number of other tty and
  serial driver updates by the various driver authors.

  Also, Jiri has been coerced^Wconvinced to be the co-maintainer of the
  TTY layer, which is much appreciated by me.

  All of these have been in the linux-next tree for a while.

  Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"

Fixed up some trivial conflicts in the staging tree, due to the fwserial
driver having come in both ways (but fixed up a bit in the serial tree),
and the ioctl handling in the dgrp driver having been done slightly
differently (staging tree got that one right, and removed both
TIOCGSOFTCAR and TIOCSSOFTCAR).

* tag 'tty-3.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (146 commits)
  staging: sb105x: fix potential NULL pointer dereference in mp_chars_in_buffer()
  staging/fwserial: Remove superfluous free
  staging/fwserial: Use WARN_ONCE when port table is corrupted
  staging/fwserial: Destruct embedded tty_port on teardown
  staging/fwserial: Fix build breakage when !CONFIG_BUG
  staging: fwserial: Add TTY-over-Firewire serial driver
  drivers/tty/serial/serial_core.c: clean up HIGH_BITS_OFFSET usage
  staging: dgrp: dgrp_tty.c: Audit the return values of get/put_user()
  staging: dgrp: dgrp_tty.c: Remove the TIOCSSOFTCAR ioctl handler from dgrp driver
  serial: ifx6x60: Add modem power off function in the platform reboot process
  serial: mxs-auart: unmap the scatter list before we copy the data
  serial: mxs-auart: disable the Receive Timeout Interrupt when DMA is enabled
  serial: max310x: Setup missing "can_sleep" field for GPIO
  tty/serial: fix ifx6x60.c declaration warning
  serial: samsung: add devicetree properties for non-Exynos SoCs
  serial: samsung: fix potential soft lockup during uart write
  tty: vt: Remove redundant null check before kfree.
  tty/8250 Add check for pci_ioremap_bar failure
  tty/8250 Add support for Commtech's Fastcom Async-335 and Fastcom Async-PCIe cards
  tty/8250 Add XR17D15x devices to the exar_handle_irq override
  ...
2012-12-11 14:08:47 -08:00
Ingo Molnar
cc1b39dbf9 Merge branch 'tip/perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace into perf/core
Pull ftrace updates from Steve Rostedt.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-12-08 15:54:35 +01:00
Ingo Molnar
630e1e0bcd Merge branch 'rcu/next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu
Conflicts:
	arch/x86/kernel/ptrace.c

Pull the latest RCU tree from Paul E. McKenney:

"       The major features of this series are:

  1.	A first version of no-callbacks CPUs.  This version prohibits
  	offlining CPU 0, but only when enabled via CONFIG_RCU_NOCB_CPU=y.
  	Relaxing this constraint is in progress, but not yet ready
  	for prime time.  These commits were posted to LKML at
  	https://lkml.org/lkml/2012/10/30/724, and are at branch rcu/nocb.

  2.	Changes to SRCU that allows statically initialized srcu_struct
  	structures.  These commits were posted to LKML at
  	https://lkml.org/lkml/2012/10/30/296, and are at branch rcu/srcu.

  3.	Restructuring of RCU's debugfs output.  These commits were posted
  	to LKML at https://lkml.org/lkml/2012/10/30/341, and are at
  	branch rcu/tracing.

  4.	Additional CPU-hotplug/RCU improvements, posted to LKML at
  	https://lkml.org/lkml/2012/10/30/327, and are at branch rcu/hotplug.
  	Note that the commit eliminating __stop_machine() was judged to
  	be too-high of risk, so is deferred to 3.9.

  5.	Changes to RCU's idle interface, most notably a new module
  	parameter that redirects normal grace-period operations to
  	their expedited equivalents.  These were posted to LKML at
  	https://lkml.org/lkml/2012/10/30/739, and are at branch rcu/idle.

  6.	Additional diagnostics for RCU's CPU stall warning facility,
  	posted to LKML at https://lkml.org/lkml/2012/10/30/315, and
  	are at branch rcu/stall.  The most notable change reduces the
  	default RCU CPU stall-warning time from 60 seconds to 21 seconds,
  	so that it once again happens sooner than the softlockup timeout.

  7.	Documentation updates, which were posted to LKML at
  	https://lkml.org/lkml/2012/10/30/280, and are at branch rcu/doc.
  	A couple of late-breaking changes were posted at
  	https://lkml.org/lkml/2012/11/16/634 and
  	https://lkml.org/lkml/2012/11/16/547.

  8.	Miscellaneous fixes, which were posted to LKML at
  	https://lkml.org/lkml/2012/10/30/309, along with a late-breaking
  	change posted at Fri, 16 Nov 2012 11:26:25 -0800 with message-ID
  	<20121116192625.GA447@linux.vnet.ibm.com>, but which lkml.org
  	seems to have missed.  These are at branch rcu/fixes.

  9.	Finally, a fix for an lockdep-RCU splat was posted to LKML
  	at https://lkml.org/lkml/2012/11/7/486.  This is at rcu/next. "

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-12-03 06:27:05 +01:00
Al Viro
afa86fc426 flagday: don't pass regs to copy_thread()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-11-28 23:43:42 -05:00
Al Viro
1d4b4b2994 x86, um: switch to generic fork/vfork/clone
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-11-28 22:13:44 -05:00
Al Viro
2b067fc9dd um: don't bother looking at regs in copy_thread() - current_pt_regs() is what we'll get
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-11-28 21:49:03 -05:00
Al Viro
8cddebd767 um: don't bother with passing sp to do_fork() for fork(2)/vfork(2)
copy_thread() on um will do the right thing when getting 0 for sp...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-11-28 21:49:03 -05:00
Al Viro
9436d5c32b um: get_safe_registers() should be done in flush_thread(), not start_thread()
... or we'll end up buggering the results of ELF_PLAT_INIT()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-11-28 21:46:03 -05:00
Eric W. Biederman
17cf22c33e pidns: Use task_active_pid_ns where appropriate
The expressions tsk->nsproxy->pid_ns and task_active_pid_ns
aka ns_of_pid(task_pid(tsk)) should have the same number of
cache line misses with the practical difference that
ns_of_pid(task_pid(tsk)) is released later in a processes life.

Furthermore by using task_active_pid_ns it becomes trivial
to write an unshare implementation for the the pid namespace.

So I have used task_active_pid_ns everywhere I can.

In fork since the pid has not yet been attached to the
process I use ns_of_pid, to achieve the same effect.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2012-11-19 05:59:09 -08:00
Jiri Slaby
191c5f1027 TTY: call tty_port_destroy in the rest of drivers
After commit "TTY: move tty buffers to tty_port", the tty buffers are
not freed in some drivers. This is because tty_port_destructor is not
called whenever a tty_port is freed. This was an assumption I counted
with but was unfortunately untrue. So fix the drivers to fulfil this
assumption.

To be sure, the TTY buffers (and later some stuff) are gone along with
the tty_port, we have to call tty_port_destroy at tear-down places.
This is mostly where the structure containing a tty_port is freed.
This patch does exactly that -- put tty_port_destroy at those places.

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-11-15 17:20:58 -08:00
David Sharp
8cbd9cc625 tracing,x86: Add a TSC trace_clock
In order to promote interoperability between userspace tracers and ftrace,
add a trace_clock that reports raw TSC values which will then be recorded
in the ring buffer. Userspace tracers that also record TSCs are then on
exactly the same time base as the kernel and events can be unambiguously
interlaced.

Tested: Enabled a tracepoint and the "tsc" trace_clock and saw very large
timestamp values.

v2:
Move arch-specific bits out of generic code.
v3:
Rename "x86-tsc", cleanups
v7:
Generic arch bits in Kbuild.

Google-Bug-Id: 6980623
Link: http://lkml.kernel.org/r/1352837903-32191-1-git-send-email-dhsharp@google.com

Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@linux.intel.com>
Signed-off-by: David Sharp <dhsharp@google.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-11-13 15:48:27 -05:00
Jiri Slaby
d2ffc740f3 UM: TTY: fix build errors now that tty->raw is gone
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2012-10-30 14:03:10 -07:00
Frederic Weisbecker
4d9a5d4319 rcu: Remove rcu_switch()
It's only there to call rcu_user_hooks_switch(). Let's
just call rcu_user_hooks_switch() directly, we don't need this
function in the middle.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Richard Weinberger <richard@nod.at>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2012-10-23 14:54:06 -07:00
Linus Torvalds
4e21fc138b Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal
Pull third pile of kernel_execve() patches from Al Viro:
 "The last bits of infrastructure for kernel_thread() et.al., with
  alpha/arm/x86 use of those.  Plus sanitizing the asm glue and
  do_notify_resume() on alpha, fixing the "disabled irq while running
  task_work stuff" breakage there.

  At that point the rest of kernel_thread/kernel_execve/sys_execve work
  can be done independently for different architectures.  The only
  pending bits that do depend on having all architectures converted are
  restrictred to fs/* and kernel/* - that'll obviously have to wait for
  the next cycle.

  I thought we'd have to wait for all of them done before we start
  eliminating the longjump-style insanity in kernel_execve(), but it
  turned out there's a very simple way to do that without flagday-style
  changes."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal:
  alpha: switch to saner kernel_execve() semantics
  arm: switch to saner kernel_execve() semantics
  x86, um: convert to saner kernel_execve() semantics
  infrastructure for saner ret_from_kernel_thread semantics
  make sure that kernel_thread() callbacks call do_exit() themselves
  make sure that we always have a return path from kernel_execve()
  ppc: eeh_event should just use kthread_run()
  don't bother with kernel_thread/kernel_execve for launching linuxrc
  alpha: get rid of switch_stack argument of do_work_pending()
  alpha: don't bother passing switch_stack separately from regs
  alpha: take SIGPENDING/NOTIFY_RESUME loop into signal.c
  alpha: simplify TIF_NEED_RESCHED handling
2012-10-13 10:05:52 +09:00
Al Viro
22e2430d60 x86, um: convert to saner kernel_execve() semantics
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-10-12 13:35:22 -04:00
Linus Torvalds
8213a2f3ee Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal
Pull pile 2 of execve and kernel_thread unification work from Al Viro:
 "Stuff in there: kernel_thread/kernel_execve/sys_execve conversions for
  several more architectures plus assorted signal fixes and cleanups.

  There'll be more (in particular, real fixes for the alpha
  do_notify_resume() irq mess)..."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal: (43 commits)
  alpha: don't open-code trace_report_syscall_{enter,exit}
  Uninclude linux/freezer.h
  m32r: trim masks
  avr32: trim masks
  tile: don't bother with SIGTRAP in setup_frame
  microblaze: don't bother with SIGTRAP in setup_rt_frame()
  mn10300: don't bother with SIGTRAP in setup_frame()
  frv: no need to raise SIGTRAP in setup_frame()
  x86: get rid of duplicate code in case of CONFIG_VM86
  unicore32: remove pointless test
  h8300: trim _TIF_WORK_MASK
  parisc: decide whether to go to slow path (tracesys) based on thread flags
  parisc: don't bother looping in do_signal()
  parisc: fix double restarts
  bury the rest of TIF_IRET
  sanitize tsk_is_polling()
  bury _TIF_RESTORE_SIGMASK
  unicore32: unobfuscate _TIF_WORK_MASK
  mips: NOTIFY_RESUME is not needed in TIF masks
  mips: merge the identical "return from syscall" per-ABI code
  ...

Conflicts:
	arch/arm/include/asm/thread_info.h
2012-10-12 10:49:08 +09:00
Linus Torvalds
42859eea96 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal
Pull generic execve() changes from Al Viro:
 "This introduces the generic kernel_thread() and kernel_execve()
  functions, and switches x86, arm, alpha, um and s390 over to them."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal: (26 commits)
  s390: convert to generic kernel_execve()
  s390: switch to generic kernel_thread()
  s390: fold kernel_thread_helper() into ret_from_fork()
  s390: fold execve_tail() into start_thread(), convert to generic sys_execve()
  um: switch to generic kernel_thread()
  x86, um/x86: switch to generic sys_execve and kernel_execve
  x86: split ret_from_fork
  alpha: introduce ret_from_kernel_execve(), switch to generic kernel_execve()
  alpha: switch to generic kernel_thread()
  alpha: switch to generic sys_execve()
  arm: get rid of execve wrapper, switch to generic execve() implementation
  arm: optimized current_pt_regs()
  arm: introduce ret_from_kernel_execve(), switch to generic kernel_execve()
  arm: split ret_from_fork, simplify kernel_thread() [based on patch by rmk]
  generic sys_execve()
  generic kernel_execve()
  new helper: current_pt_regs()
  preparation for generic kernel_thread()
  um: kill thread->forking
  um: let signal_delivered() do SIGTRAP on singlestepping into handler
  ...
2012-10-10 12:02:25 +09:00
Linus Torvalds
f59b51fe3d Merge branch 'for-linus-37rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml
Pull UML changes from Richard Weinberger:
 "UML receives this time only cleanups.

  The most outstanding change is the 'include "foo.h"' do 'include
  <foo.h>' conversion done by Al Viro.

  It touches many files, that's why the diffstat is rather big."

* 'for-linus-37rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
  typo in UserModeLinux-HOWTO
  hppfs: fix the return value of get_inode()
  hostfs: drop vmtruncate
  um: get rid of pointless include "..." where include <...> will do
  um: move sysrq.h out of include/shared
  um/x86: merge 32 and 64 bit variants of ptrace.h
  um/x86: merge 32 and 64bit variants of checksum.h
2012-10-10 11:15:20 +09:00
Al Viro
37185b3324 um: get rid of pointless include "..." where include <...> will do
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Richard Weinberger <richard@nod.at>
2012-10-09 22:28:45 +02:00
Al Viro
382d95fdfa um: move sysrq.h out of include/shared
never used by userland-side objects

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Richard Weinberger <richard@nod.at>
2012-10-09 22:28:19 +02:00
Linus Torvalds
9e2d8656f5 Merge branch 'akpm' (Andrew's patch-bomb)
Merge patches from Andrew Morton:
 "A few misc things and very nearly all of the MM tree.  A tremendous
  amount of stuff (again), including a significant rbtree library
  rework."

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (160 commits)
  sparc64: Support transparent huge pages.
  mm: thp: Use more portable PMD clearing sequenece in zap_huge_pmd().
  mm: Add and use update_mmu_cache_pmd() in transparent huge page code.
  sparc64: Document PGD and PMD layout.
  sparc64: Eliminate PTE table memory wastage.
  sparc64: Halve the size of PTE tables
  sparc64: Only support 4MB huge pages and 8KB base pages.
  memory-hotplug: suppress "Trying to free nonexistent resource <XXXXXXXXXXXXXXXX-YYYYYYYYYYYYYYYY>" warning
  mm: memcg: clean up mm_match_cgroup() signature
  mm: document PageHuge somewhat
  mm: use %pK for /proc/vmallocinfo
  mm, thp: fix mlock statistics
  mm, thp: fix mapped pages avoiding unevictable list on mlock
  memory-hotplug: update memory block's state and notify userspace
  memory-hotplug: preparation to notify memory block's state at memory hot remove
  mm: avoid section mismatch warning for memblock_type_name
  make GFP_NOTRACK definition unconditional
  cma: decrease cc.nr_migratepages after reclaiming pagelist
  CMA: migrate mlocked pages
  kpageflags: fix wrong KPF_THP on non-huge compound pages
  ...
2012-10-09 16:23:15 +09:00
Shaohua Li
45cac65b0f readahead: fault retry breaks mmap file read random detection
.fault now can retry.  The retry can break state machine of .fault.  In
filemap_fault, if page is miss, ra->mmap_miss is increased.  In the second
try, since the page is in page cache now, ra->mmap_miss is decreased.  And
these are done in one fault, so we can't detect random mmap file access.

Add a new flag to indicate .fault is tried once.  In the second try, skip
ra->mmap_miss decreasing.  The filemap_fault state machine is ok with it.

I only tested x86, didn't test other archs, but looks the change for other
archs is obvious, but who knows :)

Signed-off-by: Shaohua Li <shaohua.li@fusionio.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-09 16:22:47 +09:00
Catalin Marinas
af1839eb4b Kconfig: clean up the long arch list for the UID16 config option
Introduce HAVE_UID16 config option and select it in corresponding
architecture Kconfig files.  UID16 now only depends on HAVE_UID16.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Mike Frysinger <vapier@gentoo.org>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-09 16:22:13 +09:00
Arnd Bergmann
c37d6154c0 Merge branch 'disintegrate-asm-generic' of git://git.infradead.org/users/dhowells/linux-headers into asm-generic
Patches from David Howells <dhowells@redhat.com>:

This is to complete part of the UAPI disintegration for which the
preparatory patches were pulled recently.

Note that there are some fixup patches which are at the base of the
branch aimed at you, plus all arches get the asm-generic branch merged in too.

* 'disintegrate-asm-generic' of git://git.infradead.org/users/dhowells/linux-headers:
  UAPI: (Scripted) Disintegrate include/asm-generic
  UAPI: Fix conditional header installation handling (notably kvm_para.h on m68k)
  c6x: remove c6x signal.h
  UAPI: Split compound conditionals containing __KERNEL__ in Arm64
  UAPI: Fix the guards on various asm/unistd.h files

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2012-10-04 22:57:51 +02:00
Linus Torvalds
9b2e077c42 Prepared for main script
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIVAwUAUGsfSBOxKuMESys7AQIQug/+LyViiXFmCSlM+lCGkp64/BfUvy0QHqN4
 K/dMvbZKOQbvmgps/xj8G+6diDzeO4hz8e1I3c/SEZ3M9TTz/Ppv1slfET9uUZ4X
 aLLHKqXihsxEOslw7mgp91KTd1Nr+e41f/5hr3j5Ap1HQB4yJa2mmj3reb48VfjD
 jmXo/dID66c2ExaVO7C8yyZXWgMGTfiy27qmEnMTxW7xQPt1oYsV2Bq0PCC/zEcq
 JgnwMatDVMy9en9wuEVMNelImE+XLm1T3XpLHL2WkV2JWSai98TcvGZnNKIxpFqu
 PueHWWCs5F5bZfn4bf6QOEstRTW76NL2qFNYrBPi0Zuq8Pm53ucnnzJUY8JFPPoR
 kXYmv8K73Jb10eHFuc3X4UyzvnhmJ7y3kG3jx7WoJVkW1KPgEFNmvMHkLyHgPZOU
 nT1tZiO0QHF4zi0JWMfK+7aeEY7EKfqRSce0F3Jw91vaIlEOIqgMgVJ1Y/nMhu3s
 92mpg8JDoAcgCghok4m4Pc1qO06Fe8Iw5Qap5KMdPutp5Br2ebLL5NrwdAE8LNpR
 7826r9RTMhyVRgNJ71JMFDY1IBeLeY0bxipN8dh6VYqMiKgClUeNwv7/tIgI4YS7
 acQ+GdcsgTtg5qx3xwX5N2TSJVvdwnXdnWhAw7wN48tbzH8LvMV61Pq8Ytc7iK3M
 cAMgkbxdZRk=
 =VtEQ
 -----END PGP SIGNATURE-----

Merge tag 'uapi-prep-20121002' of git://git.infradead.org/users/dhowells/linux-headers

Pull preparatory patches for user API disintegration from David Howells:
 "The patches herein prepare for the extraction of the Userspace API
  bits from the various header files named in the Kbuild files.

  New subdirectories are created under either include/uapi/ or
  arch/x/include/uapi/ that correspond to the subdirectory containing
  that file under include/ or arch/x/include/.

  The new subdirs under the uapi/ directory are populated with Kbuild
  files that mostly do nothing at this time.  Further patches will
  disintegrate the headers in each original directory and fill in the
  Kbuild files as they do it.

  These patches also:

   (1) fix up #inclusions of "foo.h" rather than <foo.h>.

   (2) Remove some redundant #includes from the DRM code.

   (3) Make the kernel build infrastructure handle Kbuild files both in
       the old places and the new UAPI place that both specify headers
       to be exported.

   (4) Fix some kernel tools that #include kernel headers during their
       build.

  I have compile tested this with allyesconfig against x86_64,
  allmodconfig against i386 and a scattering of additional defconfigs of
  other arches.  Prepared for main script

  Signed-off-by: David Howells <dhowells@redhat.com>
  Acked-by: Arnd Bergmann <arnd@arndb.de>
  Acked-by: Thomas Gleixner <tglx@linutronix.de>
  Acked-by: Michael Kerrisk <mtk.manpages@gmail.com>
  Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
  Acked-by: Dave Jones <davej@redhat.com>
  Acked-by: H. Peter Anvin <hpa@zytor.com>"

* tag 'uapi-prep-20121002' of git://git.infradead.org/users/dhowells/linux-headers:
  UAPI: Plumb the UAPI Kbuilds into the user header installation and checking
  UAPI: x86: Differentiate the generated UAPI and internal headers
  UAPI: Remove the objhdr-y export list
  UAPI: Move linux/version.h
  UAPI: Set up uapi/asm/Kbuild.asm
  UAPI: x86: Fix insn_sanity build failure after UAPI split
  UAPI: x86: Fix the test_get_len tool
  UAPI: (Scripted) Set up UAPI Kbuild files
  UAPI: Partition the header include path sets and add uapi/ header directories
  UAPI: (Scripted) Convert #include "..." to #include <path/...> in kernel system headers
  UAPI: (Scripted) Convert #include "..." to #include <path/...> in drivers/gpu/
  UAPI: (Scripted) Remove redundant DRM UAPI header #inclusions from drivers/gpu/.
  UAPI: Refer to the DRM UAPI headers with <...> and from certain headers only
2012-10-03 13:45:43 -07:00
Mark Brown
e7a570ff7d asm-generic: Add default clkdev.h
Ease the deployment of clkdev by providing a default asm/clkdev.h for
use if the arch does not have an include/asm/clkdev.h.

Due to limitations in Kbuild we manually add clkdev.h to all
architectures that don't have one rather than having the header appear
by default.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Reviewed-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2012-10-03 21:33:53 +02:00
Linus Torvalds
aab174f0df Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs update from Al Viro:

 - big one - consolidation of descriptor-related logics; almost all of
   that is moved to fs/file.c

   (BTW, I'm seriously tempted to rename the result to fd.c.  As it is,
   we have a situation when file_table.c is about handling of struct
   file and file.c is about handling of descriptor tables; the reasons
   are historical - file_table.c used to be about a static array of
   struct file we used to have way back).

   A lot of stray ends got cleaned up and converted to saner primitives,
   disgusting mess in android/binder.c is still disgusting, but at least
   doesn't poke so much in descriptor table guts anymore.  A bunch of
   relatively minor races got fixed in process, plus an ext4 struct file
   leak.

 - related thing - fget_light() partially unuglified; see fdget() in
   there (and yes, it generates the code as good as we used to have).

 - also related - bits of Cyrill's procfs stuff that got entangled into
   that work; _not_ all of it, just the initial move to fs/proc/fd.c and
   switch of fdinfo to seq_file.

 - Alex's fs/coredump.c spiltoff - the same story, had been easier to
   take that commit than mess with conflicts.  The rest is a separate
   pile, this was just a mechanical code movement.

 - a few misc patches all over the place.  Not all for this cycle,
   there'll be more (and quite a few currently sit in akpm's tree)."

Fix up trivial conflicts in the android binder driver, and some fairly
simple conflicts due to two different changes to the sock_alloc_file()
interface ("take descriptor handling from sock_alloc_file() to callers"
vs "net: Providing protocol type via system.sockprotoname xattr of
/proc/PID/fd entries" adding a dentry name to the socket)

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (72 commits)
  MAX_LFS_FILESIZE should be a loff_t
  compat: fs: Generic compat_sys_sendfile implementation
  fs: push rcu_barrier() from deactivate_locked_super() to filesystems
  btrfs: reada_extent doesn't need kref for refcount
  coredump: move core dump functionality into its own file
  coredump: prevent double-free on an error path in core dumper
  usb/gadget: fix misannotations
  fcntl: fix misannotations
  ceph: don't abuse d_delete() on failure exits
  hypfs: ->d_parent is never NULL or negative
  vfs: delete surplus inode NULL check
  switch simple cases of fget_light to fdget
  new helpers: fdget()/fdput()
  switch o2hb_region_dev_write() to fget_light()
  proc_map_files_readdir(): don't bother with grabbing files
  make get_file() return its argument
  vhost_set_vring(): turn pollstart/pollstop into bool
  switch prctl_set_mm_exe_file() to fget_light()
  switch xfs_find_handle() to fget_light()
  switch xfs_swapext() to fget_light()
  ...
2012-10-02 20:25:04 -07:00