Commit Graph

157276 Commits

Author SHA1 Message Date
David Hildenbrand 5ffb90b393 m68k/mm: use __ClearPageReserved()
The PG_reserved flag is cleared from memory that is part of the kernel
image (and therefore marked as PG_reserved).  Avoid using PG_reserved
directly.

Link: http://lkml.kernel.org/r/20190114125903.24845-6-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-05 21:07:18 -08:00
David Hildenbrand 795c230604 riscv/vdso: don't clear PG_reserved
The VDSO is part of the kernel image and therefore the struct pages are
marked as reserved during boot.

As we install a special mapping, the actual struct pages will never be
exposed to MM via the page tables.  We can therefore leave the pages
marked as reserved.

Link: http://lkml.kernel.org/r/20190114125903.24845-5-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Acked-by: Palmer Dabbelt <palmer@sifive.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Tobias Klauser <tklauser@distanz.ch>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-05 21:07:18 -08:00
David Hildenbrand f55b74170b powerpc/vdso: don't clear PG_reserved
The VDSO is part of the kernel image and therefore the struct pages are
marked as reserved during boot.

As we install a special mapping, the actual struct pages will never be
exposed to MM via the page tables.  We can therefore leave the pages
marked as reserved.

Link: http://lkml.kernel.org/r/20190114125903.24845-4-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>		[powerpc]
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Christophe Leroy <christophe.leroy@c-s.fr>
Cc: Kees Cook <keescook@chromium.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-05 21:07:18 -08:00
David Hildenbrand 446d29645b s390/vdso: don't clear PG_reserved
The VDSO is part of the kernel image and therefore the struct pages are
marked as reserved during boot.

As we install a special mapping, the actual struct pages will never be
exposed to MM via the page tables.  We can therefore leave the pages
marked as reserved.

Link: http://lkml.kernel.org/r/20190114125903.24845-3-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Suggested-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Souptick Joarder <jrdr.linux@gmail.com>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-05 21:07:18 -08:00
Aneesh Kumar K.V 8ef5cbde6d arch/powerpc/mm/hugetlb: NestMMU workaround for hugetlb mprotect RW upgrade
NestMMU requires us to mark the pte invalid and flush the tlb when we do
a RW upgrade of pte.  We fixed a variant of this in the fault path in
bd5050e38a ("powerpc/mm/radix: Change pte relax sequence to handle
nest MMU hang").

Link: http://lkml.kernel.org/r/20190116085035.29729-6-aneesh.kumar@linux.ibm.com
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reviewed-by: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-05 21:07:18 -08:00
Aneesh Kumar K.V 5b323367ef arch/powerpc/mm: Nest MMU workaround for mprotect RW upgrade
NestMMU requires us to mark the pte invalid and flush the tlb when we do
a RW upgrade of pte.  We fixed a variant of this in the fault path in
bd5050e38a ("powerpc/mm/radix: Change pte relax sequence to handle
nest MMU hang").

Do the same for mprotect upgrades.

Hugetlb is handled in the next patch.

Link: http://lkml.kernel.org/r/20190116085035.29729-4-aneesh.kumar@linux.ibm.com
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-05 21:07:18 -08:00
Aneesh Kumar K.V 04a8645304 mm: update ptep_modify_prot_commit to take old pte value as arg
Architectures like ppc64 require to do a conditional tlb flush based on
the old and new value of pte.  Enable that by passing old pte value as
the arg.

Link: http://lkml.kernel.org/r/20190116085035.29729-3-aneesh.kumar@linux.ibm.com
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-05 21:07:18 -08:00
Aneesh Kumar K.V 0cbe3e26ab mm: update ptep_modify_prot_start/commit to take vm_area_struct as arg
Patch series "NestMMU pte upgrade workaround for mprotect", v5.

We can upgrade pte access (R -> RW transition) via mprotect.  We need to
make sure we follow the recommended pte update sequence as outlined in
commit bd5050e38a ("powerpc/mm/radix: Change pte relax sequence to
handle nest MMU hang") for such updates.  This patch series does that.

This patch (of 5):

Some architectures may want to call flush_tlb_range from these helpers.

Link: http://lkml.kernel.org/r/20190116085035.29729-2-aneesh.kumar@linux.ibm.com
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-05 21:07:18 -08:00
Anshuman Khandual 5480280d3f arm64/mm: enable HugeTLB migration for contiguous bit HugeTLB pages
Let arm64 subscribe to the previously added framework in which
architecture can inform whether a given huge page size is supported for
migration.  This just overrides the default function
arch_hugetlb_migration_supported() and enables migration for all
possible HugeTLB page sizes on arm64.

With this, HugeTLB migration support on arm64 now covers all possible
HugeTLB options.

          CONT PTE    PMD    CONT PMD    PUD
          --------    ---    --------    ---
  4K:        64K      2M        32M      1G
  16K:        2M     32M         1G
  64K:        2M    512M        16G

Link: http://lkml.kernel.org/r/1545121450-1663-6-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Reviewed-by: Steve Capper <steve.capper@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-05 21:07:15 -08:00
Anshuman Khandual 4a03a058d1 arm64/mm: enable HugeTLB migration
Let arm64 subscribe to generic HugeTLB page migration framework.  Right
now this only works on the following PMD and PUD level HugeTLB page
sizes with various kernel base page size combinations.

         CONT PTE    PMD    CONT PMD    PUD
         --------    ---    --------    ---
  4K:         NA     2M         NA      1G
  16K:        NA    32M         NA
  64K:        NA   512M         NA

Link: http://lkml.kernel.org/r/1545121450-1663-5-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Reviewed-by: Steve Capper <steve.capper@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-05 21:07:15 -08:00
Anshuman Khandual 98fa15f34c mm: replace all open encodings for NUMA_NO_NODE
Patch series "Replace all open encodings for NUMA_NO_NODE", v3.

All these places for replacement were found by running the following
grep patterns on the entire kernel code.  Please let me know if this
might have missed some instances.  This might also have replaced some
false positives.  I will appreciate suggestions, inputs and review.

1. git grep "nid == -1"
2. git grep "node == -1"
3. git grep "nid = -1"
4. git grep "node = -1"

This patch (of 2):

At present there are multiple places where invalid node number is
encoded as -1.  Even though implicitly understood it is always better to
have macros in there.  Replace these open encodings for an invalid node
number with the global macro NUMA_NO_NODE.  This helps remove NUMA
related assumptions like 'invalid node' from various places redirecting
them to a common definition.

Link: http://lkml.kernel.org/r/1545127933-10711-2-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	[ixgbe]
Acked-by: Jens Axboe <axboe@kernel.dk>			[mtip32xx]
Acked-by: Vinod Koul <vkoul@kernel.org>			[dmaengine.c]
Acked-by: Michael Ellerman <mpe@ellerman.id.au>		[powerpc]
Acked-by: Doug Ledford <dledford@redhat.com>		[drivers/infiniband]
Cc: Joseph Qi <jiangqi903@gmail.com>
Cc: Hans Verkuil <hverkuil@xs4all.nl>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-05 21:07:14 -08:00
Firoz Khan 685536921f sh: remove nargs from __SYSCALL
The __SYSCALL macro's arguments are system call number, system call
entry name and number of arguments for the system call.

Argument- nargs in __SYSCALL(nr, entry, nargs) is neither calculated nor
used anywhere.  So it would be better to keep the implementation as
__SYSCALL(nr, entry).  This unifies the implementation with some other
architectures too.

Link: http://lkml.kernel.org/r/1546443445-21075-2-git-send-email-firoz.khan@linaro.org
Signed-off-by: Firoz Khan <firoz.khan@linaro.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Cc: Simon Horman <horms+renesas@verge.net.au>
Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Philippe Ombredanne <pombredanne@nexb.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-05 21:07:13 -08:00
Andrey Ryabinin 7771bdbbfd kasan: remove use after scope bugs detection.
Use after scope bugs detector seems to be almost entirely useless for
the linux kernel.  It exists over two years, but I've seen only one
valid bug so far [1].  And the bug was fixed before it has been
reported.  There were some other use-after-scope reports, but they were
false-positives due to different reasons like incompatibility with
structleak plugin.

This feature significantly increases stack usage, especially with GCC <
9 version, and causes a 32K stack overflow.  It probably adds
performance penalty too.

Given all that, let's remove use-after-scope detector entirely.

While preparing this patch I've noticed that we mistakenly enable
use-after-scope detection for clang compiler regardless of
CONFIG_KASAN_EXTRA setting.  This is also fixed now.

[1] http://lkml.kernel.org/r/<20171129052106.rhgbjhhis53hkgfn@wfg-t540p.sh.intel.com>

Link: http://lkml.kernel.org/r/20190111185842.13978-1-aryabinin@virtuozzo.com
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Acked-by: Will Deacon <will.deacon@arm.com>		[arm64]
Cc: Qian Cai <cai@lca.pw>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-05 21:07:13 -08:00
Linus Torvalds b1b988a6a0 Merge branch 'timers-2038-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull year 2038 updates from Thomas Gleixner:
 "Another round of changes to make the kernel ready for 2038. After lots
  of preparatory work this is the first set of syscalls which are 2038
  safe:

    403 clock_gettime64
    404 clock_settime64
    405 clock_adjtime64
    406 clock_getres_time64
    407 clock_nanosleep_time64
    408 timer_gettime64
    409 timer_settime64
    410 timerfd_gettime64
    411 timerfd_settime64
    412 utimensat_time64
    413 pselect6_time64
    414 ppoll_time64
    416 io_pgetevents_time64
    417 recvmmsg_time64
    418 mq_timedsend_time64
    419 mq_timedreceiv_time64
    420 semtimedop_time64
    421 rt_sigtimedwait_time64
    422 futex_time64
    423 sched_rr_get_interval_time64

  The syscall numbers are identical all over the architectures"

* 'timers-2038-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits)
  riscv: Use latest system call ABI
  checksyscalls: fix up mq_timedreceive and stat exceptions
  unicore32: Fix __ARCH_WANT_STAT64 definition
  asm-generic: Make time32 syscall numbers optional
  asm-generic: Drop getrlimit and setrlimit syscalls from default list
  32-bit userspace ABI: introduce ARCH_32BIT_OFF_T config option
  compat ABI: use non-compat openat and open_by_handle_at variants
  y2038: add 64-bit time_t syscalls to all 32-bit architectures
  y2038: rename old time and utime syscalls
  y2038: remove struct definition redirects
  y2038: use time32 syscall names on 32-bit
  syscalls: remove obsolete __IGNORE_ macros
  y2038: syscalls: rename y2038 compat syscalls
  x86/x32: use time64 versions of sigtimedwait and recvmmsg
  timex: change syscalls to use struct __kernel_timex
  timex: use __kernel_timex internally
  sparc64: add custom adjtimex/clock_adjtime functions
  time: fix sys_timer_settime prototype
  time: Add struct __kernel_timex
  time: make adjtime compat handling available for 32 bit
  ...
2019-03-05 14:08:26 -08:00
Linus Torvalds edaed168e1 Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86/pti update from Thomas Gleixner:
 "Just a single change from the anti-performance departement:

   - Add a new PR_SPEC_DISABLE_NOEXEC option which allows to apply the
     speculation protections on a process without inheriting the state
     on exec.

     This remedies a situation where a Java-launcher has speculation
     protections enabled because that's the default for JVMs which
     causes the launched regular harmless processes to inherit the
     protection state which results in unintended performance
     degradation"

* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/speculation: Add PR_SPEC_DISABLE_NOEXEC
2019-03-05 12:50:34 -08:00
Linus Torvalds d9862cfbe2 Here's the main MIPS pull request for v5.1:
- Support for the MIPSr6 MemoryMapID register & Global INValidate TLB
   (GINVT) instructions, allowing for more efficient TLB maintenance when
   running on a CPU such as the I6500 that supports these.
 
 - Enable huge page support for MIPS64r6.
 
 - Optimize post-DMA cache sync by removing that code entirely for kernel
   configurations in which we know it won't be needed.
 
 - The number of pages allocated for interrupt stacks is now calculated
   correctly, where before we would wastefully allocate too much memory
   in some configurations.
 
 - The ath79 platform migrates to devicetree.
 
 - The bcm47xx platform sees fixes for the Buffalo WHR-G54S board.
 
 - The ingenic/jz4740 platform gains support for appended devicetrees.
 
 - The cavium_octeon, lantiq, loongson32 & sgi-ip27 platforms all see
   cleanups as do various pieces of core architecture code.
 -----BEGIN PGP SIGNATURE-----
 
 iIsEABYIADMWIQRgLjeFAZEXQzy86/s+p5+stXUA3QUCXH3BQxUccGF1bC5idXJ0
 b25AbWlwcy5jb20ACgkQPqefrLV1AN1+4wD+Oh4JTfZN/NEOQMlrSkXxjEHqjX3u
 1Y6CiiPCs+q2UnYBANb+ic+ZH5MnvJxxmcvlYI2q3rIh4b8TDriip4KMUTUP
 =Sw9X
 -----END PGP SIGNATURE-----

Merge tag 'mips_5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux

Pull MIPS updates from Paul Burton:

 - Support for the MIPSr6 MemoryMapID register & Global INValidate TLB
   (GINVT) instructions, allowing for more efficient TLB maintenance
   when running on a CPU such as the I6500 that supports these.

 - Enable huge page support for MIPS64r6.

 - Optimize post-DMA cache sync by removing that code entirely for
   kernel configurations in which we know it won't be needed.

 - The number of pages allocated for interrupt stacks is now calculated
   correctly, where before we would wastefully allocate too much memory
   in some configurations.

 - The ath79 platform migrates to devicetree.

 - The bcm47xx platform sees fixes for the Buffalo WHR-G54S board.

 - The ingenic/jz4740 platform gains support for appended devicetrees.

 - The cavium_octeon, lantiq, loongson32 & sgi-ip27 platforms all see
   cleanups as do various pieces of core architecture code.

* tag 'mips_5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: (66 commits)
  MIPS: lantiq: Remove separate GPHY Firmware loader
  MIPS: ingenic: Add support for appended devicetree
  MIPS: SGI-IP27: rework HUB interrupts
  MIPS: SGI-IP27: do boot CPU init later
  MIPS: SGI-IP27: do xtalk scanning later
  MIPS: SGI-IP27: use pr_info/pr_emerg and pr_cont to fix output
  MIPS: SGI-IP27: clean up bridge access and header files
  MIPS: SGI-IP27: get rid of volatile and hubreg_t
  MIPS: irq: Allocate accurate order pages for irq stack
  MIPS: dma-noncoherent: Remove bogus condition in dma_sync_phys()
  MIPS: eBPF: Remove REG_32BIT_ZERO_EX
  MIPS: eBPF: Always return sign extended 32b values
  MIPS: CM: Fix indentation
  MIPS: BCM47XX: Fix/improve Buffalo WHR-G54S support
  MIPS: OCTEON: program rx/tx-delay always from DT
  MIPS: OCTEON: delete board-specific link status
  MIPS: OCTEON: don't lie about interface type of CN3005 board
  MIPS: OCTEON: warn if deprecated link status is being used
  MIPS: OCTEON: add fixed-link nodes to in-kernel device tree
  MIPS: Delete unused flush_cache_sigtramp()
  ...
2019-03-05 11:28:25 -08:00
Linus Torvalds 8feed3efa8 Merge branch 'parisc-5.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc updates from Helge Deller:
 "The most important changes in this patch set are:

   - DMA-related cleanups for parisc with the aim to move anything not
     required by drivers out of <asm/dma-mapping.h>, by Christoph
     Hellwig

   - Switch to memblock_alloc(), by Mike Rapoport

   - Makefile cleanups by Masahiro Yamada

   - Switch to bust_spinlocks(), by Sergey Senozhatsky

   - Improved initial SMP affinity selection for IRQs

   - Added IPI- and rescheduling interrupts in /proc/interrupts output"

* 'parisc-5.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: (21 commits)
  parisc: use memblock_alloc() instead of custom get_memblock()
  parisc: Add constants for various PDC firmware calls
  parisc: Add constant for PDC_PAT_COMPLEX firmware call
  parisc: Show machine product number during boot
  parisc: Add constants for PDC_RELOCATE PDC call
  parisc: Add PDC_CRASH_PREP PDC function number
  parisc: Use F_EXTEND() macro in iosapic code
  parisc: remove the HBA_DATA macro
  parisc/lba_pci: use container_of in LBA_DEV
  parisc/dino: use container_of in DINO_DEV
  parisc: properly type the return value of parisc_walk_tree
  parisc: properly type the iommu field in struct pci_hba_data
  parisc: turn GET_IOC into an inline function
  parisc: move internal implementation details out of <asm/dma-mapping.h>
  parisc: don't include <asm/cacheflush.h> in <asm/dma-mapping.h>
  parisc: remove meaningless ccflags-y in arch/parisc/boot/Makefile
  parisc: replace oops_in_progress manipulation with bust_spinlocks()
  parisc: Improve initial IRQ to CPU assignment
  parisc: Count IPI function call interrupts
  parisc: Show rescheduling interrupts on SMP machines only
  ...
2019-03-05 11:17:23 -08:00
Linus Torvalds 3591b19511 s390 updates for the 5.1 merge window
- A copy of Arnds compat wrapper generation series
 
  - Pass information about the KVM guest to the host in form the control
    program code and the control program version code
 
  - Map IOV resources to support PCI physical functions on s390
 
  - Add vector load and store alignment hints to improve performance
 
  - Use the "jdd" constraint with gcc 9 to make jump labels working again
 
  - Remove amode workaround for old z/VM releases from the DCSS code
 
  - Add support for in-kernel performance measurements using the
    CPU measurement counter facility
 
  - Introduce a new PMU device cpum_cf_diag to capture counters and
    store thenn as event raw data.
 
  - Bug fixes and cleanups
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABCAAGBQJcfh4QAAoJEDjwexyKj9rgXVAH/RzVbi3vznldujSNfCFTZKPu
 EmFFAZIfbhifW3szfylyOJL52pFhxjcWzY0hkFEkbs2t90sn8l1BNkDscYZtfNHC
 XvN3N9LsHyxOeyxvQuWLSio58qm+Lr1L0UrIhbMvqyAVkOLmIHvybFwi83OkMptm
 djoL8NbuNsAA2s26y2bZLNtU7FmOW5smJIlnt7H4dmK4SFylqZKS/EnUZxGDgn+7
 UrrTTOQUir0QZ8vraANsP1M0/LqPcd2YusLmj4jOdZ5Muc2Ch2AA991FofqdKShO
 /8cGlsIzwHWGgdnP/YDea5gbetvonayYduixKy3EnYpWQ9iogiBjH4G7QNxcncs=
 =v26J
 -----END PGP SIGNATURE-----

Merge tag 's390-5.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 updates from Martin Schwidefsky:

 - A copy of Arnds compat wrapper generation series

 - Pass information about the KVM guest to the host in form the control
   program code and the control program version code

 - Map IOV resources to support PCI physical functions on s390

 - Add vector load and store alignment hints to improve performance

 - Use the "jdd" constraint with gcc 9 to make jump labels working again

 - Remove amode workaround for old z/VM releases from the DCSS code

 - Add support for in-kernel performance measurements using the CPU
   measurement counter facility

 - Introduce a new PMU device cpum_cf_diag to capture counters and store
   thenn as event raw data.

 - Bug fixes and cleanups

* tag 's390-5.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (54 commits)
  Revert "s390/cpum_cf: Add kernel message exaplanations"
  s390/dasd: fix read device characteristic with CONFIG_VMAP_STACK=y
  s390/suspend: fix prefix register reset in swsusp_arch_resume
  s390: warn about clearing als implied facilities
  s390: allow overriding facilities via command line
  s390: clean up redundant facilities list setup
  s390/als: remove duplicated in-place implementation of stfle
  s390/cio: Use cpa range elsewhere within vfio-ccw
  s390/cio: Fix vfio-ccw handling of recursive TICs
  s390: vfio_ap: link the vfio_ap devices to the vfio_ap bus subsystem
  s390/cpum_cf: Handle EBUSY return code from CPU counter facility reservation
  s390/cpum_cf: Add kernel message exaplanations
  s390/cpum_cf_diag: Add support for s390 counter facility diagnostic trace
  s390/cpum_cf: add ctr_stcctm() function
  s390/cpum_cf: move common functions into a separate file
  s390/cpum_cf: introduce kernel_cpumcf_avail() function
  s390/cpu_mf: replace stcctm5() with the stcctm() function
  s390/cpu_mf: add store cpu counter multiple instruction support
  s390/cpum_cf: Add minimal in-kernel interface for counter measurements
  s390/cpum_cf: introduce kernel_cpumcf_alert() to obtain measurement alerts
  ...
2019-03-05 11:13:10 -08:00
Linus Torvalds 45f5532a2f m68k updates for v5.1
- VLA removal,
   - Gcc-8.x build fixes,
   - Small improvements and cleanups,
   - Defconfig updates.
 -----BEGIN PGP SIGNATURE-----
 
 iIsEABYIADMWIQQ9qaHoIs/1I4cXmEiKwlD9ZEnxcAUCXHk4BBUcZ2VlcnRAbGlu
 dXgtbTY4ay5vcmcACgkQisJQ/WRJ8XAF4wEAn31Bsn7R8Szn/TtilxBL0GQV3eeC
 rGtX96FrGurDSlYA/1RmgYffJsb/Rlia9ejb8/rWPsEEZn8ngZ2T2YjRHwUA
 =665T
 -----END PGP SIGNATURE-----

Merge tag 'm68k-for-v5.1-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k

Pull m68k updates from Geert Uytterhoeven:

 - VLA removal

 - gcc-8.x build fixes

 - small improvements and cleanups

 - defconfig updates

* tag 'm68k-for-v5.1-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
  m68k: Add -ffreestanding to CFLAGS
  m68k/apollo: Fix comment in Makefile
  dio: Fix buffer overflow in case of unknown board
  m68k/defconfig: Update defconfigs for v5.0-rc1
  m68k/atari: Avoid VLA use in atari_switches_setup()
  m68k: Avoid VLA use in mangle_kernel_stack()
  m68k/mac: Use '030 reset method on SE/30
  m68k/mac: Remove obsolete comment
  m68k/mac: Skip VIA port setup unless RTC is connected
  m68k/mac: Clean up unused timer definitions
  m68k/defconfig: Drop NET_VENDOR_<FOO>=n
2019-03-05 11:02:12 -08:00
Borislav Petkov eac6165570 x86: Deprecate a.out support
Linux supports ELF binaries for ~25 years now.  a.out coredumping has
bitrotten quite significantly and would need some fixing to get it into
shape again but considering how even the toolchains cannot create a.out
executables in its default configuration, let's deprecate a.out support
and remove it a couple of releases later, instead.

Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Richard Weinberger <richard@nod.at>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Cc: Jann Horn <jannh@google.com>
Cc: <linux-api@vger.kernel.org>
Cc: <linux-fsdevel@vger.kernel.org>
Cc: lkml <linux-kernel@vger.kernel.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: <x86@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-05 10:39:38 -08:00
Linus Torvalds 08300f4402 a.out: remove core dumping support
We're (finally) phasing out a.out support for good.  As Borislav Petkov
points out, we've supported ELF binaries for about 25 years by now, and
coredumping in particular has bitrotted over the years.

None of the tool chains even support generating a.out binaries any more,
and the plan is to deprecate a.out support entirely for the kernel.  But
I want to start with just removing the core dumping code, because I can
still imagine that somebody actually might want to support a.out as a
simpler biinary format.

Particularly if you generate some random binaries on the fly, ELF is a
much more complicated format (admittedly ELF also does have a lot of
toolchain support, mitigating that complexity a lot and you really
should have moved over in the last 25 years).

So it's at least somewhat possible that somebody out there has some
workflow that still involves generating and running a.out executables.

In contrast, it's very unlikely that anybody depends on debugging any
legacy a.out core files.  But regardless, I want this phase-out to be
done in two steps, so that we can resurrect a.out support (if needed)
without having to resurrect the core file dumping that is almost
certainly not needed.

Jann Horn pointed to the <asm/a.out-core.h> file that my first trivial
cut at this had missed.

And Alan Cox points out that the a.out binary loader _could_ be done in
user space if somebody wants to, but we might keep just the loader in
the kernel if somebody really wants it, since the loader isn't that big
and has no really odd special cases like the core dumping does.

Acked-by: Borislav Petkov <bp@alien8.de>
Cc: Alan Cox <gnomes@lxorguk.ukuu.org.uk>
Cc: Jann Horn <jannh@google.com>
Cc: Richard Weinberger <richard@nod.at>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-05 10:00:35 -08:00
Linus Torvalds 63bdf4284c Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto update from Herbert Xu:
 "API:
   - Add helper for simple skcipher modes.
   - Add helper to register multiple templates.
   - Set CRYPTO_TFM_NEED_KEY when setkey fails.
   - Require neither or both of export/import in shash.
   - AEAD decryption test vectors are now generated from encryption
     ones.
   - New option CONFIG_CRYPTO_MANAGER_EXTRA_TESTS that includes random
     fuzzing.

  Algorithms:
   - Conversions to skcipher and helper for many templates.
   - Add more test vectors for nhpoly1305 and adiantum.

  Drivers:
   - Add crypto4xx prng support.
   - Add xcbc/cmac/ecb support in caam.
   - Add AES support for Exynos5433 in s5p.
   - Remove sha384/sha512 from artpec7 as hardware cannot do partial
     hash"

[ There is a merge of the Freescale SoC tree in order to pull in changes
  required by patches to the caam/qi2 driver. ]

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (174 commits)
  crypto: s5p - add AES support for Exynos5433
  dt-bindings: crypto: document Exynos5433 SlimSSS
  crypto: crypto4xx - add missing of_node_put after of_device_is_available
  crypto: cavium/zip - fix collision with generic cra_driver_name
  crypto: af_alg - use struct_size() in sock_kfree_s()
  crypto: caam - remove redundant likely/unlikely annotation
  crypto: s5p - update iv after AES-CBC op end
  crypto: x86/poly1305 - Clear key material from stack in SSE2 variant
  crypto: caam - generate hash keys in-place
  crypto: caam - fix DMA mapping xcbc key twice
  crypto: caam - fix hash context DMA unmap size
  hwrng: bcm2835 - fix probe as platform device
  crypto: s5p-sss - Use AES_BLOCK_SIZE define instead of number
  crypto: stm32 - drop pointless static qualifier in stm32_hash_remove()
  crypto: chelsio - Fixed Traffic Stall
  crypto: marvell - Remove set but not used variable 'ivsize'
  crypto: ccp - Update driver messages to remove some confusion
  crypto: adiantum - add 1536 and 4096-byte test vectors
  crypto: nhpoly1305 - add a test vector with len % 16 != 0
  crypto: arm/aes-ce - update IV after partial final CTR block
  ...
2019-03-05 09:09:55 -08:00
Linus Torvalds 6456300356 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:
 "Here we go, another merge window full of networking and #ebpf changes:

   1) Snoop DHCPACKS in batman-adv to learn MAC/IP pairs in the DHCP
      range without dealing with floods of ARP traffic, from Linus
      Lüssing.

   2) Throttle buffered multicast packet transmission in mt76, from
      Felix Fietkau.

   3) Support adaptive interrupt moderation in ice, from Brett Creeley.

   4) A lot of struct_size conversions, from Gustavo A. R. Silva.

   5) Add peek/push/pop commands to bpftool, as well as bash completion,
      from Stanislav Fomichev.

   6) Optimize sk_msg_clone(), from Vakul Garg.

   7) Add SO_BINDTOIFINDEX, from David Herrmann.

   8) Be more conservative with local resends due to local congestion,
      from Yuchung Cheng.

   9) Allow vetoing of unsupported VXLAN FDBs, from Petr Machata.

  10) Add health buffer support to devlink, from Eran Ben Elisha.

  11) Add TXQ scheduling API to mac80211, from Toke Høiland-Jørgensen.

  12) Add statistics to basic packet scheduler filter, from Cong Wang.

  13) Add GRE tunnel support for mlxsw Spectrum-2, from Nir Dotan.

  14) Lots of new IP tunneling forwarding tests, also from Nir Dotan.

  15) Add 3ad stats to bonding, from Nikolay Aleksandrov.

  16) Lots of probing improvements for bpftool, from Quentin Monnet.

  17) Various nfp drive #ebpf JIT improvements from Jakub Kicinski.

  18) Allow #ebpf programs to access gso_segs from skb shared info, from
      Eric Dumazet.

  19) Add sock_diag support for AF_XDP sockets, from Björn Töpel.

  20) Support 22260 iwlwifi devices, from Luca Coelho.

  21) Use rbtree for ipv6 defragmentation, from Peter Oskolkov.

  22) Add JMP32 instruction class support to #ebpf, from Jiong Wang.

  23) Add spinlock support to #ebpf, from Alexei Starovoitov.

  24) Support 256-bit keys and TLS 1.3 in ktls, from Dave Watson.

  25) Add device infomation API to devlink, from Jakub Kicinski.

  26) Add new timestamping socket options which are y2038 safe, from
      Deepa Dinamani.

  27) Add RX checksum offloading for various sh_eth chips, from Sergei
      Shtylyov.

  28) Flow offload infrastructure, from Pablo Neira Ayuso.

  29) Numerous cleanups, improvements, and bug fixes to the PHY layer
      and many drivers from Heiner Kallweit.

  30) Lots of changes to try and make packet scheduler classifiers run
      lockless as much as possible, from Vlad Buslov.

  31) Support BCM957504 chip in bnxt_en driver, from Erik Burrows.

  32) Add concurrency tests to tc-tests infrastructure, from Vlad
      Buslov.

  33) Add hwmon support to aquantia, from Heiner Kallweit.

  34) Allow 64-bit values for SO_MAX_PACING_RATE, from Eric Dumazet.

  And I would be remiss if I didn't thank the various major networking
  subsystem maintainers for integrating much of this work before I even
  saw it. Alexei Starovoitov, Daniel Borkmann, Pablo Neira Ayuso,
  Johannes Berg, Kalle Valo, and many others. Thank you!"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (2207 commits)
  net/sched: avoid unused-label warning
  net: ignore sysctl_devconf_inherit_init_net without SYSCTL
  phy: mdio-mux: fix Kconfig dependencies
  net: phy: use phy_modify_mmd_changed in genphy_c45_an_config_aneg
  net: dsa: mv88e6xxx: add call to mv88e6xxx_ports_cmode_init to probe for new DSA framework
  selftest/net: Remove duplicate header
  sky2: Disable MSI on Dell Inspiron 1545 and Gateway P-79
  net/mlx5e: Update tx reporter status in case channels were successfully opened
  devlink: Add support for direct reporter health state update
  devlink: Update reporter state to error even if recover aborted
  sctp: call iov_iter_revert() after sending ABORT
  team: Free BPF filter when unregistering netdev
  ip6mr: Do not call __IP6_INC_STATS() from preemptible context
  isdn: mISDN: Fix potential NULL pointer dereference of kzalloc
  net: dsa: mv88e6xxx: support in-band signalling on SGMII ports with external PHYs
  cxgb4/chtls: Prefix adapter flags with CXGB4
  net-sysfs: Switch to bitmap_zalloc()
  mellanox: Switch to bitmap_zalloc()
  bpf: add test cases for non-pointer sanitiation logic
  mlxsw: i2c: Extend initialization by querying resources data
  ...
2019-03-05 08:26:13 -08:00
Christian Brauner 3eb39f4793
signal: add pidfd_send_signal() syscall
The kill() syscall operates on process identifiers (pid). After a process
has exited its pid can be reused by another process. If a caller sends a
signal to a reused pid it will end up signaling the wrong process. This
issue has often surfaced and there has been a push to address this problem [1].

This patch uses file descriptors (fd) from proc/<pid> as stable handles on
struct pid. Even if a pid is recycled the handle will not change. The fd
can be used to send signals to the process it refers to.
Thus, the new syscall pidfd_send_signal() is introduced to solve this
problem. Instead of pids it operates on process fds (pidfd).

/* prototype and argument /*
long pidfd_send_signal(int pidfd, int sig, siginfo_t *info, unsigned int flags);

/* syscall number 424 */
The syscall number was chosen to be 424 to align with Arnd's rework in his
y2038 to minimize merge conflicts (cf. [25]).

In addition to the pidfd and signal argument it takes an additional
siginfo_t and flags argument. If the siginfo_t argument is NULL then
pidfd_send_signal() is equivalent to kill(<positive-pid>, <signal>). If it
is not NULL pidfd_send_signal() is equivalent to rt_sigqueueinfo().
The flags argument is added to allow for future extensions of this syscall.
It currently needs to be passed as 0. Failing to do so will cause EINVAL.

/* pidfd_send_signal() replaces multiple pid-based syscalls */
The pidfd_send_signal() syscall currently takes on the job of
rt_sigqueueinfo(2) and parts of the functionality of kill(2), Namely, when a
positive pid is passed to kill(2). It will however be possible to also
replace tgkill(2) and rt_tgsigqueueinfo(2) if this syscall is extended.

/* sending signals to threads (tid) and process groups (pgid) */
Specifically, the pidfd_send_signal() syscall does currently not operate on
process groups or threads. This is left for future extensions.
In order to extend the syscall to allow sending signal to threads and
process groups appropriately named flags (e.g. PIDFD_TYPE_PGID, and
PIDFD_TYPE_TID) should be added. This implies that the flags argument will
determine what is signaled and not the file descriptor itself. Put in other
words, grouping in this api is a property of the flags argument not a
property of the file descriptor (cf. [13]). Clarification for this has been
requested by Eric (cf. [19]).
When appropriate extensions through the flags argument are added then
pidfd_send_signal() can additionally replace the part of kill(2) which
operates on process groups as well as the tgkill(2) and
rt_tgsigqueueinfo(2) syscalls.
How such an extension could be implemented has been very roughly sketched
in [14], [15], and [16]. However, this should not be taken as a commitment
to a particular implementation. There might be better ways to do it.
Right now this is intentionally left out to keep this patchset as simple as
possible (cf. [4]).

/* naming */
The syscall had various names throughout iterations of this patchset:
- procfd_signal()
- procfd_send_signal()
- taskfd_send_signal()
In the last round of reviews it was pointed out that given that if the
flags argument decides the scope of the signal instead of different types
of fds it might make sense to either settle for "procfd_" or "pidfd_" as
prefix. The community was willing to accept either (cf. [17] and [18]).
Given that one developer expressed strong preference for the "pidfd_"
prefix (cf. [13]) and with other developers less opinionated about the name
we should settle for "pidfd_" to avoid further bikeshedding.

The  "_send_signal" suffix was chosen to reflect the fact that the syscall
takes on the job of multiple syscalls. It is therefore intentional that the
name is not reminiscent of neither kill(2) nor rt_sigqueueinfo(2). Not the
fomer because it might imply that pidfd_send_signal() is a replacement for
kill(2), and not the latter because it is a hassle to remember the correct
spelling - especially for non-native speakers - and because it is not
descriptive enough of what the syscall actually does. The name
"pidfd_send_signal" makes it very clear that its job is to send signals.

/* zombies */
Zombies can be signaled just as any other process. No special error will be
reported since a zombie state is an unreliable state (cf. [3]). However,
this can be added as an extension through the @flags argument if the need
ever arises.

/* cross-namespace signals */
The patch currently enforces that the signaler and signalee either are in
the same pid namespace or that the signaler's pid namespace is an ancestor
of the signalee's pid namespace. This is done for the sake of simplicity
and because it is unclear to what values certain members of struct
siginfo_t would need to be set to (cf. [5], [6]).

/* compat syscalls */
It became clear that we would like to avoid adding compat syscalls
(cf. [7]).  The compat syscall handling is now done in kernel/signal.c
itself by adding __copy_siginfo_from_user_generic() which lets us avoid
compat syscalls (cf. [8]). It should be noted that the addition of
__copy_siginfo_from_user_any() is caused by a bug in the original
implementation of rt_sigqueueinfo(2) (cf. 12).
With upcoming rework for syscall handling things might improve
significantly (cf. [11]) and __copy_siginfo_from_user_any() will not gain
any additional callers.

/* testing */
This patch was tested on x64 and x86.

/* userspace usage */
An asciinema recording for the basic functionality can be found under [9].
With this patch a process can be killed via:

 #define _GNU_SOURCE
 #include <errno.h>
 #include <fcntl.h>
 #include <signal.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
 #include <sys/stat.h>
 #include <sys/syscall.h>
 #include <sys/types.h>
 #include <unistd.h>

 static inline int do_pidfd_send_signal(int pidfd, int sig, siginfo_t *info,
                                         unsigned int flags)
 {
 #ifdef __NR_pidfd_send_signal
         return syscall(__NR_pidfd_send_signal, pidfd, sig, info, flags);
 #else
         return -ENOSYS;
 #endif
 }

 int main(int argc, char *argv[])
 {
         int fd, ret, saved_errno, sig;

         if (argc < 3)
                 exit(EXIT_FAILURE);

         fd = open(argv[1], O_DIRECTORY | O_CLOEXEC);
         if (fd < 0) {
                 printf("%s - Failed to open \"%s\"\n", strerror(errno), argv[1]);
                 exit(EXIT_FAILURE);
         }

         sig = atoi(argv[2]);

         printf("Sending signal %d to process %s\n", sig, argv[1]);
         ret = do_pidfd_send_signal(fd, sig, NULL, 0);

         saved_errno = errno;
         close(fd);
         errno = saved_errno;

         if (ret < 0) {
                 printf("%s - Failed to send signal %d to process %s\n",
                        strerror(errno), sig, argv[1]);
                 exit(EXIT_FAILURE);
         }

         exit(EXIT_SUCCESS);
 }

/* Q&A
 * Given that it seems the same questions get asked again by people who are
 * late to the party it makes sense to add a Q&A section to the commit
 * message so it's hopefully easier to avoid duplicate threads.
 *
 * For the sake of progress please consider these arguments settled unless
 * there is a new point that desperately needs to be addressed. Please make
 * sure to check the links to the threads in this commit message whether
 * this has not already been covered.
 */
Q-01: (Florian Weimer [20], Andrew Morton [21])
      What happens when the target process has exited?
A-01: Sending the signal will fail with ESRCH (cf. [22]).

Q-02:  (Andrew Morton [21])
       Is the task_struct pinned by the fd?
A-02:  No. A reference to struct pid is kept. struct pid - as far as I
       understand - was created exactly for the reason to not require to
       pin struct task_struct (cf. [22]).

Q-03: (Andrew Morton [21])
      Does the entire procfs directory remain visible? Just one entry
      within it?
A-03: The same thing that happens right now when you hold a file descriptor
      to /proc/<pid> open (cf. [22]).

Q-04: (Andrew Morton [21])
      Does the pid remain reserved?
A-04: No. This patchset guarantees a stable handle not that pids are not
      recycled (cf. [22]).

Q-05: (Andrew Morton [21])
      Do attempts to signal that fd return errors?
A-05: See {Q,A}-01.

Q-06: (Andrew Morton [22])
      Is there a cleaner way of obtaining the fd? Another syscall perhaps.
A-06: Userspace can already trivially retrieve file descriptors from procfs
      so this is something that we will need to support anyway. Hence,
      there's no immediate need to add another syscalls just to make
      pidfd_send_signal() not dependent on the presence of procfs. However,
      adding a syscalls to get such file descriptors is planned for a
      future patchset (cf. [22]).

Q-07: (Andrew Morton [21] and others)
      This fd-for-a-process sounds like a handy thing and people may well
      think up other uses for it in the future, probably unrelated to
      signals. Are the code and the interface designed to permit such
      future applications?
A-07: Yes (cf. [22]).

Q-08: (Andrew Morton [21] and others)
      Now I think about it, why a new syscall? This thing is looking
      rather like an ioctl?
A-08: This has been extensively discussed. It was agreed that a syscall is
      preferred for a variety or reasons. Here are just a few taken from
      prior threads. Syscalls are safer than ioctl()s especially when
      signaling to fds. Processes are a core kernel concept so a syscall
      seems more appropriate. The layout of the syscall with its four
      arguments would require the addition of a custom struct for the
      ioctl() thereby causing at least the same amount or even more
      complexity for userspace than a simple syscall. The new syscall will
      replace multiple other pid-based syscalls (see description above).
      The file-descriptors-for-processes concept introduced with this
      syscall will be extended with other syscalls in the future. See also
      [22], [23] and various other threads already linked in here.

Q-09: (Florian Weimer [24])
      What happens if you use the new interface with an O_PATH descriptor?
A-09:
      pidfds opened as O_PATH fds cannot be used to send signals to a
      process (cf. [2]). Signaling processes through pidfds is the
      equivalent of writing to a file. Thus, this is not an operation that
      operates "purely at the file descriptor level" as required by the
      open(2) manpage. See also [4].

/* References */
[1]:  https://lore.kernel.org/lkml/20181029221037.87724-1-dancol@google.com/
[2]:  https://lore.kernel.org/lkml/874lbtjvtd.fsf@oldenburg2.str.redhat.com/
[3]:  https://lore.kernel.org/lkml/20181204132604.aspfupwjgjx6fhva@brauner.io/
[4]:  https://lore.kernel.org/lkml/20181203180224.fkvw4kajtbvru2ku@brauner.io/
[5]:  https://lore.kernel.org/lkml/20181121213946.GA10795@mail.hallyn.com/
[6]:  https://lore.kernel.org/lkml/20181120103111.etlqp7zop34v6nv4@brauner.io/
[7]:  https://lore.kernel.org/lkml/36323361-90BD-41AF-AB5B-EE0D7BA02C21@amacapital.net/
[8]:  https://lore.kernel.org/lkml/87tvjxp8pc.fsf@xmission.com/
[9]:  https://asciinema.org/a/IQjuCHew6bnq1cr78yuMv16cy
[11]: https://lore.kernel.org/lkml/F53D6D38-3521-4C20-9034-5AF447DF62FF@amacapital.net/
[12]: https://lore.kernel.org/lkml/87zhtjn8ck.fsf@xmission.com/
[13]: https://lore.kernel.org/lkml/871s6u9z6u.fsf@xmission.com/
[14]: https://lore.kernel.org/lkml/20181206231742.xxi4ghn24z4h2qki@brauner.io/
[15]: https://lore.kernel.org/lkml/20181207003124.GA11160@mail.hallyn.com/
[16]: https://lore.kernel.org/lkml/20181207015423.4miorx43l3qhppfz@brauner.io/
[17]: https://lore.kernel.org/lkml/CAGXu5jL8PciZAXvOvCeCU3wKUEB_dU-O3q0tDw4uB_ojMvDEew@mail.gmail.com/
[18]: https://lore.kernel.org/lkml/20181206222746.GB9224@mail.hallyn.com/
[19]: https://lore.kernel.org/lkml/20181208054059.19813-1-christian@brauner.io/
[20]: https://lore.kernel.org/lkml/8736rebl9s.fsf@oldenburg.str.redhat.com/
[21]: https://lore.kernel.org/lkml/20181228152012.dbf0508c2508138efc5f2bbe@linux-foundation.org/
[22]: https://lore.kernel.org/lkml/20181228233725.722tdfgijxcssg76@brauner.io/
[23]: https://lwn.net/Articles/773459/
[24]: https://lore.kernel.org/lkml/8736rebl9s.fsf@oldenburg.str.redhat.com/
[25]: https://lore.kernel.org/lkml/CAK8P3a0ej9NcJM8wXNPbcGUyOUZYX+VLoDFdbenW3s3114oQZw@mail.gmail.com/

Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Jann Horn <jannh@google.com>
Cc: Andy Lutomirsky <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Florian Weimer <fweimer@redhat.com>
Signed-off-by: Christian Brauner <christian@brauner.io>
Reviewed-by: Tycho Andersen <tycho@tycho.ws>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: David Howells <dhowells@redhat.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Serge Hallyn <serge@hallyn.com>
Acked-by: Aleksa Sarai <cyphar@cyphar.com>
2019-03-05 17:03:53 +01:00
Steven Rostedt (VMware) 745cfeaac0 x86/ftrace: Fix warning and considate ftrace_jmp_replace() and ftrace_call_replace()
Arnd reported the following compiler warning:

arch/x86/kernel/ftrace.c:669:23: error: 'ftrace_jmp_replace' defined but not used [-Werror=unused-function]

The ftrace_jmp_replace() function now only has a single user and should be
simply moved by that user. But looking at the code, it shows that
ftrace_jmp_replace() is similar to ftrace_call_replace() except that instead
of using the opcode of 0xe8 it uses 0xe9. It makes more sense to consolidate
that function into one implementation that both ftrace_jmp_replace() and
ftrace_call_replace() use by passing in the op code separate.

The structure in ftrace_code_union is also modified to replace the "e8"
field with the more appropriate name "op".

Cc: stable@vger.kernel.org
Reported-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Link: http://lkml.kernel.org/r/20190304200748.1418790-1-arnd@arndb.de
Fixes: d2a68c4eff ("x86/ftrace: Do not call function graph from dynamic trampolines")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-03-05 08:45:08 -05:00
Arnd Bergmann b1ddd406cd xen: remove pre-xen3 fallback handlers
The legacy hypercall handlers were originally added with
a comment explaining that "copying the argument structures in
HYPERVISOR_event_channel_op() and HYPERVISOR_physdev_op() into the local
variable is sufficiently safe" and only made sure to not write
past the end of the argument structure, the checks in linux/string.h
disagree with that, when link-time optimizations are used:

In function 'memcpy',
    inlined from 'pirq_query_unmask' at drivers/xen/fallback.c:53:2,
    inlined from '__startup_pirq' at drivers/xen/events/events_base.c:529:2,
    inlined from 'restore_pirqs' at drivers/xen/events/events_base.c:1439:3,
    inlined from 'xen_irq_resume' at drivers/xen/events/events_base.c:1581:2:
include/linux/string.h:350:3: error: call to '__read_overflow2' declared with attribute error: detected read beyond size of object passed as 2nd parameter
   __read_overflow2();
   ^

Further research turned out that only Xen 3.0.2 or earlier required the
fallback at all, while all versions in use today don't need it.
As far as I can tell, it is not even possible to run a mainline kernel
on those old Xen releases, at the time when they were in use, only
a patched kernel was supported anyway.

Fixes: cf47a83fb0 ("xen/hypercall: fix hypercall fallback code for very old hypervisors")
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Jan Beulich <JBeulich@suse.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Juergen Gross <jgross@suse.com>
2019-03-05 12:07:32 +01:00
Jason Yan e585f51c4e powerpc: remove dead code in head_fsl_booke.S
This code is dead. Just remove it.

Signed-off-by: Jason Yan <yanaijie@huawei.com>
Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-03-05 16:38:45 +11:00
Joel Stanley 805bf3b755 powerpc/configs: Sync skiroot defconfig
This updates the skiroot defconfig with the version from the OpenPower
firmwre build tree.

Important changes are the addition of QED and E1000E ethernet drivers.

Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-03-05 16:35:54 +11:00
Aneesh Kumar K.V 35f2806b48 powerpc/hugetlb: Don't do runtime allocation of 16G pages in LPAR configuration
We added runtime allocation of 16G pages in commit 4ae279c2c9
("powerpc/mm/hugetlb: Allow runtime allocation of 16G.") That was done
to enable 16G allocation on PowerNV and KVM config. In case of KVM
config, we mostly would have the entire guest RAM backed by 16G
hugetlb pages for this to work. PAPR do support partial backing of
guest RAM with hugepages via ibm,expected#pages node of memory node in
the device tree. This means rest of the guest RAM won't be backed by
16G contiguous pages in the host and hence a hash page table insertion
can fail in such case.

An example error message will look like

  hash-mmu: mm: Hashing failure ! EA=0x7efc00000000 access=0x8000000000000006 current=readback
  hash-mmu:     trap=0x300 vsid=0x67af789 ssize=1 base psize=14 psize 14 pte=0xc000000400000386
  readback[12260]: unhandled signal 7 at 00007efc00000000 nip 00000000100012d0 lr 000000001000127c code 2

This patch address that by preventing runtime allocation of 16G
hugepages in LPAR config. To allocate 16G hugetlb one need to kernel
command line hugepagesz=16G hugepages=<number of 16G pages>

With radix translation mode we don't run into this issue.

This change will prevent runtime allocation of 16G hugetlb pages on
kvm with hash translation mode. However, with the current upstream it
was observed that 16G hugetlbfs backed guest doesn't boot at all.

We observe boot failure with the below message:
  [131354.647546] KVM: map_vrma at 0 failed, ret=-4

That means this patch is not resulting in an observable regression.
Once we fix the boot issue with 16G hugetlb backed memory, we need to
use ibm,expected#pages memory node attribute to indicate 16G page
reservation to the guest. This will also enable partial backing of
guest RAM with 16G pages.

Fixes: 4ae279c2c9 ("powerpc/mm/hugetlb: Allow runtime allocation of 16G.")
Cc: stable@vger.kernel.org # v4.14+
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-03-05 15:52:42 +11:00
Linus Torvalds dcc75ddea1 spi: Updates for v5.1
A fairly quiet release for SPI, the biggest thing is the conversion to
 use GPIO descriptors which is now 90% done but still needs some
 stragglers converting.
 
  - Support for inter-word delays.
  - Conversion of the core and most drivers to use GPIO descriptors for
    GPIO controlled chip selects.
  - New drivers for NXP FlexSPI and QuadSPI, SiFive and Spreadtrum.
 -----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCgAxFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAlx9WtETHGJyb29uaWVA
 a2VybmVsLm9yZwAKCRAk1otyXVSH0DfrB/92O47HJfHg1xIgOp4BLzT8lA5zQEsy
 a8ApcLVLvY6mSfbB/G7RosfoPwTc1eZP5Q2BqzQOBIO+507ao4AcASARmkTwjwkC
 eRUJ/THkGyGurs8POtnc5YJlHsT1t743QpqlUNekt+NqognlkPccgc5bNgixfuPD
 eVSwVC85SKP3gCpAjVb6FFFmlWr8AKdlgx+41h9QpMNG/85H6xgo4C4Wlajt42/E
 wNx+WXzlPyzB5Lc3IGPeF/I/Hu/Ta3hUZLTVpWi5ubM8p4SYGmMZ9l8sURPCz1pK
 UMZFpJxx8DWwkj2F/FkoLasfiRqUHIP9K7NfKF0u2xdNbhEk1GA3NAG0
 =ctKG
 -----END PGP SIGNATURE-----

Merge tag 'spi-v5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi

Pull spi updates from Mark Brown:
 "A fairly quiet release for SPI, the biggest thing is the conversion to
  use GPIO descriptors which is now 90% done but still needs some
  stragglers converting.

  Summary:

   - Support for inter-word delays

   - Conversion of the core and most drivers to use GPIO descriptors for
     GPIO controlled chip selects

   - New drivers for NXP FlexSPI and QuadSPI, SiFive and Spreadtrum"

* tag 'spi-v5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: (104 commits)
  spi: sh-msiof: Restrict bits per word to 8/16/24/32 on R-Car Gen2/3
  spi: sifive: Remove redundant dev_err call in sifive_spi_probe()
  spi: sifive: Remove spi_master_put in sifive_spi_remove()
  spi: spi-gpio: fix SPI_CS_HIGH capability
  spi: pxa2xx: Setup maximum supported DMA transfer length
  spi: sifive: Add driver for the SiFive SPI controller
  spi: sifive: Add DT documentation for SiFive SPI controller
  spi: sprd: Add a prefix for SPI DMA channel macros
  spi: sprd: spi: sprd: Add DMA mode support
  dt-bindings: spi: Add the DMA properties for the SPI dma mode
  spi: sprd: Add the SPI irq function for the SPI DMA mode
  dt-bindings: spi: imx: Add an entry for the i.MX8QM compatible
  spi: use gpio[d]_set_value_cansleep for setting chipselect GPIO
  spi: gpio: Advertise support for SPI_CS_HIGH
  spi: sh-msiof: Replace spi_master by spi_controller
  spi: sh-hspi: Replace spi_master by spi_controller
  spi: rspi: Replace spi_master by spi_controller
  spi: atmel-quadspi: add support for sam9x60 qspi controller
  dt-bindings: spi: atmel-quadspi: QuadSPI driver for Microchip SAM9X60
  spi: atmel-quadspi: add support for named peripheral clock
  ...
2019-03-04 19:23:56 -08:00
Linus Torvalds 32c0ac3af4 regulator: Updates for v5.1
The bulk of the standout changes in this release are cleanups, with the
 core work being a combination of factoring out common code into helpers
 and the completion of the conversion of the core to use GPIO
 descriptors.
 
  - Addition of helper functions for current limits and conversion of
    drivers to use them by Axel Lin.
  - Lots and lots of cleanups from Axel Lin.
  - Conversion of the core to use GPIO descriptors rather than numbers by
    Linus Walleij.
  - New drivers for Maxim MAX77650 and ROHM BD70528.
 -----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCgAxFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAlx9UosTHGJyb29uaWVA
 a2VybmVsLm9yZwAKCRAk1otyXVSH0DR7B/4x2fjnrnlvDqwzr20jcWH0BTiOCN+s
 JpWNn2HxoeNNOn2w52bo0WNBvXElWj5+Lik8xljlj9DWXCgSdys0z5XEOpOmg3Eg
 +7AhyHmYBYf2igHWqw3dvL79n2MqdUY6mpbVtbf9kfJ3M5CwkwBUuXYNJLKZhLsm
 r06wIaEmCDT9mQNxl7r1tbDUpN8Rjg43W4dNFMYEUnZPpd9zhdwSnTRFxqhpF+K+
 GTli4yv3Z2IzZWGvG838/UBETffTfZv/lnuF7f/fagihGnhsp4RSp+3/PYvIV7/v
 BIvPUywSXQFiC/NqqtRBkqRKEGclv0itZzZy39hl+zp/k+Im2jzH0gkP
 =7lqn
 -----END PGP SIGNATURE-----

Merge tag 'regulator-v5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator

Pull regulator updates from Mark Brown:
 "The bulk of the standout changes in this release are cleanups, with
  the core work being a combination of factoring out common code into
  helpers and the completion of the conversion of the core to use GPIO
  descriptors.

  Summary:

   - Addition of helper functions for current limits and conversion of
     drivers to use them by Axel Lin.

   - Lots and lots of cleanups from Axel Lin.

   - Conversion of the core to use GPIO descriptors rather than numbers
     by Linus Walleij.

   - New drivers for Maxim MAX77650 and ROHM BD70528"

* tag 'regulator-v5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: (131 commits)
  regulator: mc13xxx: Constify regulator_ops variables
  regulator: palmas: Constify palmas_smps_ramp_delay array
  regulator: wm831x-dcdc: Convert to use regulator_set/get_current_limit_regmap
  regulator: pv88090: Convert to use regulator_set/get_current_limit_regmap
  regulator: pv88080: Convert to use regulator_set/get_current_limit_regmap
  regulator: pv88060: Convert to use regulator_set/get_current_limit_regmap
  regulator: max77650: Convert to use regulator_set/get_current_limit_regmap
  regulator: lp873x: Convert to use regulator_set/get_current_limit_regmap
  regulator: lp872x: Convert to use regulator_set/get_current_limit_regmap
  regulator: da9210: Convert to use regulator_set/get_current_limit_regmap
  regulator: da9055: Convert to use regulator_set/get_current_limit_regmap
  regulator: core: Add set/get_current_limit helpers for regmap users
  regulator: Fix comment for csel_reg and csel_mask
  regulator: stm32-vrefbuf: add power management support
  regulator: 88pm8607: Remove unused fields from struct pm8607_regulator_info
  regulator: 88pm8607: Simplify pm8607_list_voltage implementation
  regulator: cpcap: Constify omap4_regulators and xoom_regulators
  regulator: cpcap: Remove unused vsel_shift from struct cpcap_regulator
  dt-bindings: regulator: tps65218: rectify units of LS3
  dt-bindings: regulator: add LS2 load switch documentation
  ...
2019-03-04 19:20:52 -08:00
David S. Miller 18a4d8bf25 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2019-03-04 13:26:15 -08:00
Palmer Dabbelt 13fd5de065
RISC-V: Fixmap support and MM cleanups
This patchset does:
1. Moves MM related code from kernel/setup.c to mm/init.c
2. Implements compile-time fixed mappings

Using fixed mappings, we get earlyprints even without SBI calls.

For example, we can now use kernel parameter
"earlycon=uart8250,mmio,0x10000000"
to get early prints on QEMU virt machine without using SBI calls.

The patchset is tested on QEMU virt machine.

Palmer: It looks like some of the code movement here conflicted with the
patches to move hartid handling around.  As far as I can tell the only
changed code was in smp_setup_processor_id(), and I've kept the one in
smp.c.
2019-03-04 11:47:04 -08:00
Andreas Schwab f7ccc35aa3
arch: riscv: fix logic error in parse_dtb
The function early_init_dt_scan returns true if a DTB was detected.

Fixes: 8fd6e05c74 ("arch: riscv: support kernel command line forcing when no DTB passed")
Signed-off-by: Andreas Schwab <schwab@suse.de>
Reviewed-by: Atish Patra <atish.patra@wdc.com>
Reviewed-by: Paul Walmsley <paul.walmsley@sifive.com>
Tested-by: Paul Walmsley <paul.walmsley@sifive.com> # FU540 HiFive-U BBL
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2019-03-04 11:32:14 -08:00
Linus Torvalds 736706bee3 get rid of legacy 'get_ds()' function
Every in-kernel use of this function defined it to KERNEL_DS (either as
an actual define, or as an inline function).  It's an entirely
historical artifact, and long long long ago used to actually read the
segment selector valueof '%ds' on x86.

Which in the kernel is always KERNEL_DS.

Inspired by a patch from Jann Horn that just did this for a very small
subset of users (the ones in fs/), along with Al who suggested a script.
I then just took it to the logical extreme and removed all the remaining
gunk.

Roughly scripted with

   git grep -l '(get_ds())' -- :^tools/ | xargs sed -i 's/(get_ds())/(KERNEL_DS)/'
   git grep -lw 'get_ds' -- :^tools/ | xargs sed -i '/^#define get_ds()/d'

plus manual fixups to remove a few unusual usage patterns, the couple of
inline function cases and to fix up a comment that had become stale.

The 'get_ds()' function remains in an x86 kvm selftest, since in user
space it actually does something relevant.

Inspired-by: Jann Horn <jannh@google.com>
Inspired-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-04 10:50:14 -08:00
Atish Patra fbdc6193dc
RISC-V: Assign hwcap as per comman capabilities.
Currently, we set hwcap based on first valid hart from DT. This may not
be correct always as that hart might not be current booting cpu or may
have a different capability.

Set hwcap as the capabilities supported by all possible harts with "okay"
status.

Signed-off-by: Atish Patra <atish.patra@wdc.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Reviewed-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2019-03-04 10:40:39 -08:00
Atish Patra 291debb38d
RISC-V: Compare cpuid with NR_CPUS before mapping.
We should never have a cpuid greater that NR_CPUS. Compare with NR_CPUS
before creating the mapping between logical and physical CPU ids. This
is also mandatory as NR_CPUS check is removed from
riscv_of_processor_hartid.

Signed-off-by: Atish Patra <atish.patra@wdc.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2019-03-04 10:40:39 -08:00
Atish Patra dd641e2686
RISC-V: Allow hartid-to-cpuid function to fail.
It is perfectly okay to call riscv_hartid_to_cpuid for a hartid that is
not mapped with an CPU id. It can happen if the calling functions
retrieves the hartid from DT.  However, that hartid was never brought
online by the firmware or kernel for any reasons.

No need to BUG() in the above case. A negative error return is
sufficient and the calling function should check for the return value
always.

Signed-off-by: Atish Patra <atish.patra@wdc.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2019-03-04 10:40:39 -08:00
Atish Patra ba15c86185
RISC-V: Remove NR_CPUs check during hartid search from DT
In non-smp configuration, hartid can be higher that NR_CPUS.
riscv_of_processor_hartid should not be compared to hartid to NR_CPUS in
that case. Moreover, this function checks all the DT properties of a
hart node. NR_CPUS comparison seems out of place.

Signed-off-by: Atish Patra <atish.patra@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2019-03-04 10:40:38 -08:00
Atish Patra 78d1daa364
RISC-V: Move cpuid to hartid mapping to SMP.
Currently, logical CPU id to physical hartid mapping is defined for both
smp and non-smp configurations. This is not required as we need this
only for smp configuration.  The mapping function can define directly
boot_cpu_hartid for non-smp use case.

The reverse mapping function i.e. hartid to cpuid can be called for any
valid but not booted harts. So it should return default cpu 0 only if it
is a boot hartid.

Signed-off-by: Atish Patra <atish.patra@wdc.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2019-03-04 10:40:38 -08:00
Atish Patra e15c6e3706
RISC-V: Do not wait indefinitely in __cpu_up
In SMP path, __cpu_up waits for other CPU to come online indefinitely.
This is wrong as other CPU might be disabled in machine mode and
possible CPU is set to the cpus present in DT.

Introduce a completion variable and waits only for a second.

Signed-off-by: Atish Patra <atish.patra@wdc.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2019-03-04 10:40:36 -08:00
Linus Torvalds 00c42373d3 x86-64: add warning for non-canonical user access address dereferences
This adds a warning (once) for any kernel dereference that has a user
exception handler, but accesses a non-canonical address.  It basically
is a simpler - and more limited - version of commit 9da3f2b740
("x86/fault: BUG() when uaccess helpers fault on kernel addresses") that
got reverted.

Note that unlike that original commit, this only causes a warning,
because there are real situations where we currently can do this
(notably speculative argument fetching for uprobes etc).  Also, unlike
that original commit, this _only_ triggers for #GP accesses, so the
cases of valid kernel pointers that cross into a non-mapped page aren't
affected.

The intent of this is two-fold:

 - the uprobe/tracing accesses really do need to be more careful. In
   particular, from a portability standpoint it's just wrong to think
   that "a pointer is a pointer", and use the same logic for any random
   pointer value you find on the stack. It may _work_ on x86-64, but it
   doesn't necessarily work on other architectures (where the same
   pointer value can be either a kernel pointer _or_ a user pointer, and
   you really need to be much more careful in how you try to access it)

   The warning can hopefully end up being a reminder that just any
   random pointer access won't do.

 - Kees in particular wanted a way to actually report invalid uses of
   wild pointers to user space accessors, instead of just silently
   failing them. Automated fuzzers want a way to get reports if the
   kernel ever uses invalid values that the fuzzer fed it.

   The non-canonical address range is a fair chunk of the address space,
   and with this you can teach syzkaller to feed in invalid pointer
   values and find cases where we do not properly validate user
   addresses (possibly due to bad uses of "set_fs()").

Acked-by: Kees Cook <keescook@chromium.org>
Cc: Jann Horn <jannh@google.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-04 10:08:28 -08:00
Mark Brown 14dbfb417b
Merge branch 'spi-5.1' into spi-next 2019-03-04 15:32:51 +00:00
Mark Brown 88f268a5bc
Merge branch 'regulator-5.1' into regulator-next 2019-03-04 15:32:43 +00:00
Rafael J. Wysocki dcaed592b2 Merge branch 'acpi-apei'
* acpi-apei: (29 commits)
  efi: cper: Fix possible out-of-bounds access
  ACPI: APEI: Fix possible out-of-bounds access to BERT region
  MAINTAINERS: Add James Morse to the list of APEI reviewers
  ACPI / APEI: Add support for the SDEI GHES Notification type
  firmware: arm_sdei: Add ACPI GHES registration helper
  ACPI / APEI: Use separate fixmap pages for arm64 NMI-like notifications
  ACPI / APEI: Only use queued estatus entry during in_nmi_queue_one_entry()
  ACPI / APEI: Split ghes_read_estatus() to allow a peek at the CPER length
  ACPI / APEI: Make GHES estatus header validation more user friendly
  ACPI / APEI: Pass ghes and estatus separately to avoid a later copy
  ACPI / APEI: Let the notification helper specify the fixmap slot
  ACPI / APEI: Move locking to the notification helper
  arm64: KVM/mm: Move SEA handling behind a single 'claim' interface
  KVM: arm/arm64: Add kvm_ras.h to collect kvm specific RAS plumbing
  ACPI / APEI: Switch NOTIFY_SEA to use the estatus queue
  ACPI / APEI: Move NOTIFY_SEA between the estatus-queue and NOTIFY_NMI
  ACPI / APEI: Don't allow ghes_ack_error() to mask earlier errors
  ACPI / APEI: Generalise the estatus queue's notify code
  ACPI / APEI: Don't update struct ghes' flags in read/clear estatus
  ACPI / APEI: Remove spurious GHES_TO_CLEAR check
  ...
2019-03-04 11:16:35 +01:00
Christophe Leroy 9580b71b5a powerpc/32: Clear on-stack exception marker upon exception return
Clear the on-stack STACK_FRAME_REGS_MARKER on exception exit in order
to avoid confusing stacktrace like the one below.

  Call Trace:
  [c0e9dca0] [c01c42a0] print_address_description+0x64/0x2bc (unreliable)
  [c0e9dcd0] [c01c4684] kasan_report+0xfc/0x180
  [c0e9dd10] [c0895130] memchr+0x24/0x74
  [c0e9dd30] [c00a9e38] msg_print_text+0x124/0x574
  [c0e9dde0] [c00ab710] console_unlock+0x114/0x4f8
  [c0e9de40] [c00adc60] vprintk_emit+0x188/0x1c4
  --- interrupt: c0e9df00 at 0x400f330
      LR = init_stack+0x1f00/0x2000
  [c0e9de80] [c00ae3c4] printk+0xa8/0xcc (unreliable)
  [c0e9df20] [c0c27e44] early_irq_init+0x38/0x108
  [c0e9df50] [c0c15434] start_kernel+0x310/0x488
  [c0e9dff0] [00003484] 0x3484

With this patch the trace becomes:

  Call Trace:
  [c0e9dca0] [c01c42c0] print_address_description+0x64/0x2bc (unreliable)
  [c0e9dcd0] [c01c46a4] kasan_report+0xfc/0x180
  [c0e9dd10] [c0895150] memchr+0x24/0x74
  [c0e9dd30] [c00a9e58] msg_print_text+0x124/0x574
  [c0e9dde0] [c00ab730] console_unlock+0x114/0x4f8
  [c0e9de40] [c00adc80] vprintk_emit+0x188/0x1c4
  [c0e9de80] [c00ae3e4] printk+0xa8/0xcc
  [c0e9df20] [c0c27e44] early_irq_init+0x38/0x108
  [c0e9df50] [c0c15434] start_kernel+0x310/0x488
  [c0e9dff0] [00003484] 0x3484

Cc: stable@vger.kernel.org
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-03-04 00:37:23 +11:00
Linus Torvalds c027c7cf15 ARM: SoC fixes for v5.0
One more set of simple ARM platform fixes:
 
  - A boot regression on qualcomm msm8998
  - Gemini display controllers got turned off by accident
  - incorrect reference counting in optee
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJcewMjAAoJEGCrR//JCVInKmUP/0Cf2M4eekFEoEMSkCOlZlvP
 aDiFaejGOfGXVMkBX6f5dwc4hEzGTPMBEIcsMDOa6LDFo6wd1JW+XIFYOcyT8oVr
 ELTOQeLj0Ru+c4d4r2PrNLG4jpacoWfGgDXih8QNfKn7MbcHGyu8XN/UQZGx/Vw5
 Lebrc7yrMgHpYrP2Pwj/q+6PY+DhuQbV3YEtpN8YVKFHWkqBF+81ghTpaaoOi1Uu
 +cS6lmh3bVzsyX0SO1TXWqYXd9UWWhgnkdvIHbB7XRKHW6amzSwKBvc95TE42Ycu
 8nQOc+33ZDUki0bQB75gjVU8HY2L0nNEk9Atw8o+Y4ZCCpDYpcryYwoKrNdX2GMB
 H+55ENU9zn9V/H8xOZ84+Vmu3uIQIk0gymSAyMCkNLjw/D84oxDSVMRJLeWuILlF
 LcI7+1hcT0vcwpIIQqQbocK1D0XvkNSkNhukl5eM/JhMQVLOblDW8FSK88RIkPKp
 t0yE44DJbsaz2L8ab7S/84aHP5pxybrazNXrl2N3F8Xaj+DQZ1Bk8Q3cwVH4G6Km
 DdKAEsDpbe8f1bzfmIgM/tfpPKus3ETC39wJm4bK3Yt5znllUOCOOalitUtQZVj2
 AjI8MxbM/9ePr2ymMmiQVNap2lXfaGcvNNlX/JNnJv7KiaN7mZGxQ1Vrw6EHbMbm
 fuLi8Z/mtTaucWc67GM+
 =doMO
 -----END PGP SIGNATURE-----

Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc

Pull ARM SoC fixes from Arnd Bergmann:
 "One more set of simple ARM platform fixes:

   - A boot regression on qualcomm msm8998

   - Gemini display controllers got turned off by accident

   - incorrect reference counting in optee"

* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  tee: optee: add missing of_node_put after of_device_is_available
  arm64: dts: qcom: msm8998: Extend TZ reserved memory area
  ARM: dts: gemini: Re-enable display controller
2019-03-02 16:43:15 -08:00
David S. Miller 2369afb669 Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next
Johan Hedberg says:

====================
pull request: bluetooth-next 2019-03-02

Here's one more bluetooth-next pull request for the 5.1 kernel:

 - Added support for MediaTek MT7663U and MT7668U UART devices
 - Cleanups & fixes to the hci_qca driver
 - Fixed wakeup pin behavior for QCA6174A controller

Please let me know if there are any issues pulling. Thanks.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-02 13:55:36 -08:00
David S. Miller 9eb359140c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2019-03-02 12:54:35 -08:00
Linus Torvalds e7c42a89e9 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
 "Two last minute fixes:

   - Prevent value evaluation via functions happening in the user access
     enabled region of __put_user() (put another way: make sure to
     evaluate the value to be stored in user space _before_ enabling
     user space accesses)

   - Correct the definition of a Hyper-V hypercall constant"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/hyper-v: Fix definition of HV_MAX_FLUSH_REP_COUNT
  x86/uaccess: Don't leak the AC flag into __put_user() value evaluation
2019-03-02 11:47:29 -08:00
Linus Torvalds c93d9218ea Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Fix refcount leak in act_ipt during replace, from Davide Caratti.

 2) Set task state properly in tun during blocking reads, from Timur
    Celik.

 3) Leaked reference in DSA, from Wen Yang.

 4) NULL deref in act_tunnel_key, from Vlad Buslov.

 5) cipso_v4_erro can reference the skb IPCB in inappropriate contexts
    thus referencing garbage, from Nazarov Sergey.

 6) Don't accept RTA_VIA and RTA_GATEWAY in contexts where those
    attributes make no sense.

 7) Fix hung sendto in tipc, from Tung Nguyen.

 8) Out-of-bounds access in netlabel, from Paul Moore.

 9) Grant reference leak in xen-netback, from Igor Druzhinin.

10) Fix tx stalls with lan743x, from Bryan Whitehead.

11) Fix interrupt storm with mv88e6xxx, from Hein Kallweit.

12) Memory leak in sit on device registry failure, from Mao Wenan.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (44 commits)
  net: sit: fix memory leak in sit_init_net()
  net: dsa: mv88e6xxx: Fix statistics on mv88e6161
  geneve: correctly handle ipv6.disable module parameter
  net: dsa: mv88e6xxx: prevent interrupt storm caused by mv88e6390x_port_set_cmode
  bpf: fix sanitation rewrite in case of non-pointers
  ipv4: Add ICMPv6 support when parse route ipproto
  MIPS: eBPF: Fix icache flush end address
  lan743x: Fix TX Stall Issue
  net: phy: phylink: fix uninitialized variable in phylink_get_mac_state
  net: aquantia: regression on cpus with high cores: set mode with 8 queues
  selftests: fixes for UDP GRO
  bpf: drop refcount if bpf_map_new_fd() fails in map_create()
  net: dsa: mv88e6xxx: power serdes on/off for 10G interfaces on 6390X
  net: dsa: mv88e6xxx: Fix u64 statistics
  xen-netback: don't populate the hash cache on XenBus disconnect
  xen-netback: fix occasional leak of grant ref mappings under memory pressure
  sctp: chunk.c: correct format string for size_t in printk
  net: netem: fix skb length BUG_ON in __skb_to_sgvec
  netlabel: fix out-of-bounds memory accesses
  ipv4: Pass original device to ip_rcv_finish_core
  ...
2019-03-02 08:46:34 -08:00
Linus Torvalds fa3294c58c Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull more crypto fixes from Herbert Xu:
 "This fixes a couple of issues in arm64/chacha that was introduced in
  5.0"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: arm64/chacha - fix hchacha_block_neon() for big endian
  crypto: arm64/chacha - fix chacha_4block_xor_neon() for big endian
2019-03-02 08:32:02 -08:00
Joe Lawrence 39070a96a1 powerpc: Remove export of save_stack_trace_tsk_reliable()
As tglx points out, there are no in-tree module users of
save_stack_trace_tsk_reliable() and its x86 counterpart is not
exported, so remove the powerpc symbol export.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Joe Lawrence <joe.lawrence@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-03-02 14:43:05 +11:00
Qian Cai c38ca26552 powerpc/mm: fix "section_base" set but not used
The commit 24b6d41643 ("mm: pass the vmem_altmap to vmemmap_free")
removed a line in vmemmap_free(),

altmap = to_vmem_altmap((unsigned long) section_base);

but left a variable no longer used.

arch/powerpc/mm/init_64.c: In function 'vmemmap_free':
arch/powerpc/mm/init_64.c:277:16: error: variable 'section_base' set but
not used [-Werror=unused-but-set-variable]

Signed-off-by: Qian Cai <cai@lca.pw>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-03-02 14:43:05 +11:00
Qian Cai 8132cf115e powerpc/mm: Fix "sz" set but not used warning
Fix compiler warning:
  arch/powerpc/mm/hugetlbpage-hash64.c: In function '__hash_page_huge':
  arch/powerpc/mm/hugetlbpage-hash64.c:29:28: warning: variable 'sz' set
  but not used [-Wunused-but-set-variable]

mpe: The last usage of sz was removed in 0895ecda79 ("powerpc/mm:
Bring hugepage PTE accessor functions back into sync with normal
accessors").

Signed-off-by: Qian Cai <cai@lca.pw>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-03-02 14:43:05 +11:00
Rashmica Gupta 790845e2f1 powerpc/mm: Check secondary hash page table
We were always calling base_hpte_find() with primary = true,
even when we wanted to check the secondary table.

mpe: I broke this when refactoring Rashmica's original patch.

Fixes: 1515ab9321 ("powerpc/mm: Dump hash table")
Signed-off-by: Rashmica Gupta <rashmica.g@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-03-02 14:43:05 +11:00
Firoz Khan 6b1200facc powerpc: remove nargs from __SYSCALL
The __SYSCALL macro's arguments are system call number,
system call entry name and number of arguments for the
system call.

Argument- nargs in __SYSCALL(nr, entry, nargs) is neither
calculated nor used anywhere. So it would be better to
keep the implementaion as  __SYSCALL(nr, entry). This will
unifies the implementation with some other architetures
too.

Signed-off-by: Firoz Khan <firoz.khan@linaro.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-03-02 14:43:05 +11:00
Michael Ellerman 2de04718ec Merge branch 'topic/ppc-kvm' into next
Merge another commit in the topic/ppc-kvm branch we're sharing with
kvm-ppc.
2019-03-02 14:42:28 +11:00
Paul Burton d1a2930d8a MIPS: eBPF: Fix icache flush end address
The MIPS eBPF JIT calls flush_icache_range() in order to ensure the
icache observes the code that we just wrote. Unfortunately it gets the
end address calculation wrong due to some bad pointer arithmetic.

The struct jit_ctx target field is of type pointer to u32, and as such
adding one to it will increment the address being pointed to by 4 bytes.
Therefore in order to find the address of the end of the code we simply
need to add the number of 4 byte instructions emitted, but we mistakenly
add the number of instructions multiplied by 4. This results in the call
to flush_icache_range() operating on a memory region 4x larger than
intended, which is always wasteful and can cause crashes if we overrun
into an unmapped page.

Fix this by correcting the pointer arithmetic to remove the bogus
multiplication, and use braces to remove the need for a set of brackets
whilst also making it obvious that the target field is a pointer.

Signed-off-by: Paul Burton <paul.burton@mips.com>
Fixes: b6bd53f9c4 ("MIPS: Add missing file for eBPF JIT.")
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: netdev@vger.kernel.org
Cc: bpf@vger.kernel.org
Cc: linux-mips@vger.kernel.org
Cc: stable@vger.kernel.org # v4.13+
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-03-02 00:04:15 +01:00
Claudiu Manoil 0c805404f0 arm64: dts: fsl: ls1028a-rdb: Add ENETC external eth ports for the LS1028A RDB board
The LS1028A RDB board features an Atheros PHY connected over
SGMII to the ENETC PF0 (or Port0).  ENETC Port1 (PF1) has no
external connection on this board, so it can be disabled for now.

Signed-off-by: Alex Marginean <alexandru.marginean@nxp.com>
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-01 11:21:32 -08:00
Claudiu Manoil 927d7f8575 arm64: dts: fsl: ls1028a: Add PCI IERC node and ENETC endpoints
The LS1028A SoC features a PCI Integrated Endpoint Root Complex
(IERC) defining several integrated PCI devices, including the ENETC
ethernet controller integrated endpoints (IEPs). The IERC implements
ECAM (Enhanced Configuration Access Mechanism) to provide access
to the PCIe config space of the IEPs. This means the the IEPs
(including ENETC) do not support the standard PCIe BARs, instead
the Enhanced Allocation (EA) capability structures in the ECAM space
are used to fix the base addresses in the system, and the PCI
subsystem uses these structures for device enumeration and discovery.
The "ranges" entries contain basic information from these EA capabily
structures required by the kernel for device enumeration.

The current patch also enables the first 2 ENETC PFs (Physiscal
Functions) and the associated VFs (Virtual Functions), 2 VFs for
each PF.  Each of these ENETC PFs has an external ethernet port
on the LS1028A SoC.

Signed-off-by: Alex Marginean <alexandru.marginean@nxp.com>
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-01 11:21:32 -08:00
Peng Fan b855b58ac1 arm64: mmu: drop paging_init comments
The comments could not reflect the code, and it is easy to get
what this function does from a straight-line reading of the code.
So let's drop the comments

Signed-off-by: Peng Fan <peng.fan@nxp.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-03-01 16:40:07 +00:00
Will Deacon 6bd288569b arm64: debug: Ensure debug handlers check triggering exception level
Debug exception handlers may be called for exceptions generated both by
user and kernel code. In many cases, this is checked explicitly, but
in other cases things either happen to work by happy accident or they
go slightly wrong. For example, executing 'brk #4' from userspace will
enter the kprobes code and be ignored, but the instruction will be
retried forever in userspace instead of delivering a SIGTRAP.

Fix this issue in the most stable-friendly fashion by simply adding
explicit checks of the triggering exception level to all of our debug
exception handlers.

Cc: <stable@vger.kernel.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-03-01 16:23:38 +00:00
Will Deacon b9a4b9d084 arm64: debug: Don't propagate UNKNOWN FAR into si_code for debug signals
FAR_EL1 is UNKNOWN for all debug exceptions other than those caused by
taking a hardware watchpoint. Unfortunately, if a debug handler returns
a non-zero value, then we will propagate the UNKNOWN FAR value to
userspace via the si_addr field of the SIGTRAP siginfo_t.

Instead, let's set si_addr to take on the PC of the faulting instruction,
which we have available in the current pt_regs.

Cc: <stable@vger.kernel.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-03-01 16:23:17 +00:00
Martin Schwidefsky c8e8ed386a s390/suspend: fix prefix register reset in swsusp_arch_resume
The reset of the prefix to zero in swsusp_arch_resume uses a 4 byte stack
slot. With CONFIG_VMAP_STACK=y this is now in the vmalloc area, this works
only with DAT enabled. Move the DAT disable in swsusp_arch_resume after
the prefix reset.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-03-01 16:23:27 +01:00
Arnd Bergmann c889e2a0b0 Merge branch 'milbeaut/newsoc' into arm/newsoc
Sugaya Taichi <sugaya.taichi@socionext.com> explains:

Here is the series of patches the initial support for SC2000(M10V) of
Milbeaut SoCs. "M10V" is the internal name of SC2000, so commonly used in
source code.

SC2000 is a SoC of the Milbeaut series. equipped with a DSP optimized for
computer vision. It also features advanced functionalities such as 360-degree,
real-time spherical stitching with multi cameras, image stabilization for
without mechanical gimbals, and rolling shutter correction. More detail is
below:
https://www.socionext.com/en/products/assp/milbeaut/SC2000.html

Specifications for developers are below:
 - Quad-core 32bit Cortex-A7 on ARMv7-A architecture
 - NEON support
 - DSP
 - GPU
 - MAX 3GB DDR3
 - Cortex-M0 for power control
 - NAND Flash Interface
 - SD UHS-I
 - SD UHS-II
 - SDIO
 - USB2.0 HOST / Device
 - USB3.0 HOST / Device
 - PCI express Gen2
 - Ethernet Engine
 - I2C
 - UART
 - SPI
 - PWM

Support is quite minimal for now, since it only includes timer, clock,
pictrl and serial controller drivers, so we can only boot to userspace
through initramfs. Support for the other peripherals  will come eventually.

* milbeaut/newsoc:
  ARM: multi_v7_defconfig: add ARCH_MILBEAUT and ARCH_MILBEAUT_M10V
  ARM: configs: Add Milbeaut M10V defconfig
  ARM: dts: milbeaut: Add device tree set for the Milbeaut M10V board
  clocksource/drivers/timer-milbeaut: Introduce timer for Milbeaut SoCs
  dt-bindings: timer: Add Milbeaut M10V timer description
  ARM: milbeaut: Add basic support for Milbeaut m10v SoC
  dt-bindings: Add documentation for Milbeaut SoCs
  dt-bindings: arm: Add SMP enable-method for Milbeaut
  dt-bindings: sram: milbeaut: Add binding for Milbeaut smp-sram

Link: https://lore.kernel.org/linux-arm-kernel/1551243056-10521-1-git-send-email-sugaya.taichi@socionext.com/
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2019-03-01 15:21:04 +01:00
Catalin Marinas 3cd0ddb3de Revert "arm64: uaccess: Implement unsafe accessors"
This reverts commit 0bd3ef34d2.

There is ongoing work on objtool to identify incorrect uses of
user_access_{begin,end}. Until this is sorted, do not enable the
functionality on arm64. Also, on ARMv8.2 CPUs with hardware PAN and UAO
support, there is no obvious performance benefit to the unsafe user
accessors.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-03-01 14:19:06 +00:00
Sugaya Taichi 2781204594 ARM: multi_v7_defconfig: add ARCH_MILBEAUT and ARCH_MILBEAUT_M10V
Add and enable the Milbeaut M10V architecture. These configs select those
of the clock, timer and serial driver for M10V.

Signed-off-by: Sugaya Taichi <sugaya.taichi@socionext.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2019-03-01 15:18:54 +01:00
Sugaya Taichi 4d0eacb02b ARM: configs: Add Milbeaut M10V defconfig
This patch adds the minimal defconfig for the Milbeaut M10V.

Signed-off-by: Sugaya Taichi <sugaya.taichi@socionext.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2019-03-01 15:18:54 +01:00
Sugaya Taichi bbaad14423 ARM: dts: milbeaut: Add device tree set for the Milbeaut M10V board
Add devicetree for Milbeaut M10V SoC and M10V Evaluation board.

Signed-off-by: Sugaya Taichi <sugaya.taichi@socionext.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2019-03-01 15:18:54 +01:00
Sugaya Taichi 9fb29c734f ARM: milbeaut: Add basic support for Milbeaut m10v SoC
This adds the basic M10V SoC support under arch/arm.
Since all cores are activated in the custom bootloader before booting
linux, it is necessary to wait for the secondary-cores using cpu-enable-
method and special sram.

Signed-off-by: Sugaya Taichi <sugaya.taichi@socionext.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2019-03-01 15:18:26 +01:00
Marek Szyprowski a3238924a8 ARM: dts: exynos: Fix max voltage for buck8 regulator on Odroid XU3/XU4
The maximum voltage value for buck8 regulator on Odroid XU3/XU4 boards is
set too low. Increase it to the 2000mV as specified on the board schematic.
So far the board worked fine, because of the bug in the PMIC driver, which
used incorrect step value for that regulator. It interpreted the voltage
value set by the bootloader as 1225mV and kept it unchanged. The regulator
driver has been however fixed recently in the commit 56b5d4ea77
("regulator: s2mps11: Fix steps for buck7, buck8 and LDO35"), what results
in reading the proper buck8 value and forcing it to 1500mV on boot. This
is not enough for proper board operation and results in eMMC errors during
heavy IO traffic. Increasing maximum voltage value for buck8 restores
original driver behavior and fixes eMMC issues.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Fixes: 86a2d2ac5e ("ARM: dts: Add dts file for Odroid XU3 board")
Fixes: 56b5d4ea77 ("regulator: s2mps11: Fix steps for buck7, buck8 and LDO35")
Cc: <stable@vger.kernel.org>
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2019-03-01 15:10:30 +01:00
Linus Walleij 31b0067e8d ARM: spear3xx_defconfig: Activate PL111 DRM driver
This disables the old FBDEV driver and enables the PL111
DRM driver on the SPEAr3xx.

There are some device trees in the kernel that switches
the DT node for the PL110 to "okay" but none of these
have any display defined, so we can safely switch to this
driver before we get any users starting to define
displays. Let them do it on top of the new driver
infrastructure instead.

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2019-03-01 15:09:21 +01:00
Arnd Bergmann 6089e65618 Qualcomm ARM64 Fixes for 5.0-rc8
* Fix TZ memory area size to avoid crashes during boot
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJcdiEpAAoJEFKiBbHx2RXVQLAP/i9ZU6p6TD8KVYVnxbec5pIz
 bB6/gNo0VfM4/4CZmznJ0FQcGbxGekaMSCtNJDlgfzIM/VC2nXdlPzZi/Lo8tubX
 NgP4KYVccqnyA7Ppb/7lMaITsclljofdSaGVX4Vj4o60DzrEkA8reauJ4a+X8GDF
 d0kh006/Q0aRD9rrIV9ZaRWE8mzThbuCwMDZt6iV5kUDGKK1qleJV7wkBlSYKMTg
 rB+9ObD02kZxw7FWNcf0LH/4Ans9coPkM6I0CE4MCH5ozEDg+AN7RRXTkeniXYdi
 kljx2BilgN0SyV9zuGLwYW/FRXcMOV1Rmmdw2CUU84wNnnTkcLG0O43fqR8S8nBb
 cF0Wd9zPHlXWhIKa5twgFubvluZjsaFOciThixtAELwU4wrxq9h5NUxW3XnYQVUO
 riEIPUIBl8e3wY2SHk0HUBvHUcXr8m033V0oKN5tIO69OuWRROMBCNp0Su6JraSs
 7hC2wWvo+wtqQIC3JHszSRyfA1ceF7ldYit5dRLJPycMk49fgzXF+Coel7W0ifV5
 PV62/1SQXkajeRywdamFCT6ccdRQcqVzKSUrQdD0U0+Z5hC4BjgK9TOMHSak6YEY
 iKt4PwpdE6zdenybmflUw7wcmdt2plrYHGJYLelFE3TKZMhV1x33LtNeb2FPXSZo
 u/RxwNuDNdyw9kGUk1+u
 =iA1y
 -----END PGP SIGNATURE-----

Merge tag 'qcom-fixes-for-5.0-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/agross/linux into arm/fixes

Qualcomm ARM64 Fixes for 5.0-rc8

* Fix TZ memory area size to avoid crashes during boot

* tag 'qcom-fixes-for-5.0-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/agross/linux:
  arm64: dts: qcom: msm8998: Extend TZ reserved memory area
2019-03-01 15:08:16 +01:00
Linus Walleij 00c15bb031 ARM: nhk8815_defconfig: Add new options
This adds some new driver options to the Nomadik NHK8815 defconfig:
- Activate IIO driver
- Enable CMA for coherent graphics allocations
- Activate DRM framebuffer driver for PL111
- Activate DRM panel driver for TPO TPG110
- Activate SPI GPIO driver (talks to the display)
- Activate STMPE PWM driver (used for display backlight)
- Activate PWM backlight
- Activate STw481x driver (PMIC)

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2019-03-01 15:07:23 +01:00
Linus Walleij 2be5274609 ARM: nhk8815_defconfig: Update defconfig
This updates the NHK8815 defconfig to reflect the recent
structural changes in Kconfigs all over the kernel:

- PREEMPT option was moved around
- MODULES options were moved around
- MTD_NAND options were moved around
- INPUT_MOUSEDEV doesn't have to be explicitly unselected
  anymore (not on by default)
- DEBUG_GPIO should really not be in any default config
- MMC_BLOCK_BOUNCE is gone from Kconfig
- CRYPTO options were moved around

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2019-03-01 15:07:10 +01:00
Arnd Bergmann f1685af78c ARM: pxa: remove CONFIG_SND_PXA2XX_AC97 in pxa_defconfig
The CONFIG_SND_PXA2XX_AC97 driver is for the old AC97 bus implementation,
and conflicts with all the new-style AC97 drivers after the conversion,
so the drivers we want all get turned off.

Not disabling the symbol however does the right thing, and we get
the drivers that are selectively enabled here.

Fixes: 25540f68c8 ("ASoC: pxa: change ac97 dependencies")
Acked-by: Robert Jarzmik <robert.jarzmik@free.fr>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2019-03-01 15:02:26 +01:00
Nicholas Piggin bd3524feac powerpc/64s: Fix unrelocated interrupt trampoline address test
The recent commit got this test wrong, it declared the assembler
symbols the wrong way, and also used the wrong symbol name
(xxx_start rather than start_xxx, see asm/head-64.h).

Fixes: ccd477028a ("powerpc/64s: Fix HV NMI vs HV interrupt recoverability test")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-03-02 00:25:47 +11:00
Maya Nakamura c8ccf7599d PCI: hv: Refactor hv_irq_unmask() to use cpumask_to_vpset()
Remove the duplicate implementation of cpumask_to_vpset() and use the
shared implementation. Export hv_max_vp_index, which is required by
cpumask_to_vpset().

Signed-off-by: Maya Nakamura <m.maya.nakamura@gmail.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Tested-by: Vitaly Kuznetsov <vkuznets@redhat.com>
2019-03-01 11:45:46 +00:00
Joerg Roedel d05e4c8600 Merge branches 'iommu/fixes', 'arm/msm', 'arm/tegra', 'arm/mediatek', 'x86/vt-d', 'x86/amd', 'hyper-v' and 'core' into next 2019-03-01 11:24:51 +01:00
Vasily Gorbik 6d85dac2ab s390: warn about clearing als implied facilities
Add a warning about removing required architecture level set facilities
via "facilities=" command line option.

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-03-01 08:00:42 +01:00
Vasily Gorbik b5e804598d s390: allow overriding facilities via command line
Add "facilities=" command line option which allows to override
facility bits returned by stfle. The main purpose of that is debugging
aids which allows to test specific kernel behaviour depending on
specific facilities presence. It also affects CPU alternatives.

"facilities=" command line option format is comma separated list of
integer values to be additionally set or cleared (if value is starting
with "!"). Values ranges are also supported. e.g.:

facilities=!130-160,159,167-169

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-03-01 08:00:41 +01:00
Vasily Gorbik d8901f2b2d s390: clean up redundant facilities list setup
Facilities list in the lowcore is initially set up by verify_facilities
from als.c and later initializations are redundant, so cleaning them up.

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-03-01 08:00:39 +01:00
Vasily Gorbik 96d3b64b52 s390/als: remove duplicated in-place implementation of stfle
Reuse __stfle call instead of in-place implementation. __stfle is using
memcpy and memset functions but they are safe to use, since mem.S is
built with -march=z900.

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-03-01 08:00:37 +01:00
Suraj Jitindar Singh 2b57ecd020 KVM: PPC: Book3S: Add count cache flush parameters to kvmppc_get_cpu_char()
Add KVM_PPC_CPU_CHAR_BCCTR_FLUSH_ASSIST &
KVM_PPC_CPU_BEHAV_FLUSH_COUNT_CACHE to the characteristics returned
from the H_GET_CPU_CHARACTERISTICS H-CALL, as queried from either the
hypervisor or the device tree.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2019-03-01 15:11:14 +11:00
Linus Torvalds bf23aba194 A few more MIPS fixes:
- Fix 16b cmpxchg() operations which could erroneously fail if bits 15:8
   of the old value are non-zero. In practice I'm not aware of any actual
   users of 16b cmpxchg() on MIPS, but this fixes the support for it was
   was introduced in v4.13.
 
 - Provide a struct device to dma_alloc_coherent for Lantiq XWAY systems
   with a "Voice MIPS Macro Core" (VMMC) device.
 
 - Provide DMA masks for BCM63xx ethernet devices, fixing a regression
   introduced in v4.19.
 
 - Fix memblock reservation for the kernel when the system has a non-zero
   PHYS_OFFSET, correcting the memblock conversion performed in v4.20.
 -----BEGIN PGP SIGNATURE-----
 
 iIsEABYIADMWIQRgLjeFAZEXQzy86/s+p5+stXUA3QUCXHhqjBUccGF1bC5idXJ0
 b25AbWlwcy5jb20ACgkQPqefrLV1AN3ZaAD/SFgi3dS9bSWhDhiy83llLaWiCGPb
 i09uzo3rpWoKSwQBAIwLEfmaHz/sYdliKRlE13uaxYWzwaN+VHXIPjlzbYMB
 =cDlO
 -----END PGP SIGNATURE-----

Merge tag 'mips_fixes_5.0_4' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux

Pull MIPS fixes from Paul Burton:
 "A few more MIPS fixes:

   - Fix 16b cmpxchg() operations which could erroneously fail if bits
     15:8 of the old value are non-zero. In practice I'm not aware of
     any actual users of 16b cmpxchg() on MIPS, but this fixes the
     support for it was was introduced in v4.13.

   - Provide a struct device to dma_alloc_coherent for Lantiq XWAY
     systems with a "Voice MIPS Macro Core" (VMMC) device.

   - Provide DMA masks for BCM63xx ethernet devices, fixing a regression
     introduced in v4.19.

   - Fix memblock reservation for the kernel when the system has a
     non-zero PHYS_OFFSET, correcting the memblock conversion performed
     in v4.20"

* tag 'mips_fixes_5.0_4' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
  MIPS: fix memory setup for platforms with PHYS_OFFSET != 0
  MIPS: BCM63XX: provide DMA masks for ethernet devices
  MIPS: lantiq: pass struct device to DMA API functions
  MIPS: fix truncation in __cmpxchg_small for short values
2019-02-28 15:33:10 -08:00
Arnd Bergmann 366e37e4da arm64: avoid clang warning about self-assignment
Building a preprocessed source file for arm64 now always produces
a warning with clang because of the page_to_virt() macro assigning
a variable to itself.

Adding a new temporary variable avoids this issue.

Fixes: 2813b9c029 ("kasan, mm, arm64: tag non slab memory allocated via pagealloc")
Reviewed-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-02-28 18:16:00 +00:00
Anders Roxell a29c782349 arm64: Kconfig.platforms: fix warning unmet direct dependencies
When ARCH_MXC get enabled, ARM64_ERRATUM_845719 will be selected and
this warning will happen when COMPAT isn't set.

WARNING: unmet direct dependencies detected for ARM64_ERRATUM_845719
  Depends on [n]: COMPAT [=n]
  Selected by [y]:
  - ARCH_MXC [=y]

Rework to add 'if COMPAT' before ARM64_ERRATUM_845719 gets selected,
since ARM64_ERRATUM_845719 depends on COMPAT.

Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-02-28 18:06:48 +00:00
Will Deacon 2c97a9cc35 arm64: io: Hook up __io_par() for inX() ordering
Ensure that inX() provides the same ordering guarantees as readX()
by hooking up __io_par() so that it maps directly to __iormb().

Reported-by: Andrew Murray <andrew.murray@arm.com>
Reviewed-by: Palmer Dabbelt <palmer@sifive.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-02-28 17:24:27 +00:00
Will Deacon ce246c444a riscv: io: Update __io_[p]ar() macros to take an argument
The definitions of the __io_[p]ar() macros in asm-generic/io.h take the
value returned by the preceding I/O read as an argument so that
architectures can use this to create order with a subsequent delayX()
routine using a dependency.

Update the riscv barrier definitions to match, although the argument
is currently unused.

Suggested-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Palmer Dabbelt <palmer@sifive.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-02-28 17:23:12 +00:00
Linus Torvalds 3f25a5990d Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:
 "This fixes a compiler warning introduced by a previous fix, as well as
  two crash bugs on ARM"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: sha512/arm - fix crash bug in Thumb2 build
  crypto: sha256/arm - fix crash bug in Thumb2 build
  crypto: ccree - add missing inline qualifier
2019-02-28 09:05:18 -08:00
Zhang Lei 3e32131abc arm64: Add workaround for Fujitsu A64FX erratum 010001
On the Fujitsu-A64FX cores ver(1.0, 1.1), memory access may cause
an undefined fault (Data abort, DFSC=0b111111). This fault occurs under
a specific hardware condition when a load/store instruction performs an
address translation. Any load/store instruction, except non-fault access
including Armv8 and SVE might cause this undefined fault.

The TCR_ELx.NFD1 bit is used by the kernel when CONFIG_RANDOMIZE_BASE
is enabled to mitigate timing attacks against KASLR where the kernel
address space could be probed using the FFR and suppressed fault on
SVE loads.

Since this erratum causes spurious exceptions, which may corrupt
the exception registers, we clear the TCR_ELx.NFDx=1 bits when
booting on an affected CPU.

Signed-off-by: Zhang Lei <zhang.lei@jp.fujitsu.com>
[Generated MIDR value/mask for __cpu_setup(), removed spurious-fault handler
 and always disabled the NFDx bits on affected CPUs]
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: zhang.lei <zhang.lei@jp.fujitsu.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-02-28 16:24:25 +00:00
Jens Axboe edafccee56 io_uring: add support for pre-mapped user IO buffers
If we have fixed user buffers, we can map them into the kernel when we
setup the io_uring. That avoids the need to do get_user_pages() for
each and every IO.

To utilize this feature, the application must call io_uring_register()
after having setup an io_uring instance, passing in
IORING_REGISTER_BUFFERS as the opcode. The argument must be a pointer to
an iovec array, and the nr_args should contain how many iovecs the
application wishes to map.

If successful, these buffers are now mapped into the kernel, eligible
for IO. To use these fixed buffers, the application must use the
IORING_OP_READ_FIXED and IORING_OP_WRITE_FIXED opcodes, and then
set sqe->index to the desired buffer index. sqe->addr..sqe->addr+seq->len
must point to somewhere inside the indexed buffer.

The application may register buffers throughout the lifetime of the
io_uring instance. It can call io_uring_register() with
IORING_UNREGISTER_BUFFERS as the opcode to unregister the current set of
buffers, and then register a new set. The application need not
unregister buffers explicitly before shutting down the io_uring
instance.

It's perfectly valid to setup a larger buffer, and then sometimes only
use parts of it for an IO. As long as the range is within the originally
mapped region, it will work just fine.

For now, buffers must not be file backed. If file backed buffers are
passed in, the registration will fail with -1/EOPNOTSUPP. This
restriction may be relaxed in the future.

RLIMIT_MEMLOCK is used to check how much memory we can pin. A somewhat
arbitrary 1G per buffer size is also imposed.

Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-02-28 08:24:23 -07:00
Jens Axboe 2b188cc1bb Add io_uring IO interface
The submission queue (SQ) and completion queue (CQ) rings are shared
between the application and the kernel. This eliminates the need to
copy data back and forth to submit and complete IO.

IO submissions use the io_uring_sqe data structure, and completions
are generated in the form of io_uring_cqe data structures. The SQ
ring is an index into the io_uring_sqe array, which makes it possible
to submit a batch of IOs without them being contiguous in the ring.
The CQ ring is always contiguous, as completion events are inherently
unordered, and hence any io_uring_cqe entry can point back to an
arbitrary submission.

Two new system calls are added for this:

io_uring_setup(entries, params)
	Sets up an io_uring instance for doing async IO. On success,
	returns a file descriptor that the application can mmap to
	gain access to the SQ ring, CQ ring, and io_uring_sqes.

io_uring_enter(fd, to_submit, min_complete, flags, sigset, sigsetsize)
	Initiates IO against the rings mapped to this fd, or waits for
	them to complete, or both. The behavior is controlled by the
	parameters passed in. If 'to_submit' is non-zero, then we'll
	try and submit new IO. If IORING_ENTER_GETEVENTS is set, the
	kernel will wait for 'min_complete' events, if they aren't
	already available. It's valid to set IORING_ENTER_GETEVENTS
	and 'min_complete' == 0 at the same time, this allows the
	kernel to return already completed events without waiting
	for them. This is useful only for polling, as for IRQ
	driven IO, the application can just check the CQ ring
	without entering the kernel.

With this setup, it's possible to do async IO with a single system
call. Future developments will enable polled IO with this interface,
and polled submission as well. The latter will enable an application
to do IO without doing ANY system calls at all.

For IRQ driven IO, an application only needs to enter the kernel for
completions if it wants to wait for them to occur.

Each io_uring is backed by a workqueue, to support buffered async IO
as well. We will only punt to an async context if the command would
need to wait for IO on the device side. Any data that can be accessed
directly in the page cache is done inline. This avoids the slowness
issue of usual threadpools, since cached data is accessed as quickly
as a sync interface.

Sample application: http://git.kernel.dk/cgit/fio/plain/t/io_uring.c

Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-02-28 08:24:23 -07:00
Takashi Iwai 70395a96bd ASoC: More changes for v5.1
Another batch of changes for ASoC, no big core changes - it's mainly
 small fixes and improvements for individual drivers.
 
  - A big refresh and cleanup of the Samsung drivers, fixing a number of
    issues which allow the driver to be used with a wider range of
    userspaces.
  - Fixes for the Intel drivers to make them more standard so less likely
    to get bitten by core issues.
  - New driver for Cirrus Logic CS35L26.
 -----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCgAxFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAlx3z7ETHGJyb29uaWVA
 a2VybmVsLm9yZwAKCRAk1otyXVSH0H5QB/9jwKEwOdk6ynoFUpQwXPPkQl7CGkIh
 P8J3OMTt+U4FNOrVG2S7xgUl69ZoaLm9rS/PHVrMV5krSLqY//2CTvF068qDBBlj
 haBxgeRbe4pwLZPfFUnWvn6v1rdvNCXzDG/be9jGPJjDcm6wK44VJQWkPbqTsh6O
 ZORqvKn48D89W0DegG1B+4jvbietPkhA0+nHQXwsWZ+sfMcEV/AWWsE5FIQ7ucCC
 bundBBncUFKMMp9whuhj2W9FO62LUd8OAM7ejis3hfKk9MsQWUy6vrcN1XgRCq47
 4I0doB5o+WhsOGMTZMcuhFISCVaCDqbNqGuVbeK0sdonjc1xz0682jLo
 =9rq8
 -----END PGP SIGNATURE-----

Merge tag 'asoc-v5.1-2' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-next

ASoC: More changes for v5.1

Another batch of changes for ASoC, no big core changes - it's mainly
small fixes and improvements for individual drivers.

 - A big refresh and cleanup of the Samsung drivers, fixing a number of
   issues which allow the driver to be used with a wider range of
   userspaces.
 - Fixes for the Intel drivers to make them more standard so less likely
   to get bitten by core issues.
 - New driver for Cirrus Logic CS35L26.
2019-02-28 13:30:55 +01:00
Kirill A. Shutemov 6f913de323 x86/boot/compressed/64: Do not read legacy ROM on EFI system
EFI systems do not necessarily provide a legacy ROM. If the ROM is missing
the memory is not mapped at all.

Trying to dereference values in the legacy ROM area leads to a crash on
Macbook Pro.

Only look for values in the legacy ROM area for non-EFI system.

Fixes: 3548e131ec ("x86/boot/compressed/64: Find a place for 32-bit trampoline")
Reported-by: Pitam Mitra <pitamm@gmail.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Bockjoo Kim <bockjoo@phys.ufl.edu>
Cc: bp@alien8.de
Cc: hpa@zytor.com
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20190219075224.35058-1-kirill.shutemov@linux.intel.com
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=202351
2019-02-28 12:25:05 +01:00
Daniel Borkmann ce02ef06fc x86, retpolines: Raise limit for generating indirect calls from switch-case
From networking side, there are numerous attempts to get rid of indirect
calls in fast-path wherever feasible in order to avoid the cost of
retpolines, for example, just to name a few:

  * 283c16a2df ("indirect call wrappers: helpers to speed-up indirect calls of builtin")
  * aaa5d90b39 ("net: use indirect call wrappers at GRO network layer")
  * 028e0a4766 ("net: use indirect call wrappers at GRO transport layer")
  * 356da6d0cd ("dma-mapping: bypass indirect calls for dma-direct")
  * 09772d92cd ("bpf: avoid retpoline for lookup/update/delete calls on maps")
  * 10870dd89e ("netfilter: nf_tables: add direct calls for all builtin expressions")
  [...]

Recent work on XDP from Björn and Magnus additionally found that manually
transforming the XDP return code switch statement with more than 5 cases
into if-else combination would result in a considerable speedup in XDP
layer due to avoidance of indirect calls in CONFIG_RETPOLINE enabled
builds. On i40e driver with XDP prog attached, a 20-26% speedup has been
observed [0]. Aside from XDP, there are many other places later in the
networking stack's critical path with similar switch-case
processing. Rather than fixing every XDP-enabled driver and locations in
stack by hand, it would be good to instead raise the limit where gcc would
emit expensive indirect calls from the switch under retpolines and stick
with the default as-is in case of !retpoline configured kernels. This would
also have the advantage that for archs where this is not necessary, we let
compiler select the underlying target optimization for these constructs and
avoid potential slow-downs by if-else hand-rewrite.

In case of gcc, this setting is controlled by case-values-threshold which
has an architecture global default that selects 4 or 5 (latter if target
does not have a case insn that compares the bounds) where some arch back
ends like arm64 or s390 override it with their own target hooks, for
example, in gcc commit db7a90aa0de5 ("S/390: Disable prediction of indirect
branches") the threshold pretty much disables jump tables by limit of 20
under retpoline builds.  Comparing gcc's and clang's default code
generation on x86-64 under O2 level with retpoline build results in the
following outcome for 5 switch cases:

* gcc with -mindirect-branch=thunk-inline -mindirect-branch-register:

  # gdb -batch -ex 'disassemble dispatch' ./c-switch
  Dump of assembler code for function dispatch:
   0x0000000000400be0 <+0>:     cmp    $0x4,%edi
   0x0000000000400be3 <+3>:     ja     0x400c35 <dispatch+85>
   0x0000000000400be5 <+5>:     lea    0x915f8(%rip),%rdx        # 0x4921e4
   0x0000000000400bec <+12>:    mov    %edi,%edi
   0x0000000000400bee <+14>:    movslq (%rdx,%rdi,4),%rax
   0x0000000000400bf2 <+18>:    add    %rdx,%rax
   0x0000000000400bf5 <+21>:    callq  0x400c01 <dispatch+33>
   0x0000000000400bfa <+26>:    pause
   0x0000000000400bfc <+28>:    lfence
   0x0000000000400bff <+31>:    jmp    0x400bfa <dispatch+26>
   0x0000000000400c01 <+33>:    mov    %rax,(%rsp)
   0x0000000000400c05 <+37>:    retq
   0x0000000000400c06 <+38>:    nopw   %cs:0x0(%rax,%rax,1)
   0x0000000000400c10 <+48>:    jmpq   0x400c90 <fn_3>
   0x0000000000400c15 <+53>:    nopl   (%rax)
   0x0000000000400c18 <+56>:    jmpq   0x400c70 <fn_2>
   0x0000000000400c1d <+61>:    nopl   (%rax)
   0x0000000000400c20 <+64>:    jmpq   0x400c50 <fn_1>
   0x0000000000400c25 <+69>:    nopl   (%rax)
   0x0000000000400c28 <+72>:    jmpq   0x400c40 <fn_0>
   0x0000000000400c2d <+77>:    nopl   (%rax)
   0x0000000000400c30 <+80>:    jmpq   0x400cb0 <fn_4>
   0x0000000000400c35 <+85>:    push   %rax
   0x0000000000400c36 <+86>:    callq  0x40dd80 <abort>
  End of assembler dump.

* clang with -mretpoline emitting search tree:

  # gdb -batch -ex 'disassemble dispatch' ./c-switch
  Dump of assembler code for function dispatch:
   0x0000000000400b30 <+0>:     cmp    $0x1,%edi
   0x0000000000400b33 <+3>:     jle    0x400b44 <dispatch+20>
   0x0000000000400b35 <+5>:     cmp    $0x2,%edi
   0x0000000000400b38 <+8>:     je     0x400b4d <dispatch+29>
   0x0000000000400b3a <+10>:    cmp    $0x3,%edi
   0x0000000000400b3d <+13>:    jne    0x400b52 <dispatch+34>
   0x0000000000400b3f <+15>:    jmpq   0x400c50 <fn_3>
   0x0000000000400b44 <+20>:    test   %edi,%edi
   0x0000000000400b46 <+22>:    jne    0x400b5c <dispatch+44>
   0x0000000000400b48 <+24>:    jmpq   0x400c20 <fn_0>
   0x0000000000400b4d <+29>:    jmpq   0x400c40 <fn_2>
   0x0000000000400b52 <+34>:    cmp    $0x4,%edi
   0x0000000000400b55 <+37>:    jne    0x400b66 <dispatch+54>
   0x0000000000400b57 <+39>:    jmpq   0x400c60 <fn_4>
   0x0000000000400b5c <+44>:    cmp    $0x1,%edi
   0x0000000000400b5f <+47>:    jne    0x400b66 <dispatch+54>
   0x0000000000400b61 <+49>:    jmpq   0x400c30 <fn_1>
   0x0000000000400b66 <+54>:    push   %rax
   0x0000000000400b67 <+55>:    callq  0x40dd20 <abort>
  End of assembler dump.

  For sake of comparison, clang without -mretpoline:

  # gdb -batch -ex 'disassemble dispatch' ./c-switch
  Dump of assembler code for function dispatch:
   0x0000000000400b30 <+0>:	cmp    $0x4,%edi
   0x0000000000400b33 <+3>:	ja     0x400b57 <dispatch+39>
   0x0000000000400b35 <+5>:	mov    %edi,%eax
   0x0000000000400b37 <+7>:	jmpq   *0x492148(,%rax,8)
   0x0000000000400b3e <+14>:	jmpq   0x400bf0 <fn_0>
   0x0000000000400b43 <+19>:	jmpq   0x400c30 <fn_4>
   0x0000000000400b48 <+24>:	jmpq   0x400c10 <fn_2>
   0x0000000000400b4d <+29>:	jmpq   0x400c20 <fn_3>
   0x0000000000400b52 <+34>:	jmpq   0x400c00 <fn_1>
   0x0000000000400b57 <+39>:	push   %rax
   0x0000000000400b58 <+40>:	callq  0x40dcf0 <abort>
  End of assembler dump.

Raising the cases to a high number (e.g. 100) will still result in similar
code generation pattern with clang and gcc as above, in other words clang
generally turns off jump table emission by having an extra expansion pass
under retpoline build to turn indirectbr instructions from their IR into
switch instructions as a built-in -mno-jump-table lowering of a switch (in
this case, even if IR input already contained an indirect branch).

For gcc, adding --param=case-values-threshold=20 as in similar fashion as
s390 in order to raise the limit for x86 retpoline enabled builds results
in a small vmlinux size increase of only 0.13% (before=18,027,528
after=18,051,192). For clang this option is ignored due to i) not being
needed as mentioned and ii) not having above cmdline
parameter. Non-retpoline-enabled builds with gcc continue to use the
default case-values-threshold setting, so nothing changes here.

[0] https://lore.kernel.org/netdev/20190129095754.9390-1-bjorn.topel@gmail.com/
    and "The Path to DPDK Speeds for AF_XDP", LPC 2018, networking track:
  - http://vger.kernel.org/lpc_net2018_talks/lpc18_pres_af_xdp_perf-v3.pdf
  - http://vger.kernel.org/lpc_net2018_talks/lpc18_paper_af_xdp_perf-v2.pdf

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Björn Töpel <bjorn.topel@intel.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: netdev@vger.kernel.org
Cc: David S. Miller <davem@davemloft.net>
Cc: Magnus Karlsson <magnus.karlsson@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Link: https://lkml.kernel.org/r/20190221221941.29358-1-daniel@iogearbox.net
2019-02-28 12:10:31 +01:00
Lan Tianyu 9cd05ad291 x86/hyper-v: Fix definition of HV_MAX_FLUSH_REP_COUNT
The max flush rep count of HvFlushGuestPhysicalAddressList hypercall is
equal with how many entries of union hv_gpa_page_range can be populated
into the input parameter page.

The code lacks parenthesis around PAGE_SIZE - 2 * sizeof(u64) which results
in bogus computations. Add them.

Fixes: cc4edae4b9 ("x86/hyper-v: Add HvFlushGuestAddressList hypercall support")
Signed-off-by: Lan Tianyu <Tianyu.Lan@microsoft.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: kys@microsoft.com
Cc: haiyangz@microsoft.com
Cc: sthemmin@microsoft.com
Cc: sashal@kernel.org
Cc: bp@alien8.de
Cc: hpa@zytor.com
Cc: gregkh@linuxfoundation.org
Cc: devel@linuxdriverproject.org
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20190225143114.5149-1-Tianyu.Lan@microsoft.com
2019-02-28 11:58:29 +01:00
Lan Tianyu 84fdfafab8 x86/Hyper-V: Set x2apic destination mode to physical when x2apic is available
Hyper-V doesn't provide irq remapping for IO-APIC. To enable x2apic,
set x2apic destination mode to physcial mode when x2apic is available
and Hyper-V IOMMU driver makes sure cpus assigned with IO-APIC irqs have
8-bit APIC id.

Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Lan Tianyu <Tianyu.Lan@microsoft.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-02-28 11:12:16 +01:00
David Howells 23bf1b6be9 kernfs, sysfs, cgroup, intel_rdt: Support fs_context
Make kernfs support superblock creation/mount/remount with fs_context.

This requires that sysfs, cgroup and intel_rdt, which are built on kernfs,
be made to support fs_context also.

Notes:

 (1) A kernfs_fs_context struct is created to wrap fs_context and the
     kernfs mount parameters are moved in here (or are in fs_context).

 (2) kernfs_mount{,_ns}() are made into kernfs_get_tree().  The extra
     namespace tag parameter is passed in the context if desired

 (3) kernfs_free_fs_context() is provided as a destructor for the
     kernfs_fs_context struct, but for the moment it does nothing except
     get called in the right places.

 (4) sysfs doesn't wrap kernfs_fs_context since it has no parameters to
     pass, but possibly this should be done anyway in case someone wants to
     add a parameter in future.

 (5) A cgroup_fs_context struct is created to wrap kernfs_fs_context and
     the cgroup v1 and v2 mount parameters are all moved there.

 (6) cgroup1 parameter parsing error messages are now handled by invalf(),
     which allows userspace to collect them directly.

 (7) cgroup1 parameter cleanup is now done in the context destructor rather
     than in the mount/get_tree and remount functions.

Weirdies:

 (*) cgroup_do_get_tree() calls cset_cgroup_from_root() with locks held,
     but then uses the resulting pointer after dropping the locks.  I'm
     told this is okay and needs commenting.

 (*) The cgroup refcount web.  This really needs documenting.

 (*) cgroup2 only has one root?

Add a suggestion from Thomas Gleixner in which the RDT enablement code is
placed into its own function.

[folded a leak fix from Andrey Vagin]

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
cc: Tejun Heo <tj@kernel.org>
cc: Li Zefan <lizefan@huawei.com>
cc: Johannes Weiner <hannes@cmpxchg.org>
cc: cgroups@vger.kernel.org
cc: fenghua.yu@intel.com
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-02-28 03:29:34 -05:00
Ingo Molnar c978b9460f perf/core improvements and fixes:
perf annotate:
 
   Wei Li:
 
   - Fix getting source line failure
 
 perf script:
 
   Andi Kleen:
 
   - Handle missing fields with -F +...
 
 perf data:
 
   Jiri Olsa:
 
   - Prep work to support per-cpu files in a directory.
 
 Intel PT:
 
   Adrian Hunter:
 
   - Improve thread_stack__no_call_return()
 
   - Hide x86 retpolines in thread stacks.
 
   - exported SQL viewer refactorings, new 'top calls' report..
 
   Alexander Shishkin:
 
   - Copy parent's address filter offsets on clone
 
   - Fix address filters for vmas with non-zero offset. Applies to
     ARM's CoreSight as well.
 
 python scripts:
 
   Tony Jones:
 
   - Python3 support for several 'perf script' python scripts.
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCXHRYNwAKCRCyPKLppCJ+
 J8XmAQDKY7gb3GhkX+4aE8cGffFYB2YV5mD9Bbu4AM9tuFFBJwD+KAq87FMCy7m7
 h7xyWk3UILpz6y235AVdfOmgcNDkpAQ=
 =SJCG
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-for-mingo-5.1-20190225' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core

Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

perf annotate:

  Wei Li:

  - Fix getting source line failure

perf script:

  Andi Kleen:

  - Handle missing fields with -F +...

perf data:

  Jiri Olsa:

  - Prep work to support per-cpu files in a directory.

Intel PT:

  Adrian Hunter:

  - Improve thread_stack__no_call_return()

  - Hide x86 retpolines in thread stacks.

  - exported SQL viewer refactorings, new 'top calls' report..

  Alexander Shishkin:

  - Copy parent's address filter offsets on clone

  - Fix address filters for vmas with non-zero offset. Applies to
    ARM's CoreSight as well.

python scripts:

  Tony Jones:

  - Python3 support for several 'perf script' python scripts.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-28 08:29:50 +01:00
Ingo Molnar 9ed8f1a6e7 Merge branch 'linus' into perf/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-28 08:27:17 +01:00
Ingo Molnar 0614621d89 Merge branch 'linus' into locking/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-28 07:50:39 +01:00
Eric Biggers f86d17e9ef crypto: arm64/chacha - fix hchacha_block_neon() for big endian
On big endian arm64 kernels, the xchacha20-neon and xchacha12-neon
self-tests fail because hchacha_block_neon() outputs little endian words
but the C code expects native endianness.  Fix it to output the words in
native endianness (which also makes it match the arm32 version).

Fixes: cc7cf991e9 ("crypto: arm64/chacha20 - add XChaCha20 support")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-02-28 14:37:48 +08:00
Eric Biggers 4b6d196c9c crypto: arm64/chacha - fix chacha_4block_xor_neon() for big endian
The change to encrypt a fifth ChaCha block using scalar instructions
caused the chacha20-neon, xchacha20-neon, and xchacha12-neon self-tests
to start failing on big endian arm64 kernels.  The bug is that the
keystream block produced in 32-bit scalar registers is directly XOR'd
with the data words, which are loaded and stored in native endianness.
Thus in big endian mode the data bytes end up XOR'd with the wrong
bytes.  Fix it by byte-swapping the keystream words in big endian mode.

Fixes: 2fe55987b2 ("crypto: arm64/chacha - use combined SIMD/ALU routine for more speed")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-02-28 14:37:48 +08:00
Tommi Hirvola 7748168c66 crypto: x86/poly1305 - Clear key material from stack in SSE2 variant
1-block SSE2 variant of poly1305 stores variables s1..s4 containing key
material on the stack. This commit adds missing zeroing of the stack
memory. Benchmarks show negligible performance hit (tested on i7-3770).

Signed-off-by: Tommi Hirvola <tommi@hirvola.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-02-28 14:17:59 +08:00
Thomas Bogendoerfer e0bf304e4a
MIPS: fix memory setup for platforms with PHYS_OFFSET != 0
For platforms, which use a PHYS_OFFSET != 0, symbol _end also
contains that offset. So when calling memblock_reserve() for
reserving kernel the size argument needs to be adjusted.

Fixes: bcec54bf31 ("mips: switch to NO_BOOTMEM")
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
Signed-off-by: Paul Burton <paul.burton@mips.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: James Hogan <jhogan@kernel.org>
Cc: linux-mips@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org # v4.20+
2019-02-27 18:49:29 -08:00
Alexey Kardashevskiy 11f5acce2f powerpc/powernv/ioda: Fix locked_vm counting for memory used by IOMMU tables
We store 2 multilevel tables in iommu_table - one for the hardware and
one with the corresponding userspace addresses. Before allocating
the tables, the iommu_table_group_ops::get_table_size() hook returns
the combined size of the two and VFIO SPAPR TCE IOMMU driver adjusts
the locked_vm counter correctly. When the table is actually allocated,
the amount of allocated memory is stored in iommu_table::it_allocated_size
and used to decrement the locked_vm counter when we release the memory
used by the table; .get_table_size() and .create_table() calculate it
independently but the result is expected to be the same.

However the allocator does not add the userspace table size to
.it_allocated_size so when we destroy the table because of VFIO PCI
unplug (i.e. VFIO container is gone but the userspace keeps running),
we decrement locked_vm by just a half of size of memory we are
releasing.

To make things worse, since we enabled on-demand allocation of
indirect levels, it_allocated_size contains only the amount of memory
actually allocated at the table creation time which can just be a
fraction. It is not a problem with incrementing locked_vm (as
get_table_size() value is used) but it is with decrementing.

As the result, we leak locked_vm and may not be able to allocate more
IOMMU tables after few iterations of hotplug/unplug.

This sets it_allocated_size in the pnv_pci_ioda2_ops::create_table()
hook to what pnv_pci_ioda2_get_table_size() returns so from now on we
have a single place which calculates the maximum memory a table can
occupy. The original meaning of it_allocated_size is somewhat lost now
though.

We do not ditch it_allocated_size whatsoever here and we do not call
get_table_size() from vfio_iommu_spapr_tce.c when decrementing
locked_vm as we may have multiple IOMMU groups per container and even
though they all are supposed to have the same get_table_size()
implementation, there is a small chance for failure or confusion.

Fixes: 090bad39b2 ("powerpc/powernv: Add indirect levels to it_userspace")
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-28 11:50:02 +11:00
Thomas Gleixner cfbe271667 y2038: additional syscall ABI cleanup
This is a follow-up to the y2038 syscall patches already merged in the tip
 tree.  As the final 32-bit RISC-V syscall ABI is still being decided on,
 this is the last chance to make a few corrections to leave out interfaces
 based on 32-bit time_t along with the old off_t and rlimit types.
 
 The series achieves this in a few steps:
 
 - A couple of bug fixes for minor regressions I introduced
   in the original series
 
 - A couple of older patches from Yury Norov that I had never
   merged in the past, these fix up the openat/open_by_handle_at and
   getrlimit/setrlimit syscalls to disallow the old versions of off_t
   and rlimit.
 
 - Hiding the deprecated system calls behind an #ifdef in
   include/uapi/asm-generic/unistd.h
 
 - Change arch/riscv to drop all these ABIs.
 
 Originally, the plan was to also leave these out on C-Sky, but that now
 has a glibc port that uses the older interfaces, so we need to leave
 them in place.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJcdEhGAAoJEGCrR//JCVInQuUQAN+mRFzRXAqhbpb63/vYGJei
 nmDqB+SoxzaIKAIGAVIdMGUoFxBrY1oyS4m6/a9lzQ9G4aSkr0PruZnUID+vIo2h
 rj+3FBlB/c9nvW+NG8iEtVadlRbTmoRILCWpvgIuLNd6fwvNzP3V4uu6a1QRIMx4
 aUCWQfhzv18kW1EAPIroPA1gEL2HKbhDdEuN2V0SKnsKNiWkHQeswWQFAYpLgT36
 eZ+L52lh+miEdtBxycxJ5lh3KsWO4dPImh+QHONZgeB9iS8v47K0R6ONKm4NMeQV
 5KW55pepUq1uQUdEU9KRrh2krMih2IJbOQoN2lvb2ao5UG6erHbj0N55RQym5gSC
 +TrvP3dnqfohh9hWdHDwME+5OTeOM+8SUMRnaZBJKuywzo7W1ceLpf+KZjwlk2s5
 AgEX67fKrUbtBfTgVhzlYhJLWcgSD1yt64ed5SF15c5M3JZhkK8cd50dB9pM2/YB
 o9VbijkYwb2KyCNUiV3nghgiiqcROvOIO7PK6z3XFFiRm/Gn2CgNZyZa7c4+Vgrr
 PM/DmDvCdFqYnqBOlV2ilCLigKGN0JgwzMXnbQU77d71Yg7Bco8e/yqSucSilp2d
 lEv44extu9FINWXIqvWEjRqdSq+sNgj21VSp6Zu/GaTgNCQKac2wsAZtnQgnslko
 knKwwp525fjqnJEDd1aH
 =/iFA
 -----END PGP SIGNATURE-----

Merge tag 'y2038-syscall-abi' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/playground into timers/2038

Pull additional syscall ABI cleanup for y2038 from Arnd Bergmann:

This is a follow-up to the y2038 syscall patches already merged in the tip
tree.  As the final 32-bit RISC-V syscall ABI is still being decided on,
this is the last chance to make a few corrections to leave out interfaces
based on 32-bit time_t along with the old off_t and rlimit types.

The series achieves this in a few steps:

- A couple of bug fixes for minor regressions I introduced
  in the original series

- A couple of older patches from Yury Norov that I had never
  merged in the past, these fix up the openat/open_by_handle_at and
  getrlimit/setrlimit syscalls to disallow the old versions of off_t
  and rlimit.

- Hiding the deprecated system calls behind an #ifdef in
  include/uapi/asm-generic/unistd.h

- Change arch/riscv to drop all these ABIs.

Originally, the plan was to also leave these out on C-Sky, but that now
has a glibc port that uses the older interfaces, so we need to leave
them in place.
2019-02-27 21:45:27 +01:00
Christophe Leroy 27da80719e powerpc/fsl: Fix the flush of branch predictor.
The commit identified below adds MC_BTB_FLUSH macro only when
CONFIG_PPC_FSL_BOOK3E is defined. This results in the following error
on some configs (seen several times with kisskb randconfig_defconfig)

arch/powerpc/kernel/exceptions-64e.S:576: Error: Unrecognized opcode: `mc_btb_flush'
make[3]: *** [scripts/Makefile.build:367: arch/powerpc/kernel/exceptions-64e.o] Error 1
make[2]: *** [scripts/Makefile.build:492: arch/powerpc/kernel] Error 2
make[1]: *** [Makefile:1043: arch/powerpc] Error 2
make: *** [Makefile:152: sub-make] Error 2

This patch adds a blank definition of MC_BTB_FLUSH for other cases.

Fixes: 10c5e83afd ("powerpc/fsl: Flush the branch predictor at each kernel entry (64bit)")
Cc: Diana Craciun <diana.craciun@nxp.com>
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Reviewed-by: Daniel Axtens <dja@axtens.net>
Reviewed-by: Diana Craciun <diana.craciun@nxp.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-27 22:52:38 +11:00
Jordan Niethe 7b62f9bd22 powerpc/powernv: Make opal log only readable by root
Currently the opal log is globally readable. It is kernel policy to
limit the visibility of physical addresses / kernel pointers to root.
Given this and the fact the opal log may contain this information it
would be better to limit the readability to root.

Fixes: bfc36894a4 ("powerpc/powernv: Add OPAL message log interface")
Cc: stable@vger.kernel.org # v3.15+
Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
Reviewed-by: Stewart Smith <stewart@linux.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-27 22:11:31 +11:00
Brian Norris 5364a0b4f4 arm64: dts: rockchip: move QCA6174A wakeup pin into its USB node
Currently, we don't coordinate BT USB activity with our handling of the
BT out-of-band wake pin, and instead just use gpio-keys. That causes
problems because we have no way of distinguishing wake activity due to a
BT device (e.g., mouse) vs. the BT controller (e.g., re-configuring wake
mask before suspend). This can cause spurious wake events just because
we, for instance, try to reconfigure the host controller's event mask
before suspending.

We can avoid these synchronization problems by handling the BT wake pin
directly in the btusb driver -- for all activity up until BT controller
suspend(), we simply listen to normal USB activity (e.g., to know the
difference between device and host activity); once we're really ready to
suspend the host controller, there should be no more host activity, and
only *then* do we unmask the GPIO interrupt.

This is already supported by btusb; we just need to describe the wake
pin in the right node.

We list 2 compatible properties, since both PID/VID pairs show up on
Scarlet devices, and they're both essentially identical QCA6174A-based
modules.

Also note that the polarity was wrong before: Qualcomm implemented WAKE
as active high, not active low. We only got away with this because
gpio-keys always reconfigured us as bi-directional edge-triggered.

Finally, we have an external pull-up and a level-shifter on this line
(we didn't notice Qualcomm's polarity in the initial design), so we
can't do pull-down. Switch to pull-none.

Signed-off-by: Brian Norris <briannorris@chromium.org>
Reviewed-by: Matthias Kaehlcke <mka@chromium.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2019-02-27 08:50:15 +01:00
Marc Gonzalez 6e53330909 arm64: dts: qcom: msm8998: Extend TZ reserved memory area
My console locks up as soon as Linux writes to [88800000,88f00000[
AFAIU, that memory area is reserved for trustzone.

Extend TZ reserved memory range, to prevent Linux from stepping on
trustzone's toes.

Cc: stable@vger.kernel.org # 4.20+
Reviewed-by: Sibi Sankar <sibis@codeaurora.org>
Fixes: c783394956 ("arm64: dts: qcom: msm8998: Add smem related nodes")
Signed-off-by: Marc Gonzalez <marc.w.gonzalez@free.fr>
Signed-off-by: Andy Gross <andy.gross@linaro.org>
2019-02-26 23:32:11 -06:00
Paul Mackerras e74d53e30e KVM: PPC: Fix compilation when KVM is not enabled
Compiling with CONFIG_PPC_POWERNV=y and KVM disabled currently gives
an error like this:

  CC      arch/powerpc/kernel/dbell.o
In file included from arch/powerpc/kernel/dbell.c:20:0:
arch/powerpc/include/asm/kvm_ppc.h: In function ‘xics_on_xive’:
arch/powerpc/include/asm/kvm_ppc.h:625:9: error: implicit declaration of function ‘xive_enabled’ [-Werror=implicit-function-declaration]
  return xive_enabled() && cpu_has_feature(CPU_FTR_HVMODE);
         ^
cc1: all warnings being treated as errors
scripts/Makefile.build:276: recipe for target 'arch/powerpc/kernel/dbell.o' failed
make[3]: *** [arch/powerpc/kernel/dbell.o] Error 1

Fix this by making the xics_on_xive() definition conditional on the
same symbol (CONFIG_KVM_BOOK3S_64_HANDLER) that determines whether we
include <asm/xive.h> or not, since that's the header that defines
xive_enabled().

Fixes: 03f953329b ("KVM: PPC: Book3S: Allow XICS emulation to work in nested hosts using XIVE")
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2019-02-27 09:14:44 +11:00
Julien Thierry 4caf8758b6 arm64: Rename get_thread_info()
The assembly macro get_thread_info() actually returns a task_struct and is
analogous to the current/get_current macro/function.

While it could be argued that thread_info sits at the start of
task_struct and the intention could have been to return a thread_info,
instances of loads from/stores to the address obtained from
get_thread_info() use offsets that are generated with
offsetof(struct task_struct, [...]).

Rename get_thread_info() to state it returns a task_struct.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-02-26 16:57:59 +00:00
Julien Grall 47224e51ab arm64: Remove documentation about TIF_USEDFPU
TIF_USEDFPU is not defined as thread flags for Arm64. So drop it from
the documentation.

Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Julien Grall <julien.grall@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-02-26 16:41:10 +00:00
Nathan Chancellor e7140639b1 powerpc/xmon: Fix opcode being uninitialized in print_insn_powerpc
When building with -Wsometimes-uninitialized, Clang warns:

  arch/powerpc/xmon/ppc-dis.c:157:7: warning: variable 'opcode' is used
  uninitialized whenever 'if' condition is false
  [-Wsometimes-uninitialized]
    if (cpu_has_feature(CPU_FTRS_POWER9))
        ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  arch/powerpc/xmon/ppc-dis.c:167:7: note: uninitialized use occurs here
    if (opcode == NULL)
        ^~~~~~
  arch/powerpc/xmon/ppc-dis.c:157:3: note: remove the 'if' if its
  condition is always true
    if (cpu_has_feature(CPU_FTRS_POWER9))
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  arch/powerpc/xmon/ppc-dis.c:132:38: note: initialize the variable
  'opcode' to silence this warning
    const struct powerpc_opcode *opcode;
                                       ^
                                        = NULL
  1 warning generated.

This warning seems to make no sense on the surface because opcode is set
to NULL right below this statement. However, there is a comma instead of
semicolon to end the dialect assignment, meaning that the opcode
assignment only happens in the if statement. Properly terminate that
line so that Clang no longer warns.

Fixes: 5b102782c7 ("powerpc/xmon: Enable disassembly files (compilation changes)")
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-26 23:55:22 +11:00
Nicholas Piggin 75d9fc7fd9 powerpc/powernv: move OPAL call wrapper tracing and interrupt handling to C
The OPAL call wrapper gets interrupt disabling wrong. It disables
interrupts just by clearing MSR[EE], which has two problems:

- It doesn't call into the IRQ tracing subsystem, which means tracing
  across OPAL calls does not always notice IRQs have been disabled.

- It doesn't go through the IRQ soft-mask code, which causes a minor
  bug. MSR[EE] can not be restored by saving the MSR then clearing
  MSR[EE], because a racing interrupt while soft-masked could clear
  MSR[EE] between the two steps. This can cause MSR[EE] to be
  incorrectly enabled when the OPAL call returns. Fortunately that
  should only result in another masked interrupt being taken to
  disable MSR[EE] again, but it's a bit sloppy.

The existing code also saves MSR to PACA, which is not re-entrant if
there is a nested OPAL call from different MSR contexts, which can
happen these days with SRESET interrupts on bare metal.

To fix these issues, move the tracing and IRQ handling code to C, and
call into asm just for the low level call when everything is ready to
go. Save the MSR on stack rather than PACA.

Performance cost is kept to a minimum with a few optimisations:

- The endian switch upon return is combined with the MSR restore,
  which avoids an expensive context synchronizing operation for LE
  kernels. This makes up for the additional mtmsrd to enable
  interrupts with local_irq_enable().

- blr is now used to return from the opal_* functions that are called
  as C functions, to avoid link stack corruption. This requires a
  skiboot fix as well to keep the call stack balanced.

A NULL call is more costly after this, (410ns->430ns on POWER9), but
OPAL calls are generally not performance critical at this scale.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-26 23:55:09 +11:00
Nicholas Piggin 38555434a9 powerpc/64s: Fix data interrupts vs d-side MCE reentrancy
Handlers for interrupts that set DAR / DSISR, set MSR[RI] before those
SPRs are read. If a d-side machine check hits in this window, DAR /
DSISR will be clobbered silently, leading to random corruption.

Fix this by having handlers save those registers before setting MSR[RI].

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-26 23:28:26 +11:00
Nicholas Piggin e779fc9364 powerpc/64s: Prepare to handle data interrupts vs d-side MCE reentrancy
A subsequent fix for data interrupts (those that set DAR / DSISR)
requires some interrupt macros to be open-coded, and also requires
the 0x300 interrupt handler to be moved out-of-line.

This patch does that without changing behaviour, which makes the later
fix a smaller change.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-26 23:28:26 +11:00
Nicholas Piggin cbf2ba952a powerpc/64s: system reset interrupt preserve HSRRs
Code that uses HSRR registers is not required to clear MSR[RI] by
convention, however the system reset NMI itself may use HSRR
registers (e.g., to call OPAL) and clobber them.

Rather than introduce the requirement to clear RI in order to use
HSRRs, have system reset interrupt save and restore HSRRs.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-26 23:28:25 +11:00
Nicholas Piggin ccd477028a powerpc/64s: Fix HV NMI vs HV interrupt recoverability test
HV interrupts that use HSRR registers do not enter with MSR[RI] clear,
but their entry code is not recoverable vs NMI, due to shared use of
HSPRG1 as a scratch register to save r13.

This means that a system reset or machine check that hits in HSRR
interrupt entry can cause r13 to be silently corrupted.

Fix this by marking NMIs non-recoverable if they land in HV interrupt
ranges.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-26 23:28:24 +11:00
Vladimir Murzin d410a8a49e ARM: 8849/1: NOMMU: Fix encodings for PMSAv8's PRBAR4/PRLAR4
To access PRBARn, where n is referenced as a binary number:

MRC p15, 0, <Rt>, c6, c8+n[3:1], 4*n[0] ; Read PRBARn into Rt
MCR p15, 0, <Rt>, c6, c8+n[3:1], 4*n[0] ; Write Rt into PRBARn

To access PRLARn, where n is referenced as a binary number:

MRC p15, 0, <Rt>, c6, c8+n[3:1], 4*n[0]+1 ; Read PRLARn into Rt
MCR p15, 0, <Rt>, c6, c8+n[3:1], 4*n[0]+1 ; Write Rt into PRLARn

For PR{B,L}AR4, n is 4, n[0] is 0, n[3:1] is 2, while current encoding
done with n[0] set to 1 which is wrong. Use proper encoding instead.

Fixes: 046835b4aa ("ARM: 8757/1: NOMMU: Support PMSAv8 MPU")
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2019-02-26 11:35:56 +00:00
Vladimir Murzin 9db043d36b ARM: 8848/1: virt: Align GIC version check with arm64 counterpart
arm64 has got relaxation on GIC version check at early boot stage due
to update of the GIC architecture let's align ARM with that.

To help backports (even though the code was correct at the time of writing)
Fixes: e59941b9b3 ("ARM: 8527/1: virt: enable GICv3 system registers")
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2019-02-26 11:33:48 +00:00
Marek Szyprowski ca70ea43f8 ARM: 8847/1: pm: fix HYP/SVC mode mismatch when MCPM is used
MCPM does a soft reset of the CPUs and uses common cpu_resume() routine to
perform low-level platform initialization. This results in a try to install
HYP stubs for the second time for each CPU and results in false HYP/SVC
mode mismatch detection. The HYP stubs are already installed at the
beginning of the kernel initialization on the boot CPU (head.S) or in the
secondary_startup() for other CPUs. To fix this issue MCPM code should use
a cpu_resume() routine without HYP stubs installation.

This change fixes HYP/SVC mode mismatch on Samsung Exynos5422-based Odroid
XU3/XU4/HC1 boards.

Fixes: 3721924c81 ("ARM: 8081/1: MCPM: provide infrastructure to allow for MCPM loopback")
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Tested-by: Anand Moon <linux.amoon@gmail.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2019-02-26 11:32:54 +00:00
Stefan Agner b7e8c9397c ARM: 8845/1: use unified assembler in c files
Use unified assembler syntax (UAL) in inline assembler. Divided
syntax is considered deprecated. This will also allow to build
the kernel using LLVM's integrated assembler.

When compiling non-Thumb2 GCC always emits a ".syntax divided"
at the beginning of the inline assembly which makes the
assembler fail. Since GCC 5 there is the -masm-syntax-unified
GCC option which make GCC assume unified syntax asm and hence
emits ".syntax unified" even in ARM mode. However, the option
is broken since GCC version 6 (see GCC PR88648 [1]). Work
around by adding ".syntax unified" as part of the inline
assembly.

[0] https://gcc.gnu.org/onlinedocs/gcc/ARM-Options.html#index-masm-syntax-unified
[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88648

Signed-off-by: Stefan Agner <stefan@agner.ch>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2019-02-26 11:26:08 +00:00
Stefan Agner e44fc38818 ARM: 8844/1: use unified assembler in assembly files
Use unified assembler syntax (UAL) in assembly files. Divided
syntax is considered deprecated. This will also allow to build
the kernel using LLVM's integrated assembler.

Signed-off-by: Stefan Agner <stefan@agner.ch>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2019-02-26 11:26:07 +00:00
Stefan Agner c001899a5d ARM: 8843/1: use unified assembler in headers
Use unified assembler syntax (UAL) in headers. Divided syntax is
considered deprecated. This will also allow to build the kernel
using LLVM's integrated assembler.

Signed-off-by: Stefan Agner <stefan@agner.ch>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2019-02-26 11:26:06 +00:00
Stefan Agner a216376add ARM: 8841/1: use unified assembler in macros
Use unified assembler syntax (UAL) in macros. Divided syntax is
considered deprecated. This will also allow to build the kernel
using LLVM's integrated assembler.

Signed-off-by: Stefan Agner <stefan@agner.ch>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2019-02-26 11:25:43 +00:00
Sebastian Andrzej Siewior 74ffe79ae5 ARM: 8840/1: use a raw_spinlock_t in unwind
Mostly unwind is done with irqs enabled however SLUB may call it with
irqs disabled while creating a new SLUB cache.

I had system freeze while loading a module which called
kmem_cache_create() on init. That means SLUB's __slab_alloc() disabled
interrupts and then

->new_slab_objects()
 ->new_slab()
  ->setup_object()
   ->setup_object_debug()
    ->init_tracking()
     ->set_track()
      ->save_stack_trace()
       ->save_stack_trace_tsk()
        ->walk_stackframe()
         ->unwind_frame()
          ->unwind_find_idx()
           =>spin_lock_irqsave(&unwind_lock);

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2019-02-26 11:24:51 +00:00
Yang Shi 143c2a89e0 ARM: 8839/1: kprobe: make patch_lock a raw_spinlock_t
When running kprobe on -rt kernel, the below bug is caught:

|BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:931
|in_atomic(): 1, irqs_disabled(): 128, pid: 14, name: migration/0
|Preemption disabled at:[<802f2b98>] cpu_stopper_thread+0xc0/0x140
|CPU: 0 PID: 14 Comm: migration/0 Tainted: G O 4.8.3-rt2 #1
|Hardware name: Freescale LS1021A
|[<8025a43c>] (___might_sleep)
|[<80b5b324>] (rt_spin_lock)
|[<80b5c31c>] (__patch_text_real)
|[<80b5c3ac>] (patch_text_stop_machine)
|[<802f2920>] (multi_cpu_stop)

Since patch_text_stop_machine() is called in stop_machine() which
disables IRQ, sleepable lock should be not used in this atomic context,
 so replace patch_lock to raw lock.

Signed-off-by: Yang Shi <yang.shi@linaro.org>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2019-02-26 11:24:50 +00:00
Aneesh Kumar K.V 3b4d07d267 powerpc/mm/hash: Handle mmap_min_addr correctly in get_unmapped_area topdown search
When doing top-down search the low_limit is not PAGE_SIZE but rather
max(PAGE_SIZE, mmap_min_addr). This handle cases in which mmap_min_addr >
PAGE_SIZE.

Fixes: fba2369e6c ("mm: use vm_unmapped_area() on powerpc architecture")
Reviewed-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-26 16:26:29 +11:00
Aneesh Kumar K.V 5330367fa3 powerpc/hugetlb: Handle mmap_min_addr correctly in get_unmapped_area callback
After we ALIGN up the address we need to make sure we didn't overflow
and resulted in zero address. In that case, we need to make sure that
the returned address is greater than mmap_min_addr.

This fixes selftest va_128TBswitch --run-hugetlb reporting failures when
run as non root user for

mmap(-1, MAP_HUGETLB)

The bug is that a non-root user requesting address -1 will be given address 0
which will then fail, whereas they should have been given something else that
would have succeeded.

We also avoid the first mmap(-1, MAP_HUGETLB) returning NULL address as mmap address
with this change. So we think this is not a security issue, because it only affects
whether we choose an address below mmap_min_addr, not whether we
actually allow that address to be mapped. ie. there are existing capability
checks to prevent a user mapping below mmap_min_addr and those will still be
honoured even without this fix.

Fixes: 484837601d ("powerpc/mm: Add radix support for hugetlb")
Reviewed-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-26 16:26:28 +11:00
Tony Luck 41f035a86b x86/mce: Improve error message when kernel cannot recover, p2
In

  c7d606f560 ("x86/mce: Improve error message when kernel cannot recover")

a case was added for a machine check caused by a DATA access to poison
memory from the kernel. A case should have been added also for an
uncorrectable error during an instruction fetch in the kernel.

Add that extra case so the error message now reads:

  mce: [Hardware Error]: Machine check: Instruction fetch error in kernel

Fixes: c7d606f560 ("x86/mce: Improve error message when kernel cannot recover")
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Pu Wen <puwen@hygon.cn>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190225205940.15226-1-tony.luck@intel.com
2019-02-25 23:21:35 +01:00
Hauke Mehrtens aeb669d41f
MIPS: lantiq: Remove separate GPHY Firmware loader
The separate GPHY Firmware loader driver is not used any more, the GPHY
firmware is now loaded by the GSWIP switch driver which also makes use
of the GPHY.
Remove the old unused GPHY firmware loader driver.

The GPHY firmware is useless without an Ethernet and switch driver, it
should not harm if loading this does not work for system using an old
device tree.
I am not aware of any vendor separating the device tree from the kernel
binary, it should be ok to remove this.

The code and the functionality form this separate GPHY firmware loader
was added to the gswip driver in commit 14fceff477 ("net: dsa: Add
Lantiq / Intel DSA driver for vrx200")

Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: Paul Burton <paul.burton@mips.com>
Cc: linux-mips@vger.kernel.org
Cc: devicetree@vger.kernel.org
Cc: john@phrozen.org
Cc: netdev@vger.kernel.org
2019-02-25 14:17:10 -08:00
Borislav Petkov 2e7614c073 x86/uaccess: Remove unused __addr_ok() macro
This was caught while staring at the whole {set,get}_fs() machinery.

It's last user, the 32-bit version of strnlen_user() went away with

  5723aa993d ("x86: use the new generic strnlen_user() function")

so drop it.

No functional changes.

Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Jann Horn <jannh@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: the arch/x86 maintainers <x86@kernel.org>
Cc: "Tobin C. Harding" <tobin@kernel.org>
Link: https://lkml.kernel.org/r/20190225191109.7671-1-bp@alien8.de
2019-02-25 23:13:05 +01:00
Jonas Gorski 18836b48eb
MIPS: BCM63XX: provide DMA masks for ethernet devices
The switch to the generic dma ops made dma masks mandatory, breaking
devices having them not set. In case of bcm63xx, it broke ethernet with
the following warning when trying to up the device:

[    2.633123] ------------[ cut here ]------------
[    2.637949] WARNING: CPU: 0 PID: 325 at ./include/linux/dma-mapping.h:516 bcm_enetsw_open+0x160/0xbbc
[    2.647423] Modules linked in: gpio_button_hotplug
[    2.652361] CPU: 0 PID: 325 Comm: ip Not tainted 4.19.16 #0
[    2.658080] Stack : 80520000 804cd3ec 00000000 00000000 804ccc00 87085bdc 87d3f9d4 804f9a17
[    2.666707]         8049cf18 00000145 80a942a0 00000204 80ac0000 10008400 87085b90 eb3d5ab7
[    2.675325]         00000000 00000000 80ac0000 000022b0 00000000 00000000 00000007 00000000
[    2.683954]         0000007a 80500000 0013b381 00000000 80000000 00000000 804a1664 80289878
[    2.692572]         00000009 00000204 80ac0000 00000200 00000002 00000000 00000000 80a90000
[    2.701191]         ...
[    2.703701] Call Trace:
[    2.706244] [<8001f3c8>] show_stack+0x58/0x100
[    2.710840] [<800336e4>] __warn+0xe4/0x118
[    2.715049] [<800337d4>] warn_slowpath_null+0x48/0x64
[    2.720237] [<80289878>] bcm_enetsw_open+0x160/0xbbc
[    2.725347] [<802d1d4c>] __dev_open+0xf8/0x16c
[    2.729913] [<802d20cc>] __dev_change_flags+0x100/0x1c4
[    2.735290] [<802d21b8>] dev_change_flags+0x28/0x70
[    2.740326] [<803539e0>] devinet_ioctl+0x310/0x7b0
[    2.745250] [<80355fd8>] inet_ioctl+0x1f8/0x224
[    2.749939] [<802af290>] sock_ioctl+0x30c/0x488
[    2.754632] [<80112b34>] do_vfs_ioctl+0x740/0x7dc
[    2.759459] [<80112c20>] ksys_ioctl+0x50/0x94
[    2.763955] [<800240b8>] syscall_common+0x34/0x58
[    2.768782] ---[ end trace fb1a6b14d74e28b6 ]---
[    2.773544] bcm63xx_enetsw bcm63xx_enetsw.0: cannot allocate rx ring 512

Fix this by adding appropriate DMA masks for the platform devices.

Fixes: f8c55dc6e8 ("MIPS: use generic dma noncoherent ops for simple noncoherent platforms")
Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Paul Burton <paul.burton@mips.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: James Hogan <jhogan@kernel.org>
Cc: stable@vger.kernel.org # v4.19+
2019-02-25 12:56:39 -08:00
Arnd Bergmann d4c08b9776 riscv: Use latest system call ABI
We don't yet have an upstream glibc port for riscv, so there is no user
space for the existing ABI, and we can remove the definitions for 32-bit
time_t, off_t and struct resource and system calls based on them,
including the vdso.

Reviewed-by: Palmer Dabbelt <palmer@sifive.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2019-02-25 20:53:52 +01:00
Andy Lutomirski 2a418cf3f5 x86/uaccess: Don't leak the AC flag into __put_user() value evaluation
When calling __put_user(foo(), ptr), the __put_user() macro would call
foo() in between __uaccess_begin() and __uaccess_end().  If that code
were buggy, then those bugs would be run without SMAP protection.

Fortunately, there seem to be few instances of the problem in the
kernel. Nevertheless, __put_user() should be fixed to avoid doing this.
Therefore, evaluate __put_user()'s argument before setting AC.

This issue was noticed when an objtool hack by Peter Zijlstra complained
about genregs_get() and I compared the assembly output to the C source.

 [ bp: Massage commit message and fixed up whitespace. ]

Fixes: 11f1a4b975 ("x86: reorganize SMAP handling in user space accesses")
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20190225125231.845656645@infradead.org
2019-02-25 20:17:05 +01:00
Linus Torvalds 53a41cb7ed Revert "x86/fault: BUG() when uaccess helpers fault on kernel addresses"
This reverts commit 9da3f2b740.

It was well-intentioned, but wrong.  Overriding the exception tables for
instructions for random reasons is just wrong, and that is what the new
code did.

It caused problems for tracing, and it caused problems for strncpy_from_user(),
because the new checks made perfectly valid use cases break, rather than
catch things that did bad things.

Unchecked user space accesses are a problem, but that's not a reason to
add invalid checks that then people have to work around with silly flags
(in this case, that 'kernel_uaccess_faults_ok' flag, which is just an
odd way to say "this commit was wrong" and was sprinked into random
places to hide the wrongness).

The real fix to unchecked user space accesses is to get rid of the
special "let's not check __get_user() and __put_user() at all" logic.
Make __{get|put}_user() be just aliases to the regular {get|put}_user()
functions, and make it impossible to access user space without having
the proper checks in places.

The raison d'être of the special double-underscore versions used to be
that the range check was expensive, and if you did multiple user
accesses, you'd do the range check up front (like the signal frame
handling code, for example).  But SMAP (on x86) and PAN (on ARM) have
made that optimization pointless, because the _real_ expense is the "set
CPU flag to allow user space access".

Do let's not break the valid cases to catch invalid cases that shouldn't
even exist.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Kees Cook <keescook@chromium.org>
Cc: Tobin C. Harding <tobin@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Jann Horn <jannh@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-02-25 09:10:51 -08:00
Sandipan Das 6324320de6 powerpc sstep: Add support for modsd, modud instructions
This adds emulation support for the following integer instructions:
  * Modulo Signed Doubleword (modsd)
  * Modulo Unsigned Doubleword (modud)

Signed-off-by: Sandipan Das <sandipan@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-26 00:05:20 +11:00
PrasannaKumar Muralidharan 6c18007150 powerpc sstep: Add support for modsw, moduw instructions
This adds emulation support for the following integer instructions:
  * Modulo Signed Word (modsw)
  * Modulo Unsigned Word (moduw)

Signed-off-by: PrasannaKumar Muralidharan <prasannatsmkumar@gmail.com>
Signed-off-by: Sandipan Das <sandipan@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-26 00:05:19 +11:00
Sandipan Das 3e751acba2 powerpc sstep: Add support for extswsli instruction
This adds emulation support for the following integer instructions:
  * Extend-Sign Word and Shift Left Immediate (extswsli[.])

Signed-off-by: Sandipan Das <sandipan@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-26 00:05:18 +11:00
Sandipan Das 32628b5cf3 powerpc sstep: Add support for cnttzw, cnttzd instructions
This adds emulation support for the following integer instructions:
  * Count Trailing Zeros Word (cnttzw[.])
  * Count Trailing Zeros Doubleword (cnttzd[.])

Signed-off-by: Sandipan Das <sandipan@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-26 00:05:17 +11:00
Sandipan Das a23987ef26 powerpc: sstep: Add support for darn instruction
This adds emulation support for the following integer instructions:
  * Deliver A Random Number (darn)

As suggested by Michael, this uses a raw .long for specifying the
instruction word when using inline assembly to retain compatibility
with older binutils.

Signed-off-by: Sandipan Das <sandipan@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-26 00:05:17 +11:00
Sandipan Das 930d6288a2 powerpc: sstep: Add support for maddhd, maddhdu, maddld instructions
This adds emulation support for the following integer instructions:
  * Multiply-Add High Doubleword (maddhd)
  * Multiply-Add High Doubleword Unsigned (maddhdu)
  * Multiply-Add Low Doubleword (maddld)

As suggested by Michael, this uses a raw .long for specifying the
instruction word when using inline assembly to retain compatibility
with older binutils.

Signed-off-by: Sandipan Das <sandipan@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-26 00:05:16 +11:00
Linus Walleij 014e90ca44 ARM: dts: gemini: Re-enable display controller
commit 137cd7100e
"ARM: dts: Enable Gemini flash access" contained a bug
by disabling the display controller, while the whole
idea with the patch was to enable flash access AND
the display controller, simultaneously. Fix it up.

Fixes: 137cd7100e ("ARM: dts: Enable Gemini flash access")
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2019-02-25 11:16:30 +01:00
Angelo Dureghello d7e9d01ac2 m68k: add ColdFire mcf5441x eDMA platform support
This patch adds support for ColdFire eDMA platform driver.

Signed-off-by: Angelo Dureghello <angelo@sysam.it>
Signed-off-by: Greg Ungerer <gerg@linux-m68k.org>
2019-02-25 11:04:05 +10:00
Rafael J. Wysocki 17162a117c Merge back earlier cpufreq material for v5.1. 2019-02-24 21:18:05 +01:00
David S. Miller 70f3522614 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Three conflicts, one of which, for marvell10g.c is non-trivial and
requires some follow-up from Heiner or someone else.

The issue is that Heiner converted the marvell10g driver over to
use the generic c45 code as much as possible.

However, in 'net' a bug fix appeared which makes sure that a new
local mask (MDIO_AN_10GBT_CTRL_ADV_NBT_MASK) with value 0x01e0
is cleared.

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24 12:06:19 -08:00
Linus Torvalds c3619a482e Bug fixes.
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJcclOnAAoJEL/70l94x66DAjIH/28XLAkaAtJIsm4nTy3sb6nC
 UC7suhEEst4zyRCzUlMdeaMkuWJitx5Bun0x9k5uYvMXqmndXcGq3wLmrRhOY2u2
 iN1myLESOn0+lubVcK/+ht2rat2AO9XqpOKPojBRs/c6MW1UAIJIPKly/ls1++Ee
 TmdIKrgqGgE5Ywx4ObvXBDOeWSKUwxqqNi7FUkWpACJckvmQoKJX0Hre5ICW6lom
 +yUBC1rR9apMLTe2QIW2kBZ9JTKCX0aErdmnLXlyZbFOOk0udzNaDy+kXSs7w9Bk
 tu8biO7xBNJrcR5e+fEipVFIdN9au0aM5pJWq9s4yeqSnDKB1FRekoJj5yvTqus=
 =1GQR
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM fixes from Paolo Bonzini:
 "Bug fixes"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: MMU: record maximum physical address width in kvm_mmu_extended_role
  kvm: x86: Return LA57 feature based on hardware capability
  x86/kvm/mmu: fix switch between root and guest MMUs
  s390: vsie: Use effective CRYCBD.31 to check CRYCBD validity
2019-02-24 09:47:07 -08:00
Linus Torvalds e60b5f79bd powerpc fixes for 5.0 #6
One fix for an oops when using SRIOV, introduced by the recent changes to
 support compound IOMMU groups.
 
 Thanks to:
   Alexey Kardashevskiy.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJccT61AAoJEFHr6jzI4aWAAOkQAIVB3i5EDXhiIVUam/YsqlUk
 glEh6a3zmgt8p+zBXlpGW5UULuHC0sx7T1LDGMye+AZ9sXpkK2DzwkwJdNjBMQ8v
 xhH4e4znAhncgRZO92JkrG9Ag4VQuQVLMelhuUcLxF5ybH1+C3ZxSHrMPI7kdiG4
 8un4Og26ixDPcgylLg6tbCeeCr/IjoqZBhyKvwEUjQIY2jM/J/E7zzBEfSRtPlGW
 5jLgfJykEDp9Ta+E4+6+/UtuvbKUOX+xG3j7v7/RBMP0hu7L/naYT3nhoy25Hili
 BXfsNJrLTiQXOCfJZExvqq494Vb4dMwlF4J+45gsBBFUplmZ70g9kSmUKhLtKAtr
 /bfXRKYK9rRigyLHgTRmTbvbX4CkY6C6IgKJem68tWop6QRMazbc8Ea25eqjMESc
 neP7kpZABXJzwLDxP9TS2LjXEcVneR7eIhj7WDY3rrDL/+6YGhVfFKAE+P/Z0THO
 ahPO+EAKQirX127TJZXiL8nkJkU+R4/oKjkF6AsLi2xsLb83cEodABLUpH2xqJCn
 f8JA2gsIjZq3FE+foNpH4i+HVwV3PFFDhNBauZFXtj9hVHt4cuTk1SaIvQohfDCj
 RChHh90MT+u+q1cffeLX/WbjjuJbcxHqF1K1O4SZNN8IfIBVaAXXerbba1KOoIWB
 CG6BfAYQiJ6CBu8QhKYo
 =NBtz
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-5.0-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fix from Michael Ellerman:
 "One fix for an oops when using SRIOV, introduced by the recent changes
  to support compound IOMMU groups.

  Thanks to Alexey Kardashevskiy"

* tag 'powerpc-5.0-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/powernv/sriov: Register IOMMU groups for VFs
2019-02-23 11:13:50 -08:00
Christophe Leroy d608898abc powerpc: clean stack pointers naming
Some stack pointers used to also be thread_info pointers
and were called tp. Now that they are only stack pointers,
rename them sp.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23 22:31:40 +11:00
Christophe Leroy c911d2e128 powerpc/64: Replace CURRENT_THREAD_INFO with PACA_THREAD_INFO
Now that current_thread_info is located at the beginning of 'current'
task struct, CURRENT_THREAD_INFO macro is not really needed any more.

This patch replaces it by loads of the value at PACA_THREAD_INFO(r13).

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
[mpe: Add PACA_THREAD_INFO rather than using PACACURRENT]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23 22:31:40 +11:00
Christophe Leroy f7354ccac8 powerpc/32: Remove CURRENT_THREAD_INFO and rename TI_CPU
Now that thread_info is similar to task_struct, its address is in r2
so CURRENT_THREAD_INFO() macro is useless. This patch removes it.

This patch also moves the 'tovirt(r2, r2)' down just before the
reactivation of MMU translation, so that we keep the physical address
of 'current' in r2 until then. It avoids a few calls to tophys().

At the same time, as the 'cpu' field is not anymore in thread_info,
TI_CPU is renamed TASK_CPU by this patch.

It also allows to get rid of a couple of
'#ifdef CONFIG_VIRT_CPU_ACCOUNTING_NATIVE' as ACCOUNT_CPU_USER_ENTRY()
and ACCOUNT_CPU_USER_EXIT() are empty when
CONFIG_VIRT_CPU_ACCOUNTING_NATIVE is not defined.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
[mpe: Fix a missed conversion of TI_CPU idle_6xx.S]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23 22:31:40 +11:00
Christophe Leroy 7c19c2e5f9 powerpc: 'current_set' is now a table of task_struct pointers
The table of pointers 'current_set' has been used for retrieving
the stack and current. They used to be thread_info pointers as
they were pointing to the stack and current was taken from the
'task' field of the thread_info.

Now, the pointers of 'current_set' table are now both pointers
to task_struct and pointers to thread_info.

As they are used to get current, and the stack pointer is
retrieved from current's stack field, this patch changes
their type to task_struct, and renames secondary_ti to
secondary_current.

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23 22:31:40 +11:00
Christophe Leroy a7916a1de5 powerpc: regain entire stack space
thread_info is not anymore in the stack, so the entire stack
can now be used.

There is also no risk anymore of corrupting task_cpu(p) with a
stack overflow so the patch removes the test.

When doing this, an explicit test for NULL stack pointer is
needed in validate_sp() as it is not anymore implicitely covered
by the sizeof(thread_info) gap.

In the meantime, with the previous patch all pointers to the stacks
are not anymore pointers to thread_info so this patch changes them
to void*

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23 22:31:40 +11:00
Christophe Leroy ed1cd6deb0 powerpc: Activate CONFIG_THREAD_INFO_IN_TASK
This patch activates CONFIG_THREAD_INFO_IN_TASK which
moves the thread_info into task_struct.

Moving thread_info into task_struct has the following advantages:
  - It protects thread_info from corruption in the case of stack
    overflows.
  - Its address is harder to determine if stack addresses are leaked,
    making a number of attacks more difficult.

This has the following consequences:
  - thread_info is now located at the beginning of task_struct.
  - The 'cpu' field is now in task_struct, and only exists when
    CONFIG_SMP is active.
  - thread_info doesn't have anymore the 'task' field.

This patch:
  - Removes all recopy of thread_info struct when the stack changes.
  - Changes the CURRENT_THREAD_INFO() macro to point to current.
  - Selects CONFIG_THREAD_INFO_IN_TASK.
  - Modifies raw_smp_processor_id() to get ->cpu from current without
    including linux/sched.h to avoid circular inclusion and without
    including asm/asm-offsets.h to avoid symbol names duplication
    between ASM constants and C constants.
  - Modifies klp_init_thread_info() to take a task_struct pointer
    argument.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Add task_stack.h to livepatch.h to fix build fails]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23 22:31:40 +11:00
Christophe Leroy 7aef376679 powerpc/idle/6xx: Use r1 with CURRENT_THREAD_INFO()
Make sure CURRENT_THREAD_INFO() is used with r1 which is the virtual
address of the stack, in order to ease the switch to r2 when we enable
THREAD_INFO_IN_TASK, as we have no register having the phys address of
current.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
[mpe: Split out of larger patch]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23 22:31:40 +11:00
Christophe Leroy b72cc2e7ae powerpc: Use task_stack_page() in current_pt_regs()
Change current_pt_regs() to use task_stack_page() rather than
current_thread_info() so that it keeps working once we enable
THREAD_INFO_IN_TASK.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
[mpe: Split out of large patch]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23 22:31:40 +11:00
Christophe Leroy 3733304048 powerpc: Use linux/thread_info.h in processor.h
When we enable THREAD_INFO_IN_TASK we will remove our definition of
current_thread_info(). Instead it will come from linux/thread_info.h

So switch processor.h to include the latter, so that it can continue
to find current_thread_info().

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Split out of larger patch]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23 22:31:40 +11:00
Christophe Leroy 5497c2536f powerpc: Use sizeof(struct thread_info) in INIT_SP_LIMIT
Currently INIT_SP_LIMIT uses sizeof(init_thread_info), but that symbol
won't exist when we enable THREAD_INFO_IN_TASK. So just use the sizeof
the type which is the same value but will continue to work.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Split out of larger patch]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23 22:31:40 +11:00
Christophe Leroy 678c668a77 powerpc/64: Use task_stack_page() to initialise paca->kstack
Rather than using the thread info use task_stack_page() to initialise
paca->kstack, that way it will work with THREAD_INFO_IN_TASK.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Split out of larger patch]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23 22:31:40 +11:00
Christophe Leroy 4e67bfd7aa powerpc: Update comments in preparation for THREAD_INFO_IN_TASK
Update a few comments that talk about current_thread_info() in
preparation for THREAD_INFO_IN_TASK.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Split out of larger patch]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23 22:31:40 +11:00
Christophe Leroy 05b98791ec powerpc: Replace current_thread_info()->task with current
We have a few places that use current_thread_info()->task to access
current. This won't work with THREAD_INFO_IN_TASK so fix them now.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Split out of larger patch]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23 22:31:40 +11:00
Christophe Leroy 7306e83ccf powerpc: Don't use CURRENT_THREAD_INFO to find the stack
A few places use CURRENT_THREAD_INFO, or the C version, to find the
stack. This will no longer work with THREAD_INFO_IN_TASK so change
them to find the stack in other ways.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Split out of larger patch]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23 22:31:40 +11:00
Christophe Leroy 1e35f29c6b powerpc: call_do_[soft]irq() takes a pointer to the stack
The purpose of the pointer given to call_do_softirq() and
call_do_irq() is to point the new stack. Currently that's the same
thing as the thread_info, but won't be with THREAD_INFO_IN_TASK.

So change the parameter to void* and rename it 'sp'.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Split out of larger patch]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23 22:31:40 +11:00
Christophe Leroy 8c1fc5abdc powerpc: Rename THREAD_INFO to TASK_STACK
This patch renames THREAD_INFO to TASK_STACK, because it is in fact
the offset of the pointer to the stack in task_struct so this pointer
will not be impacted by the move of THREAD_INFO.

Also make it available on 64-bit, as we'll need it there when we
activate THREAD_INFO_IN_TASK.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Make available on 64-bit]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23 22:31:40 +11:00
Christophe Leroy 018cce33c5 powerpc: prep stack walkers for THREAD_INFO_IN_TASK
[text copied from commit 9bbd4c56b0
("arm64: prep stack walkers for THREAD_INFO_IN_TASK")]

When CONFIG_THREAD_INFO_IN_TASK is selected, task stacks may be freed
before a task is destroyed. To account for this, the stacks are
refcounted, and when manipulating the stack of another task, it is
necessary to get/put the stack to ensure it isn't freed and/or re-used
while we do so.

This patch reworks the powerpc stack walking code to account for this.
When CONFIG_THREAD_INFO_IN_TASK is not selected these perform no
refcounting, and this should only be a structural change that does not
affect behaviour.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Move try_get_task_stack() below tsk == NULL check in show_stack()]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23 22:31:40 +11:00
Christophe Leroy 054860897c powerpc: Only use task_struct 'cpu' field on SMP
When moving to CONFIG_THREAD_INFO_IN_TASK, the thread_info 'cpu' field
gets moved into task_struct and only defined when CONFIG_SMP is set.

This patch ensures that TI_CPU is only used when CONFIG_SMP is set and
that task_struct 'cpu' field is not used directly out of SMP code.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23 22:31:39 +11:00
Christophe Leroy 92ab45c5f2 powerpc: Avoid circular header inclusion in mmu-hash.h
When activating CONFIG_THREAD_INFO_IN_TASK, linux/sched.h includes
asm/current.h. This generates a circular dependency. To avoid that,
asm/processor.h shall not be included in mmu-hash.h.

In order to do that, this patch moves into a new header called
asm/task_size_64/32.h all the TASK_SIZE related constants, which can
then be included in mmu-hash.h directly.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Split out all the TASK_SIZE constants not just 64-bit ones]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23 22:31:39 +11:00
Christophe Leroy c8e409a33c powerpc/irq: use memblock functions returning virtual address
Since only the virtual address of allocated blocks is used,
lets use functions returning directly virtual address.

Those functions have the advantage of also zeroing the block.

Suggested-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23 22:31:39 +11:00
Michael Ellerman eafd825ed7 powerpc/64: Simplify __secondary_start paca->kstack handling
In __secondary_start() we load the thread_info of the idle task of the
secondary CPU from current_set[cpu], and then convert it into a stack
pointer before storing that back to paca->kstack.

As pointed out in commit f761622e59 ("powerpc: Initialise
paca->kstack before early_setup_secondary") it's important that we
initialise paca->kstack before calling the MMU setup code, in
particular slb_initialize(), because it will bolt the SLB entry for
the kstack into the SLB.

However we have already setup paca->kstack in cpu_idle_thread_init(),
since commit 3b5750644b ("[POWERPC] Bolt in SLB entry for kernel
stack on secondary cpus") (May 2008).

It's also in cpu_idle_thread_init() that we initialise current_set[cpu]
with the thread_info pointer, so there is no issue of the timing being
different between the two.

Therefore the initialisation of paca->kstack in __setup_secondary() is
completely redundant, so remove it.

This has the added benefit of removing code that runs in real mode,
and is therefore restricted by the RMO, and so opens the way for us to
enable THREAD_INFO_IN_TASK.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23 22:31:39 +11:00
Michael Ellerman e7fda7e569 powerpc/64s: Remove MSR_RI optimisation in system_call_exit()
Currently in system_call_exit() we have an optimisation where we
disable MSR_RI (recoverable interrupt) and MSR_EE (external interrupt
enable) in a single mtmsrd instruction.

Unfortunately this will no longer work with THREAD_INFO_IN_TASK,
because then the load of TI_FLAGS might fault and faulting with MSR_RI
clear is treated as an unrecoverable exception which leads to a
panic().

So change the code to only clear MSR_EE prior to loading TI_FLAGS,
leaving the clear of MSR_RI until later. We have some latitude in
where do the clear of MSR_RI. A bit of experimentation has shown that
this location gives the least slow down.

This still causes a noticeable slow down in our null_syscall
performance. On a Power9 DD2.2:

  Before        After         Delta     Delta %
  955 cycles    999 cycles    -44	-4.6%

On the plus side this does simplify the code somewhat, because we
don't have to reenable MSR_RI on the restore_math() or
syscall_exit_work() paths which was necessitated previously by the
optimisation.

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23 22:31:39 +11:00
Andrew Donnellan fb0b0a73b2 powerpc: Enable kcov
kcov provides kernel coverage data that's useful for fuzzing tools like
syzkaller.

Wire up kcov support on powerpc. Disable kcov instrumentation on the same
files where we currently disable gcov and UBSan instrumentation, plus some
additional exclusions which appear necessary to boot on book3e machines.

Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Dmitry Vyukov <dvyukov@google.com>
Tested-by: Daniel Axtens <dja@axtens.net> # e6500
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23 21:04:32 +11:00
Christophe Leroy 8f54a6f740 powerpc/kconfig: make _etext and data areas alignment configurable on 8xx
On 8xx, large pages (512kb or 8M) are used to map kernel linear
memory. Aligning to 8M reduces TLB misses as only 8M pages are used
in that case. We make 8M the default for data.

This patchs allows the user to do it via Kconfig.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23 21:04:32 +11:00
Christophe Leroy d5f17ee964 powerpc/8xx: don't disable large TLBs with CONFIG_STRICT_KERNEL_RWX
This patch implements handling of STRICT_KERNEL_RWX with
large TLBs directly in the TLB miss handlers.

To do so, etext and sinittext are aligned on 512kB boundaries
and the miss handlers use 512kB pages instead of 8Mb pages for
addresses close to the boundaries.

It sets RO PP flags for addresses under sinittext.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23 21:04:32 +11:00
Christophe Leroy 0f4a9041c7 powerpc/kconfig: make _etext and data areas alignment configurable on Book3s 32
Depending on the number of available BATs for mapping the different
kernel areas, it might be needed to increase the alignment of _etext
and/or of data areas.

This patchs allows the user to do it via Kconfig.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23 21:04:32 +11:00
Christophe Leroy 63b2bc6195 powerpc/mm/32s: Use BATs for STRICT_KERNEL_RWX
Today, STRICT_KERNEL_RWX is based on the use of regular pages
to map kernel pages.

On Book3s 32, it has three consequences:
- Using pages instead of BAT for mapping kernel linear memory severely
impacts performance.
- Exec protection is not effective because no-execute cannot be set at
page level (except on 603 which doesn't have hash tables)
- Write protection is not effective because PP bits do not provide RO
mode for kernel-only pages (except on 603 which handles it in software
via PAGE_DIRTY)

On the 603+, we have:
- Independent IBAT and DBAT allowing limitation of exec parts.
- NX bit can be set in segment registers to forbit execution on memory
mapped by pages.
- RO mode on DBATs even for kernel-only blocks.

On the 601, there is nothing much we can do other than warn the user
about it, because:
- BATs are common to instructions and data.
- BAT do not provide RO mode for kernel-only blocks.
- segment registers don't have the NX bit.

In order to use IBAT for exec protection, this patch:
- Aligns _etext to BAT block sizes (128kb)
- Set NX bit in kernel segment register (Except on vmalloc area when
CONFIG_MODULES is selected)
- Maps kernel text with IBATs.

In order to use DBAT for exec protection, this patch:
- Aligns RW DATA to BAT block sizes (4M)
- Maps kernel RO area with write prohibited DBATs
- Maps remaining memory with remaining DBATs

Here is what we get with this patch on a 832x when activating
STRICT_KERNEL_RWX:

Symbols:
c0000000 T _stext
c0680000 R __start_rodata
c0680000 R _etext
c0800000 T __init_begin
c0800000 T _sinittext

~# cat /sys/kernel/debug/block_address_translation
---[ Instruction Block Address Translation ]---
0: 0xc0000000-0xc03fffff 0x00000000 Kernel EXEC coherent
1: 0xc0400000-0xc05fffff 0x00400000 Kernel EXEC coherent
2: 0xc0600000-0xc067ffff 0x00600000 Kernel EXEC coherent
3:         -
4:         -
5:         -
6:         -
7:         -

---[ Data Block Address Translation ]---
0: 0xc0000000-0xc07fffff 0x00000000 Kernel RO coherent
1: 0xc0800000-0xc0ffffff 0x00800000 Kernel RW coherent
2: 0xc1000000-0xc1ffffff 0x01000000 Kernel RW coherent
3: 0xc2000000-0xc3ffffff 0x02000000 Kernel RW coherent
4: 0xc4000000-0xc7ffffff 0x04000000 Kernel RW coherent
5: 0xc8000000-0xcfffffff 0x08000000 Kernel RW coherent
6: 0xd0000000-0xdfffffff 0x10000000 Kernel RW coherent
7:         -

~# cat /sys/kernel/debug/segment_registers
---[ User Segments ]---
0x00000000-0x0fffffff Kern key 1 User key 1 VSID 0xa085d0
0x10000000-0x1fffffff Kern key 1 User key 1 VSID 0xa086e1
0x20000000-0x2fffffff Kern key 1 User key 1 VSID 0xa087f2
0x30000000-0x3fffffff Kern key 1 User key 1 VSID 0xa08903
0x40000000-0x4fffffff Kern key 1 User key 1 VSID 0xa08a14
0x50000000-0x5fffffff Kern key 1 User key 1 VSID 0xa08b25
0x60000000-0x6fffffff Kern key 1 User key 1 VSID 0xa08c36
0x70000000-0x7fffffff Kern key 1 User key 1 VSID 0xa08d47
0x80000000-0x8fffffff Kern key 1 User key 1 VSID 0xa08e58
0x90000000-0x9fffffff Kern key 1 User key 1 VSID 0xa08f69
0xa0000000-0xafffffff Kern key 1 User key 1 VSID 0xa0907a
0xb0000000-0xbfffffff Kern key 1 User key 1 VSID 0xa0918b

---[ Kernel Segments ]---
0xc0000000-0xcfffffff Kern key 0 User key 1 No Exec VSID 0x000ccc
0xd0000000-0xdfffffff Kern key 0 User key 1 No Exec VSID 0x000ddd
0xe0000000-0xefffffff Kern key 0 User key 1 No Exec VSID 0x000eee
0xf0000000-0xffffffff Kern key 0 User key 1 No Exec VSID 0x000fff

Aligning _etext to 128kb allows to map up to 32Mb text with 8 IBATs:
16Mb + 8Mb + 4Mb + 2Mb + 1Mb + 512kb + 256kb + 128kb (+ 128kb) = 32Mb
(A 9th IBAT is unneeded as 32Mb would need only a single 32Mb block)

Aligning data to 4M allows to map up to 512Mb data with 8 DBATs:
16Mb + 8Mb + 4Mb + 4Mb + 32Mb + 64Mb + 128Mb + 256Mb = 512Mb

Because some processors only have 4 BATs and because some targets need
DBATs for mapping other areas, the following patch will allow to
modify _etext and data alignment.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23 21:04:32 +11:00
Christophe Leroy 5e04ae85fb powerpc/mm/32s: add setibat() clearibat() and update_bats()
setibat() and clearibat() allows to manipulate IBATs independently
of DBATs.

update_bats() allows to update bats after init. This is done
with MMU off.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23 21:04:32 +11:00
Christophe Leroy 166d97d961 powerpc/kconfig: define CONFIG_DATA_SHIFT and CONFIG_ETEXT_SHIFT
CONFIG_STRICT_KERNEL_RWX requires a special alignment
for DATA for some subarches. Today it is just defined
as an #ifdef in vmlinux.lds.S

In order to get more flexibility, this patch moves the
definition of this alignment in Kconfig

On some subarches, CONFIG_STRICT_KERNEL_RWX will
require a special alignment of _etext.

This patch also adds a configuration item for it in Kconfig

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23 21:04:32 +11:00
Christophe Leroy 555f4fdb93 powerpc/kconfig: define PAGE_SHIFT inside Kconfig
This patch defined CONFIG_PPC_PAGE_SHIFT in order
to be able to use PAGE_SHIFT value inside Kconfig.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23 21:04:32 +11:00
Christophe Leroy 28ea38b9cb powerpc/mmu: add is_strict_kernel_rwx() helper
Add a helper to know whether STRICT_KERNEL_RWX is enabled.

This is based on rodata_enabled flag which is defined only
when CONFIG_STRICT_KERNEL_RWX is selected.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23 21:04:32 +11:00
Christophe Leroy 02d5d13b45 powerpc/32: add helper to write into segment registers
This patch add an helper which wraps 'mtsrin' instruction
to write into segment registers.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23 21:04:32 +11:00
Christophe Leroy df25f86390 powerpc/mm/32s: use _PAGE_EXEC in setbat()
Do not set IBAT when setbat() is called without _PAGE_EXEC

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23 21:04:32 +11:00
Christophe Leroy 160985f302 powerpc/wii: remove wii_mmu_mapin_mem2()
wii_mmu_mapin_mem2() is not used anymore, remove it.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23 21:04:32 +11:00
Christophe Leroy d2f15e0979 powerpc/32: always populate page tables for Abatron BDI.
When CONFIG_BDI_SWITCH is set, the page tables have to be populated
allthough large TLBs are used, because the BDI switch knows nothing
about those large TLBs which are handled directly in TLB miss logic.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23 21:04:32 +11:00
Christophe Leroy 9e849f231c powerpc/mm/32s: use generic mmu_mapin_ram() for all blocks.
Now that mmu_mapin_ram() is able to handle other blocks
than the one starting at 0, the WII can use it for all
its blocks.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23 21:04:32 +11:00
Christophe Leroy e4d6654ebe powerpc/mm/32s: rework mmu_mapin_ram()
This patch reworks mmu_mapin_ram() to be more generic and map as much
blocks as possible. It now supports blocks not starting at address 0.

It scans DBATs array to find free ones instead of forcing the use of
BAT2 and BAT3.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23 21:04:31 +11:00
Christophe Leroy 14e609d693 powerpc/mm/32: add base address to mmu_mapin_ram()
At the time being, mmu_mapin_ram() always maps RAM from the beginning.
But some platforms like the WII have to map a second block of RAM.

This patch adds to mmu_mapin_ram() the base address of the block.
At the moment, only base address 0 is supported.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23 21:04:31 +11:00
Christophe Leroy 6d183ca8ba powerpc/wii: properly disable use of BATs when requested.
'nobats' kernel parameter or some options like CONFIG_DEBUG_PAGEALLOC
deny the use of BATS for mapping memory.

This patch makes sure that the specific wii RAM mapping function
takes it into account as well.

Fixes: de32400dd2 ("wii: use both mem1 and mem2 as ram")
Cc: stable@vger.kernel.org
Reviewed-by: Jonathan Neuschafer <j.neuschaefer@gmx.net>
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23 21:04:31 +11:00
Christophe Leroy e4470bd6a4 powerpc/8xx: Map 32Mb of RAM at init.
At the time being, initial MMU setup allows 24 Mbytes
of DATA and 8 Mbytes of code.

Some debug setup like CONFIG_KASAN generate huge
kernels with text size over the 8M limit and data over the
24 Mbytes limit.

Here is an 8xx kernel compiled with CONFIG_KASAN_INLINE for
one of my boards:

[root@po16846vm linux-powerpc]# size -x vmlinux
   text	   data	    bss	    dec	    hex	filename
0x111019c	0x41b0d4	0x490de0	26984528	19bc050	vmlinux

This patch maps up to 32 Mbytes code based on _einittext symbol
and allows 32 Mbytes of memory instead of 24.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23 21:04:31 +11:00
Christophe Leroy 665bed2386 powerpc/8xx: replace most #ifdef by IS_ENABLED() in 8xx_mmu.c
This patch replaces most #ifdef mess by IS_ENABLED() in 8xx_mmu.c
This has the advantage of allowing syntax verification at compile
time regardless of selected options.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23 21:04:31 +11:00
Sandipan Das 78a8da0600 powerpc: sstep: Add tests for addc[.] instruction
This adds test cases for the addc[.] instruction.

Signed-off-by: Sandipan Das <sandipan@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23 21:04:31 +11:00
Sandipan Das 44dea1784b powerpc: sstep: Add tests for add[.] instruction
This adds test cases for the add[.] instruction.

Signed-off-by: Sandipan Das <sandipan@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23 21:04:31 +11:00
Sandipan Das 84022ac173 powerpc: sstep: Add tests for compute type instructions
This enhances the current selftest framework for validating
the in-kernel instruction emulation infrastructure by adding
support for compute type instructions i.e. integer ALU-based
instructions. Originally, this framework was limited to only
testing load and store instructions.

While most of the GPRs can be validated, support for SPRs is
limited to LR, CR and XER for now.

When writing the test cases, one must ensure that the Stack
Pointer (GPR1) or the Thread Pointer (GPR13) are not touched
by any means as these are vital non-volatile registers.

Signed-off-by: Sandipan Das <sandipan@linux.ibm.com>
[mpe: Use patch_site for the code patching]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23 21:04:31 +11:00
Michael Ellerman f68e792721 Revert "powerpc/book3s32: Reorder _PAGE_XXX flags to simplify TLB handling"
This reverts commit 78ca1108b1.

It is causing boot failures with qemu mac99 in at least some
configurations.
2019-02-23 20:30:50 +11:00
Linus Torvalds 9053d2db8b ARM: SoC fixes for 5.0
Only a handful of device tree fixes, all simple enough:
 
 NVIDIA Tegra:
  - Fix a regression for booting on chromebooks
 
 TI OMAP:
  - Two fixes PHY mode on am335x reference boards
 
 Marvell mvebu:
  - A regression fix for Armada XP NAND flash controllers
  - An incorrect reset signal on the clearfog board
 
 Signed-off-by: Arnd Bergmann <arnd@arndb.de>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJccHWeAAoJEGCrR//JCVInLQAQAIKQEEQXxPn1fnXH5oH7n/Yn
 aALv0zoRKx5ybgQvp15hkdD/H7AUt01ADopCvgOxh0EkheIOEZdtb/FCaRPzaYhC
 TrrZhqI6+w0rcMwUzpaDU+90/rxkh6oeIs95lfTkHaZV0ZdbWqwwvX/JuQcPJOh2
 tQXwfNMv4WZetvAFrJHr9L+7/CebgQaOe/Me78wq/bjKEROShaF6j6lQMTeZwX6C
 Jp75cI5gCktqg4ZDQ2NEE8O9Tng4uzIpoVlCptFc38XGKnRZMexZZlZWpMwTeQg1
 QmNmyTal6gbY5tDs3AGg3diSPFQ1nwUiMk2pWvGkkRo5hkNP80lNCouu8F99r/Ub
 QoPRcKzGyBwLj0MwpHBoO5gI1X4mgfZDpL71SdS81p8q3rpnt/W2HU+CTAgnEpAi
 aILCMmRzes3jpNHREUQc5X3dTwfHW8MBW/Bia6XnidqmUw5GoRq+98rCOKFpg2HQ
 m68yDZlOq2odATmaa4xASVfTwccm8jIyQwZVWFPLyekZ9kzMcdYmAI7lGGofJAbg
 SkRXDHsAQ21pDoUZY3C0lE49kPKoTNrt4bsihG0I3hM09moF6ryqmymgAy82mzsD
 e/ZVG2w1E0CS4vKjKq7BEof3PZ6wSdfaYOFzS7v9bXAeBSdRVE/qjzi/aBgAKD8J
 KpDRnh2huOMvPTO5bDKC
 =nXrB
 -----END PGP SIGNATURE-----

Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc

Pull ARM SoC fixes from Arnd Bergmann:
 "Only a handful of device tree fixes, all simple enough:

  NVIDIA Tegra:
   - Fix a regression for booting on chromebooks

  TI OMAP:
   - Two fixes PHY mode on am335x reference boards

  Marvell mvebu:
   - A regression fix for Armada XP NAND flash controllers
   - An incorrect reset signal on the clearfog board"

* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  ARM: tegra: Restore DT ABI on Tegra124 Chromebooks
  ARM: dts: am335x-evm: Fix PHY mode for ethernet
  ARM: dts: am335x-evmsk: Fix PHY mode for ethernet
  arm64: dts: clearfog-gt-8k: fix SGMII PHY reset signal
  ARM: dts: armada-xp: fix Armada XP boards NAND description
2019-02-22 16:48:37 -08:00
Linus Torvalds 2cc63b3900 ARC fixes for 5.0 final
- Fix memcpy to prevent prefetchw beyond end of buffer [Eugeniy]
 
  - Enable unaligned access early to prevent exceptions given newer gcc
    code gen [Eugeniy]
 
  - Tighten up uboot arg checking to prevent false negatives and also
    allow both jtag and bootloading to coexist w/o config option as
    needed by kernelCi folks [Eugeniy]
 
  - Set slab alignment to 8 for ARC to avoid the atomic64_t unalign [Alexey]
 
  - Disable regfile auto save on interrupts on HSDK platform due to a
    silicon issue  [Vineet]
 
  - Avoid HS38x boot printing crash by not reading HS48x only reg [Vineet]
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJccDKEAAoJEGnX8d3iisJeqAoQAIM753GmJMXJeeaDm8wxUkvF
 1NThcBekh2IrEEesCD8HBaCuegTXGJ8eNCkGBtgxUBisQvixRDCge1r18SXdVWRR
 lz3+VoRbiqe4vNZfXJJZQj09/gOIjL7sZQX7NIAk/YDJ4mdhID0yEULE0cKxPkp3
 w3AsCi6x7Umt9nbH06mPV8b71mT77MaNGpTYmx7cvc8FX/rXfh7C7QUgBDeU2201
 3F3tHiJqR+gBu/kwEVTOuG+wJ3sUy8Yi/Qungv6Lkk3rm4bcimBqB8MaJAqB8fPV
 H3rGTgz9eH6p7SERqdSPvO92x5vw/eh9reg0/K3gmHOI0i3gaiUNhxcZwhu2rqZC
 45JkfrRPbLj11uaUTB07BqYck/5SaHugyu6tCtA+khkCigND8RWwJRBAc25VCsJ1
 9ywIc/6eGbfSyOT1Elit6tf1/SpKap63VoXtNmfdEWvCoW4tAVvR6uhi8DcnSlJJ
 5vqYRZUom5IQ7YrAaXQ7VqAq61H7ZA6XSklQs+0w2pqL0YND9W1ryETIw3lraOCh
 3O2V7nETXjTvnEkxovbQ5C2GwIvURN4RtckdgiXCS3MG3OsGMEWDZFdr0kGffZAO
 SAXn8poO522cglIR8o4GwyE1EATQbQ3zuavDq5zuB//VNHdcgODN18zkqel2A5Wr
 AY85YAxbPx05PMRnAwRx
 =cNCN
 -----END PGP SIGNATURE-----

Merge tag 'arc-5.0-final' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc

Pull ARC fixes from Vineet Gupta:
 "Fixes for ARC for 5.0, bunch of those are stable fodder anyways so
  sooner the better.

   - Fix memcpy to prevent prefetchw beyond end of buffer [Eugeniy]

   - Enable unaligned access early to prevent exceptions given newer gcc
     code gen [Eugeniy]

   - Tighten up uboot arg checking to prevent false negatives and also
     allow both jtag and bootloading to coexist w/o config option as
     needed by kernelCi folks [Eugeniy]

   - Set slab alignment to 8 for ARC to avoid the atomic64_t unalign
     [Alexey]

   - Disable regfile auto save on interrupts on HSDK platform due to a
     silicon issue [Vineet]

   - Avoid HS38x boot printing crash by not reading HS48x only reg
     [Vineet]"

* tag 'arc-5.0-final' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
  ARCv2: don't assume core 0x54 has dual issue
  ARC: define ARCH_SLAB_MINALIGN = 8
  ARC: enable uboot support unconditionally
  ARC: U-boot: check arguments paranoidly
  ARCv2: support manual regfile save on interrupts
  ARC: uacces: remove lp_start, lp_end from clobber list
  ARC: fix actionpoints configuration detection
  ARCv2: lib: memcpy: fix doing prefetchw outside of buffer
  ARCv2: Enable unaligned access in early ASM code
2019-02-22 16:31:26 -08:00
Linus Torvalds 8456e98e18 Merge branch 'parisc-5.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc fixes from Helge Deller:
 "Fix ptrace syscall number modification which has been broken since
  kernel v4.5 and provide alternative email addresses for the remaining
  users of the retired parisc-linux.org email domain"

* 'parisc-5.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  CREDITS/MAINTAINERS: Retire parisc-linux.org email domain
  parisc: Fix ptrace syscall number modification
2019-02-22 16:12:01 -08:00