Commit Graph

616764 Commits

Author SHA1 Message Date
Linus Torvalds ac78bc714b Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Ingo Molnar:
 "Two cputime fixes - hopefully the last ones"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/cputime: Resync steal time when guest & host lose sync
  sched/cputime: Fix NO_HZ_FULL getrusage() monotonicity regression
2016-08-18 15:07:21 -07:00
Linus Torvalds 0dcb7b6f8f Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
 "Mostly tooling fixes, but also start/stop filter related fixes, a perf
  event read() fix, a fix uncovered by fuzzing, and an uprobes leak fix"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/core: Check return value of the perf_event_read() IPI
  perf/core: Enable mapping of the stop filters
  perf/core: Update filters only on executable mmap
  perf/core: Fix file name handling for start/stop filters
  perf/core: Fix event_function_local()
  uprobes: Fix the memcg accounting
  perf intel-pt: Fix occasional decoding errors when tracing system-wide
  tools: Sync kvm related header files for arm64 and s390
  perf probe: Release resources on error when handling exit paths
  perf probe: Check for dup and fdopen failures
  perf symbols: Fix annotation of objects with debuginfo files
  perf script: Don't disable use_callchain if input is pipe
  perf script: Show proper message when failed list scripts
  perf jitdump: Add the right header to get the major()/minor() definitions
  perf ppc64le: Fix build failure when libelf is not present
  perf tools mem: Fix -t store option for record command
  perf intel-pt: Fix ip compression
2016-08-18 15:04:53 -07:00
Linus Torvalds bd3fd451ff Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking fixes from Ingo Molnar:
 "Two lockless_dereference() related fixes"

* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/barriers: Suppress sparse warnings in lockless_dereference()
  Revert "drm/fb-helper: Reduce READ_ONCE(master) to lockless_dereference"
2016-08-18 13:45:48 -07:00
Linus Torvalds f28535c100 arm64 fixes:
- Avoid a literal load with the MMU off on the CPU resume path
   (potential inconsistency between cache and RAM)
 
 - Build error with CONFIG_ACPI=n fixed
 
 - Compiler warning in the arch/arm64/mm/dump.c code fixed
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXtd/pAAoJEGvWsS0AyF7xQyEQAJ3oQLnB+WZs1pIGlx4Q1VvY
 tIhuT7jF6G8gLv0h2yMn/8V5rbKYXmgvKscuzcajOnJEDb0W0zV2/nYot/vqb2Gn
 1gAaal1WlU6i1yioKNCm4MeOi0qIL4BPksf4cvGn79XrW0thvq8V7sJuScqa4fWc
 eAWW2XOKNAi+WZE/+rryLTGrY7WrrnuF38I+rodBd/hfr5drfZULgORpMdQ8AJjx
 nsjdAisx6CDPxmEQMfftVjKMlmosPzbvUORMjvhauTBW+9QjMZY/NRIbotl2mAr5
 xEYN2r+eU2sS24DYdNl8EIL96lYK6m6dHxJw1NJA8RvcD9O9cLJ8IX0Kl10ghSlP
 Ozsef8kFIVL1YKaxmd3b05w2OdM0V9sWgscE+dC/gVu3ge+CVxoMxgbXlwDd3+j+
 nFB3kzTHIYCwdEiNdEbXGwl5LPrkebASfG95P4lZCfT82EkYKXJ6zUNDg/rHJcDb
 L6t0XqFvaCcBsn/x5QwGyfDVFMPHWoXkG2eIgehrgHpIarxFKyX9FHFzJ3HWZ9d6
 YwQC7sqra4g3J1GXow5nF9pz72CgH8U4Xe4zI77xkk+fwBOjy6tINJsGS3gQrYjB
 /ljzb1jPh4W59du3X9kA+71VDw08IlEIOTHHW7r1+zl7LRmQDUIBwtfvodJS50T9
 HuTckxzIbl9ErKGzkHtv
 =Z8zz
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Catalin Marinas:

 - Avoid a literal load with the MMU off on the CPU resume path
   (potential inconsistency between cache and RAM)

 - Build error with CONFIG_ACPI=n fixed

 - Compiler warning in the arch/arm64/mm/dump.c code fixed

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: Fix shift warning in arch/arm64/mm/dump.c
  arm64: kernel: avoid literal load of virtual address with MMU off
  arm64: Fix NUMA build error when !CONFIG_ACPI
2016-08-18 11:17:13 -07:00
Linus Torvalds 114e3bae37 Merge branch 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm
Pull ARM fixes from Russell King:
 "Only three fixes this time:

   - Emil found an overflow problem with the memory layout sanity check.

   - Ard Biesheuvel noticed that late-allocated page tables (for EFI)
     weren't being properly constructed.

   - Guenter Roeck reported a problem found on qemu caused by the recent
     addr_limit changes"

* 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm:
  ARM: fix address limit restoration for undefined instructions
  ARM: 8591/1: mm: use fully constructed struct pages for EFI pgd allocations
  ARM: 8590/1: sanity_check_meminfo(): avoid overflow on vmalloc_limit
2016-08-18 11:13:20 -07:00
Linus Torvalds 395c434292 Power management updates for v4.8-rc3
- Fix a hibernate core regression resulting from uncovering a
    latent bug in its implementation of memory bitmaps by a recent
    commit (James Morse).
 
  - Use __pa() to compute a physical address in the x86-64 code
    finalizing resume from hibernation (Rafael Wysocki).
 
  - Update power management documentation related to system sleep
    states to remove outdated information from it and to add a
    description of a recently introduced hibernation debug feature
    to it (Rafael Wysocki).
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABCAAGBQJXtY1TAAoJEILEb/54YlRxB6YP/iv3agAMBkmwaGE1NV8cumoh
 8bkmCcm5rCu/bZzVOX8eDmLcKtwqFntY5H6p28EOBT0IFK+c9qNvsbSbXODbSui8
 FQfgP5cutSQQE3sdTb7geeqjBPPiEvpI5beeanEjePJpiZVnVapM5tuLBXLeRhYZ
 aX9Y0gWQ5bJqm9fpucN8VsjI5EknGlaNwFLGC3po3bo2pqYj+KfNy4HTNw3oByr7
 EpyoDQ584qDRre6xcM6cnxulQEz1XGvz8pvsqR99YhkBLWMcnSVezLOplrwsx71W
 GbPYHoGU7EVdayzZg5nfnI/GWpjf1z8iznvoRFB7DEuew2z4RXvUgDADljlXH1jd
 XStxTZKRo+k1++X0+mFIcZanRMsHwHsUGtzec6SzRZQCocdlKc0lPSAGBG40YQVz
 g8lFK5EXgsUlLQfVW52KHCjo5XvjwOUpgAPFyuIisOmNvMLWBb79C6oKvJbYwubg
 Raa2En8JWbjfqTxjsvGJ05LRVJmP0Z2saBQskAytRL/2dVjJGFKkeV9XznA4e8j1
 6bifUV4zmwzurUXtWdBbCIrPBVOMukvVfZPiRIWMSQWWq6dPlHK5R/g3rFBXjGtF
 IjSK0bfluUH19O1GOYZYfFFEa08dZYtG5jvqvmgULlQZXzNd4GFsY6EImVskBdOR
 Xe3v0QtkH8uK7qMXXGRa
 =GLCW
 -----END PGP SIGNATURE-----

Merge tag 'pm-4.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management fixes from Rafael Wysocki:
 "More hibernation-related material: one fix for a recent regression in
  the core, one small cleanup of the x86-64 resume code and a
  documentation update.

  Specifics:

   - Fix a hibernate core regression resulting from uncovering a latent
     bug in its implementation of memory bitmaps by a recent commit
     (James Morse).

   - Use __pa() to compute a physical address in the x86-64 code
     finalizing resume from hibernation (Rafael Wysocki).

   - Update power management documentation related to system sleep
     states to remove outdated information from it and to add a
     description of a recently introduced hibernation debug feature to
     it (Rafael Wysocki)"

* tag 'pm-4.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PM / hibernate: Fix rtree_next_node() to avoid walking off list ends
  x86/power/64: Use __pa() for physical address computation
  PM / sleep: Update some system sleep documentation
2016-08-18 11:09:43 -07:00
Linus Torvalds 76dcd9392a Merge tag 'drm-fixes-for-4.8-rc3' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
 "Pretty quiet so far:

   - a few amdgpu/radeon fixup for pcie pm changes
   - a couple of amdgpu fixes
   - some build fixes
   - printk fix"

* tag 'drm-fixes-for-4.8-rc3' of git://people.freedesktop.org/~airlied/linux:
  drm/amdgpu: Change GART offset to 64-bit
  drm/mediatek: add ARM_SMCCC dependency
  drm/mediatek: add CONFIG_OF dependency
  drm/mediatek: add COMMON_CLK dependency
  drm/amdgpu: Fix memory trashing if UVD ring test fails
  drm/amdgpu: fix vm init error path
  drm/amdkfd: print doorbell offset as a hex value
  Revert "drm/radeon: work around lack of upstream ACPI support for D3cold"
  Revert "drm/amdgpu: work around lack of upstream ACPI support for D3cold"
2016-08-18 10:58:50 -07:00
Johannes Berg 112dc0c806 locking/barriers: Suppress sparse warnings in lockless_dereference()
After Peter's commit:

  331b6d8c7a ("locking/barriers: Validate lockless_dereference() is used on a pointer type")

... we get a lot of sparse warnings (one for every rcu_dereference, and more)
since the expression here is assigning to the wrong address space.

Instead of validating that 'p' is a pointer this way, instead make
it fail compilation when it's not by using sizeof(*(p)). This will
not cause any sparse warnings (tested, likely since the address
space is irrelevant for sizeof), and will fail compilation when
'p' isn't a pointer type.

Tested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 331b6d8c7a ("locking/barriers: Validate lockless_dereference() is used on a pointer type")
Link: http://lkml.kernel.org/r/1470909022-687-2-git-send-email-johannes@sipsolutions.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-08-18 15:36:13 +02:00
Johannes Berg f17b3ea3d2 Revert "drm/fb-helper: Reduce READ_ONCE(master) to lockless_dereference"
This reverts commit:

  fa7d81bb3c ("drm/fb-helper: Reduce READ_ONCE(master) to lockless_dereference")

As Peter explained:

  [...] lockless_dereference() is _stronger_ than READ_ONCE(), not weaker.

  [...]

  Also, clue is in the name: 'dereference', you don't actually dereference
  the pointer here, only load it.

My next patch breaks the compile without this revert, because it assumes
you want to deference and thus also need the struct type visible (which
it isn't here), so revert it.

Tested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1470909022-687-1-git-send-email-johannes@sipsolutions.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-08-18 15:36:13 +02:00
Catalin Marinas a93a4d6232 arm64: Fix shift warning in arch/arm64/mm/dump.c
When building with 48-bit VAs and 16K page configuration, it's possible
to get the following warning when building the arm64 page table dumping
code:

arch/arm64/mm/dump.c: In function ‘walk_pud’:
arch/arm64/mm/dump.c:274:102: warning: right shift count >= width of type [-Wshift-count-overflow]

This is because pud_offset(pgd, 0) performs a shift to the right by 36
while the value 0 has the type 'int' by default, therefore 32-bit.

This patch modifies all the p*_offset() uses in arch/arm64/mm/dump.c to
use 0UL for the address argument.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-08-18 12:38:11 +01:00
Paolo Bonzini 2eeb321fd2 KVM/ARM Fixes for v4.8-rc3
This tag contains the following fixes on top of v4.8-rc1:
  - ITS init issues
  - ITS error handling issues
  - ITS IRQ leakage fix
  - Plug a couple of ITS race conditions
  - An erratum workaround for timers
  - Some removal of misleading use of errors and comments
  - A fix for GICv3 on 32-bit guests
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJXtLicAAoJEEtpOizt6ddyEC4H/16IngntN6Gz1WPwtBBelgyj
 ZfU970uzGOyDtDOeOX1NT+gJpkDvUMhsNlngWnMrMwqqqPVKdE4XBShPiW2v53E7
 JquDTd2kKl+OO9e9XnkLw9yUcARmJFKIjHdISlg+E78t2kcNHn+XB2jrfTLKQVl8
 tk1ztDALb4LXSGYPZQ/uHTYp9U0qei+2SbbQufRcdQ3ggyxLDwPP2aO25amctzEP
 0Y42tlnNoZj7yBBp0X9BWRrHF2AZuOp+qBJnpFiQdsgLL6G1P3DcU/t9+KDjVBVr
 LYKN8jId2r5eyGGg8aKb4I3trevayToWhDw/jzarrTNAovB1cp8G5J7ozfmeS3g=
 =4PCW
 -----END PGP SIGNATURE-----

Merge tag 'kvm-arm-for-v4.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/ARM Fixes for v4.8-rc3

This tag contains the following fixes on top of v4.8-rc1:
 - ITS init issues
 - ITS error handling issues
 - ITS IRQ leakage fix
 - Plug a couple of ITS race conditions
 - An erratum workaround for timers
 - Some removal of misleading use of errors and comments
 - A fix for GICv3 on 32-bit guests
2016-08-18 12:19:19 +02:00
Peter Feiner c95ba92afb kvm: nVMX: fix nested tsc scaling
When the host supported TSC scaling, L2 would use a TSC multiplier of
0, which causes a VM entry failure. Now L2's TSC uses the same
multiplier as L1.

Signed-off-by: Peter Feiner <pfeiner@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-08-18 12:19:09 +02:00
Radim Krčmář dccbfcf52c KVM: nVMX: postpone VMCS changes on MSR_IA32_APICBASE write
If vmcs12 does not intercept APIC_BASE writes, then KVM will handle the
write with vmcs02 as the current VMCS.
This will incorrectly apply modifications intended for vmcs01 to vmcs02
and L2 can use it to gain access to L0's x2APIC registers by disabling
virtualized x2APIC while using msr bitmap that assumes enabled.

Postpone execution of vmx_set_virtual_x2apic_mode until vmcs01 is the
current VMCS.  An alternative solution would temporarily make vmcs01 the
current VMCS, but it requires more care.

Fixes: 8d14695f95 ("x86, apicv: add virtual x2apic support")
Reported-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-08-18 12:19:08 +02:00
Radim Krčmář d048c09821 KVM: nVMX: fix msr bitmaps to prevent L2 from accessing L0 x2APIC
msr bitmap can be used to avoid a VM exit (interception) on guest MSR
accesses.  In some configurations of VMX controls, the guest can even
directly access host's x2APIC MSRs.  See SDM 29.5 VIRTUALIZING MSR-BASED
APIC ACCESSES.

L2 could read all L0's x2APIC MSRs and write TPR, EOI, and SELF_IPI.
To do so, L1 would first trick KVM to disable all possible interceptions
by enabling APICv features and then would turn those features off;
nested_vmx_merge_msr_bitmap() only disabled interceptions, so VMX would
not intercept previously enabled MSRs even though they were not safe
with the new configuration.

Correctly re-enabling interceptions is not enough as a second bug would
still allow L1+L2 to access host's MSRs: msr bitmap was shared for all
VMCSs, so L1 could trigger a race to get the desired combination of msr
bitmap and VMX controls.

This fix allocates a msr bitmap for every L1 VCPU, allows only safe
x2APIC MSRs from L1's msr bitmap, and disables msr bitmaps if they would
have to intercept everything anyway.

Fixes: 3af18d9c5f ("KVM: nVMX: Prepare for using hardware MSR bitmap")
Reported-by: Jim Mattson <jmattson@google.com>
Suggested-by: Wincy Van <fanwenyi0529@gmail.com>
Reviewed-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-08-18 12:19:07 +02:00
Wanpeng Li 03cbc73263 sched/cputime: Resync steal time when guest & host lose sync
Commit:

  5743021831 ("sched/cputime: Count actually elapsed irq & softirq time")

... fixed a bug but also triggered a regression:

On an i5 laptop, 4 pCPUs, 4vCPUs for one full dynticks guest, there are four
CPU hog processes(for loop) running in the guest, I hot-unplug the pCPUs
on host one by one until there is only one left, then observe CPU utilization
via 'top' in the guest, it shows:

  100% st for cpu0(housekeeping)
   75% st for other CPUs (nohz full mode)

However, w/o this commit it shows the correct 75% for all four CPUs.

When a guest is interrupted for a longer amount of time, missed clock ticks
are not redelivered later. Because of that, we should not limit the amount
of steal time accounted to the amount of time that the calling functions
think have passed.

However, the interval returned by account_other_time() is NOT rounded down
to the nearest jiffy, while the base interval in get_vtime_delta() it is
subtracted from is, so the max cputime limit is required to avoid underflow.

This patch fixes the regression by limiting the account_other_time() from
get_vtime_delta() to avoid underflow, and lets the other three call sites
(in account_other_time() and steal_account_process_time()) account however
much steal time the host told us elapsed.

Suggested-by: Rik van Riel <riel@redhat.com>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Radim Krcmar <rkrcmar@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kvm@vger.kernel.org
Link: http://lkml.kernel.org/r/1471399546-4069-1-git-send-email-wanpeng.li@hotmail.com
[ Improved the changelog. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-08-18 11:19:48 +02:00
Peter Zijlstra 173be9a14f sched/cputime: Fix NO_HZ_FULL getrusage() monotonicity regression
Mike reports:

 Roughly 10% of the time, ltp testcase getrusage04 fails:
 getrusage04    0  TINFO  :  Expected timers granularity is 4000 us
 getrusage04    0  TINFO  :  Using 1 as multiply factor for max [us]time increment (1000+4000us)!
 getrusage04    0  TINFO  :  utime:           0us; stime:         179us
 getrusage04    0  TINFO  :  utime:        3751us; stime:           0us
 getrusage04    1  TFAIL  :  getrusage04.c:133: stime increased > 5000us:

And tracked it down to the case where the task simply doesn't get
_any_ [us]time ticks.

Update the code to assume all rtime is utime when we lack information,
thus ensuring a task that elides the tick gets time accounted.

Reported-by: Mike Galbraith <umgwanakikbuti@gmail.com>
Tested-by: Mike Galbraith <umgwanakikbuti@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Fredrik Markstrom <fredrik.markstrom@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Radim <rkrcmar@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Cc: stable@vger.kernel.org # 4.3+
Fixes: 9d7fb04276 ("sched/cputime: Guarantee stime + utime == rtime")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-08-18 10:48:46 +02:00
David Carrillo-Cisneros 71e7bc2bab perf/core: Check return value of the perf_event_read() IPI
The call to smp_call_function_single in perf_event_read() may fail if
an invalid or not online CPU index is passed. Warn user if such bug is
present and return error.

Signed-off-by: David Carrillo-Cisneros <davidcc@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vegard Nossum <vegard.nossum@gmail.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1471467307-61171-2-git-send-email-davidcc@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-08-18 10:35:52 +02:00
Mathieu Poirier 99f5bc9bfa perf/core: Enable mapping of the stop filters
At this time the perf_addr_filter_needs_mmap() function will _not_
return true on a user space 'stop' filter.  But stop filters need
exactly the same kind of mapping that range and start filters get.

Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1468860187-318-4-git-send-email-mathieu.poirier@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-08-18 10:35:51 +02:00
Mathieu Poirier 12b40a2393 perf/core: Update filters only on executable mmap
Function perf_event_mmap() is called by the MM subsystem each time
part of a binary is loaded in memory.  There can be several mapping
for a binary, many times unrelated to the code section.

Each time a section of a binary is mapped address filters are
updated, event when the map doesn't pertain to the code section.
The end result is that filters are configured based on the last map
event that was received rather than the last mapping of the code
segment.

For example if we have an executable 'main' that calls library
'libcstest.so.1.0', and that we want to collect traces on code
that is in that library.  The perf cmd line for this scenario
would be:

  perf record -e cs_etm// --filter 'filter 0x72c/0x40@/opt/lib/libcstest.so.1.0' --per-thread ./main

Resulting in binaries being mapped this way:

  root@linaro-nano:~# cat /proc/1950/maps
  00400000-00401000 r-xp 00000000 08:02 33169     /home/linaro/main
  00410000-00411000 r--p 00000000 08:02 33169     /home/linaro/main
  00411000-00412000 rw-p 00001000 08:02 33169     /home/linaro/main
  7fa2464000-7fa2474000 rw-p 00000000 00:00 0
  7fa2474000-7fa25a4000 r-xp 00000000 08:02 543   /lib/aarch64-linux-gnu/libc-2.21.so
  7fa25a4000-7fa25b3000 ---p 00130000 08:02 543   /lib/aarch64-linux-gnu/libc-2.21.so
  7fa25b3000-7fa25b7000 r--p 0012f000 08:02 543   /lib/aarch64-linux-gnu/libc-2.21.so
  7fa25b7000-7fa25b9000 rw-p 00133000 08:02 543   /lib/aarch64-linux-gnu/libc-2.21.so
  7fa25b9000-7fa25bd000 rw-p 00000000 00:00 0
  7fa25bd000-7fa25be000 r-xp 00000000 08:02 38308 /opt/lib/libcstest.so.1.0
  7fa25be000-7fa25cd000 ---p 00001000 08:02 38308 /opt/lib/libcstest.so.1.0
  7fa25cd000-7fa25ce000 r--p 00000000 08:02 38308 /opt/lib/libcstest.so.1.0
  7fa25ce000-7fa25cf000 rw-p 00001000 08:02 38308 /opt/lib/libcstest.so.1.0
  7fa25cf000-7fa25eb000 r-xp 00000000 08:02 574   /lib/aarch64-linux-gnu/ld-2.21.so
  7fa25ef000-7fa25f2000 rw-p 00000000 00:00 0
  7fa25f7000-7fa25f9000 rw-p 00000000 00:00 0
  7fa25f9000-7fa25fa000 r--p 00000000 00:00 0     [vvar]
  7fa25fa000-7fa25fb000 r-xp 00000000 00:00 0     [vdso]
  7fa25fb000-7fa25fc000 r--p 0001c000 08:02 574   /lib/aarch64-linux-gnu/ld-2.21.so
  7fa25fc000-7fa25fe000 rw-p 0001d000 08:02 574   /lib/aarch64-linux-gnu/ld-2.21.so
  7ff2ea8000-7ff2ec9000 rw-p 00000000 00:00 0     [stack]
  root@linaro-nano:~#

Before 'main()' can execute 'libcstest.so.1.0' has to be loaded in
memory.  Once that has been done perf_event_mmap() has been called
4 times, with the last map starting at address 0x7fa25ce000 and
the address filter configured to start filtering when the
IP has passed over address 0x0x7fa25ce72c (0x7fa25ce000 + 0x72c).

But that is wrong since the code segment for library 'libcstest.so.1.0'
as been mapped at 0x7fa25bd000, resulting in traces not being
collected.

This patch corrects the situation by requesting that address
filters be updated only if the mapped event is for a code
segment.

Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1468860187-318-3-git-send-email-mathieu.poirier@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-08-18 10:35:50 +02:00
Mathieu Poirier 4059ffd09d perf/core: Fix file name handling for start/stop filters
Binary file names have to be supplied for both range and start/stop
filters but the current code only processes the filename if an
address range filter is specified.  This code adds processing of
the filename for start/stop filters.

Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1468860187-318-2-git-send-email-mathieu.poirier@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-08-18 10:35:50 +02:00
Peter Zijlstra cca2094605 perf/core: Fix event_function_local()
Vincent reported triggering the WARN_ON_ONCE() in event_function_local().

While thinking through cases I noticed that by using event_function()
directly, we miss the inactive case usually handled by
event_function_call().

Therefore construct a blend of event_function_call() and
event_function() that handles the cases relevant to
event_function_local().

Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org # 4.5+
Fixes: fae3fde651 ("perf: Collapse and fix event_function_call() users")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-08-18 10:35:49 +02:00
Jiri Olsa 7b0501b1e7 x86/smp: Fix __max_logical_packages value setup
Frank reported kernel panic when he disabled several cores in BIOS
via following option:

  Core Disable Bitmap(Hex)   [0]

with number 0xFFE, which leaves 16 CPUs in system (out of 48).

The kernel panic below goes along with following messages:

 smpboot: Max logical packages: 2^M
 smpboot: APIC(0) Converting physical 0 to logical package 0^M
 smpboot: APIC(20) Converting physical 1 to logical package 1^M
 smpboot: APIC(40) Package 2 exceeds logical package map^M
 smpboot: CPU 8 APICId 40 disabled^M
 smpboot: APIC(60) Package 3 exceeds logical package map^M
 smpboot: CPU 12 APICId 60 disabled^M
 ...
 general protection fault: 0000 [#1] SMP^M
 Modules linked in:^M
 CPU: 15 PID: 1 Comm: swapper/0 Not tainted 4.7.0-rc5+ #1^M
 Hardware name: SGI UV300/UV300, BIOS SGI UV 300 series BIOS 05/25/2016^M
 task: ffff8801673e0000 ti: ffff8801673ac000 task.ti: ffff8801673ac000^M
 RIP: 0010:[<ffffffff81014d54>]  [<ffffffff81014d54>] uncore_change_context+0xd4/0x180^M
 ...
  [<ffffffff810158ac>] uncore_event_init_cpu+0x6c/0x70^M
  [<ffffffff81d8c91c>] intel_uncore_init+0x1c2/0x2dd^M
  [<ffffffff81d8c75a>] ? uncore_cpu_setup+0x17/0x17^M
  [<ffffffff81002190>] do_one_initcall+0x50/0x190^M
  [<ffffffff810ab193>] ? parse_args+0x293/0x480^M
  [<ffffffff81d87365>] kernel_init_freeable+0x1a5/0x249^M
  [<ffffffff81d86a35>] ? set_debug_rodata+0x12/0x12^M
  [<ffffffff816dc19e>] kernel_init+0xe/0x110^M
  [<ffffffff816e93bf>] ret_from_fork+0x1f/0x40^M
  [<ffffffff816dc190>] ? rest_init+0x80/0x80^M

The reason for the panic is wrong value of __max_logical_packages,
which lets logical_package_map uninitialized and the uncore code
relying on this map being properly initialized (maybe we should
add some safety checks there as well).

The __max_logical_packages is computed as:

  DIV_ROUND_UP(total_cpus, ncpus);
  - ncpus being number of cores

With above BIOS setup we get total_cpus == 16 which set
__max_logical_packages to 2 (ncpus is 12).

Once topology_update_package_map processes CPU with logical
pkg over 2 we display above messages and fail to initialize
the physical_to_logical_pkg map, which makes the uncore code
crash.

The fix is to remove logical_package_map bitmap completely
and keep and update the logical_packages number instead.

After we enumerate all the present CPUs, we check if the
enumerated logical packages count is within its computed
maximum from BIOS data.

If it's not the case, we set this maximum to the new enumerated
value and freeze any new addition of logical packages.

The freeze is because lot of init code like uncore/rapl/cqm
depends on having maximum logical package value set to allocate
their data, so we can't change it later on.

Prarit Bhargava tested the patch and confirms that it solves
the problem:

  From dmidecode:
          Core Count: 24
          Core Enabled: 24
          Thread Count: 48

Orig kernel boot log:

 [    0.464981] smpboot: Max logical packages: 19
 [    0.469861] smpboot: APIC(0) Converting physical 0 to logical package 0
 [    0.477261] smpboot: APIC(40) Converting physical 1 to logical package 1
 [    0.484760] smpboot: APIC(80) Converting physical 2 to logical package 2
 [    0.492258] smpboot: APIC(c0) Converting physical 3 to logical package 3

1.  nr_cpus=8, should stop enumerating in package 0:

 [    0.533664] smpboot: APIC(0) Converting physical 0 to logical package 0
 [    0.539596] smpboot: Max logical packages: 19

2.  max_cpus=8, should still enumerate all packages:

 [    0.526494] smpboot: APIC(0) Converting physical 0 to logical package 0
 [    0.532428] smpboot: APIC(40) Converting physical 1 to logical package 1
 [    0.538456] smpboot: APIC(80) Converting physical 2 to logical package 2
 [    0.544486] smpboot: APIC(c0) Converting physical 3 to logical package 3
 [    0.550524] smpboot: Max logical packages: 19

3.  nr_cpus=49 ( 2 socket + 1 core on 3rd socket), should stop enumerating in
    package 2:

 [    0.521378] smpboot: APIC(0) Converting physical 0 to logical package 0
 [    0.527314] smpboot: APIC(40) Converting physical 1 to logical package 1
 [    0.533345] smpboot: APIC(80) Converting physical 2 to logical package 2
 [    0.539368] smpboot: Max logical packages: 19

4.  maxcpus=49, should still enumerate all packages:

 [    0.525591] smpboot: APIC(0) Converting physical 0 to logical package 0
 [    0.531525] smpboot: APIC(40) Converting physical 1 to logical package 1
 [    0.537547] smpboot: APIC(80) Converting physical 2 to logical package 2
 [    0.543579] smpboot: APIC(c0) Converting physical 3 to logical package 3
 [    0.549624] smpboot: Max logical packages: 19

5.  kdump (nr_cpus=1) works as well.

Reported-by: Frank Ramsay <framsay@redhat.com>
Tested-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Reviewed-by: Prarit Bhargava <prarit@redhat.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20160815101700.GA30090@krava
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-08-18 10:14:48 +02:00
Borislav Petkov 88b2f63402 x86/microcode/AMD: Fix initrd loading with CONFIG_RANDOMIZE_MEMORY=y
Similar to:

  efaad554b4 ("x86/microcode/intel: Fix initrd loading with CONFIG_RANDOMIZE_MEMORY=y")

... fix microcode loading from the initrd on AMD by adding the
randomization offset to the microcode patch container within the initrd.

Reported-and-tested-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-tip-commits@vger.kernel.org
Link: http://lkml.kernel.org/r/20160817113314.GA19221@nazgul.tnic
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-08-18 10:06:49 +02:00
Oleg Nesterov 6c4687cc17 uprobes: Fix the memcg accounting
__replace_page() wronlgy calls mem_cgroup_cancel_charge() in "success" path,
it should only do this if page_check_address() fails.

This means that every enable/disable leads to unbalanced mem_cgroup_uncharge()
from put_page(old_page), it is trivial to underflow the page_counter->count
and trigger OOM.

Reported-and-tested-by: Brenden Blanco <bblanco@plumgrid.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vladimir Davydov <vdavydov@virtuozzo.com>
Cc: stable@vger.kernel.org # 3.17+
Fixes: 00501b531c ("mm: memcontrol: rewrite charge API")
Link: http://lkml.kernel.org/r/20160817153629.GB29724@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-08-18 10:03:26 +02:00
Dave Airlie 91d62d9f30 Merge branch 'drm-fixes-4.8' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
Single 64-bit gart size fix.

* 'drm-fixes-4.8' of git://people.freedesktop.org/~agd5f/linux:
  drm/amdgpu: Change GART offset to 64-bit
2016-08-18 12:51:27 +10:00
Rafael J. Wysocki 6c16f42a4e Merge branch 'pm-sleep'
* pm-sleep:
  PM / hibernate: Fix rtree_next_node() to avoid walking off list ends
  x86/power/64: Use __pa() for physical address computation
  PM / sleep: Update some system sleep documentation
2016-08-18 03:27:08 +02:00
Linus Torvalds 184ca82348 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Buffers powersave frame test is reversed in cfg80211, fix from Felix
    Fietkau.

 2) Remove bogus WARN_ON in openvswitch, from Jarno Rajahalme.

 3) Fix some tg3 ethtool logic bugs, and one that would cause no
    interrupts to be generated when rx-coalescing is set to 0.  From
    Satish Baddipadige and Siva Reddy Kallam.

 4) QLCNIC mailbox corruption and napi budget handling fix from Manish
    Chopra.

 5) Fix fib_trie logic when walking the trie during /proc/net/route
    output than can access a stale node pointer.  From David Forster.

 6) Several sctp_diag fixes from Phil Sutter.

 7) PAUSE frame handling fixes in mlxsw driver from Ido Schimmel.

 8) Checksum fixup fixes in bpf from Daniel Borkmann.

 9) Memork leaks in nfnetlink, from Liping Zhang.

10) Use after free in rxrpc, from David Howells.

11) Use after free in new skb_array code of macvtap driver, from Jason
    Wang.

12) Calipso resource leak, from Colin Ian King.

13) mediatek bug fixes (missing stats sync init, etc.) from Sean Wang.

14) Fix bpf non-linear packet write helpers, from Daniel Borkmann.

15) Fix lockdep splats in macsec, from Sabrina Dubroca.

16) hv_netvsc bug fixes from Vitaly Kuznetsov, mostly to do with VF
    handling.

17) Various tc-action bug fixes, from CONG Wang.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (116 commits)
  net_sched: allow flushing tc police actions
  net_sched: unify the init logic for act_police
  net_sched: convert tcf_exts from list to pointer array
  net_sched: move tc offload macros to pkt_cls.h
  net_sched: fix a typo in tc_for_each_action()
  net_sched: remove an unnecessary list_del()
  net_sched: remove the leftover cleanup_a()
  mlxsw: spectrum: Allow packets to be trapped from any PG
  mlxsw: spectrum: Unmap 802.1Q FID before destroying it
  mlxsw: spectrum: Add missing rollbacks in error path
  mlxsw: reg: Fix missing op field fill-up
  mlxsw: spectrum: Trap loop-backed packets
  mlxsw: spectrum: Add missing packet traps
  mlxsw: spectrum: Mark port as active before registering it
  mlxsw: spectrum: Create PVID vPort before registering netdevice
  mlxsw: spectrum: Remove redundant errors from the code
  mlxsw: spectrum: Don't return upon error in removal path
  i40e: check for and deal with non-contiguous TCs
  ixgbe: Re-enable ability to toggle VLAN filtering
  ixgbe: Force VLNCTRL.VFE to be set in all VMDq paths
  ...
2016-08-17 17:26:58 -07:00
David S. Miller b96c22c071 Merge branch 'tc_action-fixes'
Cong Wang says:

====================
net_sched: tc action fixes and updates

This patchset fixes a few regressions caused by the previous
code refactor and more. Thanks to Jamal for catching them!

Note, patch 3/7 and 4/7 are not strictly necessary for this patchset,
I just want to carry them together.

---
v4: adjust an indention for Jamal
    add two more patches

v3: avoid list for fast path, suggested by Jamal

v2: replace flex_array with regular dynamic array
    keep tcf_action_stats_update() in act_api.h
    fix macro typos found by Amir
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-17 19:27:58 -04:00
Roman Mashak b5ac851885 net_sched: allow flushing tc police actions
The act_police uses its own code to walk the
action hashtable, which leads to that we could
not flush standalone tc police actions, so just
switch to tcf_generic_walker() like other actions.

(Joint work from Roman and Cong.)

Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-17 19:27:51 -04:00
WANG Cong 0852e45523 net_sched: unify the init logic for act_police
Jamal reported a crash when we create a police action
with a specific index, this is because the init logic
is not correct, we should always create one for this
case. Just unify the logic with other tc actions.

Fixes: a03e6fe569 ("act_police: fix a crash during removal")
Reported-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-17 19:27:51 -04:00
WANG Cong 22dc13c837 net_sched: convert tcf_exts from list to pointer array
As pointed out by Jamal, an action could be shared by
multiple filters, so we can't use list to chain them
any more after we get rid of the original tc_action.
Instead, we could just save pointers to these actions
in tcf_exts, since they are refcount'ed, so convert
the list to an array of pointers.

The "ugly" part is the action API still accepts list
as a parameter, I just introduce a helper function to
convert the array of pointers to a list, instead of
relying on the C99 feature to iterate the array.

Fixes: a85a970af2 ("net_sched: move tc_action into tcf_common")
Reported-by: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-17 19:27:51 -04:00
WANG Cong 2734437ef3 net_sched: move tc offload macros to pkt_cls.h
struct tcf_exts belongs to filters, should not be visible
to plain tc actions.

Cc: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-17 19:27:51 -04:00
WANG Cong 0c23c3e705 net_sched: fix a typo in tc_for_each_action()
It is harmless because all users pass 'a' to this macro.

Fixes: 00175aec94 ("net/sched: Macro instead of CONFIG_NET_CLS_ACT ifdef")
Cc: Amir Vadai <amir@vadai.me>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-17 19:27:51 -04:00
WANG Cong 824a7e8863 net_sched: remove an unnecessary list_del()
This list_del() for tc action is not needed actually,
because we only use this list to chain bulk operations,
therefore should not be carried for latter operations.

Fixes: ec0595cc44 ("net_sched: get rid of struct tcf_common")
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-17 19:27:51 -04:00
WANG Cong f07fed82ad net_sched: remove the leftover cleanup_a()
After refactoring tc_action into tcf_common, we no
longer need to cleanup temporary "actions" in list,
they are permanently stored in the hashtable.

Fixes: a85a970af2 ("net_sched: move tc_action into tcf_common")
Reported-by: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-17 19:27:51 -04:00
David S. Miller f4abf05f54 Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue
Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates 2016-08-16

This series contains fixes to e1000e, igb, ixgbe and i40e.

Kshitiz Gupta provides a fix for igb to resolve the PHY delay compensation
math in several functions.

Jarod Wilson provides a fix for e1000e which had to broken up into 2
patches, first is prepares the driver for expanding the list of NICs
that have occasional ~10 hour clock jumps when being used for PTP.
Second patch actually fixes i218 silicon which has been experiencing
the clock jumps while using PTP.

Alex provides 2 patches for ixgbe now that he is back at Intel.  First
fixes setting VLNCTRL.VFE bit, which was left unchanged in earlier patches
which resulted in disabling VLAN filtering for all the VFs.  Second
corrects the support for disabling the VLAN tag filtering via the
feature bit.

Lastly, David fixes i40e which was causing a kernel panic when
non-contiguous traffic classes or traffic classes not starting with TC0,
were configured on a link partner switch.  To fix this, changed the
logic when determining the total number of TCs enabled.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-17 19:20:24 -04:00
David S. Miller 647f28c727 Merge branch 'mlxsw-fixes'
Jiri Pirko says:

====================
mlxsw: IPv4 UC router fixes

Ido says:
Patches 1-3 fix a long standing problem in the driver's init sequence,
which manifests itself quite often when routing daemons try to configure
an IP address on registered netdevs that don't yet have an associated
vPort.

Patches 4-9 add missing packet traps for the router to work properly and
also fix ordering issue following the recent changes to the driver's init
sequence.

The last patch isn't related to the router, but fixes a general problem
in which under certain conditions packets aren't trapped to CPU.

v1->v2:
- Change order of patch 7
- Add patch 6 following Ilan's comment
- Add patchset name and cover letter
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-17 19:18:34 -04:00
Ido Schimmel 9ffcc3725f mlxsw: spectrum: Allow packets to be trapped from any PG
When packets enter the device they are classified to a priority group
(PG) buffer based on their PCP value. After their egress port and
traffic class are determined they are moved to the switch's shared
buffer and await transmission, if:

(Ingress{Port}.Usage < Thres && Ingress{Port,PG}.Usage < Thres &&
 Egress{Port}.Usage < Thres && Egress{Port,TC}.Usage < Thres)
||
(Ingress{Port}.Usage < Min || Ingress{Port,PG} < Min ||
 Egress{Port}.Usage < Min || Egress{Port,TC}.Usage < Min)

Packets scheduled to transmission through CPU port (trapped to CPU) use
traffic class 7, which has a zero maximum and minimum quotas. However,
when such packets arrive from PG 0 they are admitted to the shared
buffer as PG 0 has a non-zero minimum quota.

Allow all packets to be trapped to the CPU - regardless of the PG they
were classified to - by assigning a 10KB minimum quota for CPU port and
TC7.

Fixes: 8e8dfe9fdf ("mlxsw: spectrum: Add IEEE 802.1Qaz ETS support")
Reported-by: Tamir Winetroub <tamirw@mellanox.com>
Tested-by: Tamir Winetroub <tamirw@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-17 19:18:28 -04:00
Ido Schimmel 8168287b5d mlxsw: spectrum: Unmap 802.1Q FID before destroying it
Before destroying the 802.1Q FID we should first remove the VID-to-FID
mapping. This makes mlxsw_sp_fid_destroy() symmetric with regards to
mlxsw_sp_fid_create().

Fixes: 14d39461b3 ("mlxsw: spectrum: Use per-FID struct for the VLAN-aware bridge")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-17 19:18:27 -04:00
Ido Schimmel 0583272d91 mlxsw: spectrum: Add missing rollbacks in error path
While going over the code I noticed we are missing two rollbacks in the
port's creation error path. Add them and adjust the place of one of them
in the port's removal sequence so that both are symmetric.

Fixes: 56ade8fe3f ("mlxsw: spectrum: Add initial support for Spectrum ASIC")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-17 19:18:27 -04:00
Jiri Pirko 0e7df1a290 mlxsw: reg: Fix missing op field fill-up
Ralue pack function needs to set op, otherwise it is 0 for add always.

Fixes: d5a1c749d2 ("mlxsw: reg: Add Router Algorithmic LPM Unicast Entry Register definition")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-17 19:18:27 -04:00
Ido Schimmel a94a614fa2 mlxsw: spectrum: Trap loop-backed packets
One of the conditions to generate an ICMP Redirect Message is that "the
packet is being forwarded out the same physical interface that it was
received from" (RFC 1812).

Therefore, we need to be able to trap such packets and let the kernel
decide what to do with them.

For each RIF, enable the loop-back filter, which will raise the LBERROR
trap whenever the ingress RIF equals the egress RIF.

Fixes: 99724c18fc ("mlxsw: spectrum: Introduce support for router interfaces")
Reported-by: Ilan Tayari <ilant@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-17 19:18:27 -04:00
Elad Raz c20b80187a mlxsw: spectrum: Add missing packet traps
Add the following traps:

1) MTU Error: Trap packets whose size is bigger than the egress RIF's
MTU. If DF bit isn't set, traffic will continue to be routed in slow
path.

2) TTL Error: Trap packets whose TTL expired. This allows traceroute to
work properly.

3) OSPF packets.

Fixes: 7b27ce7bb9 ("mlxsw: spectrum: Add traps needed for router implementation")
Signed-off-by: Elad Raz <eladr@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-17 19:18:27 -04:00
Ido Schimmel 2f25844c23 mlxsw: spectrum: Mark port as active before registering it
Commit bbf2a4757b ("mlxsw: spectrum: Initialize ports at the end of
init sequence") moved ports initialization to the end of the init
sequence, which means ports are the first to be removed during fini.

Since the FDB delayed work is still active when ports are removed it's
possible for it to process FDB notifications of inactive ports,
resulting in a warning message.

Fix that by marking ports as inactive only after unregistering them. The
NETDEV_UNREGISTER event will invoke bridge's driver port removal
sequence that will cause the FDB (and FDB notifications) to be flushed.

Fixes: bbf2a4757b ("mlxsw: spectrum: Initialize ports at the end of init sequence")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-17 19:18:27 -04:00
Ido Schimmel 05978481e7 mlxsw: spectrum: Create PVID vPort before registering netdevice
After registering a netdevice it's possible for user space applications
to configure an IP address on it. From the driver's perspective, this
means a router interface (RIF) should be created for the PVID vPort.

Therefore, we must create the PVID vPort before registering the
netdevice.

Fixes: 99724c18fc ("mlxsw: spectrum: Introduce support for router interfaces")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-17 19:18:27 -04:00
Ido Schimmel fa66d7e3fe mlxsw: spectrum: Remove redundant errors from the code
Currently, when device configuration fails we emit errors to the kernel
log despite the fact we already get these from the EMAD transaction
layer, so remove them.

In addition to being unnecessary, removing these error messages will
allow us to reuse mlxsw_sp_port_add_vid() to create the PVID vPort
before registering the netdevice.

Fixes: 99724c18fc ("mlxsw: spectrum: Introduce support for router interfaces")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-17 19:18:27 -04:00
Ido Schimmel 7a35583ec5 mlxsw: spectrum: Don't return upon error in removal path
When removing a VLAN filter from the device we shouldn't return upon the
first error we encounter, as otherwise we'll have resources that will
never be freed nor used.

Instead, we should keep trying to free as much resources as possible in
a best effort mode.

Remove the error message as well, since we already get these from the
EMAD transaction code.

Fixes: 99724c18fc ("mlxsw: spectrum: Introduce support for router interfaces")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-17 19:18:27 -04:00
Linus Torvalds 5ff132c07a Power Supply Fixes for 4.8 cycle
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABCgAGBQJXsk2BAAoJENju1/PIO/qa0E4P/0rsS4Kzv+B61BJsoCIoXYOp
 p347gy6fS7QEJxSlHG430uVV74zlXQE3JTiBrjIdX2TKRA6EvD0RyM3/h+vfQU1c
 GviBJ53UwNlq1bBuJD1jkIHCvIW274FjX1h45Tytuzsi/8rLWas15oumnRFuOtyA
 UtgTbPtualYab01XgQyivv/dY4Dfb3d0aNPl3Nq77hAJMTVc9s1DLVqRMudqc0Zc
 e489mw+8l31L2pJs7CPxzNXqlS4YvMz4/mwbGCsqtEJTNHyzyrVH5VfCpMUNfGr4
 rZdqtF/nj/9c/DTIO6GvhoOVjja4MCEf3WpGCxrz2KUmhvxVTeBqsPStjE1xnC8u
 rNDDpRAwtjgpiw8fiEw+CXB5XsDMPmMSJi1KaOLz1q9O2GxHzoz2tgfjCh9V73qj
 J5pq1cQY2V6AkB+10WQ7afCnN3Zn2qDUoWXj2cvkd8yuvyfjTuC53Gd1sGpNk1dj
 f+euFLm6YYMvLv37vDCQDaxvzmud1Z60bgJy/uvbaVhcCABLUdbfwJpRf/3tWjp3
 rU3Lw43p5MPfBWHkZrzpQO33GEzO6FSrIkTr3YfVJiwAoViXgt5WEknbVde/pZDz
 Acn8UTE+6KeLrmrwA66pF++ewKjA7GgUzW50Ubju15rWujnwu5QVg/T+UVpGrB4k
 OYmwCtd5i2/OjnC+APnK
 =hK7R
 -----END PGP SIGNATURE-----

Merge tag 'for-v4.8-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply

Pull power supply fixes from Sebastian Reichel.

* tag 'for-v4.8-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply:
  power_supply: tps65217-charger: fix missing platform_set_drvdata()
  power: reset: hisi-reboot: Unmap region obtained by of_iomap
  power: reset: reboot-mode: fix build error of missing ioremap/iounmap on UM
  power: supply: max17042_battery: fix model download bug.
2016-08-17 12:10:22 -07:00
Ard Biesheuvel bc9f3d7788 arm64: kernel: avoid literal load of virtual address with MMU off
Literal loads of virtual addresses are subject to runtime relocation when
CONFIG_RELOCATABLE=y, and given that the relocation routines run with the
MMU and caches enabled, literal loads of relocated values performed with
the MMU off are not guaranteed to return the latest value unless the
memory covering the literal is cleaned to the PoC explicitly.

So defer the literal load until after the MMU has been enabled, just like
we do for primary_switch() and secondary_switch() in head.S.

Fixes: 1e48ef7fcc ("arm64: add support for building vmlinux as a relocatable PIE binary")
Cc: <stable@vger.kernel.org> # 4.6+
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-08-17 17:37:37 +01:00
Catalin Marinas bfe6c8a89e arm64: Fix NUMA build error when !CONFIG_ACPI
Since asm/acpi.h is only included by linux/acpi.h when CONFIG_ACPI is
enabled, disabling the latter leads to the following build error on
arm64:

arch/arm64/mm/numa.c: In function ‘arm64_numa_init’:
arch/arm64/mm/numa.c:395:24: error: ‘arm64_acpi_numa_init’ undeclared (first use in this function)
   if (!acpi_disabled && !numa_init(arm64_acpi_numa_init))

This patch include the asm/acpi.h explicitly in arch/arm64/mm/numa.c for
the arm64_acpi_numa_init() definition.

Fixes: d8b47fca8c ("arm64, ACPI, NUMA: NUMA support based on SRAT and SLIT")
Reviewed-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-08-17 17:16:58 +01:00