Commit Graph

674318 Commits

Author SHA1 Message Date
Shaohua Li 93e06c7a64 mm: enable MADV_FREE for swapless system
Now MADV_FREE pages can be easily reclaimed even for swapless system.
We can safely enable MADV_FREE for all systems.

Link: http://lkml.kernel.org/r/155648585589300bfae1d45078e7aebb3d988b87.1487965799.git.shli@fb.com
Signed-off-by: Shaohua Li <shli@fb.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-03 15:52:08 -07:00
Minchan Kim eb94a87844 mm: fix lazyfree BUG_ON check in try_to_unmap_one()
If a page is swapbacked, it means it should be in swapcache in
try_to_unmap_one's path.

If a page is !swapbacked, it mean it shouldn't be in swapcache in
try_to_unmap_one's path.

Check both two cases all at once and if it fails, warn and return
SWAP_FAIL.  Such bug never mean we should shut down the kernel.

[minchan@kernel.org: do not use VM_WARN_ON_ONCE as if condition[
  Link: http://lkml.kernel.org/r/20170309060226.GB854@bbox
Link: http://lkml.kernel.org/r/20170307055551.GC29458@bbox
Signed-off-by: Minchan Kim <minchan@kernel.org>
Suggested-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Shaohua Li <shli@fb.com>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-03 15:52:08 -07:00
Shaohua Li 802a3a92ad mm: reclaim MADV_FREE pages
When memory pressure is high, we free MADV_FREE pages.  If the pages are
not dirty in pte, the pages could be freed immediately.  Otherwise we
can't reclaim them.  We put the pages back to anonumous LRU list (by
setting SwapBacked flag) and the pages will be reclaimed in normal
swapout way.

We use normal page reclaim policy.  Since MADV_FREE pages are put into
inactive file list, such pages and inactive file pages are reclaimed
according to their age.  This is expected, because we don't want to
reclaim too many MADV_FREE pages before used once pages.

Based on Minchan's original patch

[minchan@kernel.org: clean up lazyfree page handling]
  Link: http://lkml.kernel.org/r/20170303025237.GB3503@bbox
Link: http://lkml.kernel.org/r/14b8eb1d3f6bf6cc492833f183ac8c304e560484.1487965799.git.shli@fb.com
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
Acked-by: Minchan Kim <minchan@kernel.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-03 15:52:08 -07:00
Shaohua Li f7ad2a6cb9 mm: move MADV_FREE pages into LRU_INACTIVE_FILE list
madv()'s MADV_FREE indicate pages are 'lazyfree'.  They are still
anonymous pages, but they can be freed without pageout.  To distinguish
these from normal anonymous pages, we clear their SwapBacked flag.

MADV_FREE pages could be freed without pageout, so they pretty much like
used once file pages.  For such pages, we'd like to reclaim them once
there is memory pressure.  Also it might be unfair reclaiming MADV_FREE
pages always before used once file pages and we definitively want to
reclaim the pages before other anonymous and file pages.

To speed up MADV_FREE pages reclaim, we put the pages into
LRU_INACTIVE_FILE list.  The rationale is LRU_INACTIVE_FILE list is tiny
nowadays and should be full of used once file pages.  Reclaiming
MADV_FREE pages will not have much interfere of anonymous and active
file pages.  And the inactive file pages and MADV_FREE pages will be
reclaimed according to their age, so we don't reclaim too many MADV_FREE
pages too.  Putting the MADV_FREE pages into LRU_INACTIVE_FILE_LIST also
means we can reclaim the pages without swap support.  This idea is
suggested by Johannes.

This patch doesn't move MADV_FREE pages to LRU_INACTIVE_FILE list yet to
avoid bisect failure, next patch will do it.

The patch is based on Minchan's original patch.

[akpm@linux-foundation.org: coding-style fixes]
Link: http://lkml.kernel.org/r/2f87063c1e9354677b7618c647abde77b07561e5.1487965799.git.shli@fb.com
Signed-off-by: Shaohua Li <shli@fb.com>
Suggested-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Minchan Kim <minchan@kernel.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-03 15:52:08 -07:00
Shaohua Li d44d363f65 mm: don't assume anonymous pages have SwapBacked flag
There are a few places the code assumes anonymous pages should have
SwapBacked flag set.  MADV_FREE pages are anonymous pages but we are
going to add them to LRU_INACTIVE_FILE list and clear SwapBacked flag
for them.  The assumption doesn't hold any more, so fix them.

Link: http://lkml.kernel.org/r/3945232c0df3dd6c4ef001976f35a95f18dcb407.1487965799.git.shli@fb.com
Signed-off-by: Shaohua Li <shli@fb.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-03 15:52:08 -07:00
Shaohua Li a128ca71fb mm: delete unnecessary TTU_* flags
Patch series "mm: fix some MADV_FREE issues", v5.

We are trying to use MADV_FREE in jemalloc.  Several issues are found.
Without solving the issues, jemalloc can't use the MADV_FREE feature.

 - Doesn't support system without swap enabled. Because if swap is off,
   we can't or can't efficiently age anonymous pages. And since
   MADV_FREE pages are mixed with other anonymous pages, we can't
   reclaim MADV_FREE pages. In current implementation, MADV_FREE will
   fallback to MADV_DONTNEED without swap enabled. But in our
   environment, a lot of machines don't enable swap. This will prevent
   our setup using MADV_FREE.

 - Increases memory pressure. page reclaim bias file pages reclaim
   against anonymous pages. This doesn't make sense for MADV_FREE pages,
   because those pages could be freed easily and refilled with very
   slight penality. Even page reclaim doesn't bias file pages, there is
   still an issue, because MADV_FREE pages and other anonymous pages are
   mixed together. To reclaim a MADV_FREE page, we probably must scan a
   lot of other anonymous pages, which is inefficient. In our test, we
   usually see oom with MADV_FREE enabled and nothing without it.

 - Accounting. There are two accounting problems. We don't have a global
   accounting. If the system is abnormal, we don't know if it's a
   problem from MADV_FREE side. The other problem is RSS accounting.
   MADV_FREE pages are accounted as normal anon pages and reclaimed
   lazily, so application's RSS becomes bigger. This confuses our
   workloads. We have monitoring daemon running and if it finds
   applications' RSS becomes abnormal, the daemon will kill the
   applications even kernel can reclaim the memory easily.

To address the first the two issues, we can either put MADV_FREE pages
into a separate LRU list (Minchan's previous patches and V1 patches), or
put them into LRU_INACTIVE_FILE list (suggested by Johannes).  The
patchset use the second idea.  The reason is LRU_INACTIVE_FILE list is
tiny nowadays and should be full of used once file pages.  So we can
still efficiently reclaim MADV_FREE pages there without interference
with other anon and active file pages.  Putting the pages into inactive
file list also has an advantage which allows page reclaim to prioritize
MADV_FREE pages and used once file pages.  MADV_FREE pages are put into
the lru list and clear SwapBacked flag, so PageAnon(page) &&
!PageSwapBacked(page) will indicate a MADV_FREE pages.  These pages will
directly freed without pageout if they are clean, otherwise normal swap
will reclaim them.

For the third issue, the previous post adds global accounting and a
separate RSS count for MADV_FREE pages.  The problem is we never get
accurate accounting for MADV_FREE pages.  The pages are mapped to
userspace, can be dirtied without notice from kernel side.  To get
accurate accounting, we could write protect the page, but then there is
extra page fault overhead, which people don't want to pay.  Jemalloc
guys have concerns about the inaccurate accounting, so this post drops
the accounting patches temporarily.  The info exported to
/proc/pid/smaps for MADV_FREE pages are kept, which is the only place we
can get accurate accounting right now.

This patch (of 6):

Johannes pointed out TTU_LZFREE is unnecessary.  It's true because we
always have the flag set if we want to do an unmap.  For cases we don't
do an unmap, the TTU_LZFREE part of code should never run.

Also the TTU_UNMAP is unnecessary.  If no other flags set (for example,
TTU_MIGRATION), an unmap is implied.

The patch includes Johannes's cleanup and dead TTU_ACTION macro removal
code

Link: http://lkml.kernel.org/r/4be3ea1bc56b26fd98a54d0a6f70bec63f6d8980.1487965799.git.shli@fb.com
Signed-off-by: Shaohua Li <shli@fb.com>
Suggested-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Minchan Kim <minchan@kernel.org>
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-03 15:52:08 -07:00
Geliang Tang 0a372d09cc mm/page-writeback.c: use setup_deferrable_timer
Use setup_deferrable_timer() instead of init_timer_deferrable() to
simplify the code.

Link: http://lkml.kernel.org/r/e8e3d4280a34facbc007346f31df833cec28801e.1488070291.git.geliangtang@gmail.com
Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-03 15:52:08 -07:00
Johannes Weiner 491d79ae77 mm: remove unnecessary back-off function when retrying page reclaim
The backoff mechanism is not needed.  If we have MAX_RECLAIM_RETRIES
loops without progress, we'll OOM anyway; backing off might cut one or
two iterations off that in the rare OOM case.  If we have intermittent
success reclaiming a few pages, the backoff function gets reset also,
and so is of little help in these scenarios.

We might want a backoff function for when there IS progress, but not
enough to be satisfactory.  But this isn't that.  Remove it.

Link: http://lkml.kernel.org/r/20170228214007.5621-10-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Jia He <hejianet@gmail.com>
Cc: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-03 15:52:08 -07:00
Johannes Weiner 3db65812d6 Revert "mm, vmscan: account for skipped pages as a partial scan"
This reverts commit d7f05528ee.

Now that reclaimability of a node is no longer based on the ratio
between pages scanned and theoretically reclaimable pages, we can remove
accounting tricks for pages skipped due to zone constraints.

Link: http://lkml.kernel.org/r/20170228214007.5621-9-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Jia He <hejianet@gmail.com>
Cc: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-03 15:52:08 -07:00
Johannes Weiner c822f6223d mm: delete NR_PAGES_SCANNED and pgdat_reclaimable()
NR_PAGES_SCANNED counts number of pages scanned since the last page free
event in the allocator.  This was used primarily to measure the
reclaimability of zones and nodes, and determine when reclaim should
give up on them.  In that role, it has been replaced in the preceding
patches by a different mechanism.

Being implemented as an efficient vmstat counter, it was automatically
exported to userspace as well.  It's however unlikely that anyone
outside the kernel is using this counter in any meaningful way.

Remove the counter and the unused pgdat_reclaimable().

Link: http://lkml.kernel.org/r/20170228214007.5621-8-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Jia He <hejianet@gmail.com>
Cc: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-03 15:52:08 -07:00
Johannes Weiner 688035f729 mm: don't avoid high-priority reclaim on memcg limit reclaim
Commit 246e87a939 ("memcg: fix get_scan_count() for small targets")
sought to avoid high reclaim priorities for memcg by forcing it to scan
a minimum amount of pages when lru_pages >> priority yielded nothing.
This was done at a time when reclaim decisions like dirty throttling
were tied to the priority level.

Nowadays, the only meaningful thing still tied to priority dropping
below DEF_PRIORITY - 2 is gating whether laptop_mode=1 is generally
allowed to write.  But that is from an era where direct reclaim was
still allowed to call ->writepage, and kswapd nowadays avoids writes
until it's scanned every clean page in the system.  Potential changes to
how quick sc->may_writepage could trigger are of little concern.

Remove the force_scan stuff, as well as the ugly multi-pass target
calculation that it necessitated.

Link: http://lkml.kernel.org/r/20170228214007.5621-7-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Jia He <hejianet@gmail.com>
Cc: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-03 15:52:08 -07:00
Johannes Weiner a2d7f8e461 mm: don't avoid high-priority reclaim on unreclaimable nodes
Commit 246e87a939 ("memcg: fix get_scan_count() for small targets")
sought to avoid high reclaim priorities for kswapd by forcing it to scan
a minimum amount of pages when lru_pages >> priority yielded nothing.

Commit b95a2f2d48 ("mm: vmscan: convert global reclaim to per-memcg
LRU lists"), due to switching global reclaim to a round-robin scheme
over all cgroups, had to restrict this forceful behavior to
unreclaimable zones in order to prevent massive overreclaim with many
cgroups.

The latter patch effectively neutered the behavior completely for all
but extreme memory pressure.  But in those situations we might as well
drop the reclaimers to lower priority levels.  Remove the check.

Link: http://lkml.kernel.org/r/20170228214007.5621-6-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Jia He <hejianet@gmail.com>
Cc: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-03 15:52:08 -07:00
Johannes Weiner 15038d0de9 mm: remove unnecessary reclaimability check from NUMA balancing target
NUMA balancing already checks the watermarks of the target node to
decide whether it's a suitable balancing target.  Whether the node is
reclaimable or not is irrelevant when we don't intend to reclaim.

Link: http://lkml.kernel.org/r/20170228214007.5621-5-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Jia He <hejianet@gmail.com>
Cc: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-03 15:52:07 -07:00
Johannes Weiner 047d72c30e mm: remove seemingly spurious reclaimability check from laptop_mode gating
Commit 1d82de618d ("mm, vmscan: make kswapd reclaim in terms of
nodes") allowed laptop_mode=1 to start writing not just when the
priority drops to DEF_PRIORITY - 2 but also when the node is
unreclaimable.

That appears to be a spurious change in this patch as I doubt the series
was tested with laptop_mode, and neither is that particular change
mentioned in the changelog.  Remove it, it's still recent.

Link: http://lkml.kernel.org/r/20170228214007.5621-4-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Jia He <hejianet@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-03 15:52:07 -07:00
Johannes Weiner d450abd81b mm: fix check for reclaimable pages in PF_MEMALLOC reclaim throttling
PF_MEMALLOC direct reclaimers get throttled on a node when the sum of
all free pages in each zone fall below half the min watermark.  During
the summation, we want to exclude zones that don't have reclaimables.
Checking the same pgdat over and over again doesn't make sense.

Fixes: 599d0c954f ("mm, vmscan: move LRU lists to node")
Link: http://lkml.kernel.org/r/20170228214007.5621-3-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Jia He <hejianet@gmail.com>
Cc: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-03 15:52:07 -07:00
Johannes Weiner c73322d098 mm: fix 100% CPU kswapd busyloop on unreclaimable nodes
Patch series "mm: kswapd spinning on unreclaimable nodes - fixes and
cleanups".

Jia reported a scenario in which the kswapd of a node indefinitely spins
at 100% CPU usage.  We have seen similar cases at Facebook.

The kernel's current method of judging its ability to reclaim a node (or
whether to back off and sleep) is based on the amount of scanned pages
in proportion to the amount of reclaimable pages.  In Jia's and our
scenarios, there are no reclaimable pages in the node, however, and the
condition for backing off is never met.  Kswapd busyloops in an attempt
to restore the watermarks while having nothing to work with.

This series reworks the definition of an unreclaimable node based not on
scanning but on whether kswapd is able to actually reclaim pages in
MAX_RECLAIM_RETRIES (16) consecutive runs.  This is the same criteria
the page allocator uses for giving up on direct reclaim and invoking the
OOM killer.  If it cannot free any pages, kswapd will go to sleep and
leave further attempts to direct reclaim invocations, which will either
make progress and re-enable kswapd, or invoke the OOM killer.

Patch #1 fixes the immediate problem Jia reported, the remainder are
smaller fixlets, cleanups, and overall phasing out of the old method.

Patch #6 is the odd one out.  It's a nice cleanup to get_scan_count(),
and directly related to #5, but in itself not relevant to the series.

If the whole series is too ambitious for 4.11, I would consider the
first three patches fixes, the rest cleanups.

This patch (of 9):

Jia He reports a problem with kswapd spinning at 100% CPU when
requesting more hugepages than memory available in the system:

$ echo 4000 >/proc/sys/vm/nr_hugepages

top - 13:42:59 up  3:37,  1 user,  load average: 1.09, 1.03, 1.01
Tasks:   1 total,   1 running,   0 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.0 us, 12.5 sy,  0.0 ni, 85.5 id,  2.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem:  31371520 total, 30915136 used,   456384 free,      320 buffers
KiB Swap:  6284224 total,   115712 used,  6168512 free.    48192 cached Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
   76 root      20   0       0      0      0 R 100.0 0.000 217:17.29 kswapd3

At that time, there are no reclaimable pages left in the node, but as
kswapd fails to restore the high watermarks it refuses to go to sleep.

Kswapd needs to back away from nodes that fail to balance.  Up until
commit 1d82de618d ("mm, vmscan: make kswapd reclaim in terms of
nodes") kswapd had such a mechanism.  It considered zones whose
theoretically reclaimable pages it had reclaimed six times over as
unreclaimable and backed away from them.  This guard was erroneously
removed as the patch changed the definition of a balanced node.

However, simply restoring this code wouldn't help in the case reported
here: there *are* no reclaimable pages that could be scanned until the
threshold is met.  Kswapd would stay awake anyway.

Introduce a new and much simpler way of backing off.  If kswapd runs
through MAX_RECLAIM_RETRIES (16) cycles without reclaiming a single
page, make it back off from the node.  This is the same number of shots
direct reclaim takes before declaring OOM.  Kswapd will go to sleep on
that node until a direct reclaimer manages to reclaim some pages, thus
proving the node reclaimable again.

[hannes@cmpxchg.org: check kswapd failure against the cumulative nr_reclaimed count]
  Link: http://lkml.kernel.org/r/20170306162410.GB2090@cmpxchg.org
[shakeelb@google.com: fix condition for throttle_direct_reclaim]
  Link: http://lkml.kernel.org/r/20170314183228.20152-1-shakeelb@google.com
Link: http://lkml.kernel.org/r/20170228214007.5621-2-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Reported-by: Jia He <hejianet@gmail.com>
Tested-by: Jia He <hejianet@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-03 15:52:07 -07:00
Greg Thelen a87c75fbcc slab: avoid IPIs when creating kmem caches
Each slab kmem cache has per cpu array caches.  The array caches are
created when the kmem_cache is created, either via kmem_cache_create()
or lazily when the first object is allocated in context of a kmem
enabled memcg.  Array caches are replaced by writing to /proc/slabinfo.

Array caches are protected by holding slab_mutex or disabling
interrupts.  Array cache allocation and replacement is done by
__do_tune_cpucache() which holds slab_mutex and calls
kick_all_cpus_sync() to interrupt all remote processors which confirms
there are no references to the old array caches.

IPIs are needed when replacing array caches.  But when creating a new
array cache, there's no need to send IPIs because there cannot be any
references to the new cache.  Outside of memcg kmem accounting these
IPIs occur at boot time, so they're not a problem.  But with memcg kmem
accounting each container can create kmem caches, so the IPIs are
wasteful.

Avoid unnecessary IPIs when creating array caches.

Test which reports the IPI count of allocating slab in 10000 memcg:

	import os

	def ipi_count():
		with open("/proc/interrupts") as f:
			for l in f:
				if 'Function call interrupts' in l:
					return int(l.split()[1])

	def echo(val, path):
		with open(path, "w") as f:
			f.write(val)

	n = 10000
	os.chdir("/mnt/cgroup/memory")
	pid = str(os.getpid())
	a = ipi_count()
	for i in range(n):
		os.mkdir(str(i))
		echo("1G\n", "%d/memory.limit_in_bytes" % i)
		echo("1G\n", "%d/memory.kmem.limit_in_bytes" % i)
		echo(pid, "%d/cgroup.procs" % i)
		open("/tmp/x", "w").close()
		os.unlink("/tmp/x")
	b = ipi_count()
	print "%d loops: %d => %d (+%d ipis)" % (n, a, b, b-a)
	echo(pid, "cgroup.procs")
	for i in range(n):
		os.rmdir(str(i))

patched:   10000 loops: 1069 => 1170 (+101 ipis)
unpatched: 10000 loops: 1192 => 48933 (+47741 ipis)

Link: http://lkml.kernel.org/r/20170416214544.109476-1-gthelen@google.com
Signed-off-by: Greg Thelen <gthelen@google.com>
Acked-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-03 15:52:07 -07:00
Geliang Tang d47736fafe fs/ocfs2/cluster: use offset_in_page() macro
Use offset_in_page() macro instead of open-coding.

Link: http://lkml.kernel.org/r/4dbc77ccaaed98b183cf4dba58a4fa325fd65048.1492758503.git.geliangtang@gmail.com
Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Cc: Mark Fasheh <mfasheh@versity.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <jiangqi903@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-03 15:52:07 -07:00
Junxiao Bi 33496c3c3d ocfs2: o2hb: revert hb threshold to keep compatible
Configfs is the interface for ocfs2-tools to set configure to kernel and
$configfs_dir/cluster/$clustername/heartbeat/dead_threshold is the one
used to configure heartbeat dead threshold.  Kernel has a default value
of it but user can set O2CB_HEARTBEAT_THRESHOLD in /etc/sysconfig/o2cb
to override it.

Commit 45b997737a ("ocfs2/cluster: use per-attribute show and store
methods") changed heartbeat dead threshold name while ocfs2-tools did
not, so ocfs2-tools won't set this configurable and the default value is
always used.  So revert it.

Fixes: 45b997737a ("ocfs2/cluster: use per-attribute show and store methods")
Link: http://lkml.kernel.org/r/1490665245-15374-1-git-send-email-junxiao.bi@oracle.com
Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Acked-by: Joseph Qi <jiangqi903@gmail.com>
Cc: Mark Fasheh <mfasheh@versity.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-03 15:52:07 -07:00
Geliang Tang 667b8a37f3 fs/ocfs2/cluster: use setup_timer
Use setup_timer() instead of init_timer() to simplify the code.

Link: http://lkml.kernel.org/r/5e75bf07beb91e092d5aa36c36769949a480456a.1489060564.git.geliangtang@gmail.com
Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-03 15:52:07 -07:00
Masahiro Yamada accce8e7e8 blackfin: bf609: let clk_disable() return immediately if clk is NULL
In many of clk_disable() implementations, it is a no-op for a NULL
pointer input, but this is one of the exceptions.

Making it treewide consistent will allow clock consumers to call
clk_disable() without NULL pointer check.

Link: http://lkml.kernel.org/r/1490692624-11931-4-git-send-email-yamada.masahiro@socionext.com
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Stephen Boyd <sboyd@codeaurora.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Michael Turquette <mturquette@baylibre.com>
Cc: Steven Miao <realmz6@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-03 15:52:07 -07:00
Colin Ian King 672934d247 scripts/spelling.txt: add several more common spelling mistakes
Here are some of the more common spelling mistakes that I've found while
fixing up spelling mistakes in kernel error message text.  They probably
should be added to this list so we don't keep on seeing them appearing
again.

Link: http://lkml.kernel.org/r/20170421122534.5378-1-colin.king@canonical.com
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Joe Perches <joe@perches.com>
Cc: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-03 15:52:07 -07:00
Pankaj Gupta 6a5cd60ba8 lib/dma-debug.c: make locking work for RT
Interrupt enable/disabled with spinlock is not a valid operation for RT
as it can make executing tasks sleep from a non-sleepable context.  So
convert it to spin_lock_irq[save, restore].

Link: http://lkml.kernel.org/r/1492065666-3816-1-git-send-email-pagupta@redhat.com
Signed-off-by: Pankaj Gupta <pagupta@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Cc: Vinod Koul <vinod.koul@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Ville Syrjl <ville.syrjala@linux.intel.com>
Cc: Miles Chen <miles.chen@mediatek.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-03 15:52:07 -07:00
Darrick J. Wong fe0be23e68 xfs: reserve enough blocks to handle btree splits when remapping
In xfs_reflink_end_cow, we erroneously reserve only enough blocks to
handle adding 1 extent.  This is problematic if we fragment free space,
have to do CoW, and then have to perform multiple bmap btree expansions.
Furthermore, the BUI recovery routine doesn't reserve /any/ blocks to
handle btree splits, so log recovery fails after our first error causes
the filesystem to go down.

Therefore, refactor the transaction block reservation macros until we
have a macro that works for our deferred (re)mapping activities, and fix
both problems by using that macro.

With 1k blocks we can hit this fairly often in g/187 if the scratch fs
is big enough.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2017-05-03 13:21:40 -07:00
Linus Torvalds 1684096b1e Updates for 4.12 kernel merge window
- idr usage and locking changes
 - build fix for hns
 - ipoib debug path record file fix
 - hfi1 updates
 - core RDMA netdev addition
 - Intel VNIC driver addition
 - Enhanced accelerators for IPoIB addition
 - Debug cleanups in cxgb3/4
 - Trivial cleanups from SF Markus Elfring
 - Misc rxe fixes from Mellanox
 - Misc ipoib fixes from Mellanox
 - Lots of mlx4/mlx5 changes from Mellanox
 - Misc fixes across the RDMA subsystem
 - ODP paging fixes and improvements
 - qedr updates
 - hfi1 updates
 - OPA port info patches
 - OPA AH patches
 - OPA SA Query patches
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJZCfBsAAoJELgmozMOVy/d9GsP/je5/IyEwQOFVxhLM+BooDWy
 wfH/GWLoT4iSxviWtBzukZzrioxjfyFitZzkTWYxHMj3EIb63i52pDUTpes/soGl
 c3ob0SYv5mPB9b1mBZaIyyTWBWrXfm2pNSfyYryhI1cYxNX5ZLlXG51Xd3YxdB3D
 A8avUsCtH17zSb6Mimm04cT47pn5UIkVkcPKZDCir10hj1JiwLVwrWyC7abxLENp
 jHFw4uKQHOV3IN6jevM/tXfUenjALXwBHHKv+lJsBVijDUPTEmDsBiDXsvO++dmN
 Ph5ElY3KPfUmj4wIWIrY4L56j5Kr13Wxc+U8+MWNC6frbcHYoMCaSz3yaU15NLAd
 UYY5blzZsuNXqhgmudeV89qJpXYleW7KCgJQNiBmLkcQL38+ObdLTP0EmsC02K+W
 YpJbwecjNQtcb3KTJGnKCyMc3+Rs0u6Osz6YKuad4l8cNaxUI8NVujB2ru/wBczg
 fqXEunXjr6tEVM39zqwolImicsSSEzBKfpaFvB3D2Re5O22Eos6DM+DveUnzXAFR
 Hof5NhPURr/1aqNog2ymgGbjlg3tL4JAAG1PRBhvSFYywVMjV/LLBPQOgqaQzIU5
 J72jbSikRJYLCJaLFAeM7nNsTQgAMH58G0vhnrFoAjC7MglYaedcvouLjOs1jrpW
 d5f12NtIBIpC6DvQCNvH
 =pgEL
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma

Pull rdma updates from Doug Ledford:
 "More exchaustive description of primary updates in this release:

   - Lots of driver fixes and misc fixes across the board.

   - I had to base on a net-next tree because the IPoIB Accelorator
     patches needed it.

     Unfortunately, it was known to Mellanox that there would need to be
     an IPoIB accelorator patch to the net tree (which left some
     functions turned off by an #ifdef construct to avoid warnings about
     defined but unused functions), then one to the RDMA tree, then a
     fixup that went back and re-enabled the functions in the net tree
     and enabled their use in the rdma tree

     Also, a sparse fix was sent to the net tree after I did my pull,
     and the fixup patch conflicts quite directly with that sparse fix,
     so I'm going to submit the fixup patch towards the end of the merge
     window by itself and based upon your master branch at the time.

   - Two separate rounds of hfi1 fixes, one that got dropped from last
     release because it came in just a day or two before the end of the
     merge window and then the one from this release cycle.

     Of note is that I now have a third series that just landed from
     Intel yesterday. It is not included in this pull request, but I may
     submit it by the end of the week. I'll talk to Intel about
     improving the timing of thier submissions for my workflow.

   - Changes to our idr usage in the RDMA subsystem that will tie into
     our cgroup management and also into the upcoming changes for the
     RDMA kernel<->userspace API.

   - Addition of support for a netdev to be tied to an RDMA device at
     the core level

   - Addition of the VNIC driver from Intel.

     While IPoIB provides IP over InfiniBand (and *only* IP, no lower
     layer protocol headers are allowed or supported), the VNIC driver
     presents a virtual Ethernet device with support for things like
     varying Ethertypes, VLANs, priorities and other features of
     Ethernet.

     The virtual devices are centrally managed by the OPA fabric
     manager, making this (for the time being) a strictly OPA specific
     feature.

   - Improvements to the On-Demand Paging support in the RDMA subsystem.

   - Addition of three significant OPA changes.

     While we added OPA support some time ago (via the hfi1 driver), the
     RDMA subsystem has so far glossed over the areas where OPA and
     InfiniBand differ.

     With this release we are starting to add support for the OPA
     extensions into the RDMA core in the following area: Extended port
     information for OPA is now supported, extended Address Handle
     attributes for OPA are now supported, and extended SA Queries to
     get OPA specific subnet information is now supported.

  Concise summary from the tag:
   - idr usage and locking changes
   - build fix for hns
   - ipoib debug path record file fix
   - hfi1 updates
   - core RDMA netdev addition
   - Intel VNIC driver addition
   - Enhanced accelerators for IPoIB addition
   - Debug cleanups in cxgb3/4
   - Trivial cleanups from SF Markus Elfring
   - Misc rxe fixes from Mellanox
   - Misc ipoib fixes from Mellanox
   - Lots of mlx4/mlx5 changes from Mellanox
   - Misc fixes across the RDMA subsystem
   - ODP paging fixes and improvements
   - qedr updates
   - hfi1 updates
   - OPA port info patches
   - OPA AH patches
   - OPA SA Query patches"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (191 commits)
  infiniband: avoid dereferencing uninitialized dst on error path
  IB/SA: Add OPA addr header
  IB/mlx5: Add port_xmit_wait to counter registers read
  IB/ocrdma: fix out of bounds access to local buffer
  IB/mlx4: Fix incorrect order of formal and actual parameters
  IB/mlx4: Change flush logic so it adheres to the variable name
  mlx5: Fix mlx5_ib_map_mr_sg mr length
  IB/rxe: Don't clamp residual length to mtu
  IB/SA: Add support to query OPA path records
  IB/SA: Add OPA path record type
  IB/SA: Split struct sa_path_rec based on IB and ROCE specific fields
  IB/SA: Introduce path record specific types
  IB/SA: Rename ib_sa_path_rec to sa_path_rec
  IB/CM: Add braces when using sizeof
  IB/core: Define 'opa' rdma_ah_attr type
  IB/core: Define 'ib' and 'roce' rdma_ah_attr types
  IB/core: Use rdma_ah_attr accessor functions
  IB/core: Add accessor functions for rdma_ah_attr fields
  IB/PVRDMA: Rename ib_ah_attr related functions
  IB/mthca: Rename to_ib_ah_attr to to_rdma_ah_attr
  ...
2017-05-03 12:45:55 -07:00
Linus Torvalds 16a12fa9ae Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input subsystem updates from Dmitry Torokhov:

 - a big update from Mauro converting input documentation to ReST format

 - Synaptics PS/2 is now aware of SMBus companion devices, which means
   that we can now use native RMI4 protocol to handle touchpads, instead
   of relying on legacy PS/2 mode.

 - we removed support from BMA180 accelerometer from input devices as it
   is now handled properly by IIO

 - update to TSC2007 to corretcly report pressure

 - other miscellaneous driver fixes.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (152 commits)
  Input: ar1021_i2c - use BIT to check for a bit
  Input: twl4030-pwrbutton - use input_set_capability() helper
  Input: twl4030-pwrbutton - use correct device for irq request
  Input: ar1021_i2c - enable touch mode during open
  Input: add uinput documentation
  dt-bindings: input: add bindings document for ar1021_i2c driver
  dt-bindings: input: rotary-encoder: fix typo
  Input: xen-kbdfront - add module parameter for setting resolution
  ARM: pxa/raumfeld: fix compile error in rotary controller resources
  Input: xpad - do not suggest writing to Dominic
  Input: xpad - don't use literal blocks inside footnotes
  Input: xpad - note that usb/devices is now at /sys/kernel/debug/
  Input: docs - freshen up introduction
  Input: docs - split input docs into kernel- and user-facing
  Input: docs - note that MT-A protocol is obsolete
  Input: docs - update joystick documentation a bit
  Input: docs - remove disclaimer/GPL notice
  Input: fix "Game console" heading level in joystick documentation
  Input: rotary-encoder - remove references to platform data from docs
  Input: move documentation for Amiga CD32
  ...
2017-05-03 12:38:20 -07:00
Linus Torvalds d25e436c4b spi: Updates for v4.12
There's quite a lot of small driver specific fixes and enhancements in
 this release but the main activity has been around the loopback and
 spidev test drivers which is good to see as it should hopefully help
 improve the quality of all the drivers as people start to make use of
 the new code:
 
  - Additional tests in the loopback test driver for vmalloc()
    compatibility and around delays together with fixes for existing
    tests.
  - Support for testing continuous data transfer for use in soak testing.
  - Device property support for board info platforms.
  - Support for registering empty sets of devices via board info (useful
    when writing code to enumerate hardware automatically).
 -----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCAAxFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAlkJ81ETHGJyb29uaWVA
 a2VybmVsLm9yZwAKCRAk1otyXVSH0HQZB/9IO6RQmdIU8A2s0xBaXOI64uE9ajFQ
 aPcwtWwpsAyxSHYtDbsPrcVuLTaJm3q+ldNJt7stYTkRG7R5W6bx+oiJOx1VdkbY
 QJiUQUYNpAj5H0EIPocTFct8Yq+SfVtEeEwAuEnu/DouTXLPSoxoQ0dua+iyesxL
 ZF6T+/zyRyj+zoijHGbYQEDs6jKuZudtMzQAFoJEalr3ywEDyBMUghXbkfk1qJd4
 9XD1Vr4wFUYJ/7yPdwzfhG8u8FHmIBob3L2w6MPvNB961lnaQUCxuRKy3cJZrQM8
 fn3WJAzrnsA6SOTM+rskWhWh4j0t26XgY/xCsEdiE+XGIh6Sd1RPQuJM
 =e4MH
 -----END PGP SIGNATURE-----

Merge tag 'spi-v4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi

Pull spi updates from Mark Brown:
 "There's quite a lot of small driver specific fixes and enhancements in
  this release but the main activity has been around the loopback and
  spidev test drivers which is good to see as it should hopefully help
  improve the quality of all the drivers as people start to make use of
  the new code:

   - Additional tests in the loopback test driver for vmalloc()
     compatibility and around delays together with fixes for existing
     tests.

   - Support for testing continuous data transfer for use in soak
     testing.

   - Device property support for board info platforms.

   - Support for registering empty sets of devices via board info
     (useful when writing code to enumerate hardware automatically)"

* tag 'spi-v4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: (52 commits)
  spi: cadence: Allow for GPIO pins to be used as chipselects
  spi-imx: Implements handling of the SPI_READY mode flag.
  spi: tegra: fix spelling mistake: "trasfer" -> "transfer"
  spi: spi-ti-qspi: Use bounce buffer if read buffer is not DMA'ble
  spi: Add can_dma like interface for spi_flash_read
  spi: dw: Disable clock after unregistering the host
  spi: double time out tolerance
  spi: atmel: add deepest PM support to SAMA5D2
  spi: atmel: factorize reusable code for SPI controller init
  spi: orion: add LSB support
  spi: pl022: don't use uninitialized variable
  spi: loopback-test: fix spelling mistake: "minimam" -> "minimum"
  spi: dynamycally allocated message initialization
  spi: spi-ti-qspi: Remove unused dma_dev variable
  spi: omap2-mcspi: poll OMAP2_MCSPI_CHSTAT_RXS for PIO transfer
  spi: spi-ti-qspi: Use dma_engine wrapper for dma memcpy call
  spi: spidev_test: add option to continuously transfer data
  spi: loopback-test: fix potential integer overflow on multiple
  spi: sun6i: update max transfer size reported
  spi: pl022: Document property values
  ...
2017-05-03 12:31:39 -07:00
Linus Torvalds a90f0e9ebb regulator: Updates for v4.12
Quite a lot going on with the regulator API for this release, much more
 in the core than in the drivers for a change:
 
  - Fixes for voltage change propagation through dumb power switches.
  - A notification when regulators are enabled.
  - A new settling time property for regulators where the time taken to
    move to a new voltage is not related to the size of the change.
  - Some reorganization of the Arizona drivers in preparation for sharing
    the code with the next generation devices they've been integrated
    with.
  - Support for newer Freescale chips in the Anatop regulator.
  - A new driver for voltage controlled regulators to cope with some
    exciting ChromeOS hardware designs.
  - Support for Rohm BD9571MWV-M and TI TPS65132.
 -----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCAAxFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAlkJ72cTHGJyb29uaWVA
 a2VybmVsLm9yZwAKCRAk1otyXVSH0KfMB/9zutY/L8UyJ40ZOIn4mgfUiWuzTrMP
 lFWlHyRtt0gz6pHlZtaslDUMpp95R/BchE3fNfvmi1VHAAL8yt+edlMniPmVLG+M
 09CSr27n0Vk8uk8DIpZNzzPc/Rxp0tfa59/+e01yV69s3x/j0yoFXGxHPbco2zT/
 EVSYgQf5yXgAu4qG/htLm0AEQyHvfnMiGvd2Z3xU+kE1BOv617ATmYBdvkZLOKDO
 f7QqVK/POkVmDDh3p+qOUYa1+su6icpe3O2bYeWc/x50gxXx+ouxdtmqLSpPoWZz
 ox+1S1Mv32UC5q9NMF2lz1o0SK8VNLVVTQHr9x57IbXCyIBl84e+6JES
 =6YOx
 -----END PGP SIGNATURE-----

Merge tag 'regulator-v4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator

Pull regulator updates from Mark Brown:
 "Quite a lot going on with the regulator API for this release, much
  more in the core than in the drivers for a change:

   - Fixes for voltage change propagation through dumb power switches.

   - A notification when regulators are enabled.

   - A new settling time property for regulators where the time taken to
     move to a new voltage is not related to the size of the change.

   - Some reorganization of the Arizona drivers in preparation for
     sharing the code with the next generation devices they've been
     integrated with.

   - Support for newer Freescale chips in the Anatop regulator.

   - A new driver for voltage controlled regulators to cope with some
     exciting ChromeOS hardware designs.

   - Support for Rohm BD9571MWV-M and TI TPS65132"

* tag 'regulator-v4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: (51 commits)
  regulator: Add ROHM BD9571MWV-M PMIC regulator driver
  regulator: arizona-ldo1: Factor out generic initialization
  regulator: arizona-ldo1: Make arizona_ldo1 independent of struct arizona
  regulator: arizona-ldo1: Move pdata into a separate structure
  regulator: arizona-micsupp: Factor out generic initialization
  regulator: arizona-micsupp: Make arizona_micsupp independent of struct arizona
  regulator: arizona-micsupp: Move pdata into a separate structure
  regulator: arizona: Split KConfig options for LDO1 and MICSUPP regulators
  regulator: anatop: make regulator name property required
  regulator: tps65023: Fix inverted core enable logic.
  regulator: anatop: make sure regulator name is properly defined
  regulator: core: Allow dummy regulators for supplies
  regulator: core: Only propagate voltage changes to if it can change voltages
  regulator: vctrl: Fix out of bounds array access for vctrl->vtable
  regulator: tps65132: fix platform_no_drv_owner.cocci warnings
  regulator: tps65132: Fix off-by-one for .max_register setting
  regulator: anatop: set default voltage selector for pcie
  regulator: tps65132: add device-tree binding
  regulator: tps65132: add regulator driver for TI TPS65132
  regulator: anatop: remove unneeded name field of struct anatop_regulator
  ...
2017-05-03 12:27:53 -07:00
Linus Torvalds 14b730723a Merge branch 'i2c/for-4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c updates from Wilfram Sang:
 "I2C has the following updates for you:

   - an immutable cross-subsystem branch fixing PMIC access on Intel
     Baytrail

   - bigger driver updates to the designware, meson, exynos5 drivers

   - new i2c_acpi_new_device() function to create devices from ACPI

   - struct i2c_driver has now a flag 'disable_i2c_core_irq_mapping' to
     allow custom IRQ mapping in case the default does not fit

   - mux subsystem centralized error messages in its core

   - new driver for ltc4306 i2c mux

   - usual set of small updates"

* 'i2c/for-4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (44 commits)
  i2c: thunderx: Enable HWMON class probing
  i2c: rcar: clarify PM handling with more comments
  i2c: rcar: fix resume by always initializing registers before transfer
  i2c: tegra: fix spelling mistake: "contoller" -> "controller"
  i2c: exynos5: use core helper to get driver data
  i2c: exynos5: de-duplicate error logs on clock setup
  i2c: exynos5: simplify clock frequency handling
  i2c: exynos5: simplify timings calculation
  i2c: designware-baytrail: fix potential null pointer dereference on dev
  i2c: designware: Get selected speed mode sda-hold-time via ACPI
  [media] cx231xx: stop double error reporting
  i2c: core: Allow drivers to disable i2c-core irq mapping
  i2c: core: Add new i2c_acpi_new_device helper function
  i2c: core: Allow getting ACPI info by index
  i2c: img-scb: use setup_timer
  i2c: i2c-scmi: add a MS HID
  i2c: mux: ltc4306: LTC4306 and LTC4305 I2C multiplexer/switch
  dt-bindings: i2c: mux: ltc4306: Add dt-bindings for I2C multiplexer/switch
  i2c: mux: reg: stop double error reporting
  i2c: mux: pinctrl: stop double error reporting
  ...
2017-05-03 12:18:47 -07:00
Linus Torvalds d26f552ebb - New Drivers
- Freescale MXS Low Resolution ADC
    - Freescale i.MX23/i.MX28 LRADC touchscreen
    - Motorola CPCAP Power Button
    - TI LMU (Lighting Management Unit)
    - Atmel SMC (Static Memory Controller)
 
  - New Device Support
    - Add support for X-Powers AXP803 to axp20x
    - Add support for Dialog Semi DA9061 to da9062-core
    - Add support for Intel Cougar Mountain to lpc_ich
    - Add support for Intel Gemini Lake to lpc_ich
 
  - New Functionality
    - Add Device Tree support; wm831x-*, axp20x, ti-lmu, da9062, sun4i-gpadc
    - Add IRQ sense support; motorola-cpcap
    - Add ACPI support; cros_ec
    - Add Reset support; altera-a10sr
    - Add ADC support; axp20x
    - Add AC Power support; axp20x
    - Add Runtime PM support; atmel-ebi, exynos-lpass
    - Add Battery Power Supply support; axp20x
    - Add Clock support; exynos-lpass, hi655x-pmic
 
  - Fix-ups
    - Implicitly specify required headers; motorola-cpcap, intel_soc_pmic_bxtwc
    - Add .remove() method; stm32-timers, exynos-lpass
    - Remove unused code; intel_soc_pmic_core, intel-lpss-acpi, ipaq-micro, atmel-smc, menelaus
    - Rename variables for clarity; axp20x
    - Convert pr_warning() to pr_warn(); db8500-prcmu, sta2x11-mfd, twl4030-power
    - Improve formatting; arizona-core, axp20x
    - Use raw_spinlock_*() variants; asic3, t7l66xb, tc6393xb
    - Simplify/refactor code; arizona-core, atmel-ebi
    - Improve error checking; intel_soc_pmic_core
 
  - Bug Fixes
    - Ensure OMAP3630/3730 boards can successfully reboot; twl4030-power
    - Correct max-register value; stm32-timers
    - Extend timeout to account for clock stretching; cros_ec_spi
    - Use correct IRQ trigger type; motorola-cpcap
    - Fix bad use of IRQ sense register; motorola-cpcap
    - Logic error "||" should be "&&"; mxs-lradc-ts
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJZAdDwAAoJEFGvii+H/HdhViMQAJ7Of3xKiS/P1d7RiOhs2OMY
 41R4GojoY2QSurndIbV/PBUbNMlJiqvIawbFCBz7rAZnIv6NatFQGCQnATci8iDV
 tFxz2m705ifstSQTWUr2ykRdNUdKkShLPHdbjs0ZbpV6Xa5tIXT0U7WpdDr+J51B
 422JHx8tVFrktkYCjg7VASKU9hzz8iRSbdpfu6ZitTT3yrr5Ivl0gaCCmXVyWTsF
 fy8DFvEpsAS1pToXGGeZHueTDIgePyEjwT+By6TuDvkObxvCbVrdhKrJnORfHRKx
 +aidbb4E8/ZNYmRERwl4VkAR7y2tenQat/Si+4rtwYHNTcapjjpdEElQTKkIAUpy
 L5Y9Ai0/ihDXpCPmMnf7omnt3qxAltE4voUk2WUIxDOiaFl6XwyxFPDoy5l8T2IM
 i1akRFss/lov9r3dWzxApTdMNwEdeXnHbZgW60h6RHyCH3dqfN3dFcfu9IX/ua01
 HHI4ltkmaokXJmwvpa+/oVxGAfcoS5AGRw1uRfIN1fbjIxEeRS4I8iogqneVQ5GJ
 D766JIhuf1KKBIWu5DYwfCyTgSdBnEt/J/vTIe4zOZrBk/StbeygWfhUMRSutglK
 eORpwzsX8DnS4SYRErCcRRlePB/NU2GvmHOXSApSem9ifHx8sQGM7QZt2am5JYRp
 q/6gViepBHxrA8Xv6mWJ
 =SHiw
 -----END PGP SIGNATURE-----

Merge tag 'mfd-next-4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd

Pull MFD updates from Lee Jones:
 "New Drivers:
   - Freescale MXS Low Resolution ADC
   - Freescale i.MX23/i.MX28 LRADC touchscreen
   - Motorola CPCAP Power Button
   - TI LMU (Lighting Management Unit)
   - Atmel SMC (Static Memory Controller)

  New Device Support:
   - Add support for X-Powers AXP803 to axp20x
   - Add support for Dialog Semi DA9061 to da9062-core
   - Add support for Intel Cougar Mountain to lpc_ich
   - Add support for Intel Gemini Lake to lpc_ich

  New Functionality:
   - Add Device Tree support; wm831x-*, axp20x, ti-lmu, da9062, sun4i-gpadc
   - Add IRQ sense support; motorola-cpcap
   - Add ACPI support; cros_ec
   - Add Reset support; altera-a10sr
   - Add ADC support; axp20x
   - Add AC Power support; axp20x
   - Add Runtime PM support; atmel-ebi, exynos-lpass
   - Add Battery Power Supply support; axp20x
   - Add Clock support; exynos-lpass, hi655x-pmic

  Fix-ups:
   - Implicitly specify required headers; motorola-cpcap, intel_soc_pmic_bxtwc
   - Add .remove() method; stm32-timers, exynos-lpass
   - Remove unused code; intel_soc_pmic_core, intel-lpss-acpi, ipaq-micro, atmel-smc, menelaus
   - Rename variables for clarity; axp20x
   - Convert pr_warning() to pr_warn(); db8500-prcmu, sta2x11-mfd, twl4030-power
   - Improve formatting; arizona-core, axp20x
   - Use raw_spinlock_*() variants; asic3, t7l66xb, tc6393xb
   - Simplify/refactor code; arizona-core, atmel-ebi
   - Improve error checking; intel_soc_pmic_core

  Bug Fixes:
   - Ensure OMAP3630/3730 boards can successfully reboot; twl4030-power
   - Correct max-register value; stm32-timers
   - Extend timeout to account for clock stretching; cros_ec_spi
   - Use correct IRQ trigger type; motorola-cpcap
   - Fix bad use of IRQ sense register; motorola-cpcap
   - Logic error "||" should be "&&"; mxs-lradc-ts"

* tag 'mfd-next-4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd: (79 commits)
  input: touchscreen: mxs-lradc: || vs && typos
  dt-bindings: Add AXP803's regulator info
  mfd: axp20x: Support AXP803 variant
  dt-bindings: Add device tree binding for X-Powers AXP803 PMIC
  dt-bindings: Make AXP20X compatible strings one per line
  mfd: intel_soc_pmic_core: Fix unchecked return value
  mfd: menelaus: Remove obsolete local_irq_disable() and local_irq_enable()
  mfd: omap-usb-tll: Configure ULPIAUTOIDLE
  mfd: omap-usb-tll: Fix inverted bit use for USB TLL mode
  mfd: palmas: Fixed spelling mistake in error message
  mfd: lpc_ich: Add support for Intel Gemini Lake SoC
  mfd: hi655x: Add the clock cell to provide WiFi and Bluetooth
  mfd: intel_soc_pmic: Fix a mess with compilation units
  mfd: exynos-lpass: Add runtime PM support
  mfd: exynos-lpass: Add missing remove() function
  mfd: exynos-lpass: Add support for clocks
  mfd: exynos-lpass: Remove pad retention control
  iio: adc: add support for X-Powers AXP20X and AXP22X PMICs ADCs
  mfd: cpcap: Fix bad use of IRQ sense register
  mfd: cpcap: Use ack_invert interrupts
  ...
2017-05-03 12:16:25 -07:00
Linus Torvalds e897f267c5 - New Drivers
- Arctic Sand ARC2C0608 LED Backlight
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJZAdHKAAoJEFGvii+H/HdhooAP/3CTa+8+94T97DU/o4crK5w1
 W9tMDDdmes0qM4SR37PJDM2HGDt9sprAPvhdhgpPUnm+MR0RYhj4dxxmX7D3sV+y
 GNM0E4hofYfdWiNeSLOPA3c5PT+UaRCjei8wMrFMNYf0JouGFqOSQnhBy6K+W4hs
 k1onp9hm8FoxiZqxaaITIt2Q/nllWPkUyitU5MfzB5DPKkDQygtAUWRFgh+ItLJf
 Hf7Qy30E8EASoncgH28glh7JCVyXpd7/ldUBM/zslw3V1oBdZ3TX3XDyg9NootA7
 szfQ0kM7tgXZXgu6klZVxezIUm/T7dnWyL2ScMyANj6l2jgNZ/uwzYYmGauHAcFl
 zWkit6+9D9ZKXMnrxq3cP2rJ6NryJOa+0yGl0mJ7EWfIB/4cXbt6JRVcF3rFfBzg
 EndvUTJsIwlDIgoj+IREF6FD4oyLRC7YE3HAa65Hn7rxilxej3Q8uPJ8n0AzxbBK
 8sbypPCU40kJg9N5gSPIdUHEIynYicO/K/4invnF8nQc4915Nr0pnUsrI5xiwHez
 9qRVaTpaTuf5mARRjBaX0qDRz2ssccOET4hpFiXumSWhKjanOqQ3k9o2QE6qTbor
 FIt7Ik6jabXzqC+ZuEalxhjvRCHC8FguCWCCanwFeMe6dB0COY2hzidukUPQuTgd
 LgcqrMKt1Kj674e7DenB
 =qfK6
 -----END PGP SIGNATURE-----

Merge tag 'backlight-next-4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight

Pull backlight update from Lee Jones:
 "New Arctic Sand ARC2C0608 LED Backlight driver"

* tag 'backlight-next-4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight:
  backlight: Add support for Arctic Sand LED backlight driver chips
  dt-bindings: backlight: arcxcnn: Supply bindings for Arctic Sand backlight
2017-05-03 12:11:44 -07:00
Linus Torvalds 221656e7c4 sound updates for 4.12-rc1
It was a relatively calm development cycle, and no scaring changes are
 seen in both core and driver sides.  Here are some highlights:
 
 ASoC:
 - A new API for hooking up jacks more generically and easily
 - Card longname is set based on DMI for a unique UCM profile
 - Lots of Intel driver fixes: Atom, Broxton, Skylake and newer chips
 - New drivers for Cirrus CS35L35, DIO DIO2125, Everest ES7132,
   HiSilicon hi6210, Maxim MAX98927, MT2701 systems with WM8960, Nuvoton
   NAU8824, Odroid systems, ST STM32 SAI controllers and x86 systems with
   DA7213
 
 HD-audio:
 - Many new quirks to support headset for various devices (mostly ASUS
   ones) as usual
 - Support for dual codecs on some Gigabyte mobos and Lenovo laptop
 - Improvement on PCM position reporting for Skylake and newer
 
 FireWire:
 - New drivers for MOTU and RME Fireface series
 - Updates for Digidesign Digi00x and TASCAM series
 - Support for tracepoints
 
 Others:
 - USB-audio: improved support for quirk_alias option
 - Cleanups, constification allover the places
 -----BEGIN PGP SIGNATURE-----
 
 iQJCBAABCAAsFiEECxfAB4MH3rD5mfB6bDGAVD0pKaQFAlkIphEOHHRpd2FpQHN1
 c2UuZGUACgkQbDGAVD0pKaQwjg//axvFHHVIJGkwL628pfWSVJN7+gTlKkeBPBcn
 NtaOOC7aM5IhkvYGxrq5e55cRFDMt4tvVloGLXu593gzDoN0JYUSCVYcctqklKa8
 nbHeasZgnVwmQHf/44oajiT++UElxH/i4q/kz91ZuYmNVUgh6syH3o04T9UBZA6k
 rxv4MbMctUf1SYwbZVMzPLWXsSCwmaWsUimhi9WiDRzSE1bGI46nJtPbDF5jXhlR
 83Dsp1lp5tQXXAeYksjx+yUtQMRpY85zsQj1NJ/izrD1fjWnXquaRlDZwcIOWCio
 3Vz87liQyIEldY7FHL64igo8SIMeypPhRUFfxugSn9iTqeuWaXFyJNQwo0aENDZH
 RMUmqAutiik2MEXMN0fAAgj3GcxbSVgYK/EfmzHNlrDAtdbgBm+ArIhS67Ue9vPi
 emb6+/STUI7rmH8+RFBvQ/Xz3mpa791z+jVuidTKoEgYJ5/M1Ql8Ucoja74UXj4m
 QjNe+CBO6GXcAOlBNeMZ7PMpQrR14Hl386fusLG2JXRLR8p0SmO4Klt8PrBy/Obh
 4bE0/EWK/e5XbXVX+8QyDOtt9cFsAYZJqbpDr9Enft3LJcid6gmiJZGuZ1i+Iv1d
 L2lYkFQkLI/bjf5xGsamK2pB9xQzOSa7u6Q+q4iBXuVYjDpoKz59l6siwRuuJLMX
 c+QfQgY=
 =8RhW
 -----END PGP SIGNATURE-----

Merge tag 'sound-4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound updates from Takashi Iwai:
 "It was a relatively calm development cycle, and no scaring changes are
  seen in both core and driver sides. Here are some highlights:

  ASoC:
   - A new API for hooking up jacks more generically and easily

   - Card longname is set based on DMI for a unique UCM profile

   - Lots of Intel driver fixes: Atom, Broxton, Skylake and newer chips

   - New drivers for Cirrus CS35L35, DIO DIO2125, Everest ES7132,
     HiSilicon hi6210, Maxim MAX98927, MT2701 systems with WM8960,
     Nuvoton NAU8824, Odroid systems, ST STM32 SAI controllers and x86
     systems with DA7213

  HD-audio:
   - Many new quirks to support headset for various devices (mostly ASUS
     ones) as usual

   - Support for dual codecs on some Gigabyte mobos and Lenovo laptop

   - Improvement on PCM position reporting for Skylake and newer

  FireWire:
   - New drivers for MOTU and RME Fireface series

   - Updates for Digidesign Digi00x and TASCAM series

   - Support for tracepoints

  Others:
   - USB-audio: improved support for quirk_alias option

   - Cleanups, constification allover the places"

* tag 'sound-4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (299 commits)
  ASoC: codec: wm8960: Relax bit clock computation when using PLL
  ASoC: codec: wm9860: avoid maybe-uninitialized warning
  ASoC: nau8824: leave Class D gain at chip default
  ASoC: nau8824: rename controls to match DAPM controls
  ASoC: Intel: Skylake: Return negative error code
  ASoC: Intel: Skylake: Fix unused variable warning
  ASoC: Intel: Skylake: fix uninitialized pointer use
  ASoC: sti: Fix error handling if of_clk_get() fails
  ASoC: cs4271: configure reset GPIO as output
  ASoC: dwc: Disallow building designware_pcm as a module
  ALSA: ali5451: fix spelling mistake in "ali_capture_preapre"
  ASoC: stm32: add SAI driver
  ASoC: stm32: add bindings for SAI
  ASoC: Intel: Skylake: Add loadable module support on KBL platform
  ASoC: Intel: Skylake: Modify load_lib_ipc arguments for a nowait version
  ASoC: Intel: Skylake: Register dsp_fw_ops for kabylake
  ASoC: Intel: Skylake: Modify arguments to reuse module transfer function
  ASoC: Intel: Skylake: Commonize library load
  ASoC: Intel: Skylake: Move sst common initialization to a helper function
  ASoC: nau8824: new driver
  ...
2017-05-03 11:58:59 -07:00
Linus Torvalds 2f34c1231b main drm pull request for 4.12 kernel
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJZCTzvAAoJEAx081l5xIa+9kcQAJsQiija4/7QGx6IzakOMqjx
 WulJ3zYG/cU/HLwCBcuWRDF6wAj+7iWNeLCPmolHwEazcI8tQVdgMlWtbdMbDh8U
 ckzD3FBXsEVfIfab+u6tyoUkm3l/VDhMXbjkUK7NTo/+dkRqe5LuFfZPCGN09jft
 Y+5salkRXzDhXPSFsqmjfzhx1v7PTgf0a5HUenKWEWOv+sJQaW4/iPvcDSIcg5qR
 l9WjAqro1NpFYhUodnh6DkLeledL1U5whdtp/yvrUAck8y+WP/jwGYmQ7pZ0UkQm
 f0M3kV6K67ox9eqN++jsGX5o8sB1qF01Uh95kBAnyzYzsw4ZlMCx6pV7PDX+J88M
 UBNMEqX10hrLkNJA9lGjPWx+/6fudcwg9anKvTRO3Uyx7MbYoJAgjzAM+yBqqtV0
 8Otxa4Bw0V2pmUD+0lqJDERRvE77VCXkLb8SaI5lQo0MHpQqT2cZA+GD+B+rZHO6
 Ie5LDFY87vM2GG1IECufG+xOa3v6sn2FfQ1ouu1KNGKOAMBKcQCQyQx3kGVuNW2i
 HDACVXALJgXdRlVLm4jydOCZdRoguX7AWmRjtdwxgaO+lBcGfLhkXdjLQ7Ho+29p
 32ArJfkZPfA53vMB6lHxAfbtrs1q2RzyVnPHj/KqeJnGZbABKTsF2HQ5BQc4Xq/J
 mqXoz6Oubdvk4Pwyx7Ne
 =UxFF
 -----END PGP SIGNATURE-----

Merge tag 'drm-for-v4.12' of git://people.freedesktop.org/~airlied/linux

Pull drm u pdates from Dave Airlie:
 "This is the main drm pull request for v4.12. Apart from two fixes
  pulls, everything should have been in drm-next for at least 2 weeks.

  The biggest thing in here is AMD released the public headers for their
  upcoming VEGA GPUs. These as always are quite a sizeable chunk of
  header files. They've also added initial non-display support for those
  GPUs, though they aren't available in production yet.

  Otherwise it's pretty much normal.

  New bridge drivers:
   - megachips-stdpxxxx-ge-b850v3-fw LVDS->DP++
   - generic LVDS bridge support.

  Core:
   - Displayport link train failure reporting to userspace
   - debugfs interface cleaned up
   - subsystem TODO in kerneldoc now
   - Extended fbdev support (flipping and vblank wait)
   - drm_platform removed
   - EDP CRC support in helper
   - HF-VSDB SCDC support in EDID parser
   - Lots of code cleanups and header extraction
   - Thunderbolt external GPU awareness
   - Atomic helper improvements
   - Documentation improvements

  panel:
   - Sitronix and Samsung new panel support

  amdgpu:
   - Preliminary vega10 support
   - Multi-level page table support
   - GPU sensor support for userspace
   - PRT support for sparse buffers
   - SR-IOV improvements
   - Non-contig VRAM CPU mapping

  i915:
   - Atomic modesetting enabled by default on Gen5+
   - LSPCON improvements
   - Atomic state handling for cdclk
   - GPU reset improvements
   - In-kernel unit tests
   - Geminilake improvements and color manager support
   - Designware i2c fixes
   - vblank evasion improvements
   - Hotplug safe connector iterators
   - GVT scheduler QoS support
   - GVT Kabylake support

  nouveau:
   - Acceleration support for Pascal (GP10x).
   - Rearchitecture of code handling proprietary signed firmware
   - Fix GTX 970 with odd MMU configuration
   - GP10B support
   - GP107 acceleration support

  vmwgfx:
   - Atomic modesetting support for vmwgfx

  omapdrm:
   - Support for render nodes
   - Refactor omapdss code
   - Fix some probe ordering issues
   - Fix too dark RGB565 rendering

  sunxi:
   - prelim rework for multiple pipes.

  mali-dp:
   - Color management support
   - Plane scaling
   - Power management improvements

  imx-drm:
   - Prefetch Resolve Engine/Gasket on i.MX6QP
   - Deferred plane disabling
   - Separate alpha support

  mediatek:
   - Mediatek SoC MT2701 support

  rcar-du:
   - Gen3 HDMI support

  msm:
   - 4k support for newer chips
   - OPP bindings for gpu
   - prep work for per-process pagetables

  vc4:
   - HDMI audio support
   - fixes

  qxl:
   - minor fixes.

  dw-hdmi:
   - PHY improvements
   - CSC fixes
   - Amlogic GX SoC support"

* tag 'drm-for-v4.12' of git://people.freedesktop.org/~airlied/linux: (1778 commits)
  drm/nouveau/fb/gf100-: Fix 32 bit wraparound in new ram detection
  drm/nouveau/secboot/gm20b: fix the error return code in gm20b_secboot_tegra_read_wpr()
  drm/nouveau/kms: Increase max retries in scanout position queries.
  drm/nouveau/bios/bitP: check that table is long enough for optional pointers
  drm/nouveau/fifo/nv40: no ctxsw for pre-nv44 mpeg engine
  drm: mali-dp: use div_u64 for expensive 64-bit divisions
  drm/i915: Confirm the request is still active before adding it to the await
  drm/i915: Avoid busy-spinning on VLV_GLTC_PW_STATUS mmio
  drm/i915/selftests: Allocate inode/file dynamically
  drm/i915: Fix system hang with EI UP masked on Haswell
  drm/i915: checking for NULL instead of IS_ERR() in mock selftests
  drm/i915: Perform link quality check unconditionally during long pulse
  drm/i915: Fix use after free in lpe_audio_platdev_destroy()
  drm/i915: Use the right mapping_gfp_mask for final shmem allocation
  drm/i915: Make legacy cursor updates more unsynced
  drm/i915: Apply a cond_resched() to the saturated signaler
  drm/i915: Park the signaler before sleeping
  drm: mali-dp: Check the mclk rate and allow up/down scaling
  drm: mali-dp: Enable image enhancement when scaling
  drm: mali-dp: Add plane upscaling support
  ...
2017-05-03 11:44:24 -07:00
Linus Torvalds a3719f34fd Merge branch 'generic' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull quota, reiserfs, udf and ext2 updates from Jan Kara:
 "The branch contains changes to quota code so that it does not modify
  persistent flags in inode->i_flags (it was the only place in kernel
  doing that) and handle it inside filesystem's quotaon/off handlers
  instead.

  The branch also contains two UDF cleanups, a couple of reiserfs fixes
  and one fix for ext2 quota locking"

* 'generic' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  ext4: Improve comments in ext4_quota_{on|off}()
  udf: use kmap_atomic for memcpy copying
  udf: use octal for permissions
  quota: Remove dquot_quotactl_ops
  reiserfs: Remove i_attrs_to_sd_attrs()
  reiserfs: Remove useless setting of i_flags
  jfs: Remove jfs_get_inode_flags()
  ext2: Remove ext2_get_inode_flags()
  ext4: Remove ext4_get_inode_flags()
  quota: Stop setting IMMUTABLE and NOATIME flags on quota files
  jfs: Set flags on quota files directly
  ext2: Set flags on quota files directly
  reiserfs: Set flags on quota files directly
  ext4: Set flags on quota files directly
  reiserfs: Protect dquot_writeback_dquots() by s_umount semaphore
  reiserfs: Make cancel_old_flush() reliable
  ext2: Call dquot_writeback_dquots() with s_umount held
  reiserfs: avoid a -Wmaybe-uninitialized warning
2017-05-03 11:35:47 -07:00
Gerald Schaefer 1ef97fe4f8 brd: fix uninitialized use of brd->dax_dev
commit 1647b9b9 "brd: add dax_operations support" introduced the allocation
and freeing of a dax_device, but the allocated dax_device is not stored
into the brd_device, so brd_del_one() will eventually operate on an
uninitialized brd->dax_dev.

Fix this by storing the allocated dax_device to brd->dax_dev.

Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2017-05-03 11:30:03 -07:00
Linus Torvalds 5133cd7518 Merge branch 'fsnotify' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull fsnotify updates from Jan Kara:
 "The branch contains mainly a rework of fsnotify infrastructure fixing
  a shortcoming that we have waited for response to fanotify permission
  events with SRCU read lock held and when the process consuming events
  was slow to respond the kernel has stalled.

  It also contains several cleanups of unnecessary indirections in
  fsnotify framework and a bugfix from Amir fixing leakage of kernel
  internal errno to userspace"

* 'fsnotify' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: (37 commits)
  fanotify: don't expose EOPENSTALE to userspace
  fsnotify: remove a stray unlock
  fsnotify: Move ->free_mark callback to fsnotify_ops
  fsnotify: Add group pointer in fsnotify_init_mark()
  fsnotify: Drop inode_mark.c
  fsnotify: Remove fsnotify_find_{inode|vfsmount}_mark()
  fsnotify: Remove fsnotify_detach_group_marks()
  fsnotify: Rename fsnotify_clear_marks_by_group_flags()
  fsnotify: Inline fsnotify_clear_{inode|vfsmount}_mark_group()
  fsnotify: Remove fsnotify_recalc_{inode|vfsmount}_mask()
  fsnotify: Remove fsnotify_set_mark_{,ignored_}mask_locked()
  fanotify: Release SRCU lock when waiting for userspace response
  fsnotify: Pass fsnotify_iter_info into handle_event handler
  fsnotify: Provide framework for dropping SRCU lock in ->handle_event
  fsnotify: Remove special handling of mark destruction on group shutdown
  fsnotify: Detach mark from object list when last reference is dropped
  fsnotify: Move queueing of mark for destruction into fsnotify_put_mark()
  inotify: Do not drop mark reference under idr_lock
  fsnotify: Free fsnotify_mark_connector when there is no mark attached
  fsnotify: Lock object list with connector lock
  ...
2017-05-03 11:05:15 -07:00
Jens Axboe 2719aa217e blk-mq: don't use sync workqueue flushing from drivers
A previous commit introduced the sync flush, which we need from
internal callers like blk_mq_quiesce_queue(). However, we also
call the stop helpers from drivers, particularly from ->queue_rq()
when we have to stop processing for a bit. We can't block from
those locations, and we don't have to guarantee that we're
fully flushed.

Fixes: 9f99373790 ("blk-mq: unify hctx delayed_run_work and run_work")
Reviewed-by: Bart Van Assche <Bart.VanAssche@sandisk.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2017-05-03 11:44:43 -06:00
Linus Torvalds 7b66f13207 - Cleanups to request-based DM and DM multipath from Christoph that
prepare for his block core error code type checking improvements.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJZB7R5AAoJEMUj8QotnQNaCFMIAKcE+xFMAf5D6en6Ys5V1Lm6
 L6/MdUnbH2j7wZ7CnNgkmDExdJ8dpENyjhy8r4rgXs+BufiVeZ8uGOYsuiXGjOG2
 wZ4M4haBbBDsWStyn3C5K3QxpN7ksuxHZC7XR25fDDDIBmJW2/bL7B7kyE9lp6LR
 SmP7O0x36twCMrwWrC043NwhCS+lQH+EIqTTX4Q18swtXz/CCAtNDxgGsjxvwfxH
 YkCAxzbQlva3nYv29tcKpc89RJK1hWfdkXqb/TW4pPxspexnEjVUFyh019DxEoRr
 KPi6hhT6nx2JjMSvJykFasRPAdoyEoUzTNjrGk6WeD6hfzkxsHq/FutbH9BGj8Q=
 =h45q
 -----END PGP SIGNATURE-----

Merge tag 'for-4.12/dm-post-merge-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm

Pull additional device mapper updates from Mike Snitzer:
 "Here are some changes from Christoph that needed to be rebased ontop
  of changes that were already merged into the device mapper tree. In
  addition, these changes depend on the 'for-4.12/block' changes that
  you've already merged.

   - Cleanups to request-based DM and DM multipath from Christoph that
     prepare for his block core error code type checking improvements"

* tag 'for-4.12/dm-post-merge-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm: introduce a new DM_MAPIO_KILL return value
  dm rq: change ->rq_end_io calling conventions
  dm mpath: merge do_end_io into multipath_end_io
2017-05-03 10:34:03 -07:00
Linus Torvalds d35a878ae1 - A major update for DM cache that reduces the latency for deciding
whether blocks should migrate to/from the cache.  The bio-prison-v2
   interface supports this improvement by enabling direct dispatch of
   work to workqueues rather than having to delay the actual work
   dispatch to the DM cache core.  So the dm-cache policies are much more
   nimble by being able to drive IO as they see fit.  One immediate
   benefit from the improved latency is a cache that should be much more
   adaptive to changing workloads.
 
 - Add a new DM integrity target that emulates a block device that has
   additional per-sector tags that can be used for storing integrity
   information.
 
 - Add a new authenticated encryption feature to the DM crypt target that
   builds on the capabilities provided by the DM integrity target.
 
 - Add MD interface for switching the raid4/5/6 journal mode and update
   the DM raid target to use it to enable aid4/5/6 journal write-back
   support.
 
 - Switch the DM verity target over to using the asynchronous hash crypto
   API (this helps work better with architectures that have access to
   off-CPU algorithm providers, which should reduce CPU utilization).
 
 - Various request-based DM and DM multipath fixes and improvements from
   Bart and Christoph.
 
 - A DM thinp target fix for a bio structure leak that occurs for each
   discard IFF discard passdown is enabled.
 
 - A fix for a possible deadlock in DM bufio and a fix to re-check the
   new buffer allocation watermark in the face of competing admin changes
   to the 'max_cache_size_bytes' tunable.
 
 - A couple DM core cleanups.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJZB6vtAAoJEMUj8QotnQNaoicIALuZTLElgAzxzA28cfk1+1Ea
 Gd09CfJ3M6cvk/YGUU7WwiSYIwu16yOJALG4sLcYnEmUCzvKfFPcl/RpeSJHPpYM
 0aVXa6NIJw7K2r3C17toiK2DRMHYw6QU843WeWI93vBW13lDJklNJL9fM7GBEOLH
 NMSNw2mAq9ajtLlnJhM3ZfhloA7/u/jektvlBO1AA3RQ5Kx1cXVXFPqN7FdRfcqp
 4RuEMe9faAadlXLsj3bia5IBmF/W0Qza6JilP+NLKLWB4fm7LZDjN/k+TsHWMa9e
 cGR73TgUGLMBJX+sDJy8R3oeBG9JZkFVkD7I30eCjzyhSOs/54XNYQ23EkqHJU0=
 =9Ryi
 -----END PGP SIGNATURE-----

Merge tag 'for-4.12/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm

Pull device mapper updates from Mike Snitzer:

 - A major update for DM cache that reduces the latency for deciding
   whether blocks should migrate to/from the cache. The bio-prison-v2
   interface supports this improvement by enabling direct dispatch of
   work to workqueues rather than having to delay the actual work
   dispatch to the DM cache core. So the dm-cache policies are much more
   nimble by being able to drive IO as they see fit. One immediate
   benefit from the improved latency is a cache that should be much more
   adaptive to changing workloads.

 - Add a new DM integrity target that emulates a block device that has
   additional per-sector tags that can be used for storing integrity
   information.

 - Add a new authenticated encryption feature to the DM crypt target
   that builds on the capabilities provided by the DM integrity target.

 - Add MD interface for switching the raid4/5/6 journal mode and update
   the DM raid target to use it to enable aid4/5/6 journal write-back
   support.

 - Switch the DM verity target over to using the asynchronous hash
   crypto API (this helps work better with architectures that have
   access to off-CPU algorithm providers, which should reduce CPU
   utilization).

 - Various request-based DM and DM multipath fixes and improvements from
   Bart and Christoph.

 - A DM thinp target fix for a bio structure leak that occurs for each
   discard IFF discard passdown is enabled.

 - A fix for a possible deadlock in DM bufio and a fix to re-check the
   new buffer allocation watermark in the face of competing admin
   changes to the 'max_cache_size_bytes' tunable.

 - A couple DM core cleanups.

* tag 'for-4.12/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (50 commits)
  dm bufio: check new buffer allocation watermark every 30 seconds
  dm bufio: avoid a possible ABBA deadlock
  dm mpath: make it easier to detect unintended I/O request flushes
  dm mpath: cleanup QUEUE_IF_NO_PATH bit manipulation by introducing assign_bit()
  dm mpath: micro-optimize the hot path relative to MPATHF_QUEUE_IF_NO_PATH
  dm: introduce enum dm_queue_mode to cleanup related code
  dm mpath: verify __pg_init_all_paths locking assumptions at runtime
  dm: verify suspend_locking assumptions at runtime
  dm block manager: remove an unused argument from dm_block_manager_create()
  dm rq: check blk_mq_register_dev() return value in dm_mq_init_request_queue()
  dm mpath: delay requeuing while path initialization is in progress
  dm mpath: avoid that path removal can trigger an infinite loop
  dm mpath: split and rename activate_path() to prepare for its expanded use
  dm ioctl: prevent stack leak in dm ioctl call
  dm integrity: use previously calculated log2 of sectors_per_block
  dm integrity: use hex2bin instead of open-coded variant
  dm crypt: replace custom implementation of hex2bin()
  dm crypt: remove obsolete references to per-CPU state
  dm verity: switch to using asynchronous hash crypto API
  dm crypt: use WQ_HIGHPRI for the IO and crypt workqueues
  ...
2017-05-03 10:31:20 -07:00
Linus Torvalds e5021876c9 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md
Pull MD updates from Shaohua Li:

 - Add Partial Parity Log (ppl) feature found in Intel IMSM raid array
   by Artur Paszkiewicz. This feature is another way to close RAID5
   writehole. The Linux implementation is also available for normal
   RAID5 array if specific superblock bit is set.

 - A number of md-cluser fixes and enabling md-cluster array resize from
   Guoqing Jiang

 - A bunch of patches from Ming Lei and Neil Brown to rewrite MD bio
   handling related code. Now MD doesn't directly access bio bvec,
   bi_phys_segments and uses modern bio API for bio split.

 - Improve RAID5 IO pattern to improve performance for hard disk based
   RAID5/6 from me.

 - Several patches from Song Liu to speed up raid5-cache recovery and
   allow raid5 cache feature disabling in runtime.

 - Fix a performance regression in raid1 resync from Xiao Ni.

 - Other cleanup and fixes from various people.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md: (84 commits)
  md/raid10: skip spare disk as 'first' disk
  md/raid1: Use a new variable to count flighting sync requests
  md: clear WantReplacement once disk is removed
  md/raid1/10: remove unused queue
  md: handle read-only member devices better.
  md/raid10: wait up frozen array in handle_write_completed
  uapi: fix linux/raid/md_p.h userspace compilation error
  md-cluster: Fix a memleak in an error handling path
  md: support disabling of create-on-open semantics.
  md: allow creation of mdNNN arrays via md_mod/parameters/new_array
  raid5-ppl: use a single mempool for ppl_io_unit and header_page
  md/raid0: fix up bio splitting.
  md/linear: improve bio splitting.
  md/raid5: make chunk_aligned_read() split bios more cleanly.
  md/raid10: simplify handle_read_error()
  md/raid10: simplify the splitting of requests.
  md/raid1: factor out flush_bio_list()
  md/raid1: simplify handle_read_error().
  Revert "block: introduce bio_copy_data_partial"
  md/raid1: simplify alloc_behind_master_bio()
  ...
2017-05-03 10:05:38 -07:00
Linus Torvalds 46f0537b1e Merge branch 'stable-4.12' of git://git.infradead.org/users/pcmoore/audit
Pull audit updates from Paul Moore:
 "Fourteen audit patches for v4.12 that span the full range of fixes,
  new features, and internal cleanups.

  We have a patches to move to 64-bit timestamps, convert refcounts from
  atomic_t to refcount_t, track PIDs using the pid struct instead of
  pid_t, convert our own private audit buffer cache to a standard
  kmem_cache, log kernel module names when they are unloaded, and
  normalize the NETFILTER_PKT to make the userspace folks happier.

  From a fixes perspective, the most important is likely the auditd
  connection tracking RCU fix; it was a rather brain dead bug that I'll
  take the blame for, but thankfully it didn't seem to affect many
  people (only one report).

  I think the patch subject lines and commit descriptions do a pretty
  good job of explaining the details and why the changes are important
  so I'll point you there instead of duplicating it here; as usual, if
  you have any questions you know where to find us.

  We also manage to take out more code than we put in this time, that
  always makes me happy :)"

* 'stable-4.12' of git://git.infradead.org/users/pcmoore/audit:
  audit: fix the RCU locking for the auditd_connection structure
  audit: use kmem_cache to manage the audit_buffer cache
  audit: Use timespec64 to represent audit timestamps
  audit: store the auditd PID as a pid struct instead of pid_t
  audit: kernel generated netlink traffic should have a portid of 0
  audit: combine audit_receive() and audit_receive_skb()
  audit: convert audit_watch.count from atomic_t to refcount_t
  audit: convert audit_tree.count from atomic_t to refcount_t
  audit: normalize NETFILTER_PKT
  netfilter: use consistent ipv4 network offset in xt_AUDIT
  audit: log module name on delete_module
  audit: remove unnecessary semicolon in audit_watch_handle_event()
  audit: remove unnecessary semicolon in audit_mark_handle_event()
  audit: remove unnecessary semicolon in audit_field_valid()
2017-05-03 09:21:59 -07:00
Linus Torvalds 0302e28dee Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem updates from James Morris:
 "Highlights:

  IMA:
   - provide ">" and "<" operators for fowner/uid/euid rules

  KEYS:
   - add a system blacklist keyring

   - add KEYCTL_RESTRICT_KEYRING, exposes keyring link restriction
     functionality to userland via keyctl()

  LSM:
   - harden LSM API with __ro_after_init

   - add prlmit security hook, implement for SELinux

   - revive security_task_alloc hook

  TPM:
   - implement contextual TPM command 'spaces'"

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (98 commits)
  tpm: Fix reference count to main device
  tpm_tis: convert to using locality callbacks
  tpm: fix handling of the TPM 2.0 event logs
  tpm_crb: remove a cruft constant
  keys: select CONFIG_CRYPTO when selecting DH / KDF
  apparmor: Make path_max parameter readonly
  apparmor: fix parameters so that the permission test is bypassed at boot
  apparmor: fix invalid reference to index variable of iterator line 836
  apparmor: use SHASH_DESC_ON_STACK
  security/apparmor/lsm.c: set debug messages
  apparmor: fix boolreturn.cocci warnings
  Smack: Use GFP_KERNEL for smk_netlbl_mls().
  smack: fix double free in smack_parse_opts_str()
  KEYS: add SP800-56A KDF support for DH
  KEYS: Keyring asymmetric key restrict method with chaining
  KEYS: Restrict asymmetric key linkage using a specific keychain
  KEYS: Add a lookup_restriction function for the asymmetric key type
  KEYS: Add KEYCTL_RESTRICT_KEYRING
  KEYS: Consistent ordering for __key_link_begin and restrict check
  KEYS: Add an optional lookup_restriction hook to key_type
  ...
2017-05-03 08:50:52 -07:00
David S. Miller f411af6822 Merge branch 'ibmvnic-Updated-reset-handler-andcode-fixes'
Nathan Fontenot says:

====================
ibmvnic: Updated reset handler and code fixes

This set of patches multiple code fixes and a new rest handler
for the ibmvnic driver. In order to implement the new reset handler
for the ibmvnic driver resource initialization needed to be moved to
its own routine, a state variable is introduced to replace the
various is_* flags in the driver, and a new routine to handle the
assorted reasons the driver can be reset.

v4 updates:

Patch 3/11: Corrected trailing whitespace
Patch 7/11: Corrected trailing whitespace

v3 updates:

Patch 10/11: Correct patch subject line to be a description of the patch.

v2 updates:

Patch 11/11: Use __netif_subqueue_stopped() instead of
netif_subqueue_stopped() to avoid possible use of an un-initialized
skb variable.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-03 11:33:06 -04:00
Nathan Fontenot 7c3e7de3f3 ibmvnic: Move queue restarting in ibmvnic_tx_complete
Restart of the subqueue should occur outside of the loop processing
any tx buffers instead of doing this in the middle of the loop.

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-03 11:33:06 -04:00
Thomas Falcon 94ca305fd8 ibmvnic: Record SKB RX queue during poll
Map each RX SKB to the RX queue associated with the driver's RX SCRQ.
This should improve the RX CPU load balancing issues seen by the
performance team.

Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-03 11:33:05 -04:00
Nathan Fontenot ca05e31674 ibmvnic: Continue skb processing after skb completion error
There is not a need to stop processing skbs if we encounter a
skb that has a receive completion error.

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-03 11:33:05 -04:00
Nathan Fontenot 161b8a8138 ibmvnic: Check for driver reset first in ibmvnic_xmit
Move the check for the driver resetting to the first thing
in ibmvnic_xmit().

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-03 11:33:05 -04:00
Nathan Fontenot 46293b940f ibmvnic: Wait for any pending scrqs entries at driver close
When closing the ibmvnic driver we need to wait for any pending
sub crq entries to ensure they are handled.

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-03 11:33:05 -04:00
Nathan Fontenot b41b83e9a7 ibmvnic: Clean up tx pools when closing
When closing the ibmvnic driver, most notably during the reset
path, the tx pools need to be cleaned to ensure there are no
hanging skbs that need to be free'ed.

The need for this was found during debugging a loss of network
traffic after handling a driver reset. The underlying cause was
some skbs in the tx pool that were never free'ed. As a
result the upper network layers never tried a re-send since it
believed the driver still had the skb.

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-03 11:33:04 -04:00
Nathan Fontenot e0ebe942f4 ibmvnic: Whitespace correction in release_rx_pools
Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-03 11:33:04 -04:00