linux/mm
Tejun Heo c2aa723a60 writeback: implement memcg writeback domain based throttling
While cgroup writeback support now connects memcg and blkcg so that
writeback IOs are properly attributed and controlled, the IO back
pressure propagation mechanism implemented in balance_dirty_pages()
and its subroutines wasn't aware of cgroup writeback.

Processes belonging to a memcg may have access to only subset of total
memory available in the system and not factoring this into dirty
throttling rendered it completely ineffective for processes under
memcg limits and memcg ended up building a separate ad-hoc degenerate
mechanism directly into vmscan code to limit page dirtying.

The previous patches updated balance_dirty_pages() and its subroutines
so that they can deal with multiple wb_domain's (writeback domains)
and defined per-memcg wb_domain.  Processes belonging to a non-root
memcg are bound to two wb_domains, global wb_domain and memcg
wb_domain, and should be throttled according to IO pressures from both
domains.  This patch updates dirty throttling code so that it repeats
similar calculations for the two domains - the differences between the
two are few and minor - and applies the lower of the two sets of
resulting constraints.

wb_over_bg_thresh(), which controls when background writeback
terminates, is also updated to consider both global and memcg
wb_domains.  It returns true if dirty is over bg_thresh for either
domain.

This makes the dirty throttling mechanism operational for memcg
domains including writeback-bandwidth-proportional dirty page
distribution inside them but the ad-hoc memcg throttling mechanism in
vmscan is still in place.  The next patch will rip it out.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Jan Kara <jack@suse.cz>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Greg Thelen <gthelen@google.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-06-02 08:38:13 -06:00
..
kasan mm/mempool.c: kasan: poison mempool elements 2015-04-15 16:35:20 -07:00
Kconfig mm: cma: debugfs interface 2015-04-14 16:49:00 -07:00
Kconfig.debug mm/debug_pagealloc: remove obsolete Kconfig options 2015-01-08 15:10:52 -08:00
Makefile mm: move memtest under mm 2015-04-14 16:49:06 -07:00
backing-dev.c writeback: implement memcg wb_domain 2015-06-02 08:38:13 -06:00
balloon_compaction.c mm/balloon_compaction: fix deflation when compaction is disabled 2014-10-29 16:33:15 -07:00
bootmem.c mem-hotplug: reset node managed pages when hot-adding a new pgdat 2014-11-13 16:17:06 -08:00
cleancache.c cleancache: remove limit on the number of cleancache enabled filesystems 2015-04-14 16:49:03 -07:00
cma.c mm: cma: add trace events for CMA allocations and freeings 2015-04-15 16:35:19 -07:00
cma.h mm: cma: allocation trigger 2015-04-14 16:49:00 -07:00
cma_debug.c mm/cma_debug.c: remove blank lines before DEFINE_SIMPLE_ATTRIBUTE() 2015-04-15 16:35:20 -07:00
compaction.c mm/compaction.c: fix "suitable_migration_target() unused" warning 2015-04-15 16:35:20 -07:00
debug-pagealloc.c mm/debug-pagealloc: make debug-pagealloc boottime configurable 2014-12-13 12:42:48 -08:00
debug.c mm: account pmd page tables to the process 2015-02-11 17:06:04 -08:00
dmapool.c mm/dmapool.c: fixed a brace coding style issue 2014-10-09 22:26:00 -04:00
early_ioremap.c
fadvise.c writeback: implement and use inode_congested() 2015-06-02 08:33:35 -06:00
failslab.c
filemap.c memcg: add per cgroup dirty page accounting 2015-06-02 08:33:33 -06:00
frontswap.c mm/frontswap.c: fix the condition in BUG_ON 2014-12-10 17:41:08 -08:00
gup.c mm: use READ_ONCE() for non-scalar types 2015-04-15 16:35:18 -07:00
highmem.c mm/highmem: make kmap cache coloring aware 2014-08-06 18:01:22 -07:00
huge_memory.c thp: cleanup khugepaged startup 2015-04-15 16:35:19 -07:00
hugetlb.c mm: hugetlb: cleanup using paeg_huge_active() 2015-04-15 16:35:19 -07:00
hugetlb_cgroup.c mm: page_counter: pull "-1" handling out of page_counter_memparse() 2015-02-11 17:06:02 -08:00
hwpoison-inject.c mm/hwpoison-inject.c: remove unnecessary null test before debugfs_remove_recursive 2014-08-06 18:01:19 -07:00
init-mm.c
internal.h mm: remove rest of ACCESS_ONCE() usages 2015-04-15 16:35:18 -07:00
interval_tree.c mm: replace vma->sharead.linear with vma->shared 2015-02-10 14:30:31 -08:00
kmemcheck.c mm/slab_common: move kmem_cache definition to internal header 2014-10-09 22:25:50 -04:00
kmemleak-test.c
kmemleak.c kmemleak: disable kasan instrumentation for kmemleak 2015-02-13 21:21:41 -08:00
ksm.c mm: remove rest of ACCESS_ONCE() usages 2015-04-15 16:35:18 -07:00
list_lru.c memcg: reparent list_lrus and free kmemcg_id on css offline 2015-02-12 18:54:10 -08:00
maccess.c
madvise.c writeback: separate out include/linux/backing-dev-defs.h 2015-06-02 08:33:34 -06:00
memblock.c mm/memblock.c: add debug output for memblock_add() 2015-04-15 16:35:19 -07:00
memcontrol.c writeback: implement memcg writeback domain based throttling 2015-06-02 08:38:13 -06:00
memory-failure.c mm: hugetlb: introduce page_huge_active 2015-04-15 16:35:19 -07:00
memory.c mm: new pfn_mkwrite same as page_mkwrite for VM_PFNMAP 2015-04-15 16:35:20 -07:00
memory_hotplug.c mm: hugetlb: cleanup using paeg_huge_active() 2015-04-15 16:35:19 -07:00
mempolicy.c mm, thp: really limit transparent hugepage allocation to local node 2015-04-14 16:49:03 -07:00
mempool.c mm/mempool.c: kasan: poison mempool elements 2015-04-15 16:35:20 -07:00
memtest.c memtest: use phys_addr_t for physical addresses 2015-04-14 16:49:06 -07:00
migrate.c mm/migrate: check-before-clear PageSwapCache 2015-04-15 16:35:17 -07:00
mincore.c mincore: apply page table walker on do_mincore() 2015-02-11 17:06:06 -08:00
mlock.c mm: move mm_populate()-related code to mm/gup.c 2015-04-14 16:49:00 -07:00
mm_init.c mm/mm_init.c: mark mminit_loglevel __meminitdata 2015-02-12 18:54:11 -08:00
mmap.c mm/mmap.c: use while instead of if+goto 2015-04-15 16:35:19 -07:00
mmu_context.c
mmu_notifier.c mmu_notifier: add the callback for mmu_notifier_invalidate_range() 2014-11-13 13:46:09 +11:00
mmzone.c mm: microoptimize zonelist operations 2015-02-11 17:06:02 -08:00
mprotect.c mm: numa: preserve PTE write permissions across a NUMA hinting fault 2015-03-25 16:20:31 -07:00
mremap.c mm/mremap.c: clean up goto just return ERR_PTR 2015-04-15 16:35:18 -07:00
msync.c mm: remove rest usage of VM_NONLINEAR and pte_file() 2015-02-10 14:30:31 -08:00
nobootmem.c mem-hotplug: reset node managed pages when hot-adding a new pgdat 2014-11-13 16:17:06 -08:00
nommu.c nommu: use __vfs_read() 2015-04-11 22:27:56 -04:00
oom_kill.c mm/oom_kill.c: fix typo in comment 2015-04-15 16:35:16 -07:00
page-writeback.c writeback: implement memcg writeback domain based throttling 2015-06-02 08:38:13 -06:00
page_alloc.c mm: remove rest of ACCESS_ONCE() usages 2015-04-15 16:35:18 -07:00
page_counter.c mm: page_counter: pull "-1" handling out of page_counter_memparse() 2015-02-11 17:06:02 -08:00
page_ext.c mm/page_owner: keep track of page owners 2014-12-13 12:42:48 -08:00
page_io.c suspend: simplify block I/O handling 2015-05-19 09:19:59 -06:00
page_isolation.c mm/page_alloc.c: call kernel_map_pages in unset_migrateype_isolate 2015-03-25 16:20:30 -07:00
page_owner.c mm/page_owner.c: remove unnecessary stack_trace field 2015-02-11 17:06:07 -08:00
pagewalk.c mm/pagewalk.c: prevent positive return value of walk_page_test() from being passed to callers 2015-03-25 16:20:30 -07:00
percpu-km.c percpu: implmeent pcpu_nr_empty_pop_pages and chunk->nr_populated 2014-09-02 14:46:05 -04:00
percpu-vm.c percpu: move region iterations out of pcpu_[de]populate_chunk() 2014-09-02 14:46:02 -04:00
percpu.c percpu: Fix trivial typos in comments 2015-03-24 13:41:54 -04:00
pgtable-generic.c mm: convert p[te|md]_mknonnuma and remaining page table manipulations 2015-02-12 18:54:08 -08:00
process_vm_access.c process_vm_access: switch to {compat_,}import_iovec() 2015-04-11 22:27:12 -04:00
quicklist.c
readahead.c writeback: implement and use inode_congested() 2015-06-02 08:33:35 -06:00
rmap.c memcg: add per cgroup dirty page accounting 2015-06-02 08:33:33 -06:00
shmem.c VFS: assorted weird filesystems: d_inode() annotations 2015-04-15 15:06:58 -04:00
slab.c mm: remove GFP_THISNODE 2015-04-14 16:49:03 -07:00
slab.h slub: make dead caches discard free slabs immediately 2015-02-12 18:54:10 -08:00
slab_common.c mm: slub: add kernel address sanitizer support for slub allocator 2015-02-13 21:21:41 -08:00
slob.c slob: make slob_alloc_node() static and remove EXPORT_SYMBOL() 2015-04-14 16:48:59 -07:00
slub.c mm: remove rest of ACCESS_ONCE() usages 2015-04-15 16:35:18 -07:00
sparse-vmemmap.c
sparse.c
swap.c mm: don't call __page_cache_release for hugetlb 2015-04-15 16:35:19 -07:00
swap_cgroup.c mm: page_cgroup: rename file to mm/swap_cgroup.c 2014-12-10 17:41:09 -08:00
swap_state.c mm: remove rest of ACCESS_ONCE() usages 2015-04-15 16:35:18 -07:00
swapfile.c mm: remove rest of ACCESS_ONCE() usages 2015-04-15 16:35:18 -07:00
truncate.c memcg: add per cgroup dirty page accounting 2015-06-02 08:33:33 -06:00
util.c mm: uninline and cleanup page-mapping related helpers 2015-04-15 16:35:19 -07:00
vmacache.c mm,vmacache: count number of system-wide flushes 2014-12-13 12:42:48 -08:00
vmalloc.c mm/vmalloc: get rid of dirty bitmap inside vmap_block structure 2015-04-15 16:35:18 -07:00
vmpressure.c mm/vmpressure.c: fix race in vmpressure_work_fn() 2014-12-02 17:32:07 -08:00
vmscan.c writeback: implement and use inode_congested() 2015-06-02 08:33:35 -06:00
vmstat.c vmstat: Reduce time interval to stat update on idle cpu 2015-02-11 17:06:07 -08:00
workingset.c list_lru: add helpers to isolate items 2015-02-12 18:54:10 -08:00
zbud.c mm/zpool: add name argument to create zpool 2015-02-12 18:54:12 -08:00
zpool.c mm/zpool: add name argument to create zpool 2015-02-12 18:54:12 -08:00
zsmalloc.c zsmalloc: remove extra cond_resched() in __zs_compact 2015-04-15 16:35:22 -07:00
zswap.c mm/zpool: add name argument to create zpool 2015-02-12 18:54:12 -08:00