linux/mm
Greg Thelen f9fa1d919c kasan: drain quarantine of memcg slab objects
Per memcg slab accounting and kasan have a problem with kmem_cache
destruction.
 - kmem_cache_create() allocates a kmem_cache, which is used for
   allocations from processes running in root (top) memcg.
 - Processes running in non root memcg and allocating with either
   __GFP_ACCOUNT or from a SLAB_ACCOUNT cache use a per memcg
   kmem_cache.
 - Kasan catches use-after-free by having kfree() and kmem_cache_free()
   defer freeing of objects. Objects are placed in a quarantine.
 - kmem_cache_destroy() destroys root and non root kmem_caches. It takes
   care to drain the quarantine of objects from the root memcg's
   kmem_cache, but ignores objects associated with non root memcg. This
   causes leaks because quarantined per memcg objects refer to per memcg
   kmem cache being destroyed.

To see the problem:

 1) create a slab cache with kmem_cache_create(,,,SLAB_ACCOUNT,)
 2) from non root memcg, allocate and free a few objects from cache
 3) dispose of the cache with kmem_cache_destroy() kmem_cache_destroy()
    will trigger a "Slab cache still has objects" warning indicating
    that the per memcg kmem_cache structure was leaked.

Fix the leak by draining kasan quarantined objects allocated from non
root memcg.

Racing memcg deletion is tricky, but handled.  kmem_cache_destroy() =>
shutdown_memcg_caches() => __shutdown_memcg_cache() => shutdown_cache()
flushes per memcg quarantined objects, even if that memcg has been
rmdir'd and gone through memcg_deactivate_kmem_caches().

This leak only affects destroyed SLAB_ACCOUNT kmem caches when kasan is
enabled.  So I don't think it's worth patching stable kernels.

Link: http://lkml.kernel.org/r/1482257462-36948-1-git-send-email-gthelen@google.com
Signed-off-by: Greg Thelen <gthelen@google.com>
Reviewed-by: Vladimir Davydov <vdavydov.dev@gmail.com>
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-24 17:46:56 -08:00
..
kasan kasan: drain quarantine of memcg slab objects 2017-02-24 17:46:56 -08:00
Kconfig mm: THP page cache support for ppc64 2016-12-12 18:55:08 -08:00
Kconfig.debug PM / Hibernate: allow hibernation with PAGE_POISONING_ZERO 2016-09-13 02:35:27 +02:00
Makefile mm: introduce page_vma_mapped_walk() 2017-02-24 17:46:55 -08:00
backing-dev.c mm/backing-dev.c: use rb_entry() 2017-02-22 16:41:30 -08:00
balloon_compaction.c mm: balloon: use general non-lru movable page feature 2016-07-26 16:19:19 -07:00
bootmem.c mm/bootmem.c: cosmetic improvement of code readability 2017-02-22 16:41:29 -08:00
cleancache.c
cma.c mm: cma: print allocation failure reason and bitmap status 2017-02-24 17:46:55 -08:00
cma.h
cma_debug.c mm: cma_alloc: allow to specify GFP mask 2017-02-24 17:46:55 -08:00
compaction.c mm/migration: make isolate_movable_page() return int type 2017-02-24 17:46:55 -08:00
debug.c mm, debug: print raw struct page data in __dump_page() 2016-12-12 18:55:08 -08:00
debug_page_ref.c mm/page_ref: add tracepoint to track down page reference manipulation 2016-03-17 15:09:34 -07:00
dmapool.c mm: cleanups for printing phys_addr_t and dma_addr_t 2017-02-24 17:46:56 -08:00
early_ioremap.c
fadvise.c mm: fadvise: avoid expensive remote LRU cache draining after FADV_DONTNEED 2016-12-20 09:48:46 -08:00
failslab.c
filemap.c mm: do not access page->mapping directly on page_endio 2017-02-24 17:46:56 -08:00
frame_vector.c mm: replace get_vaddr_frames() write/force parameters with gup_flags 2016-10-19 08:11:24 -07:00
frontswap.c mm, frontswap: convert frontswap_enabled to static key 2016-07-26 16:19:19 -07:00
gup.c mm/gup: check for protnone only if it is a PTE entry 2017-02-24 17:46:56 -08:00
highmem.c mm/highmem: make nr_free_highpages() handles all highmem zones by itself 2016-05-19 19:12:14 -07:00
huge_memory.c mm/thp/autonuma: use TNF flag instead of vm fault 2017-02-24 17:46:56 -08:00
hugetlb.c mm: alloc_contig_range: allow to specify GFP mask 2017-02-24 17:46:55 -08:00
hugetlb_cgroup.c mm, hugetlb_cgroup: round limit_in_bytes down to hugepage size 2016-05-20 17:58:30 -07:00
hwpoison-inject.c
init-mm.c mm: Add a user_ns owner to mm_struct and fix ptrace permission checks 2016-11-22 11:49:48 -06:00
internal.h mm, rmap: check all VMAs that PTE-mapped THP can be part of 2017-02-24 17:46:55 -08:00
interval_tree.c
khugepaged.c mm: get rid of __GFP_OTHER_NODE 2017-01-10 18:31:55 -08:00
kmemcheck.c mm: convert printk(KERN_<LEVEL> to pr_<level> 2016-03-17 15:09:34 -07:00
kmemleak-test.c mm: convert printk(KERN_<LEVEL> to pr_<level> 2016-03-17 15:09:34 -07:00
kmemleak.c kmemleak: fix reference to Documentation 2016-12-12 18:55:07 -08:00
ksm.c mm/ksm: handle protnone saved writes when making page write protect 2017-02-24 17:46:56 -08:00
list_lru.c mm/list_lru.c: avoid error-path NULL pointer deref 2016-10-27 18:43:42 -07:00
maccess.c x86: remove more uaccess_32.h complexity 2016-05-22 17:21:27 -07:00
madvise.c mm: remove shmem_mapping() shmem_zero_setup() duplicates 2017-02-24 17:46:56 -08:00
memblock.c memblock: embed memblock type name within struct memblock_type 2017-02-24 17:46:54 -08:00
memcontrol.c mm: remove shmem_mapping() shmem_zero_setup() duplicates 2017-02-24 17:46:56 -08:00
memory-failure.c HWPOISON: soft offlining for non-lru movable page 2017-02-24 17:46:55 -08:00
memory.c mm/autonuma: let architecture override how the write bit should be stashed in a protnone pte. 2017-02-24 17:46:56 -08:00
memory_hotplug.c memory-hotplug: use dev_online for memhp_auto_online 2017-02-24 17:46:56 -08:00
mempolicy.c mm/mempolicy.c: do not put mempolicy before using its nodemask 2017-01-24 16:26:14 -08:00
mempool.c Revert "mm, mempool: only set __GFP_NOMEMALLOC if there are free elements" 2016-07-28 16:07:41 -07:00
memtest.c
migrate.c mm: convert remove_migration_pte() to use page_vma_mapped_walk() 2017-02-24 17:46:55 -08:00
mincore.c mm: remove shmem_mapping() shmem_zero_setup() duplicates 2017-02-24 17:46:56 -08:00
mlock.c thp: fix corner case of munlock() of PTE-mapped THPs 2016-11-30 16:32:52 -08:00
mm_init.c mm: convert printk(KERN_<LEVEL> to pr_<level> 2016-03-17 15:09:34 -07:00
mmap.c mm, madvise: fail with ENOMEM when splitting vma will hit max_map_count 2017-02-24 17:46:55 -08:00
mmu_context.c mm/mmu_context, sched/core: Fix mmu_context.h assumption 2016-04-28 11:44:19 +02:00
mmu_notifier.c fix Christoph's email addresses 2016-03-17 15:09:34 -07:00
mmzone.c mm/mmzone.c: swap likely to unlikely as code logic is different for next_zones_zonelist() 2017-02-22 16:41:29 -08:00
mprotect.c mm/autonuma: let architecture override how the write bit should be stashed in a protnone pte. 2017-02-24 17:46:56 -08:00
mremap.c userfaultfd: non-cooperative: add event for memory unmaps 2017-02-24 17:46:55 -08:00
msync.c
nobootmem.c mm: kmemleak: avoid using __va() on addresses that don't have a lowmem mapping 2016-10-11 15:06:33 -07:00
nommu.c userfaultfd: non-cooperative: add event for memory unmaps 2017-02-24 17:46:55 -08:00
oom_kill.c mm, oom: header nodemask is NULL when cpusets are disabled 2017-02-24 17:46:53 -08:00
page-writeback.c mm/page-writeback.c: place "not" inside of unlikely() statement in wb_domain_writeout_inc() 2017-02-24 17:46:56 -08:00
page_alloc.c mm/page_alloc.c: remove redundant init code for ZONE_MOVABLE 2017-02-24 17:46:56 -08:00
page_counter.c
page_ext.c mm/page_ext: support extra space allocation by page_ext user 2016-10-07 18:46:27 -07:00
page_idle.c mm: fix handling PTE-mapped THPs in page_idle_clear_pte_refs() 2017-02-24 17:46:55 -08:00
page_io.c writeback: add wbc_to_write_flags() 2016-11-02 10:24:03 -06:00
page_isolation.c mm, page_alloc: avoid page_to_pfn() when merging buddies 2017-02-22 16:41:27 -08:00
page_owner.c mm/page_owner: don't define fields on struct page_ext by hard-coding 2016-10-07 18:46:27 -07:00
page_poison.c mm: check the return value of lookup_page_ext for all call sites 2016-06-03 15:06:22 -07:00
page_vma_mapped.c mm: convert page_mapped_in_vma() to use page_vma_mapped_walk() 2017-02-24 17:46:55 -08:00
pagewalk.c mm, x86: add support for PUD-sized transparent hugepages 2017-02-24 17:46:54 -08:00
percpu-km.c mm: percpu: use pr_fmt to prefix output 2016-03-17 15:09:34 -07:00
percpu-vm.c
percpu.c Merge branch 'for-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu 2016-12-13 12:34:47 -08:00
pgtable-generic.c mm, x86: add support for PUD-sized transparent hugepages 2017-02-24 17:46:54 -08:00
process_vm_access.c mm: unexport __get_user_pages_unlocked() 2016-12-14 16:04:09 -08:00
quicklist.c fix Christoph's email addresses 2016-03-17 15:09:34 -07:00
readahead.c mm: don't cap request size based on read-ahead setting 2016-12-12 18:55:08 -08:00
rmap.c mm: drop page_check_address{,_transhuge} 2017-02-24 17:46:55 -08:00
shmem.c mm/shmem.c: fix unlikely() test of info->seals to test only for WRITE and GROW 2017-02-24 17:46:56 -08:00
slab.c slab: introduce __kmemcg_cache_deactivate() 2017-02-22 16:41:27 -08:00
slab.h slab: remove synchronous synchronize_sched() from memcg cache deactivation path 2017-02-22 16:41:27 -08:00
slab_common.c kasan: drain quarantine of memcg slab objects 2017-02-24 17:46:56 -08:00
slob.c slab: introduce __kmemcg_cache_deactivate() 2017-02-22 16:41:27 -08:00
slub.c slub: make sysfs directories for memcg sub-caches optional 2017-02-22 16:41:27 -08:00
sparse-vmemmap.c treewide: replace obsolete _refok by __ref 2016-08-02 17:31:41 -04:00
sparse.c mm/memory_hotplug: set magic number to page->freelist instead of page->lru.next 2017-02-22 16:41:29 -08:00
swap.c mm: vmscan: move dirty pages out of the way until they're flushed 2017-02-24 17:46:54 -08:00
swap_cgroup.c mm: convert printk(KERN_<LEVEL> to pr_<level> 2016-03-17 15:09:34 -07:00
swap_slots.c mm/swap: skip readahead only when swap slot cache is enabled 2017-02-22 16:41:30 -08:00
swap_state.c mm/swap: skip readahead only when swap slot cache is enabled 2017-02-22 16:41:30 -08:00
swapfile.c mm/swap: enable swap slots cache usage 2017-02-22 16:41:30 -08:00
truncate.c mm: remove shmem_mapping() shmem_zero_setup() duplicates 2017-02-24 17:46:56 -08:00
usercopy.c mm/usercopy: Switch to using lm_alias 2017-01-11 13:56:50 +00:00
userfaultfd.c userfaultfd: mcopy_atomic: return -ENOENT when no compatible VMA found 2017-02-24 17:46:55 -08:00
util.c userfaultfd: non-cooperative: add event for memory unmaps 2017-02-24 17:46:55 -08:00
vmacache.c mm: unrig VMA cache hit ratio 2016-10-07 18:46:27 -07:00
vmalloc.c mm: cleanups for printing phys_addr_t and dma_addr_t 2017-02-24 17:46:56 -08:00
vmpressure.c mm: vmpressure: fix sending wrong events on underflow 2017-02-24 17:46:56 -08:00
vmscan.c mm, vmscan: clear PGDAT_WRITEBACK when zone is balanced 2017-02-24 17:46:55 -08:00
vmstat.c mm, compaction: add vmstats for kcompactd work 2017-02-22 16:41:29 -08:00
workingset.c mm: remove shmem_mapping() shmem_zero_setup() duplicates 2017-02-24 17:46:56 -08:00
z3fold.c z3fold: add kref refcounting 2017-02-24 17:46:54 -08:00
zbud.c
zpool.c
zsmalloc.c mm/zsmalloc: fix comment in zsmalloc 2017-02-24 17:46:56 -08:00
zswap.c zswap: disable changing params if init fails 2017-02-03 14:13:19 -08:00