linux/block
Jan Kara 5e4c0d9741 lib/radix-tree.c: make radix_tree_node_alloc() work correctly within interrupt
With users of radix_tree_preload() run from interrupt (block/blk-ioc.c is
one such possible user), the following race can happen:

radix_tree_preload()
...
radix_tree_insert()
  radix_tree_node_alloc()
    if (rtp->nr) {
      ret = rtp->nodes[rtp->nr - 1];
<interrupt>
...
radix_tree_preload()
...
radix_tree_insert()
  radix_tree_node_alloc()
    if (rtp->nr) {
      ret = rtp->nodes[rtp->nr - 1];

And we give out one radix tree node twice.  That clearly results in radix
tree corruption with different results (usually OOPS) depending on which
two users of radix tree race.

We fix the problem by making radix_tree_node_alloc() always allocate fresh
radix tree nodes when in interrupt.  Using preloading when in interrupt
doesn't make sense since all the allocations have to be atomic anyway and
we cannot steal nodes from process-context users because some users rely
on radix_tree_insert() succeeding after radix_tree_preload().
in_interrupt() check is somewhat ugly but we cannot simply key off passed
gfp_mask as that is acquired from root_gfp_mask() and thus the same for
all preload users.

Another part of the fix is to avoid node preallocation in
radix_tree_preload() when passed gfp_mask doesn't allow waiting.  Again,
preallocation in such case doesn't make sense and when preallocation would
happen in interrupt we could possibly leak some allocated nodes.  However,
some users of radix_tree_preload() require following radix_tree_insert()
to succeed.  To avoid unexpected effects for these users,
radix_tree_preload() only warns if passed gfp mask doesn't allow waiting
and we provide a new function radix_tree_maybe_preload() for those users
which get different gfp mask from different call sites and which are
prepared to handle radix_tree_insert() failure.

Signed-off-by: Jan Kara <jack@suse.cz>
Cc: Jens Axboe <jaxboe@fusionio.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-11 15:59:36 -07:00
..
partitions block/partitions/efi.c: consistently use pr_foo() 2013-09-11 15:59:19 -07:00
Kconfig block: support embedded device command line partition 2013-09-11 15:56:57 -07:00
Kconfig.iosched blkcg: make CONFIG_BLK_CGROUP bool 2012-03-06 21:27:21 +01:00
Makefile block: support embedded device command line partition 2013-09-11 15:56:57 -07:00
blk-cgroup.c cgroup: make css_for_each_descendant() and friends include the origin css in the iteration 2013-08-08 20:11:27 -04:00
blk-cgroup.h cgroup: make css_for_each_descendant() and friends include the origin css in the iteration 2013-08-08 20:11:27 -04:00
blk-core.c [SCSI] Return ENODATA on medium error 2013-08-23 12:54:53 -04:00
blk-exec.c Merge branch 'for-3.9/core' of git://git.kernel.dk/linux-block 2013-02-28 12:52:24 -08:00
blk-flush.c Block: blk-flush: Fixed indent code style 2013-03-22 12:22:51 -06:00
blk-integrity.c scatterlist: introduce sg_unmark_end 2013-03-20 15:43:04 +10:30
blk-ioc.c lib/radix-tree.c: make radix_tree_node_alloc() work correctly within interrupt 2013-09-11 15:59:36 -07:00
blk-iopoll.c block: delete __cpuinit usage from all block files 2013-07-14 19:36:59 -04:00
blk-lib.c block: account iowait time when waiting for completion of IO request 2013-02-15 16:45:07 +01:00
blk-map.c block: re-use existing 'reading' variable instead of checking direction again 2011-12-21 15:27:24 +01:00
blk-merge.c scatterlist: introduce sg_unmark_end 2013-03-20 15:43:04 +10:30
blk-settings.c block: discard granularity might not be power of 2 2012-12-14 20:46:04 +01:00
blk-softirq.c block: delete __cpuinit usage from all block files 2013-07-14 19:36:59 -04:00
blk-sysfs.c block/blk-sysfs.c: replace strict_strtoul() with kstrtoul() 2013-09-11 15:56:56 -07:00
blk-tag.c block: Reserve only one queue tag for sync IO if only 3 tags are available 2013-06-28 21:32:27 +02:00
blk-throttle.c cgroup: make css_for_each_descendant() and friends include the origin css in the iteration 2013-08-08 20:11:27 -04:00
blk-timeout.c block: check for timeout function in blk_rq_timed_out() 2013-07-01 17:31:23 +02:00
blk.h block,elevator: use new hashtable implementation 2013-01-11 14:43:13 +01:00
bsg-lib.c bsg: Remove unused function bsg_goose_queue() 2012-12-06 14:33:02 +01:00
bsg.c hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
cfq-iosched.c cgroup: pass around cgroup_subsys_state instead of cgroup in file methods 2013-08-08 20:11:24 -04:00
cmdline-parser.c block: support embedded device command line partition 2013-09-11 15:56:57 -07:00
compat_ioctl.c kernel-wide: fix missing validations on __get/__put/__copy_to/__copy_from_user() 2013-09-11 15:58:18 -07:00
deadline-iosched.c elevator: Fix a race in elevator switching 2013-07-03 13:25:24 +02:00
elevator.c elevator: Fix a race in elevator switching 2013-07-03 13:25:24 +02:00
genhd.c block: do not pass disk names as format strings 2013-07-03 16:07:25 -07:00
ioctl.c Merge branch 'for-3.7/core' of git://git.kernel.dk/linux-block 2012-10-11 09:04:23 +09:00
noop-iosched.c elevator: Fix a race in elevator switching 2013-07-03 13:25:24 +02:00
partition-generic.c Revert "loop: cleanup partitions when detaching loop device" 2013-04-08 10:12:11 +02:00
scsi_ioctl.c aio: don't include aio.h in sched.h 2013-05-07 20:16:25 -07:00