a347c7ad8e
It is not allowed to reinit q->tag_set_list list entry while RCU grace
period has not completed yet, otherwise the following soft lockup in
blk_mq_sched_restart() happens:
[ 1064.252652] watchdog: BUG: soft lockup - CPU#12 stuck for 23s! [fio:9270]
[ 1064.254445] task: ffff99b912e8b900 task.stack: ffffa6d54c758000
[ 1064.254613] RIP: 0010:blk_mq_sched_restart+0x96/0x150
[ 1064.256510] Call Trace:
[ 1064.256664] <IRQ>
[ 1064.256824] blk_mq_free_request+0xea/0x100
[ 1064.256987] msg_io_conf+0x59/0xd0 [ibnbd_client]
[ 1064.257175] complete_rdma_req+0xf2/0x230 [ibtrs_client]
[ 1064.257340] ? ibtrs_post_recv_empty+0x4d/0x70 [ibtrs_core]
[ 1064.257502] ibtrs_clt_rdma_done+0xd1/0x1e0 [ibtrs_client]
[ 1064.257669] ib_create_qp+0x321/0x380 [ib_core]
[ 1064.257841] ib_process_cq_direct+0xbd/0x120 [ib_core]
[ 1064.258007] irq_poll_softirq+0xb7/0xe0
[ 1064.258165] __do_softirq+0x106/0x2a2
[ 1064.258328] irq_exit+0x92/0xa0
[ 1064.258509] do_IRQ+0x4a/0xd0
[ 1064.258660] common_interrupt+0x7a/0x7a
[ 1064.258818] </IRQ>
Meanwhile another context frees other queue but with the same set of
shared tags:
[ 1288.201183] INFO: task bash:5910 blocked for more than 180 seconds.
[ 1288.201833] bash D 0 5910 5820 0x00000000
[ 1288.202016] Call Trace:
[ 1288.202315] schedule+0x32/0x80
[ 1288.202462] schedule_timeout+0x1e5/0x380
[ 1288.203838] wait_for_completion+0xb0/0x120
[ 1288.204137] __wait_rcu_gp+0x125/0x160
[ 1288.204287] synchronize_sched+0x6e/0x80
[ 1288.204770] blk_mq_free_queue+0x74/0xe0
[ 1288.204922] blk_cleanup_queue+0xc7/0x110
[ 1288.205073] ibnbd_clt_unmap_device+0x1bc/0x280 [ibnbd_client]
[ 1288.205389] ibnbd_clt_unmap_dev_store+0x169/0x1f0 [ibnbd_client]
[ 1288.205548] kernfs_fop_write+0x109/0x180
[ 1288.206328] vfs_write+0xb3/0x1a0
[ 1288.206476] SyS_write+0x52/0xc0
[ 1288.206624] do_syscall_64+0x68/0x1d0
[ 1288.206774] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
What happened is the following:
1. There are several MQ queues with shared tags.
2. One queue is about to be freed and now task is in
blk_mq_del_queue_tag_set().
3. Other CPU is in blk_mq_sched_restart() and loops over all queues in
tag list in order to find hctx to restart.
Because linked list entry was modified in blk_mq_del_queue_tag_set()
without proper waiting for a grace period, blk_mq_sched_restart()
never ends, spining in list_for_each_entry_rcu_rr(), thus soft lockup.
Fix is simple: reinit list entry after an RCU grace period elapsed.
Fixes: Fixes:
|
||
---|---|---|
.. | ||
partitions | ||
badblocks.c | ||
bfq-cgroup.c | ||
bfq-iosched.c | ||
bfq-iosched.h | ||
bfq-wf2q.c | ||
bio-integrity.c | ||
bio.c | ||
blk-cgroup.c | ||
blk-core.c | ||
blk-exec.c | ||
blk-flush.c | ||
blk-integrity.c | ||
blk-ioc.c | ||
blk-lib.c | ||
blk-map.c | ||
blk-merge.c | ||
blk-mq-cpumap.c | ||
blk-mq-debugfs.c | ||
blk-mq-debugfs.h | ||
blk-mq-pci.c | ||
blk-mq-rdma.c | ||
blk-mq-sched.c | ||
blk-mq-sched.h | ||
blk-mq-sysfs.c | ||
blk-mq-tag.c | ||
blk-mq-tag.h | ||
blk-mq-virtio.c | ||
blk-mq.c | ||
blk-mq.h | ||
blk-settings.c | ||
blk-softirq.c | ||
blk-stat.c | ||
blk-stat.h | ||
blk-sysfs.c | ||
blk-tag.c | ||
blk-throttle.c | ||
blk-timeout.c | ||
blk-wbt.c | ||
blk-wbt.h | ||
blk-zoned.c | ||
blk.h | ||
bounce.c | ||
bsg-lib.c | ||
bsg.c | ||
cfq-iosched.c | ||
cmdline-parser.c | ||
compat_ioctl.c | ||
deadline-iosched.c | ||
elevator.c | ||
genhd.c | ||
ioctl.c | ||
ioprio.c | ||
Kconfig | ||
Kconfig.iosched | ||
kyber-iosched.c | ||
Makefile | ||
mq-deadline.c | ||
noop-iosched.c | ||
opal_proto.h | ||
partition-generic.c | ||
scsi_ioctl.c | ||
sed-opal.c | ||
t10-pi.c |