linux/fs/ocfs2/dlm
Joseph Qi 70e82a12db ocfs2: fix deadlock between o2hb thread and o2net_wq
The following case may lead to o2net_wq and o2hb thread deadlock on
o2hb_callback_sem.
Currently there are 2 nodes say N1, N2 in the cluster. And N2 down, at
the same time, N3 tries to join the cluster. So N1 will handle node
down (N2) and join (N3) simultaneously.
    o2hb                               o2net_wq
    ->o2hb_do_disk_heartbeat
    ->o2hb_check_slot
    ->o2hb_run_event_list
    ->o2hb_fire_callbacks
    ->down_write(&o2hb_callback_sem)
    ->o2net_hb_node_down_cb
    ->flush_workqueue(o2net_wq)
                                       ->o2net_process_message
                                       ->dlm_query_join_handler
                                       ->o2hb_check_node_heartbeating
                                       ->o2hb_fill_node_map
                                       ->down_read(&o2hb_callback_sem)

No need to take o2hb_callback_sem in dlm_query_join_handler,
o2hb_live_lock is enough to protect live node map.

Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
Cc: xMark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: jiangyiwen <jiangyiwen@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-09 22:25:47 -04:00
..
Makefile ocfs2: remove versioning information 2014-01-21 16:19:41 -08:00
dlmapi.h ocfs2/trivial: Remove trailing whitespaces 2010-01-25 19:20:51 -08:00
dlmast.c ocfs2: use list_for_each_entry() instead of list_for_each() 2013-09-11 15:56:36 -07:00
dlmcommon.h ocfs2/dlm: do not purge lockres that is queued for assert master 2014-06-23 16:47:45 -07:00
dlmconvert.c ocfs2: use list_for_each_entry() instead of list_for_each() 2013-09-11 15:56:36 -07:00
dlmconvert.h
dlmdebug.c fs/ocfs2/dlm/dlmdebug.c: use seq_open_private() not seq_open() 2014-10-09 22:25:47 -04:00
dlmdebug.h ocfs2/dlm: Cleanup dlmdebug.c 2010-12-22 18:34:44 -08:00
dlmdomain.c ocfs2: fix deadlock between o2hb thread and o2net_wq 2014-10-09 22:25:47 -04:00
dlmdomain.h
dlmlock.c ocfs2: remove NULL assignments on static 2014-06-04 16:53:53 -07:00
dlmmaster.c ocfs2: remove unused code in dlm_new_lockres() 2014-10-09 22:25:47 -04:00
dlmrecovery.c ocfs2/dlm: call dlm_lockres_put without resource spinlock 2014-10-09 22:25:47 -04:00
dlmthread.c ocfs2/dlm: do not purge lockres that is queued for assert master 2014-06-23 16:47:45 -07:00
dlmunlock.c ocfs2: fix deadlock when two nodes are converting same lock from PR to EX and idletimeout closes conn 2014-06-23 16:47:45 -07:00