linux/fs/ocfs2/dlm
Xue jiufei 6718cb5e0e ocfs2/dlm: fix possible convert=sion deadlock
We found there is a conversion deadlock when the owner of lockres
happened to crash before send DLM_PROXY_AST_MSG for a downconverting
lock.  The situation is as follows:

Node1                            Node2                  Node3
                           the owner of lockresA
lock_1 granted at EX mode
and call ocfs2_cluster_unlock
to decrease ex_holders.
                                                 converting lock_3 from
                                                 NL to EX
                           send DLM_PROXY_AST_MSG
                           to Node1, asking Node 1
                           to downconvert.
receiving DLM_PROXY_AST_MSG,
thread ocfs2dc send
DLM_CONVERT_LOCK_MSG
to Node2 to downconvert
lock_1(EX->NL).
                           lock_1 can be granted and
                           put it into pending_asts
                           list, return DLM_NORMAL.
                           then something happened
                           and Node2 crashed.
received DLM_NORMAL, waiting
for DLM_PROXY_AST_MSG.
                                               selected as the recovery
                                               master, receving migrate
                                               lock from Node1, queue
                                               lock_1 to the tail of
                                               converting list.

After dlm recovery, converting list in the master of lockresA(Node3)
will be: converting list head <-> lock_3(NL->EX) <->lock_1(EX<->NL).
Requested mode of lock_3 is not compatible with the granted mode of
lock_1, so it can not be granted.  and lock_1 can not downconvert
because covnerting queue is strictly FIFO.  So a deadlock is created.
We think function dlm_process_recovery_data() should queue_ast for
lock_1 or alter the order of lock_1 and lock_3, so dlm_thread can
process lock_1 first.  And if there are multiple downconverting locks,
they must convert form PR to NL, so no need to sort them.

Signed-off-by: joyce.xue <xuejiufei@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04 16:53:54 -07:00
..
Makefile ocfs2: remove versioning information 2014-01-21 16:19:41 -08:00
dlmapi.h
dlmast.c ocfs2: use list_for_each_entry() instead of list_for_each() 2013-09-11 15:56:36 -07:00
dlmcommon.h ocfs2: use list_for_each_entry() instead of list_for_each() 2013-09-11 15:56:36 -07:00
dlmconvert.c ocfs2: use list_for_each_entry() instead of list_for_each() 2013-09-11 15:56:36 -07:00
dlmconvert.h
dlmdebug.c ocfs2: remove NULL assignments on static 2014-06-04 16:53:53 -07:00
dlmdebug.h
dlmdomain.c ocfs2: fix deadlock risk when kmalloc failed in dlm_query_region_handler 2014-04-03 16:20:55 -07:00
dlmdomain.h
dlmlock.c ocfs2: remove NULL assignments on static 2014-06-04 16:53:53 -07:00
dlmmaster.c ocfs2: remove NULL assignments on static 2014-06-04 16:53:53 -07:00
dlmrecovery.c ocfs2/dlm: fix possible convert=sion deadlock 2014-06-04 16:53:54 -07:00
dlmthread.c ocfs2: use list_for_each_entry() instead of list_for_each() 2013-09-11 15:56:36 -07:00
dlmunlock.c ocfs2: use list_for_each_entry() instead of list_for_each() 2013-09-11 15:56:36 -07:00