linux/fs/ubifs
hujianyang 88785b28eb UBIFS: Remove incorrect assertion in shrink_tnc()
commit 72abc8f4b4 upstream.

I hit the same assert failed as Dolev Raviv reported in Kernel v3.10
shows like this:

[ 9641.164028] UBIFS assert failed in shrink_tnc at 131 (pid 13297)
[ 9641.234078] CPU: 1 PID: 13297 Comm: mmap.test Tainted: G           O 3.10.40 #1
[ 9641.234116] [<c0011a6c>] (unwind_backtrace+0x0/0x12c) from [<c000d0b0>] (show_stack+0x20/0x24)
[ 9641.234137] [<c000d0b0>] (show_stack+0x20/0x24) from [<c0311134>] (dump_stack+0x20/0x28)
[ 9641.234188] [<c0311134>] (dump_stack+0x20/0x28) from [<bf22425c>] (shrink_tnc_trees+0x25c/0x350 [ubifs])
[ 9641.234265] [<bf22425c>] (shrink_tnc_trees+0x25c/0x350 [ubifs]) from [<bf2245ac>] (ubifs_shrinker+0x25c/0x310 [ubifs])
[ 9641.234307] [<bf2245ac>] (ubifs_shrinker+0x25c/0x310 [ubifs]) from [<c00cdad8>] (shrink_slab+0x1d4/0x2f8)
[ 9641.234327] [<c00cdad8>] (shrink_slab+0x1d4/0x2f8) from [<c00d03d0>] (do_try_to_free_pages+0x300/0x544)
[ 9641.234344] [<c00d03d0>] (do_try_to_free_pages+0x300/0x544) from [<c00d0a44>] (try_to_free_pages+0x2d0/0x398)
[ 9641.234363] [<c00d0a44>] (try_to_free_pages+0x2d0/0x398) from [<c00c6a60>] (__alloc_pages_nodemask+0x494/0x7e8)
[ 9641.234382] [<c00c6a60>] (__alloc_pages_nodemask+0x494/0x7e8) from [<c00f62d8>] (new_slab+0x78/0x238)
[ 9641.234400] [<c00f62d8>] (new_slab+0x78/0x238) from [<c031081c>] (__slab_alloc.constprop.42+0x1a4/0x50c)
[ 9641.234419] [<c031081c>] (__slab_alloc.constprop.42+0x1a4/0x50c) from [<c00f80e8>] (kmem_cache_alloc_trace+0x54/0x188)
[ 9641.234459] [<c00f80e8>] (kmem_cache_alloc_trace+0x54/0x188) from [<bf227908>] (do_readpage+0x168/0x468 [ubifs])
[ 9641.234553] [<bf227908>] (do_readpage+0x168/0x468 [ubifs]) from [<bf2296a0>] (ubifs_readpage+0x424/0x464 [ubifs])
[ 9641.234606] [<bf2296a0>] (ubifs_readpage+0x424/0x464 [ubifs]) from [<c00c17c0>] (filemap_fault+0x304/0x418)
[ 9641.234638] [<c00c17c0>] (filemap_fault+0x304/0x418) from [<c00de694>] (__do_fault+0xd4/0x530)
[ 9641.234665] [<c00de694>] (__do_fault+0xd4/0x530) from [<c00e10c0>] (handle_pte_fault+0x480/0xf54)
[ 9641.234690] [<c00e10c0>] (handle_pte_fault+0x480/0xf54) from [<c00e2bf8>] (handle_mm_fault+0x140/0x184)
[ 9641.234716] [<c00e2bf8>] (handle_mm_fault+0x140/0x184) from [<c0316688>] (do_page_fault+0x150/0x3ac)
[ 9641.234737] [<c0316688>] (do_page_fault+0x150/0x3ac) from [<c000842c>] (do_DataAbort+0x3c/0xa0)
[ 9641.234759] [<c000842c>] (do_DataAbort+0x3c/0xa0) from [<c0314e38>] (__dabt_usr+0x38/0x40)

After analyzing the code, I found a condition that may cause this failed
in correct operations. Thus, I think this assertion is wrong and should be
removed.

Suppose there are two clean znodes and one dirty znode in TNC. So the
per-filesystem atomic_t @clean_zn_cnt is (2). If commit start, dirty_znode
is set to COW_ZNODE in get_znodes_to_commit() in case of potentially ops
on this znode. We clear COW bit and DIRTY bit in write_index() without
@tnc_mutex locked. We don't increase @clean_zn_cnt in this place. As the
comments in write_index() shows, if another process hold @tnc_mutex and
dirty this znode after we clean it, @clean_zn_cnt would be decreased to (1).
We will increase @clean_zn_cnt to (2) with @tnc_mutex locked in
free_obsolete_znodes() to keep it right.

If shrink_tnc() performs between decrease and increase, it will release
other 2 clean znodes it holds and found @clean_zn_cnt is less than zero
(1 - 2 = -1), then hit the assertion. Because free_obsolete_znodes() will
soon correct @clean_zn_cnt and no harm to fs in this case, I think this
assertion could be removed.

2 clean zondes and 1 dirty znode, @clean_zn_cnt == 2

Thread A (commit)         Thread B (write or others)       Thread C (shrinker)
->write_index
   ->clear_bit(DIRTY_NODE)
   ->clear_bit(COW_ZNODE)

            @clean_zn_cnt == 2
                          ->mutex_locked(&tnc_mutex)
                          ->dirty_cow_znode
                              ->!ubifs_zn_cow(znode)
                              ->!test_and_set_bit(DIRTY_NODE)
                              ->atomic_dec(&clean_zn_cnt)
                          ->mutex_unlocked(&tnc_mutex)

            @clean_zn_cnt == 1
                                                           ->mutex_locked(&tnc_mutex)
                                                           ->shrink_tnc
                                                             ->destroy_tnc_subtree
                                                             ->atomic_sub(&clean_zn_cnt, 2)
                                                             ->ubifs_assert  <- hit
                                                           ->mutex_unlocked(&tnc_mutex)

            @clean_zn_cnt == -1
->mutex_lock(&tnc_mutex)
->free_obsolete_znodes
   ->atomic_inc(&clean_zn_cnt)
->mutux_unlock(&tnc_mutex)

            @clean_zn_cnt == 0 (correct after shrink)

Signed-off-by: hujianyang <hujianyang@huawei.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-07-06 18:57:26 -07:00
..
Kconfig UBIFS: remove Kconfig debugging option 2012-05-16 19:53:46 +03:00
Makefile UBIFS: remove Kconfig debugging option 2012-05-16 19:53:46 +03:00
budget.c No big changes for 3.7 in UBIFS: 2012-10-02 20:47:48 -07:00
commit.c UBIFS: print less 2012-08-31 17:32:58 +03:00
compress.c UBIFS: comply with coding style 2012-08-31 17:32:57 +03:00
debug.c fs/ubifs: use rbtree postorder iteration helper instead of opencoding 2014-01-23 16:37:03 -08:00
debug.h UBIFS: print less 2012-08-31 17:32:58 +03:00
dir.c ubifs: switch to %pd 2013-10-24 23:34:51 -04:00
file.c UBIFS: fix an mmap and fsync race condition 2014-07-06 18:57:26 -07:00
find.c UBIFS: fix mounting problems after power cuts 2012-10-26 16:26:44 +03:00
gc.c UBIFS: remove unnecessary code in ubifs_garbage_collect 2013-10-22 13:34:27 +01:00
io.c UBI: Kill data type hint 2012-05-20 20:25:59 +03:00
ioctl.c new helper: file_inode(file) 2013-02-22 23:31:31 -05:00
journal.c ubifs: switch to %pd 2013-10-24 23:34:51 -04:00
key.h
log.c fs/ubifs: use rbtree postorder iteration helper instead of opencoding 2014-01-23 16:37:03 -08:00
lprops.c UBIFS: introduce categorized lprops counter 2012-10-26 16:00:26 +03:00
lpt.c UBIFS: print less 2012-08-31 17:32:58 +03:00
lpt_commit.c UBIFS: rename random32() to prandom_u32() 2013-01-15 15:45:27 +02:00
master.c UBI: Kill data type hint 2012-05-20 20:25:59 +03:00
misc.h
orphan.c fs/ubifs: use rbtree postorder iteration helper instead of opencoding 2014-01-23 16:37:03 -08:00
recovery.c fs/ubifs: use rbtree postorder iteration helper instead of opencoding 2014-01-23 16:37:03 -08:00
replay.c UBIFS: print less 2012-08-31 17:32:58 +03:00
sb.c No big changes for 3.7 in UBIFS: 2012-10-02 20:47:48 -07:00
scan.c UBIFS: comply with coding style 2012-08-31 17:32:57 +03:00
shrinker.c UBIFS: Remove incorrect assertion in shrink_tnc() 2014-07-06 18:57:26 -07:00
super.c fs/ubifs: use rbtree postorder iteration helper instead of opencoding 2014-01-23 16:37:03 -08:00
tnc.c fs/ubifs: use rbtree postorder iteration helper instead of opencoding 2014-01-23 16:37:03 -08:00
tnc_commit.c UBIFS: rename random32() to prandom_u32() 2013-01-15 15:45:27 +02:00
tnc_misc.c UBIFS: print less 2012-08-31 17:32:58 +03:00
ubifs-media.h
ubifs.h fs: convert fs shrinkers to new scan/count API 2013-09-10 18:56:31 -04:00
xattr.c ubifs: switch to %pd 2013-10-24 23:34:51 -04:00