linux/fs/jbd
Neil Brown 28ae094c62 ext3 can fail badly when device stops accepting BIO_RW_BARRIER requests
Some devices - notably dm and md - can change their behaviour in response
to BIO_RW_BARRIER requests.  They might start out accepting such requests
but on reconfiguration, they find out that they cannot any more.

ext3 (and other filesystems) deal with this by always testing if
BIO_RW_BARRIER requests fail with EOPNOTSUPP, and retrying the write
requests without the barrier (probably after waiting for any pending writes
to complete).

However there is a bug in the handling for this for ext3.

When ext3 (jbd actually) decides to submit a BIO_RW_BARRIER request, it
sets the buffer_ordered flag on the buffer head.  If the request completes
successfully, the flag STAYS SET.

Other code might then write the same buffer_head after the device has been
reconfigured to not accept barriers.  This write will then fail, but the
"other code" is not ready to handle EOPNOTSUPP errors and the error will be
treated as fatal.

This can be seen without having to reconfigure a device at exactly the
wrong time by putting:

		if (buffer_ordered(bh))
			printk("OH DEAR, and ordered buffer\n");

in the while loop in "commit phase 5" of journal_commit_transaction.

If it ever prints the "OH DEAR ..." message (as it does sometimes for
me), then that request could (in different circumstances) have failed
with EOPNOTSUPP, but that isn't tested for.

My proposed fix is to clear the buffer_ordered flag after it has been
used, as in the following patch.

Signed-off-by: Neil Brown <neilb@suse.de>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08 09:22:44 -08:00
..
Makefile Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
checkpoint.c spinlock: lockbreak cleanup 2008-01-30 13:31:20 +01:00
commit.c ext3 can fail badly when device stops accepting BIO_RW_BARRIER requests 2008-02-08 09:22:44 -08:00
journal.c make jbd/journal.c:__journal_abort_hard() static 2008-02-06 10:41:20 -08:00
recovery.c BKL-removal: remove incorrect comment refering to lock_kernel() from jbd/jbd2 2008-02-06 10:41:20 -08:00
revoke.c Group short-lived and reclaimable kernel allocations 2007-10-16 09:43:00 -07:00
transaction.c jbd: do not try lock_acquire after handle made invalid 2008-01-17 15:38:59 -08:00