linux/fs/xfs
Dave Chinner 7271d243f9 xfs: don't serialise adjacent concurrent direct IO appending writes
For append write workloads, extending the file requires a certain
amount of exclusive locking to be done up front to ensure sanity in
things like ensuring that we've zeroed any allocated regions
between the old EOF and the start of the new IO.

For single threads, this typically isn't a problem, and for large
IOs we don't serialise enough for it to be a problem for two
threads on really fast block devices. However for smaller IO and
larger thread counts we have a problem.

Take 4 concurrent sequential, single block sized and aligned IOs.
After the first IO is submitted but before it completes, we end up
with this state:

        IO 1    IO 2    IO 3    IO 4
      +-------+-------+-------+-------+
      ^       ^
      |       |
      |       |
      |       |
      |       \- ip->i_new_size
      \- ip->i_size

And the IO is done without exclusive locking because offset <=
ip->i_size. When we submit IO 2, we see offset > ip->i_size, and
grab the IO lock exclusive, because there is a chance we need to do
EOF zeroing. However, there is already an IO in progress that avoids
the need for IO zeroing because offset <= ip->i_new_size. hence we
could avoid holding the IO lock exlcusive for this. Hence after
submission of the second IO, we'd end up this state:

        IO 1    IO 2    IO 3    IO 4
      +-------+-------+-------+-------+
      ^               ^
      |               |
      |               |
      |               |
      |               \- ip->i_new_size
      \- ip->i_size

There is no need to grab the i_mutex of the IO lock in exclusive
mode if we don't need to invalidate the page cache. Taking these
locks on every direct IO effective serialises them as taking the IO
lock in exclusive mode has to wait for all shared holders to drop
the lock. That only happens when IO is complete, so effective it
prevents dispatch of concurrent direct IO writes to the same inode.

And so you can see that for the third concurrent IO, we'd avoid
exclusive locking for the same reason we avoided the exclusive lock
for the second IO.

Fixing this is a bit more complex than that, because we need to hold
a write-submission local value of ip->i_new_size to that clearing
the value is only done if no other thread has updated it before our
IO completes.....

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Alex Elder <aelder@sgi.com>
2011-10-11 21:14:59 -05:00
..
Kconfig quota: Make QUOTACTL config be selected by its users 2010-10-05 12:16:37 +02:00
Makefile xfs: fix tracing builds inside the source tree 2011-08-22 16:37:24 -05:00
kmem.c xfs: remove subdirectories 2011-08-12 16:21:35 -05:00
kmem.h xfs: remove subdirectories 2011-08-12 16:21:35 -05:00
mrlock.h xfs: remove subdirectories 2011-08-12 16:21:35 -05:00
time.h xfs: remove subdirectories 2011-08-12 16:21:35 -05:00
uuid.c xfs: remove subdirectories 2011-08-12 16:21:35 -05:00
uuid.h xfs: remove subdirectories 2011-08-12 16:21:35 -05:00
xfs.h xfs: don't expect xfs headers to be in subdirectories 2011-08-12 13:57:55 -05:00
xfs_acl.c xfs: remove subdirectories 2011-08-12 16:21:35 -05:00
xfs_acl.h xfs: Fix build breakage in xfs_iops.c when CONFIG_FS_POSIX_ACL is not set 2011-08-01 02:35:04 -04:00
xfs_ag.h xfs: Remove the macro XFS_BUF_PTR 2011-07-25 15:03:13 -05:00
xfs_alloc.c xfs: Remove the macro XFS_BUF_ERROR and family 2011-07-25 14:57:46 -05:00
xfs_alloc.h xfs: do not discard alloc btree blocks 2011-05-24 11:17:22 -05:00
xfs_alloc_btree.c xfs: remove leftovers of the old btree tracing code 2011-07-13 13:43:50 +02:00
xfs_alloc_btree.h
xfs_aops.c xfs: fix a use after free in xfs_end_io_direct_write 2011-09-14 08:56:35 -05:00
xfs_aops.h xfs: remove subdirectories 2011-08-12 16:21:35 -05:00
xfs_attr.c xfs: Remove the macro XFS_BUF_ERROR and family 2011-07-25 14:57:46 -05:00
xfs_attr.h xfs: convert attr to use unsigned names 2010-01-20 10:47:48 +11:00
xfs_attr_leaf.c xfs: byteswap constants instead of variables 2011-07-08 14:36:05 +02:00
xfs_attr_leaf.h [XFS] Remove macro-to-function indirections in attr code 2009-01-09 15:46:44 +11:00
xfs_attr_sf.h xfs: convert attr to use unsigned names 2010-01-20 10:47:48 +11:00
xfs_bit.c
xfs_bit.h [XFS] Remove macro-to-function indirections in the mask code 2009-01-09 15:53:54 +11:00
xfs_bmap.c Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux 2011-08-08 07:06:24 -05:00
xfs_bmap.h xfs: remove the unused XFS_BMAPI_RSVBLOCKS flag 2011-05-25 10:48:36 -05:00
xfs_bmap_btree.c xfs: remove leftovers of the old btree tracing code 2011-07-13 13:43:50 +02:00
xfs_bmap_btree.h xfs: make several more functions static 2010-01-15 15:31:38 -06:00
xfs_btree.c xfs: Remove the macro XFS_BUF_ERROR and family 2011-07-25 14:57:46 -05:00
xfs_btree.h xfs: Remove the macro XFS_BUF_PTR 2011-07-25 15:03:13 -05:00
xfs_buf.c xfs: remove subdirectories 2011-08-12 16:21:35 -05:00
xfs_buf.h xfs: remove subdirectories 2011-08-12 16:21:35 -05:00
xfs_buf_item.c "xfs: fix error handling for synchronous writes" revisited 2011-08-10 17:00:21 -05:00
xfs_buf_item.h xfs: use struct list_head for the buf cancel table 2010-12-16 16:05:22 -06:00
xfs_da_btree.c Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux 2011-08-08 07:06:24 -05:00
xfs_da_btree.h xfs: remove the dead XFS_DABUF_DEBUG code 2011-07-13 13:43:50 +02:00
xfs_dfrag.c xfs: cleanup duplicate initializations 2011-04-28 13:25:29 -05:00
xfs_dfrag.h xfs: clean up inconsistent variable naming in xfs_swap_extent 2010-01-15 15:31:23 -06:00
xfs_dinode.h xfs: Remove the macro XFS_BUF_PTR 2011-07-25 15:03:13 -05:00
xfs_dir2.c xfs: get rid of open-coded S_ISREG(), etc. 2011-07-26 15:05:16 -04:00
xfs_dir2.h xfs: reshuffle dir2 headers 2011-07-13 13:43:48 +02:00
xfs_dir2_block.c xfs: reshuffle dir2 headers 2011-07-13 13:43:48 +02:00
xfs_dir2_data.c xfs: reshuffle dir2 headers 2011-07-13 13:43:48 +02:00
xfs_dir2_format.h xfs: cleanup struct xfs_dir2_free 2011-07-13 13:43:48 +02:00
xfs_dir2_leaf.c xfs: factor out xfs_dir2_leaf_find_stale 2011-07-13 13:43:48 +02:00
xfs_dir2_node.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2011-07-25 13:56:39 -07:00
xfs_dir2_priv.h xfs: reshuffle dir2 headers 2011-07-13 13:43:48 +02:00
xfs_dir2_sf.c xfs: reshuffle dir2 headers 2011-07-13 13:43:48 +02:00
xfs_discard.c xfs: remove subdirectories 2011-08-12 16:21:35 -05:00
xfs_discard.h xfs: remove subdirectories 2011-08-12 16:21:35 -05:00
xfs_dquot.c xfs: remove subdirectories 2011-08-12 16:21:35 -05:00
xfs_dquot.h xfs: remove subdirectories 2011-08-12 16:21:35 -05:00
xfs_dquot_item.c xfs: remove subdirectories 2011-08-12 16:21:35 -05:00
xfs_dquot_item.h xfs: remove subdirectories 2011-08-12 16:21:35 -05:00
xfs_error.c xfs: Convert remaining cmn_err() callers to new API 2011-03-07 10:08:35 +11:00
xfs_error.h xfs: kill support/debug.[ch] 2011-03-07 10:09:35 +11:00
xfs_export.c xfs: remove subdirectories 2011-08-12 16:21:35 -05:00
xfs_export.h xfs: remove subdirectories 2011-08-12 16:21:35 -05:00
xfs_extfree_item.c xfs: fix efi item leak on forced shutdown 2011-01-28 09:01:33 -06:00
xfs_extfree_item.h xfs: Pull EFI/EFD handling out from under the AIL lock 2010-12-20 11:59:49 +11:00
xfs_file.c xfs: don't serialise adjacent concurrent direct IO appending writes 2011-10-11 21:14:59 -05:00
xfs_filestream.c xfs: fix misspelled S_IS...() 2011-07-26 15:05:30 -04:00
xfs_filestream.h xfs: clean up filestreams helpers 2010-07-26 13:16:51 -05:00
xfs_fs.h xfs: consolidate & clarify mount sanity checks 2011-07-08 11:32:51 -05:00
xfs_fs_subr.c xfs: remove subdirectories 2011-08-12 16:21:35 -05:00
xfs_fsops.c Revert "xfs: fix filesystsem freeze race in xfs_trans_alloc" 2011-07-11 10:21:03 -05:00
xfs_fsops.h xfs: ensure log covering transactions are synchronous 2011-01-11 20:28:17 -06:00
xfs_globals.c xfs: remove subdirectories 2011-08-12 16:21:35 -05:00
xfs_ialloc.c xfs: Remove the macro XFS_BUF_ERROR and family 2011-07-25 14:57:46 -05:00
xfs_ialloc.h xfs: rationalize xfs_inobt_lookup* 2009-09-01 12:45:39 -05:00
xfs_ialloc_btree.c xfs: remove leftovers of the old btree tracing code 2011-07-13 13:43:50 +02:00
xfs_ialloc_btree.h xfs: remove superflous inobt macros 2009-02-09 08:37:14 +01:00
xfs_iget.c xfs: remove leftovers of the old btree tracing code 2011-07-13 13:43:50 +02:00
xfs_inode.c Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux 2011-08-08 07:06:24 -05:00
xfs_inode.h xfs: get rid of open-coded S_ISREG(), etc. 2011-07-26 15:05:16 -04:00
xfs_inode_item.c xfs: remove wrappers around b_fspriv 2011-07-13 13:43:49 +02:00
xfs_inode_item.h xfs: simplify inode to transaction joining 2010-07-26 13:16:36 -05:00
xfs_inum.h xfs: cleanup shortform directory inode number handling 2011-07-08 14:35:03 +02:00
xfs_ioctl.c xfs: remove subdirectories 2011-08-12 16:21:35 -05:00
xfs_ioctl.h xfs: remove subdirectories 2011-08-12 16:21:35 -05:00
xfs_ioctl32.c xfs: remove subdirectories 2011-08-12 16:21:35 -05:00
xfs_ioctl32.h xfs: remove subdirectories 2011-08-12 16:21:35 -05:00
xfs_iomap.c Revert "xfs: fix filesystsem freeze race in xfs_trans_alloc" 2011-07-11 10:21:03 -05:00
xfs_iomap.h xfs: kill xfs_iomap 2010-12-16 16:05:51 -06:00
xfs_iops.c xfs: fix xfs_mark_inode_dirty during umount 2011-08-31 17:59:39 -05:00
xfs_iops.h xfs: remove subdirectories 2011-08-12 16:21:35 -05:00
xfs_itable.c xfs: fix variable set but not used warnings 2011-04-08 08:09:12 -05:00
xfs_itable.h xfs: remove block number from inode lookup code 2010-06-24 11:35:17 +10:00
xfs_linux.h xfs: remove subdirectories 2011-08-12 16:21:35 -05:00
xfs_log.c xfs: Remove the macro XFS_BUF_SET_PTR 2011-07-25 15:03:17 -05:00
xfs_log.h xfs: exact busy extent tracking 2011-04-28 13:18:04 -05:00
xfs_log_cil.c xfs: add online discard support 2011-05-24 11:17:13 -05:00
xfs_log_priv.h xfs: exact busy extent tracking 2011-04-28 13:18:04 -05:00
xfs_log_recover.c xfs: replace xfs_buf_geterror() with bp->b_error 2011-08-12 13:39:40 -05:00
xfs_log_recover.h xfs: Clean up XFS_BLI_* flag namespace 2010-05-24 10:33:39 -05:00
xfs_message.c xfs: remove subdirectories 2011-08-12 16:21:35 -05:00
xfs_message.h xfs: remove subdirectories 2011-08-12 16:21:35 -05:00
xfs_mount.c Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux 2011-08-08 07:06:24 -05:00
xfs_mount.h xfs: Remove the second parameter to xfs_sb_count() 2011-07-20 18:35:03 -05:00
xfs_mru_cache.c xfs: convert to alloc_workqueue() 2011-02-01 11:42:43 +01:00
xfs_mru_cache.h xfs: Kill filestreams cache flush 2010-01-15 15:34:22 -06:00
xfs_qm.c xfs: remove subdirectories 2011-08-12 16:21:35 -05:00
xfs_qm.h xfs: remove subdirectories 2011-08-12 16:21:35 -05:00
xfs_qm_bhv.c xfs: remove subdirectories 2011-08-12 16:21:35 -05:00
xfs_qm_stats.c xfs: remove subdirectories 2011-08-12 16:21:35 -05:00
xfs_qm_stats.h xfs: remove subdirectories 2011-08-12 16:21:35 -05:00
xfs_qm_syscalls.c xfs: remove subdirectories 2011-08-12 16:21:35 -05:00
xfs_quota.h xfs: Convert xlog_warn to new logging interface 2011-03-07 10:01:35 +11:00
xfs_quota_priv.h xfs: remove subdirectories 2011-08-12 16:21:35 -05:00
xfs_quotaops.c xfs: remove subdirectories 2011-08-12 16:21:35 -05:00
xfs_rename.c xfs: get rid of open-coded S_ISREG(), etc. 2011-07-26 15:05:16 -04:00
xfs_rtalloc.c xfs: Remove the macro XFS_BUF_PTR 2011-07-25 15:03:13 -05:00
xfs_rtalloc.h xfs: Remove the macro XFS_BUF_PTR 2011-07-25 15:03:13 -05:00
xfs_rw.c xfs: Remove the macro XFS_BUFTARG_NAME 2011-07-25 15:03:37 -05:00
xfs_rw.h xfs: only clear the suid bit once in xfs_write 2010-02-12 13:43:57 -06:00
xfs_sb.h xfs: Remove the macro XFS_BUF_PTR 2011-07-25 15:03:13 -05:00
xfs_stats.c xfs: remove subdirectories 2011-08-12 16:21:35 -05:00
xfs_stats.h xfs: remove subdirectories 2011-08-12 16:21:35 -05:00
xfs_super.c xfs: fix ->write_inode return values 2011-09-01 09:46:11 -05:00
xfs_super.h xfs: remove subdirectories 2011-08-12 16:21:35 -05:00
xfs_sync.c xfs: remove subdirectories 2011-08-12 16:21:35 -05:00
xfs_sync.h xfs: remove subdirectories 2011-08-12 16:21:35 -05:00
xfs_sysctl.c xfs: remove subdirectories 2011-08-12 16:21:35 -05:00
xfs_sysctl.h xfs: remove subdirectories 2011-08-12 16:21:35 -05:00
xfs_trace.c xfs: remove subdirectories 2011-08-12 16:21:35 -05:00
xfs_trace.h xfs: remove subdirectories 2011-08-12 16:21:35 -05:00
xfs_trans.c xfs: use a cursor for bulk AIL insertion 2011-07-20 18:37:20 -05:00
xfs_trans.h Revert "xfs: fix filesystsem freeze race in xfs_trans_alloc" 2011-07-11 10:21:03 -05:00
xfs_trans_ail.c xfs: set cursor in xfs_ail_splice() even when AIL was empty 2011-08-09 15:30:43 -05:00
xfs_trans_buf.c xfs: Remove the macro XFS_BUF_TARGET 2011-07-25 15:03:31 -05:00
xfs_trans_dquot.c xfs: remove subdirectories 2011-08-12 16:21:35 -05:00
xfs_trans_extfree.c xfs: Pull EFI/EFD handling out from under the AIL lock 2010-12-20 11:59:49 +11:00
xfs_trans_inode.c xfs: remove i_transp 2011-07-08 14:34:47 +02:00
xfs_trans_priv.h xfs: convert AIL cursors to use struct list_head 2011-07-20 18:37:46 -05:00
xfs_trans_space.h xfs: remove superflous inobt macros 2009-02-09 08:37:14 +01:00
xfs_types.h xfs: exact busy extent tracking 2011-04-28 13:18:04 -05:00
xfs_utils.c xfs: remove xfs_cred.h 2010-10-18 15:08:06 -05:00
xfs_utils.h xfs: remove xfs_cred.h 2010-10-18 15:08:06 -05:00
xfs_vnode.h xfs: remove subdirectories 2011-08-12 16:21:35 -05:00
xfs_vnodeops.c xfs: replace xfs_buf_geterror() with bp->b_error 2011-08-12 13:39:40 -05:00
xfs_vnodeops.h xfs: split xfs_setattr 2011-07-08 14:34:23 +02:00
xfs_xattr.c xfs: remove subdirectories 2011-08-12 16:21:35 -05:00