linux/fs/btrfs
Filipe Manana 44f714dae5 Btrfs: improve performance on fsync against new inode after rename/unlink
With commit 56f23fdbb6 ("Btrfs: fix file/data loss caused by fsync after
rename and new inode") we got simple fix for a functional issue when the
following sequence of actions is done:

  at transaction N
  create file A at directory D
  at transaction N + M (where M >= 1)
  move/rename existing file A from directory D to directory E
  create a new file named A at directory D
  fsync the new file
  power fail

The solution was to simply detect such scenario and fallback to a full
transaction commit when we detect it. However this turned out to had a
significant impact on throughput (and a bit on latency too) for benchmarks
using the dbench tool, which simulates real workloads from smbd (Samba)
servers. For example on a test vm (with a debug kernel):

Unpatched:
Throughput 19.1572 MB/sec  32 clients  32 procs  max_latency=1005.229 ms

Patched:
Throughput 23.7015 MB/sec  32 clients  32 procs  max_latency=809.206 ms

The patched results (this patch is applied) are similar to the results of
a kernel with the commit 56f23fdbb6 ("Btrfs: fix file/data loss caused
by fsync after rename and new inode") reverted.

This change avoids the fallback to a transaction commit and instead makes
sure all the names of the conflicting inode (the one that had a name in a
past transaction that matches the name of the new file in the same parent
directory) are logged so that at log replay time we don't lose neither the
new file nor the old file, and the old file gets the name it was renamed
to.

This also ends up avoiding a full transaction commit for a similar case
that involves an unlink instead of a rename of the old file:

  at transaction N
  create file A at directory D
  at transaction N + M (where M >= 1)
  remove file A
  create a new file named A at directory D
  fsync the new file
  power fail

Signed-off-by: Filipe Manana <fdmanana@suse.com>
2016-08-01 07:32:14 +01:00
..
tests Btrfs: fix error return code in btrfs_init_test_fs() 2016-06-23 10:44:39 -07:00
acl.c posix_acl: Inode acl caching fixes 2016-03-31 00:30:15 -04:00
async-thread.c btrfs: async-thread: Fix a use-after-free error for trace 2016-01-25 16:50:26 -08:00
async-thread.h btrfs: async_thread: Fix workqueue 'max_active' value when initializing 2015-08-31 11:46:40 -07:00
backref.c Merge branch 'cleanups-4.7' into for-chris-4.7-20160525 2016-05-25 22:51:03 +02:00
backref.h
btrfs_inode.h Merge branch 'cleanups-4.7' into for-chris-4.7-20160525 2016-05-25 22:51:03 +02:00
check-integrity.c btrfs: Use correct format specifier 2016-06-17 18:32:40 +02:00
check-integrity.h
compression.c btrfs: make find_workspace warn if there are no workspaces 2016-05-10 09:46:16 +02:00
compression.h btrfs: move btrfs_compression_type to compression.h 2016-03-11 17:12:46 +01:00
ctree.c Btrfs: fix error handling in map_private_extent_buffer 2016-06-23 10:44:40 -07:00
ctree.h Btrfs: add tracepoints for flush events 2016-07-07 18:45:53 +02:00
delayed-inode.c Btrfs: change delayed reservation fallback behavior 2016-07-07 18:45:53 +02:00
delayed-inode.h Btrfs: fix ->iterate_shared() by upgrading i_rwsem for delayed nodes 2016-06-25 06:20:10 -07:00
delayed-ref.c btrfs: drop null testing before destroy functions 2016-02-18 11:46:03 +01:00
delayed-ref.h btrfs: fix string and comment grammatical issues and typos 2016-05-25 22:35:14 +02:00
dev-replace.c Merge branch 'cleanups-4.7' into for-chris-4.7-20160525 2016-05-25 22:51:03 +02:00
dev-replace.h btrfs: refactor btrfs_dev_replace_start for reuse 2016-04-28 10:59:13 +02:00
dir-item.c
disk-io.c Btrfs: Force stripesize to the value of sectorsize 2016-06-23 10:44:42 -07:00
disk-io.h Btrfs: self-tests: Support non-4k page size 2016-06-02 19:23:14 +02:00
export.c BTRFS: support NFSv2 export 2015-10-06 06:55:23 -07:00
export.h
extent_io.c Btrfs: fix error handling in map_private_extent_buffer 2016-06-23 10:44:40 -07:00
extent_io.h Btrfs: self-tests: Support non-4k page size 2016-06-02 19:23:14 +02:00
extent_map.c btrfs: fix string and comment grammatical issues and typos 2016-05-25 22:35:14 +02:00
extent_map.h btrfs: cleanup, stop casting for extent_map->lookup everywhere 2016-01-15 19:22:28 +01:00
extent-tree.c Btrfs: avoid deadlocks during reservations in btrfs_truncate_block 2016-07-20 16:58:04 -07:00
file-item.c btrfs: sink gfp parameter to set_extent_bits 2016-04-29 11:01:47 +02:00
file.c Btrfs: add missing check for writeback errors on fsync 2016-08-01 07:21:13 +01:00
free-space-cache.c Btrfs: self-tests: Support non-4k page size 2016-06-02 19:23:14 +02:00
free-space-cache.h btrfs: fix string and comment grammatical issues and typos 2016-05-25 22:35:14 +02:00
free-space-tree.c Revert "btrfs: synchronize incompat feature bits with sysfs files" 2016-01-29 08:19:37 -08:00
free-space-tree.h Btrfs: implement the free space B-tree 2015-12-17 12:16:47 -08:00
hash.c btrfs: advertise which crc32c implementation is being used at module load 2016-06-06 14:08:28 +02:00
hash.h btrfs: advertise which crc32c implementation is being used at module load 2016-06-06 14:08:28 +02:00
inode-item.c btrfs: rename btrfs_std_error to btrfs_handle_fs_error 2016-04-28 10:36:54 +02:00
inode-map.c mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros 2016-04-04 10:41:08 -07:00
inode-map.h Btrfs: Initialize btrfs_root->highest_objectid when loading tree root and subvolume roots 2016-01-15 19:25:02 +01:00
inode.c Btrfs: improve performance on fsync against new inode after rename/unlink 2016-08-01 07:32:14 +01:00
ioctl.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-05-27 17:14:05 -07:00
Kconfig
locking.c btrfs: cleanup, remove stray return statements 2016-01-07 14:30:52 +01:00
locking.h
lzo.c mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros 2016-04-04 10:41:08 -07:00
Makefile Btrfs: add free space tree sanity tests 2015-12-17 12:16:47 -08:00
math.h
ordered-data.c btrfs: fix disk_i_size update bug when fallocate() fails 2016-06-23 10:44:41 -07:00
ordered-data.h Btrfs: fix race setting block group readonly during device replace 2016-05-30 12:58:21 +01:00
orphan.c
print-tree.c btrfs: teach print_leaf about temporary item subtypes 2016-02-11 16:15:43 +01:00
print-tree.h
props.c btrfs: move btrfs_compression_type to compression.h 2016-03-11 17:12:46 +01:00
props.h
qgroup.c btrfs: fix string and comment grammatical issues and typos 2016-05-25 22:35:14 +02:00
qgroup.h btrfs: qgroup: Check if qgroup reserved space leaked 2015-10-21 18:41:10 -07:00
raid56.c btrfs: fix string and comment grammatical issues and typos 2016-05-25 22:35:14 +02:00
raid56.h Btrfs: add RAID 5/6 BTRFS_RBIO_REBUILD_MISSING operation 2015-08-09 07:34:26 -07:00
rcu-string.h
reada.c Btrfs: fix race between readahead and device replace/removal 2016-05-30 12:58:18 +01:00
relocation.c Btrfs: use FLUSH_LIMIT for relocation in reserve_metadata_bytes 2016-07-07 18:45:53 +02:00
root-tree.c Merge branch 'cleanups-4.7' into for-chris-4.7-20160525 2016-05-25 22:51:03 +02:00
scrub.c Btrfs: fix race setting block group back to RW mode during device replace 2016-05-30 12:58:24 +01:00
send.c Btrfs: send, don't bug on inconsistent snapshots 2016-08-01 07:31:41 +01:00
send.h Btrfs: use linux/sizes.h to represent constants 2016-01-07 14:38:02 +01:00
struct-funcs.c btrfs: fix string and comment grammatical issues and typos 2016-05-25 22:35:14 +02:00
super.c btrfs: avoid blocking open_ctree from cleaner_kthread 2016-06-17 18:32:40 +02:00
sysfs.c btrfs: sysfs: protect reading label by lock 2016-05-06 15:22:49 +02:00
sysfs.h btrfs: sysfs: introduce helper for syncing bits with sysfs files 2016-01-21 18:50:40 +01:00
transaction.c Btrfs: track transid for delayed ref flushing 2016-06-22 17:54:18 -07:00
transaction.h btrfs: account for non-CoW'd blocks in btrfs_abort_transaction 2016-06-17 18:32:40 +02:00
tree-defrag.c Btrfs: fix locking bugs when defragging leaves 2015-12-18 02:51:32 +00:00
tree-log.c Btrfs: improve performance on fsync against new inode after rename/unlink 2016-08-01 07:32:14 +01:00
tree-log.h Btrfs: fix unreplayable log after snapshot delete + parent dir fsync 2016-03-01 08:23:25 -08:00
ulist.c btrfs: fix string and comment grammatical issues and typos 2016-05-25 22:35:14 +02:00
ulist.h btrfs: ulist: Add ulist_del() function. 2015-06-10 09:26:17 -07:00
uuid-tree.c
volumes.c Merge branch 'for-linus-4.7' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs 2016-06-25 08:42:31 -07:00
volumes.h Merge branch 'foreign/jeffm/uapi' into for-chris-4.7-20160516 2016-05-16 15:46:29 +02:00
xattr.c switch xattr_handler->set() to passing dentry and inode separately 2016-05-27 15:39:43 -04:00
xattr.h btrfs: Switch to generic xattr handlers 2016-05-17 19:17:09 -04:00
zlib.c mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros 2016-04-04 10:41:08 -07:00