With commit 56f23fdbb6 ("Btrfs: fix file/data loss caused by fsync after
rename and new inode") we got simple fix for a functional issue when the
following sequence of actions is done:
at transaction N
create file A at directory D
at transaction N + M (where M >= 1)
move/rename existing file A from directory D to directory E
create a new file named A at directory D
fsync the new file
power fail
The solution was to simply detect such scenario and fallback to a full
transaction commit when we detect it. However this turned out to had a
significant impact on throughput (and a bit on latency too) for benchmarks
using the dbench tool, which simulates real workloads from smbd (Samba)
servers. For example on a test vm (with a debug kernel):
Unpatched:
Throughput 19.1572 MB/sec 32 clients 32 procs max_latency=1005.229 ms
Patched:
Throughput 23.7015 MB/sec 32 clients 32 procs max_latency=809.206 ms
The patched results (this patch is applied) are similar to the results of
a kernel with the commit 56f23fdbb6 ("Btrfs: fix file/data loss caused
by fsync after rename and new inode") reverted.
This change avoids the fallback to a transaction commit and instead makes
sure all the names of the conflicting inode (the one that had a name in a
past transaction that matches the name of the new file in the same parent
directory) are logged so that at log replay time we don't lose neither the
new file nor the old file, and the old file gets the name it was renamed
to.
This also ends up avoiding a full transaction commit for a similar case
that involves an unlink instead of a rename of the old file:
at transaction N
create file A at directory D
at transaction N + M (where M >= 1)
remove file A
create a new file named A at directory D
fsync the new file
power fail
Signed-off-by: Filipe Manana <fdmanana@suse.com>