15487aa132
Currently we emit a consume-load in atomic_rcu_read. Because of limitations in current compilers, this is overkill for non-Alpha hosts and it is only useful to make Thread Sanitizer work. This patch leaves the consume-load in atomic_rcu_read when compiling with Thread Sanitizer enabled, and resorts to a relaxed load + smp_read_barrier_depends otherwise. On an RMO host architecture, such as aarch64, the performance improvement of this change is easily measurable. For instance, qht-bench performs an atomic_rcu_read on every lookup. Performance before and after applying this patch: $ tests/qht-bench -d 5 -n 1 Before: 9.78 MT/s After: 10.96 MT/s Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <1464120374-8950-4-git-send-email-cota@braap.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>