5988765741
The find_first_set and find_last_set method is not optimal for neon, it needs to be improved by synthesized with horizontal adds(vaddv) which will reduce the generated assembly code. In the following cases, vaddvq_s16 will generate 2 instructions but vpadd_s16 will generate 4 instructions: # vaddvq_s16 vaddvq_s16(__asint); // addv h0, v1.8h // smov w1, v0.h[0] # vpadd_s16 vpaddq_s16(vpaddq_s16(vpaddq_s16(__asint, __zero), __zero), __zero)[0] // addp v1.8h,v1.8h,v2.8h // addp v1.8h,v1.8h,v2.8h // addp v1.8h,v1.8h,v2.8h // smov w1, v1.h[0] # libstdc++-v3/ChangeLog: * include/experimental/bits/simd_neon.h: Replace repeated vpadd calls with a single vaddv for aarch64. |
||
---|---|---|
.. | ||
config | ||
doc | ||
include | ||
libsupc++ | ||
po | ||
python | ||
scripts | ||
src | ||
testsuite | ||
acinclude.m4 | ||
aclocal.m4 | ||
ChangeLog | ||
ChangeLog-1998 | ||
ChangeLog-1999 | ||
ChangeLog-2000 | ||
ChangeLog-2001 | ||
ChangeLog-2002 | ||
ChangeLog-2003 | ||
ChangeLog-2004 | ||
ChangeLog-2005 | ||
ChangeLog-2006 | ||
ChangeLog-2007 | ||
ChangeLog-2008 | ||
ChangeLog-2009 | ||
ChangeLog-2010 | ||
ChangeLog-2011 | ||
ChangeLog-2012 | ||
ChangeLog-2013 | ||
ChangeLog-2014 | ||
ChangeLog-2015 | ||
ChangeLog-2016 | ||
ChangeLog-2017 | ||
ChangeLog-2018 | ||
ChangeLog-2019 | ||
ChangeLog-2020 | ||
config.h.in | ||
configure | ||
configure.ac | ||
configure.host | ||
crossconfig.m4 | ||
fragment.am | ||
linkage.m4 | ||
Makefile.am | ||
Makefile.in | ||
README |
file: libstdc++-v3/README New users may wish to point their web browsers to the file index.html in the 'doc/html' subdirectory. It contains brief building instructions and notes on how to configure the library in interesting ways.