linux/arch
Eric Biggers ede9622162 crypto: arm/speck - add NEON-accelerated implementation of Speck-XTS
Add an ARM NEON-accelerated implementation of Speck-XTS.  It operates on
128-byte chunks at a time, i.e. 8 blocks for Speck128 or 16 blocks for
Speck64.  Each 128-byte chunk goes through XTS preprocessing, then is
encrypted/decrypted (doing one cipher round for all the blocks, then the
next round, etc.), then goes through XTS postprocessing.

The performance depends on the processor but can be about 3 times faster
than the generic code.  For example, on an ARMv7 processor we observe
the following performance with Speck128/256-XTS:

    xts-speck128-neon:     Encryption 107.9 MB/s, Decryption 108.1 MB/s
    xts(speck128-generic): Encryption  32.1 MB/s, Decryption  36.6 MB/s

In comparison to AES-256-XTS without the Cryptography Extensions:

    xts-aes-neonbs:        Encryption  41.2 MB/s, Decryption  36.7 MB/s
    xts(aes-asm):          Encryption  31.7 MB/s, Decryption  30.8 MB/s
    xts(aes-generic):      Encryption  21.2 MB/s, Decryption  20.9 MB/s

Speck64/128-XTS is even faster:

    xts-speck64-neon:      Encryption 138.6 MB/s, Decryption 139.1 MB/s

Note that as with the generic code, only the Speck128 and Speck64
variants are supported.  Also, for now only the XTS mode of operation is
supported, to target the disk and file encryption use cases.  The NEON
code also only handles the portion of the data that is evenly divisible
into 128-byte chunks, with any remainder handled by a C fallback.  Of
course, other modes of operation could be added later if needed, and/or
the NEON code could be updated to handle other buffer sizes.

The XTS specification is only defined for AES which has a 128-bit block
size, so for the GF(2^64) math needed for Speck64-XTS we use the
reducing polynomial 'x^64 + x^4 + x^3 + x + 1' given by the original XEX
paper.  Of course, when possible users should use Speck128-XTS, but even
that may be too slow on some processors; Speck64-XTS can be faster.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-02-22 22:16:55 +08:00
..
alpha pci-v4.16-changes 2018-02-06 09:59:40 -08:00
arc The core framework has a handful of patches this time around, mostly due 2018-02-01 16:56:07 -08:00
arm crypto: arm/speck - add NEON-accelerated implementation of Speck-XTS 2018-02-22 22:16:55 +08:00
arm64 KVM changes for 4.16 2018-02-10 13:16:35 -08:00
blackfin unify {de,}mangle_poll(), get rid of kernel-side POLL... 2018-02-11 14:37:22 -08:00
c6x The core framework has a handful of patches this time around, mostly due 2018-02-01 16:56:07 -08:00
cris vfs: do bulk POLL* -> EPOLL* replacement 2018-02-11 14:34:03 -08:00
frv unify {de,}mangle_poll(), get rid of kernel-side POLL... 2018-02-11 14:37:22 -08:00
h8300 The core framework has a handful of patches this time around, mostly due 2018-02-01 16:56:07 -08:00
hexagon The core framework has a handful of patches this time around, mostly due 2018-02-01 16:56:07 -08:00
ia64 vfs: do bulk POLL* -> EPOLL* replacement 2018-02-11 14:34:03 -08:00
m32r The core framework has a handful of patches this time around, mostly due 2018-02-01 16:56:07 -08:00
m68k unify {de,}mangle_poll(), get rid of kernel-side POLL... 2018-02-11 14:37:22 -08:00
metag The core framework has a handful of patches this time around, mostly due 2018-02-01 16:56:07 -08:00
microblaze Microblaze patches for 4.16-rc1 2018-02-02 09:48:36 -08:00
mips unify {de,}mangle_poll(), get rid of kernel-side POLL... 2018-02-11 14:37:22 -08:00
mn10300 The core framework has a handful of patches this time around, mostly due 2018-02-01 16:56:07 -08:00
nios2 nios2 update for v4.16-rc1 2018-02-11 13:52:32 -08:00
openrisc The core framework has a handful of patches this time around, mostly due 2018-02-01 16:56:07 -08:00
parisc The core framework has a handful of patches this time around, mostly due 2018-02-01 16:56:07 -08:00
powerpc vfs: do bulk POLL* -> EPOLL* replacement 2018-02-11 14:34:03 -08:00
riscv RISC-V changes for 4.16 2018-02-07 11:33:08 -08:00
s390 KVM changes for 4.16 2018-02-10 13:16:35 -08:00
score arch/score/kernel/setup.c: combine two seq_printf() calls into one call in show_cpuinfo() 2018-02-06 18:32:47 -08:00
sh libnvdimm for 4.16 2018-02-06 10:41:33 -08:00
sparc unify {de,}mangle_poll(), get rid of kernel-side POLL... 2018-02-11 14:37:22 -08:00
tile The core framework has a handful of patches this time around, mostly due 2018-02-01 16:56:07 -08:00
um mconsole_proc(): don't mess with file->f_pos 2018-02-09 19:28:01 -08:00
unicore32 lib: optimize cpumask_next_and() 2018-02-06 18:32:44 -08:00
x86 crypto: aesni - Update aesni-intel_glue to use scatter/gather 2018-02-22 22:16:52 +08:00
xtensa unify {de,}mangle_poll(), get rid of kernel-side POLL... 2018-02-11 14:37:22 -08:00
.gitignore
Kconfig Makefile: introduce CONFIG_CC_STACKPROTECTOR_AUTO 2018-02-06 18:32:44 -08:00