Commit Graph

6 Commits

Author SHA1 Message Date
Ard Biesheuvel 03c9a333fe crypto: arm64/ghash - add NEON accelerated fallback for 64-bit PMULL
Implement a NEON fallback for systems that do support NEON but have
no support for the optional 64x64->128 polynomial multiplication
instruction that is part of the ARMv8 Crypto Extensions. It is based
on the paper "Fast Software Polynomial Multiplication on ARM Processors
Using the NEON Engine" by Danilo Camara, Conrado Gouvea, Julio Lopez and
Ricardo Dahab (https://hal.inria.fr/hal-01506572), but has been reworked
extensively for the AArch64 ISA.

On a low-end core such as the Cortex-A53 found in the Raspberry Pi3, the
NEON based implementation is 4x faster than the table based one, and
is time invariant as well, making it less vulnerable to timing attacks.
When combined with the bit-sliced NEON implementation of AES-CTR, the
AES-GCM performance increases by 2x (from 58 to 29 cycles per byte).

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-08-04 09:27:25 +08:00
Ard Biesheuvel 537c1445ab crypto: arm64/gcm - implement native driver using v8 Crypto Extensions
Currently, the AES-GCM implementation for arm64 systems that support the
ARMv8 Crypto Extensions is based on the generic GCM module, which combines
the AES-CTR implementation using AES instructions with the PMULL based
GHASH driver. This is suboptimal, given the fact that the input data needs
to be loaded twice, once for the encryption and again for the MAC
calculation.

On Cortex-A57 (r1p2) and other recent cores that implement micro-op fusing
for the AES instructions, AES executes at less than 1 cycle per byte, which
means that any cycles wasted on loading the data twice hurt even more.

So implement a new GCM driver that combines the AES and PMULL instructions
at the block level. This improves performance on Cortex-A57 by ~37% (from
3.5 cpb to 2.6 cpb)

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-08-04 09:27:23 +08:00
Ard Biesheuvel 6d6254d728 crypto: arm64/ghash-ce - add non-SIMD scalar fallback
The arm64 kernel will shortly disallow nested kernel mode NEON, so
add a fallback to scalar C code that can be invoked in that case.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-08-04 09:27:16 +08:00
Ard Biesheuvel b913a6404c arm64/crypto: improve performance of GHASH algorithm
This patches modifies the GHASH secure hash implementation to switch to a
faster, polynomial multiplication based reduction instead of one that uses
shifts and rotates.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-06-18 12:40:54 +01:00
Ard Biesheuvel 6aa8b209f5 arm64/crypto: fix data corruption bug in GHASH algorithm
This fixes a bug in the GHASH algorithm resulting in the calculated hash to be
incorrect if the input is presented in chunks whose size is not a multiple of
16 bytes.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Fixes: fdd2389457 ("arm64/crypto: GHASH secure hash using ARMv8 Crypto Extensions")
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-06-18 12:40:53 +01:00
Ard Biesheuvel fdd2389457 arm64/crypto: GHASH secure hash using ARMv8 Crypto Extensions
This is a port to ARMv8 (Crypto Extensions) of the Intel implementation of the
GHASH Secure Hash (used in the Galois/Counter chaining mode). It relies on the
optional PMULL/PMULL2 instruction (polynomial multiply long, what Intel call
carry-less multiply).

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-05-14 10:04:07 -07:00