linux/arch/arm/crypto/Makefile

#
# Arch-specific CryptoAPI modules.
#

obj-$(CONFIG_CRYPTO_AES_ARM) += aes-arm.o
obj-$(CONFIG_CRYPTO_AES_ARM_BS) += aes-arm-bs.o
obj-$(CONFIG_CRYPTO_SHA1_ARM) += sha1-arm.o

aes-arm-y	:= aes-armv4.o aes_glue.o
aes-arm-bs-y	:= aesbs-core.o aesbs-glue.o
sha1-arm-y	:= sha1-armv4-large.o sha1_glue.o

quiet_cmd_perl = PERL    $@
      cmd_perl = $(PERL) $(<) > $(@)

$(src)/aesbs-core.S_shipped: $(src)/bsaes-armv7.pl
	$(call cmd,perl)

.PRECIOUS: $(obj)/aesbs-core.S
arm/crypto: Add optimized AES and SHA1 routines Add assembler versions of AES and SHA1 for ARM platforms. This has provided up to a 50% improvement in IPsec/TCP throughout for tunnels using AES128/SHA1. Platform CPU SPeed Endian Before (bps) After (bps) Improvement IXP425 533 MHz big 11217042 15566294 ~38% KS8695 166 MHz little 3828549 5795373 ~51% Signed-off-by: David McCullough <ucdevel@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> 2012-09-06 22:17:02 +02:00			`#`
			`# Arch-specific CryptoAPI modules.`
			`#`

			`obj-$(CONFIG_CRYPTO_AES_ARM) += aes-arm.o`
ARM: add support for bit sliced AES using NEON instructions Bit sliced AES gives around 45% speedup on Cortex-A15 for encryption and around 25% for decryption. This implementation of the AES algorithm does not rely on any lookup tables so it is believed to be invulnerable to cache timing attacks. This algorithm processes up to 8 blocks in parallel in constant time. This means that it is not usable by chaining modes that are strictly sequential in nature, such as CBC encryption. CBC decryption, however, can benefit from this implementation and runs about 25% faster. The other chaining modes implemented in this module, XTS and CTR, can execute fully in parallel in both directions. The core code has been adopted from the OpenSSL project (in collaboration with the original author, on cc). For ease of maintenance, this version is identical to the upstream OpenSSL code, i.e., all modifications that were required to make it suitable for inclusion into the kernel have been made upstream. The original can be found here: http://git.openssl.org/gitweb/?p=openssl.git;a=commit;h=6f6a6130 Note to integrators: While this implementation is significantly faster than the existing table based ones (generic or ARM asm), especially in CTR mode, the effects on power efficiency are unclear as of yet. This code does fundamentally more work, by calculating values that the table based code obtains by a simple lookup; only by doing all of that work in a SIMD fashion, it manages to perform better. Cc: Andy Polyakov <appro@openssl.org> Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> 2013-09-16 18:31:38 +02:00			`obj-$(CONFIG_CRYPTO_AES_ARM_BS) += aes-arm-bs.o`
arm/crypto: Add optimized AES and SHA1 routines Add assembler versions of AES and SHA1 for ARM platforms. This has provided up to a 50% improvement in IPsec/TCP throughout for tunnels using AES128/SHA1. Platform CPU SPeed Endian Before (bps) After (bps) Improvement IXP425 533 MHz big 11217042 15566294 ~38% KS8695 166 MHz little 3828549 5795373 ~51% Signed-off-by: David McCullough <ucdevel@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> 2012-09-06 22:17:02 +02:00			`obj-$(CONFIG_CRYPTO_SHA1_ARM) += sha1-arm.o`

ARM: add support for bit sliced AES using NEON instructions Bit sliced AES gives around 45% speedup on Cortex-A15 for encryption and around 25% for decryption. This implementation of the AES algorithm does not rely on any lookup tables so it is believed to be invulnerable to cache timing attacks. This algorithm processes up to 8 blocks in parallel in constant time. This means that it is not usable by chaining modes that are strictly sequential in nature, such as CBC encryption. CBC decryption, however, can benefit from this implementation and runs about 25% faster. The other chaining modes implemented in this module, XTS and CTR, can execute fully in parallel in both directions. The core code has been adopted from the OpenSSL project (in collaboration with the original author, on cc). For ease of maintenance, this version is identical to the upstream OpenSSL code, i.e., all modifications that were required to make it suitable for inclusion into the kernel have been made upstream. The original can be found here: http://git.openssl.org/gitweb/?p=openssl.git;a=commit;h=6f6a6130 Note to integrators: While this implementation is significantly faster than the existing table based ones (generic or ARM asm), especially in CTR mode, the effects on power efficiency are unclear as of yet. This code does fundamentally more work, by calculating values that the table based code obtains by a simple lookup; only by doing all of that work in a SIMD fashion, it manages to perform better. Cc: Andy Polyakov <appro@openssl.org> Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> 2013-09-16 18:31:38 +02:00			`aes-arm-y := aes-armv4.o aes_glue.o`
			`aes-arm-bs-y := aesbs-core.o aesbs-glue.o`
			`sha1-arm-y := sha1-armv4-large.o sha1_glue.o`

			`quiet_cmd_perl = PERL $@`
			`cmd_perl = $(PERL) $(<) > $(@)`

			`$(src)/aesbs-core.S_shipped: $(src)/bsaes-armv7.pl`
			`$(call cmd,perl)`

			`.PRECIOUS: $(obj)/aesbs-core.S`