glibc/crypt
Zack Weinberg ea1bd74def New string function explicit_bzero (from OpenBSD).
explicit_bzero(s, n) is the same as memset(s, 0, n), except that the
compiler is not allowed to delete a call to explicit_bzero even if the
memory pointed to by 's' is dead after the call.  Right now, this effect
is achieved externally by having explicit_bzero be a function whose
semantics are unknown to the compiler, and internally, with a no-op
asm statement that clobbers memory.  This does mean that small
explicit_bzero operations cannot be expanded inline as small memset
operations can, but on the other hand, small memset operations do get
deleted by the compiler.  Hopefully full compiler support for
explicit_bzero will happen relatively soon.

There are two new tests: test-explicit_bzero.c verifies the
visible semantics in the same way as the existing test-bzero.c,
and tst-xbzero-opt.c verifies the not-being-optimized-out property.
The latter is conceptually based on a test written by Matthew Dempsky
for the OpenBSD regression suite.

The crypt() implementation has an immediate use for this new feature.
We avoid having to add a GLIBC_PRIVATE alias for explicit_bzero
by running all of libcrypt's calls through the fortified variant,
__explicit_bzero_chk, which is in the impl namespace anyway.  Currently
I'm not aware of anything in libc proper that needs this, but the
glue is all in place if it does become necessary.  The legacy DES
implementation wasn't bothering to clear its buffers, so I added that,
mostly for consistency's sake.

	* string/explicit_bzero.c: New routine.
	* string/test-explicit_bzero.c, string/tst-xbzero-opt.c: New tests.
	* string/Makefile (routines, strop-tests, tests): Add them.
	* string/test-memset.c: Add ifdeffage for testing explicit_bzero.
	* string/string.h [__USE_MISC]: Declare explicit_bzero.

	* debug/explicit_bzero_chk.c: New routine.
	* debug/Makefile (routines): Add it.
	* debug/tst-chk1.c: Test fortification of explicit_bzero.
	* string/bits/string3.h: Fortify explicit_bzero.

	* manual/string.texi: Document explicit_bzero.
	* NEWS: Mention addition of explicit_bzero.

	* crypt/crypt-entry.c (__crypt_r): Clear key-dependent intermediate
	data before returning, using explicit_bzero.
	* crypt/md5-crypt.c (__md5_crypt_r): Likewise.
	* crypt/sha256-crypt.c (__sha256_crypt_r): Likewise.
	* crypt/sha512-crypt.c (__sha512_crypt_r): Likewise.

	* include/string.h: Redirect internal uses of explicit_bzero
	to __explicit_bzero_chk[_internal].
	* string/Versions [GLIBC_2.25]: Add explicit_bzero.
	* debug/Versions [GLIBC_2.25]: Add __explicit_bzero_chk.
	* sysdeps/arm/nacl/libc.abilist
	* sysdeps/unix/sysv/linux/aarch64/libc.abilist
	* sysdeps/unix/sysv/linux/alpha/libc.abilist
	* sysdeps/unix/sysv/linux/arm/libc.abilist
	* sysdeps/unix/sysv/linux/hppa/libc.abilist
	* sysdeps/unix/sysv/linux/i386/libc.abilist
	* sysdeps/unix/sysv/linux/ia64/libc.abilist
	* sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist
	* sysdeps/unix/sysv/linux/m68k/m680x0/libc.abilist
	* sysdeps/unix/sysv/linux/microblaze/libc.abilist
	* sysdeps/unix/sysv/linux/mips/mips32/fpu/libc.abilist
	* sysdeps/unix/sysv/linux/mips/mips32/nofpu/libc.abilist
	* sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist
	* sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist
	* sysdeps/unix/sysv/linux/nios2/libc.abilist
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libc.abilist
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libc.abilist
	* sysdeps/unix/sysv/linux/powerpc/powerpc64/libc-le.abilist
	* sysdeps/unix/sysv/linux/powerpc/powerpc64/libc.abilist
	* sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist
	* sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist
	* sysdeps/unix/sysv/linux/sh/libc.abilist
	* sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist
	* sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist
	* sysdeps/unix/sysv/linux/tile/tilegx/tilegx32/libc.abilist
	* sysdeps/unix/sysv/linux/tile/tilegx/tilegx64/libc.abilist
	* sysdeps/unix/sysv/linux/tile/tilepro/libc.abilist
	* sysdeps/unix/sysv/linux/x86_64/64/libc.abilist
	* sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist:
	Add entries for explicit_bzero and __explicit_bzero_chk.
2016-12-16 16:21:54 -05:00
..
badsalttest.c
Banner
cert.c
cert.input
crypt_util.c
crypt-entry.c New string function explicit_bzero (from OpenBSD). 2016-12-16 16:21:54 -05:00
crypt-private.h
crypt.c
crypt.h
Makefile
md5-block.c
md5-crypt.c New string function explicit_bzero (from OpenBSD). 2016-12-16 16:21:54 -05:00
md5.c
md5.h
md5c-test.c
md5test-giant.c
md5test.c
README.ufc-crypt
sha256-block.c crypt: Use internal names for the SHA-2 block functions 2016-10-28 21:49:21 +02:00
sha256-crypt.c New string function explicit_bzero (from OpenBSD). 2016-12-16 16:21:54 -05:00
sha256.c crypt: Use internal names for the SHA-2 block functions 2016-10-28 21:49:21 +02:00
sha256.h
sha256c-test.c
sha256test.c
sha512-block.c crypt: Use internal names for the SHA-2 block functions 2016-10-28 21:49:21 +02:00
sha512-crypt.c New string function explicit_bzero (from OpenBSD). 2016-12-16 16:21:54 -05:00
sha512.c crypt: Use internal names for the SHA-2 block functions 2016-10-28 21:49:21 +02:00
sha512.h
sha512c-test.c
sha512test.c
speeds.c
ufc-crypt.h
ufc.c
Versions

The following is the README for UFC-crypt, with those portions deleted
that are known to be incorrect for the implementation used with the
GNU C library.


	UFC-crypt: ultra fast 'crypt' implementation
	============================================

	@(#)README	2.27 11 Sep 1996

Design goals/non goals:
----------------------

- Crypt implementation plugin compatible with crypt(3)/fcrypt.

- High performance when used for password cracking.

- Portable to most 32/64 bit machines.

- Startup time/mixed salt performance not critical.

Features of the implementation:
------------------------------

- On most machines, UFC-crypt runs 30-60 times faster than crypt(3) when
  invoked repeated times with the same salt and varying passwords.

- With mostly constant salts, performance is about two to three times
  that of the default fcrypt implementation shipped with Alec
  Muffets 'Crack' password cracker. For instructions on how to
  plug UFC-crypt into 'Crack', see below.

- With alternating salts, performance is only about twice
  that of crypt(3).

- Requires 165 kb for tables.

Author & licensing etc
----------------------

UFC-crypt is created by Michael Glad, email: glad@daimi.aau.dk, and has
been donated to the Free Software Foundation, Inc. It is covered by the
GNU library license version 2, see the file 'COPYING.LIB'.

NOTES FOR USERS OUTSIDE THE US:
------------------------------

The US government limits the export of DES based software/hardware.
This software is written in Aarhus, Denmark. It can therefore be retrieved
from ftp sites outside the US without breaking US law. Please do not
ftp it from american sites.

Benchmark table:
---------------

The table shows how many operations per second UFC-crypt can
do on various machines.

|--------------|-------------------------------------------|
|Machine       |  SUN*  SUN*   HP*     DecStation   HP     |
|              | 3/50   ELC  9000/425e    3100    9000/720 |
|--------------|-------------------------------------------|
| Crypt(3)/sec |  4.6    30     15         25        57    |
| Ufc/sec      |  220   990    780       1015      3500    |
|--------------|-------------------------------------------|
| Speedup      |   48    30     52         40        60    |
|--------------|-------------------------------------------|

*) Compiled using special assembly language support module.

It seems as if performance is limited by CPU bus and data cache capacity.
This also makes the benchmarks debatable compared to a real test with
UFC-crypt wired into Crack. However, the table gives an outline of
what can be expected.

Optimizations:
-------------

Here are the optimizations used relative to an ordinary implementation
such as the one said to be used in crypt(3).

Major optimizations
*******************

- Keep data packed as bits in integer variables -- allows for
  fast permutations & parallel xor's in CPU hardware.

- Let adjacent final & initial permutations collapse.

- Keep working data in 'E expanded' format all the time.

- Implement DES 'f' function mostly by table lookup

- Calculate the above function on 12 bit basis rather than 6
  as would be the most natural.

- Implement setup routines so that performance is limited by the DES
  inner loops only.

- Instead of doing salting in the DES inner loops, modify the above tables
  each time a new salt is seen. According to the BSD crypt code this is
  ugly :-)

Minor (dirty) optimizations
***************************

- combine iterations of DES inner loop so that DES only loops
  8 times. This saves a lot of variable swapping.

- Implement key access by a walking pointer rather than coding
  as array indexing.

- As described, the table based f function uses a 3 dimensional array:

	sb ['number of 12 bit segment']['12 bit index']['48 bit half index']

  Code the routine with 4 (one dimensional) vectors.

- Design the internal data format & uglify the DES loops so that
  the compiler does not need to do bit shifts when indexing vectors.

Revision history
****************

UFC patchlevel 0: base version; released to alt.sources on Sep 24 1991
UFC patchlevel 1: patch released to alt.sources on Sep 27 1991.
		  No longer rebuilds sb tables when seeing a new salt.
UFC-crypt pl0:	  Essentially UFC pl 1. Released to comp.sources.misc
		  on Oct 22 1991.
UFC-crypt pl1:    Released to comp.sources.misc in march 1992
		  * setkey/encrypt routines added
		  * added validation/benchmarking programs
		  * reworked keyschedule setup code
		  * memory demands reduced
		  * 64 bit support added