| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
| |
No change to generated assembly.
|
|
|
|
|
|
|
|
|
|
| |
m32_common.h is a typical OpenSSL macro horror show - copy the update,
transform and final functions from md32_common.h, manually expanding the
macros for SHA256. This will allow for further clean up to occur.
No change in generated assembly.
ok beck@ tb@
|
|
|
|
|
|
|
|
|
|
|
| |
This recommits r1.37 of sha512.c, however uses uint8_t * instead of void *
for the crypto_load_* functions and primarily uses const uint8_t * to track
input, only casting to const SHA_LONG64 * once we know that it is suitably
aligned. This prevents the compiler from implying alignment based on type.
Tested by tb@ and deraadt@ on platforms with gcc and strict alignment.
ok tb@
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
All assembly implementations are required to perform their own alignment
handling. In the case of the C implementation, on strict alignment
platforms, unaligned data will be copied into an aligned buffer. However,
most platforms then perform byte-by-byte reads (via the PULL64 macros).
Instead, remove SHA512_BLOCK_CAN_MANAGE_UNALIGNED_DATA and alignment
handling to sha512_block_data_order() - if the data is aligned then simply
perform 64 bit loads and then do endian conversion via be64toh(). If the
data is unaligned then use memcpy() and be64toh() (in the form of
crypto_load_be64toh()). Overall this reduces complexity and can improve
performance (on aarch64 we get a ~10% performance gain with aligned input
and about ~1-2% gain on armv7), while the same movq/bswapq is generated
for amd64 and movl/bswapl for i386.
ok tb@
|
|
|
|
|
|
|
|
|
|
|
| |
Avoid reach around and initialisation outside of the macro, cleaning up
the call sites to remove the initialisation. Use a T2 variable to more
closely follow the documented algorithm and remove the gorgeous compound
statement X = Y += A + B + C.
There is no change to the clang generated assembly on aarch64.
ok tb@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We currently have three C implementations for SHA-512 - a version that is
optimised for CPUs with minimal registers (specifically i386), a regular
implementation and a semi-unrolled implementation. Testing on a ~15 year
old i386 CPU, the fastest version is actually the semi-unrolled version
(not to mention that we still currently have an i586 assembly
implementation that is used on i386 instead...).
More decent architectures do not seem to care between the regular and
semi-unrolled version, presumably since they are effectively doing the
same thing in hardware during execution.
Remove all except the semi-unrolled version.
ok tb@
|
| |
|
| |
|
|
|
|
| |
ok jsing, and kind of tb an earlier version
|
|
|
|
| |
ok tb@
|
|
|
|
| |
ok tb@
|
|
|
|
| |
While here, use KECCAK_BYTE_WIDTH instead of hardcoding the value.
|
|
|
|
| |
Also buy a vowel for rsiz.
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
| |
These will make EVP integration easier, as well as being used in the SHA3
implementation itself.
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
| |
Remove various comments that are unhelpful or obvious. Reformat remaining
comments per style(9).
|
| |
|
| |
|
| |
|
|
|
|
|
|
| |
This is a minimal and readable SHA3 implementation.
ok tb@
|
|
|
|
|
|
|
| |
This adds support for SHA512/224 and SHA512/256, as specified in FIPS
FIPS 180-4. These are truncated versions of the SHA512 hash.
ok tb@
|
|
|
|
| |
ok tb@
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Various code in libcrypto needs bitwise rotation - rather than defining
different versions across the code base, provide a common set that can
be reused. Any sensible compiler optimises these to a single instruction
where the architecture supports it, which means we can ditch the inline
assembly.
On the chance that we need to provide a platform specific versions, this
follows the approach used in BN where a MD crypto_arch.h header could be
added in the future, which would then provide more specific versions of
these functions.
ok tb@
|
|
|
|
|
|
|
|
|
| |
It is common to need to store data in a specific endianness - rather than
handrolling and deduplicating code to do this, provide a
crypto_store_htobe64() function that converts from host endian to big
endian, before storing the data to a location with unknown alignment.
ok tb@
|
|
|
|
|
|
| |
Use htobe64() instead of testing BYTE_ORDER and then handrolling htobe64().
Thanks to tobhe for providing most of the fix via openiked-portable
|
|
|
|
| |
ok jsing
|
| |
|
|
|
|
|
|
|
| |
Rather than sprinkling BYTE_ORDER checks throughout the implementation,
always define PULL64 - on big endian platforms it just becomes a no-op.
ok tb@
|
|
|
|
| |
ok tb@
|
|
|
|
|
|
|
|
|
| |
In the case that the pure C implementation of SHA512 is being used, the
prototype is unnecessary as the function is declared static and exists
in dependency order. Simply omit the prototype rather than using #ifndef
to toggle the static prefix.
ok tb@
|
|
|
|
| |
ok tb@
|
|
|
|
|
|
| |
Another set of mechnical replacements for "a,b" with "a, b".
No change in generated assembly.
|
|
|
|
|
|
| |
Mechanically replace "a,b" with "a, b".
No change to generated assembly.
|
|
|
|
|
|
|
| |
Mechanically replace "a,b" with "a, b", followed with some manual
indentation clean up.
No change in generated assembly.
|
|
|
|
| |
No change in generated assembly.
|
|
|
|
|
|
|
|
|
| |
MD32_XARRAY (formerly SHA_XARRAY) was added as a workaround for a broken
HP C compiler (circa 1999). Clean it up to simplify the code.
No change in generated assembly.
ok miod@ tb@
|
|
|
|
|
|
| |
This follows what is done for other SHA implementations.
ok miod@ tb@
|
|
|
|
| |
No intended functional change.
|
| |
|