| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
Provide aes_xts_encrypt_internal() and call that from aes_xts_cipher().
Have amd64 and i386 provide their own versions that dispatch to
aesni_xts_encrypt()/aesni_xts_decrypt() as appropriate. The
AESNI_CAPABLE code and methods can then be removed.
ok tb@
|
|
|
|
|
|
|
|
|
| |
Provide gcm128_amd64.c and gcm128_i386.c, which contain the appropriate
gcm128 initialisation and CPU feature tests for the respective platform.
This allows for all of the #define spagetti to be removed from gcm128.c
and removes one of the two remaining consumers of crypto_cpu_caps_ia32().
ok tb@
|
|
|
|
|
|
|
|
| |
Since we always initialise the gmult/ghash function pointers, use the same
implementaion of gcm_mul() and gcm_ghash(), regardless of the actual
underlying implementation.
ok tb@
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The OPENSSL_IA32_SSE2 flag controls whether a number of the perlasm
scripts generate additional implementations that use SSE2 functionality.
In all cases except ghash, the code checks OPENSSL_ia32cap_P for SSE2
support, before trying to run SSE2 code. For ghash it generates a CLMUL
based implementation in addition to different MMX version (one MMX
version hides behind OPENSSL_IA32_SSE2, the other does not), however this
does not appear to actually use SSE2. We also disable AES-NI on i386 if
OPENSSL_IA32_SSE2.
On OpenBSD, we've always defined OPENSSL_IA32_SSE2 so this is effectively
a no-op. The only change is that we now check MMX rather than SSE2 for the
ghash MMX implementation.
ok bcook@ beck@
|
|
|
|
|
| |
Fix some things that got missed in the last pass - the majority is use of
post-increment rather than unnecessary pre-increment.
|
| |
|
| |
|
|
|
|
|
|
|
|
| |
Rework some logic, add explicit numerical checks, move assignment out of
variable declaration and use post-increment/post-decrement unless there is
a specific reason to do pre-increment.
ok kenjiro@ tb@
|
|
|
|
|
|
| |
When checking the GCM tag, use timingsafe_memcmp() instead of memcmp().
ok tb@
|
| |
|
|
|
|
|
|
|
|
| |
This adds significant complexity to the code. On amd64 and aarch64 it
results in a minimal slowdown for aligned inputs and a performance
improvement for unaligned inputs.
ok beck@ joshua@ tb@
|
| |
|
|
|
|
| |
Discussed with tb@
|
|
|
|
|
|
|
|
| |
The last #else branch in CRYPTO_gcm128_init() doesn't initialize the
function pointers for gmult/ghash, which results in a segfault when
using GCM on architectures taking this branch, notably sparc64.
found by and fix from jca
|
|
|
|
| |
ok jsing@
|
|
|
|
| |
No change in generated assembly.
|
|
|
|
|
|
|
|
|
| |
Instead of using size_t and a PACK macro, store the entries as uint16_t and
then uncondtionally left shift 48 bits. This gives a small performance gain
on some architectures and has the advantage of reducing the size of the
table from 1024 bits to 256 bits.
ok beck@ joshua@ tb@
|
|
|
|
|
|
|
|
|
| |
The REDUCE1BIT macro is now only used in one place, so just inline it.
Additionally we do not need separate 32 bit and 64 bit versions - just use
the 64 bit version and let the compiler deal with it (we effectively get
the same code on i386).
ok beck@ joshua@
|
|
|
|
|
|
|
|
|
| |
TABLE_BITS is always currently defined as 4 - 8 is considered to be
insecure due to timing leaks and 1 is considerably slower. Remove code
that is not regularly tested, does not serve a lot of purpose and is making
clean up harder than it needs to be.
ok tb@
|
|
|
|
|
|
|
|
|
| |
Rather than having defines for GCM_MUL/GHASH (along with the wonder that
is GCM_FUNCREF_4BIT) then conditioning on their availability, provide and
call gcm_mul()/gcm_ghash() unconditionally. This simplifies all of the call
sites.
ok tb@
|
|
|
|
|
|
|
| |
Also condition on defined(GHASH_CHUNK) since this is used within these
blocks. This makes the conditionals consistent with other usage.
Fixes build with TABLE_BITS == 1.
|
|
|
|
| |
ok tb@
|
|
|
|
|
|
|
|
| |
A modern compiler will unroll these loops - LLVM produces identical code
(at least on arm64). Drop the manually unrolled version and have code that
is more readable and maintainable.
ok tb@
|
|
|
|
| |
ok beck@ tb@
|
|
|
|
|
|
|
|
|
|
|
| |
We're already using 64 bit variables, so just continue to do so and let
the compiler deal with code generation. While here, use unsigned right
shifts instead of relying on signed right shifts and implementation-defined
behaviour (which the original code did).
Feedback from lucas@
ok beck@ tb@
|
|
|
|
|
|
|
|
|
| |
This appears to have been broken since 2013 when OpenSSL commit 3b4be0018b5
landed. This added in_t and out_t variables, but continued to use in and
out instead. Yet another reason why untested conditional code is a bad
thing.
ok beck@ tb@
|
|
|
|
|
|
|
| |
We do not build with OPENSSL_SMALL_FOOTPRINT and it removes more untested
code paths.
Request by tb@ (and it was already on my TODO list!)
|
| |
|
|
|
|
|
|
| |
While here, tidy up the assignment of n and test directly.
ok tb@
|
|
|
|
| |
ok tb@
|
|
|
|
| |
ok tb@
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The OPENSSL_cpu_caps() change after the last bump missed a crucial bit:
there is more MD mess in the MI code than anticipated, with the result
that AES is now used without AES-NI on amd64 and i386, hurting machines
that previously greatly benefitted from it.
Temporarily add an internal crypto_cpu_caps_ia32() API that returns the
OPENSSL_ia32cap_P or 0 like OPENSSL_cpu_caps() previously did. This can
be improved after the release.
Regression reported and fix tested by Mark Patruck.
No impact on public ABI or API.
with/ok jsing
PS: Next time my pkg_add feels very slow, I should perhaps not mechanically
blame IEEE 802.11...
|
|
|
|
|
|
|
|
| |
gcm_{gmult,ghash}_4bit(), aesni_ccm64_decrypt_blocks(), aes_cbc_encrypt(),
and aesni_xts_{en,de}crypt() were overlooked in previous passes.
Found with a diff for ld.lld by kettenis
ok kettenis
|
|
|
|
|
|
|
|
|
| |
cet.h is needed for other platforms to emit the relevant .gnu.properties
sections that are necessary for them to enable IBT. It also avoids issues
with older toolchains on macOS that explode on encountering endbr64.
based on a diff by kettenis
ok beck kettenis
|
|
|
|
|
|
|
| |
This is a variant of the same logic error fixed in ghash-x86_64.pl r1.6.
The code path is only reachable on machines without FXSR or PCLMUL.
ok jsing
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The assembly code for gcm_ghash_4bit() reads one too many times from Xi,
resulting in a four byte overread. Prevent this by not loading the next
value in the final iteration of the loop. If another full iteration is
required the next Xi value will be loaded at the top of the outer_loop.
Many thanks to Douglas Gliner <Douglas.Gliner at sony dot com> for finding
and reporting this issue, along with a detailed reproducer.
Same diff from deraadt@
ok tb@
|
|
|
|
|
|
| |
Replace a pile of byte order handling mess with htobe*() and be*toh().
ok tb@
|
|
|
|
| |
ok tb@
|
|
|
|
| |
ok tb@
|
|
|
|
| |
Found by, compile tested & ok bluhm.
|
|
|
|
| |
ok jsing
|
|
|
|
| |
ok jsing, and kind of tb an earlier version
|
|
|
|
| |
ok jsing
|
|
|
|
| |
ok miod
|
|
|
|
|
|
|
|
| |
At least gcc 12 on Fedora is very unhappy about a plain .rodata and throws
Error: unknown pseudo-op: `.rodata'. So add a .section in front of it to
make it happy.
ok deraadt miod
|
|
|
|
| |
responsible from getting the proper address of those blocks.
|