summaryrefslogtreecommitdiff
path: root/src/lib/libcrypto/arch (follow)
Commit message (Collapse)AuthorAgeFilesLines
* Mop up the now unused RC4_CHUNK defines.jsing7 days13-130/+0
| | | | ok tb@
* Hook additional s2n-bignum routines to the amd64 build.jsing9 days1-1/+11
|
* Add CPU feature detection for ADX on amd64.jsing9 days2-5/+10
| | | | | | | | Add detection of Multi-Precision Add-Carry Instruction Extensions on amd64. s2n-bignum provides a number of fast multiplication routines that can leverage these instructions. ok tb@
* Remove DES_UNROLL from opensslconf.h.jsing2025-07-2713-156/+0
| | | | | | This is no longer used in the DES code. ok tb@
* Remove BN_LLONG defines/undefs from opensslconf.h.jsing2025-07-2313-65/+0
| | | | | | | These have been ineffective since r1.19 of bn.h, when BN_LLONG/BN_ULLONG defines/undefs were added based on _LP64. ok tb@
* Remove crypto_cpu_caps_ia32()jsing2025-07-224-18/+4
| | | | | | There are no more consumers of crypto_cpu_caps_ia32(), so remove it. ok bcook@ joshua@ tb@
* Move AES-NI for ECB out of EVP.jsing2025-07-222-2/+4
| | | | | | | | | | Make aes_ecb_encrypt_internal() replaceable and provide machine dependent versions for amd64 and i386, which dispatch to AES-NI if appropriate. Remove the AES-NI specific EVP methods for ECB. This removes the last of the machine dependent code from EVP AES. ok bcook@ joshua@ tb@
* Move AES-NI from EVP to AES for CCM mode.jsing2025-07-212-2/+4
| | | | | | | | | | | | | | | | | The mode implementation for CCM has two variants - one takes the block function, while the other takes a "ccm64" function. The latter is expected to handle the lower 64 bits of the IV/counter but only for 16 byte blocks. The AES-NI implementation for CCM currently uses the second variant. Provide aes_ccm64_encrypt_internal() as a function that can be replaced on a machine dependent basis, along with an aes_ccm64_encrypt_generic() function that provides the default implementation and can be used as a fallback. Wire up the AES-NI version for amd64 and i386, change EVP's aes_ccm_cipher() to use CRYPTO_ctr128_{en,de}crypt_ccm64() with aes_ccm64_encrypt_internal()) and remove the various AES-NI specific EVP_CIPHER methods for CCM. ok tb@
* Simplify AES-XTS implementation and remove AES-NI specific code from EVP.jsing2025-07-132-2/+4
| | | | | | | | | Provide aes_xts_encrypt_internal() and call that from aes_xts_cipher(). Have amd64 and i386 provide their own versions that dispatch to aesni_xts_encrypt()/aesni_xts_decrypt() as appropriate. The AESNI_CAPABLE code and methods can then be removed. ok tb@
* Provide accelerated SHA-1 for aarch64.jsing2025-06-282-2/+5
| | | | | | | | Provide an assembly implementation of SHA-1 for aarch64 using the ARM Cryptographic Extension (CE). This results in around a 2x speed up for larger block sizes. ok tb@
* Rework gcm128 implementation selection for amd64/i386.jsing2025-06-284-4/+17
| | | | | | | | | Provide gcm128_amd64.c and gcm128_i386.c, which contain the appropriate gcm128 initialisation and CPU feature tests for the respective platform. This allows for all of the #define spagetti to be removed from gcm128.c and removes one of the two remaining consumers of crypto_cpu_caps_ia32(). ok tb@
* Add CLMUL and MMX to machine dependent CPU capabilities for i386.jsing2025-06-282-4/+10
| | | | ok tb@
* Add CLMUL to machine dependent CPU capabilities for amd64.jsing2025-06-282-4/+7
| | | | ok tb@
* Move AES-NI from EVP to AES for CTR mode.jsing2025-06-272-4/+6
| | | | | | | | | | | | | | | | | | | | The mode implementation for CTR has two variants - one takes the block function, while the other takes a "ctr32" function. The latter is expected to handle the lower 32 bits of the IV/counter, but is not expected to handle overflow. The AES-NI implementation for CTR currently uses the second variant. Provide aes_ctr32_encrypt_internal() as a function that can be replaced on a machine dependent basis, along with an aes_ctr32_encrypt_generic() function that provides the default implementation and can be used as a fallback. Wire up the AES-NI version for amd64 and i386, change AES_ctr128_encrypt() to use CRYPTO_ctr128_encrypt_ctr32() (which calls aes_ctr32_encrypt_internal()) and remove the various AES-NI specific EVP_CIPHER methods for CTR. Callers of AES_ctr128_encrypt() will now use AES-NI, if available. ok tb@
* Integrate AES-NI into the AES code.jsing2025-06-152-2/+6
| | | | | | | | | | Currently, the AES-NI code is only integrated into EVP - add code to integrate AES-NI into AES. Rename the assembly provided functions and provide C versions for the original names, which check for AES-NI support and dispatch to the appropriate function. This means that the AES_* public API will now use AES-NI, if available. ok tb@
* Provide machine dependent CPU capabilities for i386.jsing2025-06-152-3/+17
| | | | | | This indicates if AES-NI is available via CRYPTO_CPU_CAPS_I386_AES. ok tb@
* Provide CRYPTO_CPU_CAPS_AMD64_AES in machine dependent CPU capabilities.jsing2025-06-152-4/+7
| | | | ok tb@
* Remove BF_PTRtb2025-06-1113-65/+0
| | | | | | | | | | | | In bf_local.h r1.2, openssl/opensslconf.h was pulled out of the HEADER_BF_LOCL_H header guard, so BF_PTR was never defined from opensslfeatures.h. Thus, alpha, mips64, sparc64 haven't used the path that is supposedly optimized for them. On the M3k the speed gain of bf-cbc with BF_PTR is roughly 5%, so not really great. This is blowfish, so I don't think we want to carry complications for alpha and mips64 only. ok jsing kenjiro
* one DES_LONG hid in arch/sh/opensslconf.htb2025-06-091-8/+0
|
* Move (mostly) MI constants to proper headerstb2025-06-0913-408/+0
| | | | | | | | | | | | | | | | | Most of the constants here are only defined if a specific header is in scope. So move the machine-independent macros to those headers and lose the header guards. Most of these should actually be typedefs but let's change this when we're bumping the major since this technically has ABI impact. IDEA_INT RC2_INT and RC4_INT are always unsigned int DES_LONG is always unsigned int except on i386 This preserves the existing situation on OpenBSD. If you're using portable on i386 with a compiler that does not define __i386__, there's an ABI break. ok jsing
* Make OPENSSL_IA32_SSE2 the default for i386 and remove the flag.jsing2025-06-091-2/+1
| | | | | | | | | | | | | | | | | The OPENSSL_IA32_SSE2 flag controls whether a number of the perlasm scripts generate additional implementations that use SSE2 functionality. In all cases except ghash, the code checks OPENSSL_ia32cap_P for SSE2 support, before trying to run SSE2 code. For ghash it generates a CLMUL based implementation in addition to different MMX version (one MMX version hides behind OPENSSL_IA32_SSE2, the other does not), however this does not appear to actually use SSE2. We also disable AES-NI on i386 if OPENSSL_IA32_SSE2. On OpenBSD, we've always defined OPENSSL_IA32_SSE2 so this is effectively a no-op. The only change is that we now check MMX rather than SSE2 for the ghash MMX implementation. ok bcook@ beck@
* Stop defining OPENSSL_IA32_SSE2 on amd64.jsing2025-06-091-2/+1
| | | | | | This no longer does anything on this architecture. ok bcook@ beck@
* Remove ${MULTIPLE_OF_EIGHT}_BIT*tb2025-06-0813-144/+0
| | | | | | | | These are unused internally and very few things look at them, none of which should really matter to us, except possibly free pascal on Windows. sizeof has been available since forever... ok jsing
* Garbage collect DES_PTRtb2025-06-0813-78/+0
| | | | pointed out by/ok jsing
* Remove DES_RISC*tb2025-06-0813-715/+0
| | | | | | | | | | | | | | | codesearch.debian.net only shows some legacy openssl patches plus binkd (a FidoNet mailer) as sole potential user. net-snmp and a strongswan DES plugin bundle some opt-in libdes/openssl legacy things. If this should break any of this, I don't think we need to care. If you're really going to use DES you can also use non bleeding edge libressl. We can remove the big 'default values' block because one of DES_RISC1, DES_RISC2, DES_UNROLL is always defined (you can ignore DES_PTR for this), so this is dead support code for mostly dead platforms. ok kenjiro
* Rename the header guard of des.h with HEADER_DES_Htb2025-06-0513-13/+13
| | | | | | libdes is dead, Jim. Only its successors continue to haunt us. discussed with jsing
* Remove preprocessor branching on HEADER_DES_Htb2025-06-0513-13/+13
| | | | | | | | This was the header guard for des_old.h introduced in 2002 and removed in 2014. The header guard for des.h is HEADER_NEW_DES_H for the sake of inconsistency (ostensibly due to backward compat concerns with libdes). ok jsing
* opensslconf.h: remove md2 leftoverstb2025-06-0513-52/+0
| | | | | | | md2.h left on Apr 15, 2014, along with jpake and seed. In particular, HEADER_MD2_H is never defined. These bits have been dead ever since. ok jsing
* Disable libcrypto assembly on arm.jsing2025-05-245-257/+2
| | | | | | | | | | | | | | | | | The arm CPU capability detection is uses SIGILL and is unsafe to call from some contexts. Furthermore, this is only useful to detect NEON support, which is then unused on OpenBSD due to __STRICT_ALIGNMENT. Requiring a minimum of ARMv7+VFP+NEON is also not unreasonable. The SHA-1, SHA-256 and SHA-512 (non-NEON) C code performs within ~5% of the assembly, as does RSA when using the C based Montgomery multiplication. The C versions of AES and GHASH code are around ~40-50% of the assembly, howeer if you care about performance you really want to use Chacha20Poly1305 on this platform. This will enable further clean up to proceed. ok joshua@ kinjiro@ tb@
* Remove BS-AES and VP-AES from EVP.jsing2025-04-182-8/+2
| | | | | | | | | | | | | | | | | The bitsliced and vector permutation AES implementations were created around 2009, in attempts to speed up AES on Intel hardware. Both require SSSE3 which existed from around 2006. Intel introduced AES-NI in 2008 and a large percentage of Intel/AMD CPUs made in the last 15 years include it. AES-NI is significantly faster and requires less code. Furthermore, the BS-AES and VP-AES implementations are wired directly into EVP (as is AES-NI currently), which means that any consumers of the AES_* API are not able to benefit from acceleration. Removing these greatly simplifies the EVP AES code - if you just happen to have a CPU that supports SSSE3 but not AES-NI, then you'll now use the regular AES assembly implementations instead. ok kettenis@ tb@
* Provide an accelerated SHA-512 assembly implementation for aarch64.jsing2025-03-122-2/+7
| | | | | | | | | | This provides a SHA-512 assembly implementation that makes use of the ARM Cryptographic Extension (CE), which is found on many arm64 CPUs. This gives a performance gain of up to 2.5x on an Apple M2 (dependent on block size). If an aarch64 machine does not have SHA512 support, then we'll fall back to using the existing C implementation. ok kettenis@ tb@
* Support OPENSSL_NO_FILENAMEStb2025-03-0913-0/+130
| | | | | | | | | | Some people are concerned that leaking a user name is a privacy issue. Allow disabling the __FILE__ and __LINE__ argument in the error stack to avoid this. This can be improved a bit in tree. From Viktor Szakats in https://github.com/libressl/portable/issues/761 ok bcook jsing
* Provide an accelerated SHA-256 assembly implementation for aarch64.jsing2025-03-072-2/+9
| | | | | | | | | | This provides a SHA-256 assembly implementation that makes use of the ARM Cryptographic Extension (CE), which is found on many arm64 CPUs. This gives a performance gain of up to 7.5x on an Apple M2 (dependent on block size). If an aarch64 machine does not have SHA2 support, then we'll fall back to using the existing C implementation. ok kettenis@ tb@
* Replace Makefile based SHA*_ASM defines with HAVE_SHA_* defines.jsing2025-02-1417-38/+80
| | | | | | | | | | | | | | | | Currently, SHA{1,256,512}_ASM defines are used to remove the C implementation of sha{1,256,512}_block_data_order() when it is provided by assembly. However, this prevents the C implementation from being used as a fallback. Rename the C sha*_block_data_order() to sha*_block_generic() and provide a sha*_block_data_order() that calls sha*_block_generic(). Replace the Makefile based SHA*_ASM defines with two HAVE_SHA_* defines that allow these functions to be compiled in or removed, such that machine specific verisons can be provided. This should effectively be a no-op on any platform that defined SHA{1,256,512}_ASM. ok tb@
* Mop up RC4_INDEX.jsing2025-01-2713-91/+0
| | | | | | | | | | | | | The RC4_INDEX define switches between base pointer indexing and per-byte pointer increment. This supposedly made a huge difference to performance on x86 at some point, however compilers have improved somewhat since then. There is no change (or effectively no change) in generated assembly on a the majority of LLVM platforms and even when there is some change (e.g. aarch64), there is no noticable performance difference. Simplify the (still messy) macros/code and mop up RC4_INDEX. ok tb@
* Provide a readable assembly implementation for MD5 on amd64.jsing2025-01-241-2/+2
| | | | | | | | | | This appears to be about 5% faster than the current perlasm version on a modern Intel CPU. While here rename md5_block_asm_data_order to md5_block_data_order, for consistency with other hashes. ok tb@
* Provide a SHA-1 assembly implementation for amd64 using SHA-NI.jsing2024-12-061-1/+2
| | | | | | | | This provides a SHA-1 assembly implementation for amd64, which uses the Intel SHA Extensions (aka SHA New Instructions or SHA-NI). This provides a 2-2.5x performance gain on some Intel CPUs and many AMD CPUs. ok tb@
* Provide a replacement assembly implementation for SHA-1 on amd64.jsing2024-12-041-2/+3
| | | | | | | | | | | | | As already done for SHA-256 and SHA-512, replace the perlasm generated SHA-1 assembly implementation with one that is actually readable. Call the assembly implementation from a C wrapper that can, in the future, dispatch to alternate implementations. On a modern CPU the performance is around 5% faster than the base implementation generated by sha1-x86_64.pl, however it is around 15% slower than the excessively complex SSSE2/AVX version that is also generated by the same script (a SHA-NI version will greatly outperform this and is much cleaner/simpler). ok tb@
* Provide a SHA-256 assembly implementation for amd64 using SHA-NI.jsing2024-11-161-1/+2
| | | | | | | | This provides a SHA-256 assembly implementation for amd64, which uses the Intel SHA Extensions (aka SHA New Instructions or SHA-NI). This provides a 3-5x performance gain on some Intel CPUs and many AMD CPUs. ok tb@
* Provide a replacement assembly implementation for SHA-512 on amd64.jsing2024-11-161-6/+3
| | | | | | | | Replace the perlasm generated SHA-512 assembly with a more readable version and the same C wrapper introduced for SHA-256. As for SHA-256, on a modern CPU the performance is largely the same. ok tb@
* Add CPU capability detection for the Intel SHA extensions (aka SHA-NI).jsing2024-11-162-5/+27
| | | | | | | This also provides a crypto_cpu_caps_amd64 variable that can be checked for CRYPTO_CPU_CAPS_AMD64_SHA. ok tb@
* Add comment for crypto_cpu_caps_aarch64.jsing2024-11-121-1/+2
|
* Check the correct variable in cpuid().jsing2024-11-122-4/+4
|
* Provide a replacement assembly implementation for SHA-256 on amd64.jsing2024-11-081-6/+3
| | | | | | | | | | | | | Replace the perlasm generated SHA-256 assembly implementation with one that is actually readable. Call the assembly implementation from a C wrapper that can, in the future, dispatch to alternate implementations. Performance is similar (or even better) on modern CPUs, while somewhat slower on older CPUs (this is in part due to the wrapper, the impact of which is more noticable with small block sizes). Thanks to gkoehler@ and tb@ for testing. ok tb@
* Replace aarch64 CPU capabilities detection code.jsing2024-11-086-261/+114
| | | | | | | | | | | | Replace the aarch64 CPU detection code with a version that parses ISAR0, avoiding signal handling and SIGILL. This gets ISAR0 via sysctl(), but this can be adapted to other mechanisms for other platforms (or alternatively the same can be achieved via HWCAP). This now follows the same naming/design as used by amd64 and i386, hence define HAVE_CRYPTO_CPU_CAPS_INIT for aarch64. ok kettenis@ tb@
* cryptlib.h: adjust header guard for upcoming surgerytb2024-11-0513-13/+13
| | | | | | | | It is gross that an internal detail leaked into a public header, but, hey, it's openssl. No hack is too terrible to appear in this library. opensslconf.h needs major pruning but the day that happens is not today. ok jsing
* Clean up PPC CPU capabilities and Montgomery code.jsing2024-11-012-12/+4
| | | | | | | | | | ppc64-mont.pl (which produces bn_mul_mont_fpu64()) is unused on both powerpc and powerpc64, so remove it. ppccap.c doesn't actually contain anything to do with CPU capabilities - it just provides a bn_mul_mont() that calls bn_mul_mont_int() (which ppc-mont.pl generates). Change ppc-mont.pl to generate bn_mul_mont() directly and remove ppccap.c. ok tb@
* Remove IA32 specific code from cryptlib.c.jsing2024-10-194-6/+20
| | | | | | Move the IA32 specific code to arch/{amd64,i386}/crypto_cpu_caps.c, rather than polluting cryptlib.c with machine dependent code. A stub version of crypto_cpu_caps_ia32() still remains for now.
* Remove unused sparc CPU capability detection code.jsing2024-10-191-5/+1
| | | | | | | This has been unused for a long time - it can be found in the attic if someone wants to clean it up and enable it in the future. ok tb@
* Provide crypto_cpu_caps_init() for i386.jsing2024-10-183-10/+120
| | | | | | | This is the same CPU capabilities code that is now used for amd64. Like amd64 we now only populate OPENSSL_ia32cap_P with bits used by perlasm. Discussed with tb@