summaryrefslogtreecommitdiff
path: root/src/lib/libcrypto/arch/amd64 (follow)
Commit message (Collapse)AuthorAgeFilesLines
* This commit was manufactured by cvs2git to create tag 'tb_20250414'.tb_20250414cvs2svn2025-04-144-405/+0
|
* Support OPENSSL_NO_FILENAMEStb2025-03-091-0/+10
| | | | | | | | | | Some people are concerned that leaking a user name is a privacy issue. Allow disabling the __FILE__ and __LINE__ argument in the error stack to avoid this. This can be improved a bit in tree. From Viktor Szakats in https://github.com/libressl/portable/issues/761 ok bcook jsing
* Replace Makefile based SHA*_ASM defines with HAVE_SHA_* defines.jsing2025-02-142-5/+11
| | | | | | | | | | | | | | | | Currently, SHA{1,256,512}_ASM defines are used to remove the C implementation of sha{1,256,512}_block_data_order() when it is provided by assembly. However, this prevents the C implementation from being used as a fallback. Rename the C sha*_block_data_order() to sha*_block_generic() and provide a sha*_block_data_order() that calls sha*_block_generic(). Replace the Makefile based SHA*_ASM defines with two HAVE_SHA_* defines that allow these functions to be compiled in or removed, such that machine specific verisons can be provided. This should effectively be a no-op on any platform that defined SHA{1,256,512}_ASM. ok tb@
* Mop up RC4_INDEX.jsing2025-01-271-7/+0
| | | | | | | | | | | | | The RC4_INDEX define switches between base pointer indexing and per-byte pointer increment. This supposedly made a huge difference to performance on x86 at some point, however compilers have improved somewhat since then. There is no change (or effectively no change) in generated assembly on a the majority of LLVM platforms and even when there is some change (e.g. aarch64), there is no noticable performance difference. Simplify the (still messy) macros/code and mop up RC4_INDEX. ok tb@
* Provide a readable assembly implementation for MD5 on amd64.jsing2025-01-241-2/+2
| | | | | | | | | | This appears to be about 5% faster than the current perlasm version on a modern Intel CPU. While here rename md5_block_asm_data_order to md5_block_data_order, for consistency with other hashes. ok tb@
* Provide a SHA-1 assembly implementation for amd64 using SHA-NI.jsing2024-12-061-1/+2
| | | | | | | | This provides a SHA-1 assembly implementation for amd64, which uses the Intel SHA Extensions (aka SHA New Instructions or SHA-NI). This provides a 2-2.5x performance gain on some Intel CPUs and many AMD CPUs. ok tb@
* Provide a replacement assembly implementation for SHA-1 on amd64.jsing2024-12-041-2/+3
| | | | | | | | | | | | | As already done for SHA-256 and SHA-512, replace the perlasm generated SHA-1 assembly implementation with one that is actually readable. Call the assembly implementation from a C wrapper that can, in the future, dispatch to alternate implementations. On a modern CPU the performance is around 5% faster than the base implementation generated by sha1-x86_64.pl, however it is around 15% slower than the excessively complex SSSE2/AVX version that is also generated by the same script (a SHA-NI version will greatly outperform this and is much cleaner/simpler). ok tb@
* Provide a SHA-256 assembly implementation for amd64 using SHA-NI.jsing2024-11-161-1/+2
| | | | | | | | This provides a SHA-256 assembly implementation for amd64, which uses the Intel SHA Extensions (aka SHA New Instructions or SHA-NI). This provides a 3-5x performance gain on some Intel CPUs and many AMD CPUs. ok tb@
* Provide a replacement assembly implementation for SHA-512 on amd64.jsing2024-11-161-6/+3
| | | | | | | | Replace the perlasm generated SHA-512 assembly with a more readable version and the same C wrapper introduced for SHA-256. As for SHA-256, on a modern CPU the performance is largely the same. ok tb@
* Add CPU capability detection for the Intel SHA extensions (aka SHA-NI).jsing2024-11-162-5/+27
| | | | | | | This also provides a crypto_cpu_caps_amd64 variable that can be checked for CRYPTO_CPU_CAPS_AMD64_SHA. ok tb@
* Check the correct variable in cpuid().jsing2024-11-121-2/+2
|
* Provide a replacement assembly implementation for SHA-256 on amd64.jsing2024-11-081-6/+3
| | | | | | | | | | | | | Replace the perlasm generated SHA-256 assembly implementation with one that is actually readable. Call the assembly implementation from a C wrapper that can, in the future, dispatch to alternate implementations. Performance is similar (or even better) on modern CPUs, while somewhat slower on older CPUs (this is in part due to the wrapper, the impact of which is more noticable with small block sizes). Thanks to gkoehler@ and tb@ for testing. ok tb@
* cryptlib.h: adjust header guard for upcoming surgerytb2024-11-051-1/+1
| | | | | | | | It is gross that an internal detail leaked into a public header, but, hey, it's openssl. No hack is too terrible to appear in this library. opensslconf.h needs major pruning but the day that happens is not today. ok jsing
* Remove IA32 specific code from cryptlib.c.jsing2024-10-192-3/+10
| | | | | | Move the IA32 specific code to arch/{amd64,i386}/crypto_cpu_caps.c, rather than polluting cryptlib.c with machine dependent code. A stub version of crypto_cpu_caps_ia32() still remains for now.
* Provide crypto_cpu_caps_init() for amd64.jsing2024-10-183-10/+120
| | | | | | | | | | | This is a CPU capability detection implementation in C, with minimal inline assembly (for cpuid and xgetbv). This replaces the assembly mess generated by x86_64cpuid.pl. Rather than populating OPENSSL_ia32cap_P directly with CPUID output, just set the bits that the remaining perlasm checks (namely AESNI, AVX, FXSR, INTEL, HT, MMX, PCLMUL, SSE, SSE2 and SSSE3). ok joshua@ tb@
* Unexport OPENSSL_cpuid_setup and OPENSSL_ia32cap_Ptb2024-08-311-1/+0
| | | | | | | | | This allows us in particular to get rid of the MD Symbols.list which were needed on amd64 and i386 for llvm 16 a while back. OPENSSL_ia32cap_P was never properly exported since the symbols were marked .hidden in the asm. ok beck jsing
* Provide and use crypto_arch.h.jsing2024-08-112-8/+35
| | | | | | | | Provide a per architecture crypto_arch.h - this will be used in a similar manner to bn_arch.h and will allow for architecture specific #defines and static inline functions. Move the HAVE_AES_* and HAVE_RC4_* defines here. ok tb@
* enable -fret-clean on amd64, for libc libcrypto ld.so kernel, and all thederaadt2024-06-041-1/+3
| | | | | ssh tools. The dynamic objects are entirely ret-clean, static binaries will contain a blend of cleaning and non-cleaning callers.
* Always use C functions for AES_{encrypt,decrypt}().jsing2024-03-291-1/+3
| | | | | | | Always provide AES_{encrypt,decrypt}() via C functions, which then either use a C implementation or call the assembly implementation. ok tb@
* Move camellia to primary Makefile.jsing2024-03-291-5/+1
| | | | These files are now built on all platforms.
* Stop building camellia assembly on amd64 and i386.jsing2024-03-291-3/+4
| | | | | | | This is a legacy algorithm and the assembly is only marginally faster than the C code. Discussed with beck@ and tb@
* Move aes_core.c to the primary Makefile.jsing2024-03-291-2/+1
| | | | This is now built on all platforms.
* Always use C functions for AES_set_{encrypt,decrypt}_key().jsing2024-03-291-1/+4
| | | | | | | | Always include aes_core.c and provide AES_set_{encrypt,decrypt}_key() via C functions, which then either use a C implementation or call the assembly implementation. ok tb@
* Move wp_block.c to the primary Makefile.jsing2024-03-291-3/+1
| | | | This is now built on all platforms.
* Stop building whirlpool assembly on amd64 and i386.jsing2024-03-291-3/+2
| | | | | | | This is a legacy algorithm and the assembly is only marginally faster than the C code. Discussed with beck@ and tb@
* Merge aes_cbc.c into aes.c now that aes_cbc.c is used on all platforms.jsing2024-03-281-2/+1
|
* Make AES_cbc_encrypt() always be a C function.jsing2024-03-281-1/+3
| | | | | | | | Rename the assembly generated functions from AES_cbc_encrypt() to aes_cbc_encrypt_internal(). Always include aes_cbc.c and change it to use defines that are similar to those used in BN. ok tb@
* Remove OPENSSL_UNISTD definetb2024-03-281-3/+0
|
* Move rc4.c to primary Makefile.jsing2024-03-281-2/+1
| | | | This is now built on all platforms.
* Use C functions for RC4 public API.jsing2024-03-281-1/+4
| | | | | | | | | | | | | | Rather than having public API switch between C and assembly, always use C functions as entry points, which then call an assembly implementation (if available). This makes it significantly easier to deal with symbol aliasing/namespaces and it also means we benefit from vulnerability prevention provided by the C compiler. Rename the assembly generated functions from RC4() to rc4_internal() and RC4_set_key() to rc4_set_key_internal(). Always include rc4.c and change it to use defines that are similar to those used in BN. ok beck@ joshua@ tb@
* Move des sources to primary Makefile.jsing2024-03-281-3/+1
| | | | | Now that all platforms use a C des implementation, move it to the primary Makefile.
* Remove assembly for stitched modes.jsing2024-03-271-4/+1
| | | | | The stitched modes have been removed, so having assembly for them is of little use.
* Move bf_enc.c to the primary Makefile.jsing2024-03-271-3/+1
| | | | | Now that all architectures are using bf_enc.c, it does not make sense to have it in every Makefile.inc file.
* split the Symbols.list up so that arch specific symbols do not end up everywhererobert2023-11-121-0/+1
| | | | ok tb@
* Stop building GF2m assemblytb2023-04-151-3/+1
| | | | | | | GF2m support will be removed shortly. In the interim drop some of this unused code already and let it fall back to the C implementation. ok jsing
* Enable s2n-bignum word_clz() on amd64.jsing2023-02-161-1/+2
| | | | | | | | | The BN_num_bits_word() function is a hot path, being called more than 80 million times during a libcrypto regress run. The word_clz() implementation uses five instructions to do the same as the generic code that uses more than 60 instructions. Discussed with tb@
* Use s2n-bignum assembly implementations for libcrypto bignum on amd64.jsing2023-01-291-2/+11
| | | | | | | This switches the core bignum assembly implementations from x86_64-gcc.c to s2n-bignum for amd64. ok miod@ tb@
* Provide an implementation of bn_sqr() that calls s2n-bignum's bignum_sqr().jsing2023-01-211-1/+6
| | | | ok tb@
* Remove unused Elliptic Curve code.jsing2023-01-141-5/+1
| | | | | | | | | | | | | For various reasons, the ecp_nistp* and ecp_nistz* code is unused. While ecp_nistp* was being compiled, it is disabled due to OPENSSL_NO_EC_NISTP_64_GCC_128 being defined. On the other hand, ecp_nistz* was not even being built. We will bring in new versions or alternative versions of such code, if we end up enabling it in the future. For now it is just causing complexity (and grep noise) while trying to improve the EC code. Discussed with tb@
* spelling fixes; from paul tagliamontejmc2022-12-261-2/+2
| | | | | | | i removed the arithmetics -> arithmetic changes, as i felt they were not clearly correct ok tb
* sprinkle a few missing dependencies on perl scripts internal bits.espie2017-08-201-5/+8
| | | | 'it works' deraadt@
* Disable ec assembly for amd64 pending fixes for ssh, and bumpbeck2016-11-111-4/+4
| | | | majors appropriately
* Ride the current major bump and enable assembler code for nist 256p curve,miod2016-11-041-1/+5
| | | | | | | | on amd64 only for now. Stanzas to enable it on arm, i386 and sparc64 are provided but commented out for lack of testing due to the machine room being currently in storage. ok jsing@
* Remove I386_ONLY define. It was only used to prefer amiod2016-11-041-3/+0
| | | | | | | faster-on-genuine-80386-but-slower-on-80486-onwards innstruction sequence in the SHA512 code, and had not been enabled in years, if at all. ok tom@ bcook@
* Pass "openbsd" instead of "openbsd-elf" as the "flavour" to the perl assemblermiod2015-09-111-2/+2
| | | | | machinery. OpenBSD has never been not ELF on amd64, and changing this will actually make -portable life slightly easier in the near future.
* Disable ENGINE_load_dynamic (dynamic engine support).bcook2015-06-191-1/+0
| | | | | | | We do not build, test or ship any dynamic engines, so we can remove the dynamic engine loader as well. This leaves a stub initialization function in its place. ok beck@, reyk@, miod@
* Add the Cammelia cipher to libcrypto.miod2014-11-171-1/+4
| | | | | | | | | | | | | | | | | | There used to be a strong reluctance to provide this cipher in LibreSSL in the past, because the licence terms under which Cammelia was released by NTT were free-but-not-in-the-corners, by restricting the right to modify the source code, as well retaining the right to enforce their patents against anyone in the future. However, as stated in http://www.ntt.co.jp/news/news06e/0604/060413a.html , NTT changed its mind and made this code truly free. We only wish there had been more visibility of this, for we could have had enabled Cammelia earlier (-: Licence change noticed by deraadt@. General agreement from the usual LibreSSL suspects. Crank libcrypto.so minor version due to the added symbols.
* Guard RSA / RC4-5 ASM when NO_ASM is not definedbcook2014-08-111-1/+3
| | | | | | | | | Most assembly blocks remain inactive if OPENSSL_NO_ASM is not defined, only enabling inline assembly, but the RSA / RC4-5 blocks (used only in amd64 systems) turn on implicitly. Guard these two as well. This simplifies enabling just inline ASM in portable, no effective change in OpenBSD.
* i'm a dumbdumb. fix build.tedu2014-07-111-1/+1
|
* move all the feature settings to a common header.tedu2014-07-111-72/+1
| | | | probably ok beck jsing miod