summaryrefslogtreecommitdiff
path: root/src/lib/libcrypto/bn (follow)
Commit message (Collapse)AuthorAgeFilesLines
* Remove BN_DIV2W.jsing2025-09-072-26/+2
| | | | | | | The BN_DIV2W define provides a code path for double word division via the C compiler, which is only enabled on hppa. Simplify the code and mop this up. ok tb@
* Re-enable bn_sqr_words() assembly.jsing2025-09-073-8/+8
| | | | This is now only on amd64.
* Rename old assembly bn_sqr_words() to bn_sqr_word_wise().jsing2025-09-076-30/+27
| | | | | | | bn_sqr_words() does not actually compute the square of the words, it only computes the square of each individual word - rename it to reflect reality. Discussed with tb@
* Disable assembly bn_sqr_words() again for now.jsing2025-09-023-8/+8
| | | | | | | | The old assembly bn_sqr_words() does not actually square words in the bignum sense. These will have to be renamed (once I come up with a name for whatever it actually does) before we can roll forward again. Found the hard way by Janne Johansson.
* Add const here as well...jsing2025-09-011-2/+2
|
* Use bn_mul_words() from bn_mod_mul_words().jsing2025-09-011-5/+3
| | | | | | Use bn_mul_words() and bn_montgomery_reduce_words(), rather than using bn_montgomery_multiply_words(). This provides better performance on architectures that have assembly optimised bn_mul_words(), such as amd64.
* Constify bn_mul_words().jsing2025-09-013-6/+9
|
* Use bn_sqr_words() from bn_mod_sqr_words().jsing2025-09-011-5/+3
| | | | | | | | Use bn_sqr_words() and bn_montgomery_reduce_words(), rather than using bn_montgomery_multiply_words(). This provides better performance on architectures that have assembly optimised bn_sqr_words(), such as amd64. ok tb@
* Provide bn_mul_words() on amd64.jsing2025-09-012-2/+12
| | | | | This uses s2n-bignum's bignum_mul() and provides significant performance gains for a range of multiplication sizes.
* Reorder functions since they've been renamed.jsing2025-08-311-17/+17
|
* Rename prototype for bn_mul_normal().jsing2025-08-311-2/+2
| | | | This was missed in the previous commit.
* Rename bn_mul_words()/bn_mul_add_words().jsing2025-08-3014-111/+106
| | | | | | | | | | | | | | Most bn_.*_words() functions operate on two word arrays, however bn_mul_words() and bn_mul_add_words() operate on one word array and multiply by a single word. Rename these to bn_mulw_words() and bn_mulw_add_words() to reflect this, following naming scheme that we use for primitives. This frees up bn_mul_words() to actually be used for multiplying two word arrays. Rename bn_mul_normal() to bn_mul_words(), which will then become one of the possible assembly integration points. ok tb@
* Rework bn_sqr() to use bn_sqr_words().jsing2025-08-304-26/+27
| | | | | | | | | | | | Rework some of the squaring code so that it calls bn_sqr_words() and use this as the integration point for assembly. Convert bn_sqr_normal() to bn_sqr_words(), which is then used on architectures that do not provide their own version. This means that we resume using the assembly version of bn_sqr_words() on i386, mips64 and powerpc, which can provide considerable performance gains. ok tb@
* Use faster versions of bignum_{mul,sqr}_{4_8,6_12,8_16}() if possible.jsing2025-08-141-10/+41
| | | | | | | | If ADX instructions are available, use the non-_alt version of s2n-bignum's bignum_{mul,sqr}_{4_8,6_12,8_16}(), which are faster than the _alt non-ADX versions. ok tb@
* Provide amd64 specific versions of bn_mul_comba6() and bn_sqr_comba6().jsing2025-08-142-2/+22
| | | | | | | These use s2n-bignum's bignum_mul_6_12_alt() and bignum_sqr_6_12_alt() functions. ok tb@
* Provide bn_mod_add_words() and bn_mod_sub_words() on amd64.jsing2025-08-142-2/+25
| | | | | | These use s2n-bignum's bignum_modadd() and bignum_modsub() routines. ok tb@
* Add special handling for multiplication and squaring of BNs with six words.jsing2025-08-142-2/+6
| | | | | | | In these cases make use of bn_mul_comba6() or bn_sqr_comba6(), which are faster than the normal path. ok tb@
* Revise include to match the name that we use.jsing2025-08-1210-20/+20
|
* Replace SPDX-License-Identifier with actual license.jsing2025-08-1210-20/+130
|
* Add RCS tags to new files.jsing2025-08-1210-0/+20
|
* Bring in bignum_mod{add,sub}() from s2n-bignum.jsing2025-08-122-0/+185
| | | | These provide modular addition and subtraction.
* Bring in bignum_{mul,sqr}_{4_8,8_16}() from s2n-bignum.jsing2025-08-124-0/+877
| | | | | | | These provide fast multiplication and squaring of inputs with 4 words or 8 words, producing an 8 or 16 word result. These versions require the CPU to support ADX instructions, while the _alt versions that have previously been imported do not.
* Bring in bignum_{mul,sqr}_6_12{,_alt}() from s2n-bignum.jsing2025-08-124-0/+807
| | | | | | These provide fast multiplication and squaring of inputs with 6x words, producing a 12 word result. The non-_alt versions require the CPU to support ADX instructions, while the _alt versions do not.
* Add RCS tags.jsing2025-08-122-0/+4
|
* Add const to bignum_*() function calls.jsing2025-08-121-16/+16
| | | | | Now that s2n-bignum has marked various inputs as const, we can do the same. In most cases we were casting away const, which we no longer need to do.
* Sync headers from s2n-bignum.jsing2025-08-122-236/+588
| | | | | This effectively brings in new function prototypes, a chunk of const additions and some new defines.
* Add RCS tags.jsing2025-08-1111-0/+22
|
* Resync s2n-bignum primitives for amd64 with upstream.jsing2025-08-1111-115/+113
| | | | This amounts to whitespace changes and label renaming.
* Speed up bn_{mod,sqr}_mul_words() for specific inputs.jsing2025-08-051-3/+25
| | | | | | | | Use bn_{mul,sqr}_comba{4,6,8}() and bn_montgomery_reduce_words() for specific input sizes. This is significantly faster than using bn_montgomery_multiply_words(). ok tb@
* Provide bn_sqr_comba6().jsing2025-08-052-2/+48
| | | | | | This allows for fast squaring of a 6 word array. ok tb@
* Provide bn_mul_comba6().jsing2025-08-052-2/+63
| | | | | | This allows for fast multiplication of two 6 word arrays. ok tb@
* Mark the inputs to bn_mul_comba{4,8}() as const.jsing2025-08-053-9/+9
| | | | | | | This makes it consistent with bn_sqr_comba{4,8}() and simplifies an upcoming change. ok tb@
* Avoid signed overflow in BN_MONT_CTX_set()tb2025-08-031-2/+3
| | | | | | | | ri is an int, so the check relied on signed overflow (UB). It's not really reachable, but shrug. reported by smatch via jsg ok beck jsing kenjiro
* Avoid signed overflow in BN_mul()tb2025-08-031-3/+4
| | | | | Reported by smatch via jsg. ok beck jsing kenjiro
* Provide bn_mod_sqr_words() and call it from ec_field_element_sqr().jsing2025-08-022-2/+18
| | | | | For now this still calls bn_montgomery_multiply_words(), however it can be optimised further in the future.
* Make OPENSSL_IA32_SSE2 the default for i386 and remove the flag.jsing2025-06-092-4/+2
| | | | | | | | | | | | | | | | | The OPENSSL_IA32_SSE2 flag controls whether a number of the perlasm scripts generate additional implementations that use SSE2 functionality. In all cases except ghash, the code checks OPENSSL_ia32cap_P for SSE2 support, before trying to run SSE2 code. For ghash it generates a CLMUL based implementation in addition to different MMX version (one MMX version hides behind OPENSSL_IA32_SSE2, the other does not), however this does not appear to actually use SSE2. We also disable AES-NI on i386 if OPENSSL_IA32_SSE2. On OpenBSD, we've always defined OPENSSL_IA32_SSE2 so this is effectively a no-op. The only change is that we now check MMX rather than SSE2 for the ghash MMX implementation. ok bcook@ beck@
* bn_gcd: fix wacky indentation found by smatchtb2025-06-021-3/+5
| | | | via/ok jsg
* Implement EC field element operations.jsing2025-05-252-30/+45
| | | | | | | | | | Provide EC_FIELD_ELEMENT and EC_FIELD_MODULUS, which allow for operations on fixed width fields in constant time. These can in turn be used to implement Elliptic Curve cryptography for prime fields, without needing to use BN. This will improve the code, reduces timing leaks and enable further optimisation. ok beck@ tb@
* Provide bn_mod_{add,sub,mul}_words().jsing2025-05-253-4/+92
| | | | | | | These implement constant time modular addition, subtraction and multiplication in the Montegomery domain. ok tb@
* Fix previous.jsing2025-05-252-71/+4
|
* Provide additional variants of bn_add_words()/bn_sub_words().jsing2025-05-253-6/+190
| | | | | | | | | | | | | | | | Move bn_add_words() and bn_sub_words() from bn_add.c to bn_add_sub.c. These have effectively been replaced in the previous rewrites. Remove the asserts - if bad lengths are passed the results will be incorrect and things will fail (these should use size_t instead of int, but that is a problem for another day). Provide bn_sub_words_borrow(), which computes a subtraction but only returns the resulting borrow. Provide bn_add_words_masked() and bn_sub_words_masked(), which perform an masked addition or subtraction. These can also be used to implement constant time addition and subtraction, especially for reduction. ok beck@ tb@
* Fix handling of different length inputs in bn_sub().jsing2025-05-251-3/+3
| | | | | | | | | In the diff_len < 0 case, it incorrectly uses 0 - b[0], which mishandles the borrow - fix this by using bn_subw_subw(). Do the same in the diff_len > 0 case for consistency. Note that this is never currently reached since BN_usub() requires a >= b. ok beck@ tb@
* Use err_local.h rather than err.h in most placestb2025-05-1014-37/+28
| | | | ok jsing
* const correct BN_MONT_CTX_copy()tb2025-03-092-4/+4
| | | | ok jsing
* Convert bn_exp to BN_MONT_CTX_create()tb2025-02-131-53/+38
| | | | | | | | This simplifies the handling of the BN_MONT_CTX passed in and unifies the exit paths. Also zap some particularly insightful comments by our favorite captain. ok jsing
* Convert BPSW to BN_MONT_CTX_create()tb2025-02-131-5/+2
| | | | ok jsing
* Convert BN_MONT_CTX_set_locked() to BN_MONT_CTX_create()tb2025-02-131-4/+2
| | | | ok jsing
* bn: add internal BN_MONT_CTX_create()tb2025-02-132-2/+22
| | | | | | | | | | | | | This does what the public BN_MONT_CTX_new() should have done in the first place rather than doing the toolkit thing of returning an invalid object that you need to figure out how to populate and with what because the docs are abysmal. It takes the required arguments and calls BN_MONT_CTX_set(), which all callers do immediately after _new() (except for DSA which managed to squeeze 170 lines of garbage between the two calls). ok jsing
* Rename BN_mod_exp_recp() to BN_mod_exp_reciprocal()tb2025-02-122-5/+5
| | | | | (leaving out a dotasm comment that would become harder to read than it already is)
* bn_recp: reformat another ugly commenttb2025-02-041-5/+6
|