summaryrefslogtreecommitdiff
path: root/src/lib/libcrypto/bn (follow)
Commit message (Collapse)AuthorAgeFilesLines
...
* Reimplement bn_sqr_comba{4,8}().jsing2023-02-172-102/+110
| | | | | | | | | | | | Use bignum primitives rather than the current mess of macros.The sqr_add_c macro gets replaced with bn_mulw_addtw(), while the sqr_add_c2 macro gets replaced with bn_mul2_mulw_addtw(). The variables in the comba functions have also been reordered, so that the patterns are easier to understand - the compiler can take care of optimising the inputs and outputs to avoid register moves. ok tb@
* Enable s2n-bignum word_clz() on amd64.jsing2023-02-163-3/+15
| | | | | | | | | The BN_num_bits_word() function is a hot path, being called more than 80 million times during a libcrypto regress run. The word_clz() implementation uses five instructions to do the same as the generic code that uses more than 60 instructions. Discussed with tb@
* Use bn_addw() in bn_mulw(), rather than duplicating add with carry code.jsing2023-02-161-12/+7
|
* Change include from _internal_s2n_bignum.h to s2n_bignum_internal.h.jsing2023-02-161-1/+1
|
* Include the ISC license from s2n-bignum's LICENSE file.jsing2023-02-161-1/+12
|
* Bring in word_clz.S from s2n-bignum for amd64.jsing2023-02-161-0/+48
|
* Rename bn_umul_hilo() to bn_mulw().jsing2023-02-169-105/+109
| | | | | | | | | This keeps the naming consistent with the other bignum primitives that have been recently introduced. Also, use 1/0 intead of h/l (e.g. a1 instead of ah), as this keeps consistency with other primitives and allows for naming that works with double word, triple word and quadruple word inputs/outputs. Discussed with tb@
* Add missing masks to accumulator version of bn_umul_hilo()jsing2023-02-161-1/+5
|
* Reimplement bn_add_words() and bn_sub_words() using bignum primitives.jsing2023-02-162-111/+88
| | | | | | | This removes the effectively duplicate BN_LLONG version of bn_add_words() and simplifies the code considerably. ok tb@
* Place bn_mul_add_words() after bn_mul_words().jsing2023-02-151-39/+39
|
* zap tabtb2023-02-151-2/+2
|
* Remove the misnamed and now unused mul, mul_add and mul_add_c macros.jsing2023-02-141-122/+2
| | | | | | There were only three versions of each one... ok tb@
* Reimplement bn_mul_words(), bn_mul_add_words() and bn_mul_comba{4,8}().jsing2023-02-141-235/+152
| | | | | | | | | | | | | | | | Use bignum primitives rather than the current mess of macros, which also allows us to remove the essentially duplicate versions of bn_mul_words() and bn_mul_add_words() for BN_LLONG. The "mul" macro gets replaced by bn_mulw_addw(), "mul_add" with bn_mulw_addw_addw() and "mul_add_c" with bn_mulw_addtw() (where 'w' indicates single word input and 'tw' indicates triple word input). The variables in the comba functions have also been reordered, so that the patterns are easier to understand - the compiler can take care of optimising the inputs and outputs to avoid register moves. ok tb@
* Provide big number primitives for word addition/multiplication.jsing2023-02-141-1/+114
| | | | | | | | | | These use a consistent naming scheme and are implemented using bitwise/constant time style operations, which should generally be safe on all platforms (until a compiler decides to optimise and use branches). More optimised versions can be provided for a given architecture. ok tb@
* Make BN_is_zero() check word values.jsing2023-02-141-4/+9
| | | | | | | | Rather than completely relying on top, check the words of a bignum. This gets us one step away from being dependent on top and additionally means that we correctly report zero even if top is not yet correct. ok tb@
* Fix a -0 corner case in BN_div_internal()jsing2023-02-141-3/+5
| | | | | | | | | | If the numerator is negative, the numerator and divisor are the same length (in words) and the absolute value of the divisor > the absolute value of the numerator, the "no_branch" case produces -0 since negative has already been set. Call BN_set_negative() at the end of the function to avoid this. ok tb@
* Reimplement BN_num_bits_word().jsing2023-02-141-20/+25
| | | | | | | | | | Provide a simpler and more readable bn_word_clz() function that returns the number of leading zeros for a given BN_ULONG, then implement BN_num_bits_word() using bn_word_clz(). This is a hot path and bn_word_clz() can now be replaced with architecture specific versions where possible. ok tb@
* Make BN_set_negative() closer to constant time.jsing2023-02-141-2/+3
| | | | ok tb@
* Provide bn_ct_{eq,ne}_zero{,_mask}() inline functions.jsing2023-02-141-1/+33
| | | | | | | These will be used to test a BN_ULONG in cases where constant time style behaviour is required. ok tb@
* Avoid negative zero.jsing2023-02-1310-36/+40
| | | | | | | | | | | | | | | | Whenever setting negative to one (or when it could potentially be one), always use BN_set_negative() since it checks for a zero valued bignum and will not permit negative to be set in this case. Since BN_is_zero() currently relies on top == 0, call BN_set_negative() after top has been set (or bn_correct_top() has been called). This fixes a long standing issue where -0 and +0 have been permitted, however multiple code paths (such as BN_cmp()) fail to treat these as equivalent. Prompted by Guido Vranken who is adding negative zero fuzzing to oss-fuzz. ok tb@
* Simplify BN_set_negative().jsing2023-02-131-6/+3
| | | | ok tb@
* Remove bn_exp2.c, which is now empty.jsing2023-02-111-116/+0
|
* Bye bye x86_64-gcc.c.jsing2023-02-111-559/+0
| | | | This is no longer used, since we're now using s2n-bignum functions instead.
* Use .section .rodata instead of a plain .rodatatb2023-02-091-1/+1
| | | | | | | | At least gcc 12 on Fedora is very unhappy about a plain .rodata and throws Error: unknown pseudo-op: `.rodata'. So add a .section in front of it to make it happy. ok deraadt miod
* Pull in bn_internal.h for the generic version of bn_umul_hilo()jsing2023-02-091-1/+2
|
* Clean up bn_sqr_words()jsing2023-02-092-53/+10
| | | | | | | | | | | Currently there are two versions of bn_sqr_words(), which call the sqr or sqr64 macro. Replace this with a single version that calls bn_umul_hilo() and remove the various implementations of the sqr macro. The only slight downside is that sqr64 does three multiplications instead of four, given that the second and third terms are identical. However, this is a minimal gain for the amount of duplication and entanglement it introduces. ok tb@
* Remove bn_sqr_words() on amd64.jsing2023-02-042-11/+2
| | | | | | | s2n-bignum's bignum_sqr() is not the same as bn_sqr_words() (which only computes a partial result, unlike the former). This went unnoticed since bn_sqr() is called directly on amd64, hence bn_sqr_words() is currently unused.
* Fix output constraints for bn_umul_hilo().jsing2023-02-044-8/+8
| | | | | | | | When bn_umul_hilo() is implemented using an instruction pair, mark the first output with a constraint that prevents the output from overlapping with the inputs ("&"). Otherwise the first instruction can overwrite the inputs, which then results in the second instruction producing incorrect value.
* Move BN_mod_exp2_mont() to bn_exp.c.jsing2023-02-032-188/+186
|
* Reorder functions in bn_exp.c to be slightly sensible...jsing2023-02-031-282/+279
| | | | No functional change intended.
* Clean up and simplify BN_mod_lshift{,_quick}().jsing2023-02-031-38/+34
| | | | | | | | | | | | BN_mod_lshift() already has a BN_CTX available, make use of it rather than calling BN_dup() and BN_free(). In BN_mod_lshift_quick(), BN_copy() already handles dst == src, so avoid checking this before the call. The max_shift == 0 case can also be handled without code duplication. And as with other *_quick() functions, use BN_ucmp() and BN_usub() directly given the 0 <= a < m constraint. ok tb@
* Clean up BN_mod_mul() and simplify BN_mod_sqr().jsing2023-02-031-14/+16
| | | | | | | | | | | | | | Use the same naming/code pattern in BN_mod_mul() as is used in BN_mul(). Note that the 'rr' allocation is unnecessary, since both BN_mul() and BN_sqr() handle the case where r == a || r == b. However, it avoids a potential copy on the exit from BN_mul()/BN_sqr(), so leave it in place for now. Turn BN_mod_sqr() into a wrapper that calls BN_mod_mul(), since it already calls BN_sqr() in the a == b. The supposed gain of calling BN_mod_ct() instead of BN_nnmod() does not really exist. ok tb@
* Simplify BN_mod_{lshift1,sub}_quick().jsing2023-02-031-13/+19
| | | | | | | | | The BN_mod_.*_quick() functions require that their inputs are non-negative and are already reduced. As such, they can and should use BN_ucmp() and BN_usub() instead of BN_cmp() and BN_add()/BN_sub() (which internally call BN_uadd()/BN_usub() and potentially BN_cmp()). ok tb@
* Simplify BN_nnmod().jsing2023-02-031-13/+12
| | | | | | | | | In the case that the result is negative (i.e. one of a or m is negative), the positive result can be achieved via a single BN_usub(). This simplifies BN_nnmod() and avoids indirection via BN_add()/BN_sub(), which do BN_cmp() and then call into BN_uadd()/BN_usub(). ok tb@
* Turn BN_mod_{ct,nonct}() into symbols.jsing2023-02-032-6/+19
| | | | | | Also use accurate/useful variables names. ok tb@
* Remove AIX toc data after every function. NFCmiod2023-02-022-35/+0
|
* Refactor BN_uadd() and BN_usub().jsing2023-02-023-39/+99
| | | | | | | | | | | | | | | | | | Unlike bn_add_words()/bn_sub_words(), the s2n-bignum bignum_add() and bignum_sub() functions correctly handle inputs with differing word lengths. This means that they can be called directly, without needing to fix up any remaining words manually. Split BN_uadd() in two - the default bn_add() implementation calls bn_add_words(), before handling the carry for any remaining words. Likewise split BN_usub() in two - the default bn_sub() implementation calls bn_sub_words(), before handling the borrow for any remaining words. On amd64, provide an implementation of bn_add() that calls s2n-bignum's bignum_add() directly, similarly with an implementation of bn_sub() that calls s2n-bignum's bignum_sub() directly. ok tb@
* Move all data blocks from .text to .rodata and cleanup up and homogeneize codemiod2023-02-021-1/+0
| | | | responsible from getting the proper address of those blocks.
* Move all data blocks from .text to .rodata and cleanup up and homogeneize codemiod2023-02-013-9/+10
| | | | | | responsible from getting the proper address of those blocks. ok tb@ jsing@
* Pull the MONT_WORD define to the top.jsing2023-02-011-3/+3
| | | | | | Reordering functions with defines hiding in the middle leads to fun outcomes... and apparently the non-MONT_WORD code is broken, at least on aarch64.
* Move BN_MONT_CTX_* functions to the top of the file.jsing2023-02-011-221/+221
| | | | No functional change.
* Remove the now empty bn_asm.c.jsing2023-01-311-65/+0
| | | | | | This rather misnamed file (bn_asm.c) previously contained the C code that was needed to build libcrypto bignum on platforms that did not have assembly implementations of the functions it contained.
* Simplify bn_div_3_words().jsing2023-01-311-49/+15
| | | | | | | Make use of bn_umul_hilo() and remove the tangle of preprocessor directives that implement different code paths depending on what defines exist. ok tb@
* Provide inline assembly bn_umul_hilo() for alpha/powerpc64/riscv64.jsing2023-01-313-3/+67
| | | | | | These should work, but are currently untested and disabled. ok tb@
* Provide inline assembly versions of bn_umul_hilo() for aarch64/amd64/i386.jsing2023-01-313-3/+67
| | | | ok tb@
* Provide bn_umul_hilo().jsing2023-01-311-0/+159
| | | | | | | | | | | | | | | | | The bignum code needs to be able to multiply two words, producing a double word result. Some architectures do not have native support for this, hence a pure C version is required. bn_umul_hilo() provides this functionality. There are currently two implementations, both of which are branch free. The first uses bitwise operations for the carry, while the second uses accumulators. The accumulator version uses fewer instructions, however requires more variables/registers and seems to be slower, at least on amd64/i386. The accumulator version may be faster on architectures that have more registers available. Further testing can be performed and one of the two implementations can be removed at a later date. ok tb@
* Correctly detect b < a in BN_usub().jsing2023-01-311-1/+5
| | | | | | | | | | | | BN_usub() requires that a >= b and should return an error in the case that b < a. This is currently only detected by checking the number of words in a versus b - if they have the same number of words, the top word is not checked and b < a, which then succeeds and produces an incorrect result. Fix this by checking for the case where a and b have an equal number of words, yet there is a borrow returned from bn_sub_words(). ok miod@ tb@
* Remove sparc related files from libcrypto.jsing2023-01-312-1497/+0
| | | | | | | | The sparc platform got retired a while back, however some parts remained hiding in libcrypto. Mop these up (along with the bn_arch.h that I introduced). Spotted by and ok tb@
* Remove the now empty/unused bn_depr.c.jsing2023-01-291-64/+0
|
* Use s2n-bignum assembly implementations for libcrypto bignum on amd64.jsing2023-01-291-1/+79
| | | | | | | This switches the core bignum assembly implementations from x86_64-gcc.c to s2n-bignum for amd64. ok miod@ tb@