| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
| |
|
|
|
|
|
| |
script is run. This is more of an issue with uint16_t now than it
was with prime_t aka BN_ULONG before r1.6.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
A negative input to BN_mod_exp_mont_consttime() is not correctly reduced,
remaining negative (when it should be in the range [0, m)). Fix this by
unconditionally calling BN_nnmod() on the input.
Fixes ossfuzz #55997.
ok tb@
|
|
|
|
|
|
|
|
|
| |
Currently, the use of BN_div_word() can result in -0 - avoid this by
setting negative again, at the end of the computation.
Should fix oss-fuzz 56667.
ok tb@
|
|
|
|
|
|
|
|
|
|
|
|
| |
A sign handling bug was introduced to BN_add_word() in bn_word.c r1.18.
When handling addition to a negative bignum, the BN_sub_word() call can
result in the sign being flipped, which we need to account for. Use the
same code in BN_sub_word() - while not technically needed here it keeps
the code consistent.
Issue discovered by tb@
ok tb@
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Rather than calling bn_mul_add_words() twice - once to multiply and once
to reduce - perform the multiplication and reduction in a single pass using
bn_mulw_addw_addw() directly. Also simplify the addition of the resulting
carries, which in turn allows us to avoid zeroing the top half of the
temporary words.
This provides a ~20-25% performance improvement for RSA operations on
aarch64.
ok tb@
|
|
|
|
|
|
|
|
|
| |
Call bn_mulw_addw() rather than doing bn_mulw() follow by bn_addw(). This
simplifies the code slightly, plus on some platforms bn_mulw_addw() can
be optimised (and bn_mulw_addtw() will then benefit from such an
optimisation).
ok tb@
|
|
|
|
|
|
|
| |
BN_clear_free() is a wrapper that calls BN_free() - call BN_free() directly
instead.
ok tb@
|
|
|
|
|
|
|
|
|
|
|
|
| |
The assembly bn_mul_mont() implementations effectively use alloca() to
allocate space for computation (at up to 8x the input size), without
any limitation. This means that sufficiently large inputs lead to the
stack being blown. Prevent this by using the C based implementation
instead.
Thanks to Jiayi Lin <jlin139 at asu dot edu> for reporting this to us.
ok beck@ tb@
|
|
|
|
|
|
|
|
|
|
|
| |
Provide a constant-time-style Montgomery multiplication implementation.
Use this in place of the assembly bn_mul_mont() on platforms that either
do not have an assembly implementation or have not compiled it in.
Also use this as the fallback version for bn_mul_mont(), rather than
falling back to a non-constant time implementation.
ok beck@ tb@
|
|
|
|
|
|
|
|
|
|
| |
Pull out the simplistic implementation (using BN_mul() or BN_sqr()) into a
bn_mod_mul_montgomery_simple() function. Provide bn_mod_mul_montgomery()
with an implementation that changes depending on if the assembly
bn_mul_mont() is available or not. Turn BN_mod_mul_montgomery() and
BN_to_montgomery() into callers of bn_mod_mul_montgomery().
ok beck@ tb@
|
|
|
|
|
|
| |
This came from bn_asm.c and did not even compile until recently.
ok beck@ tb@
|
| |
|
|
|
|
|
|
|
|
| |
Rename BN_from_montgomery_word() to bn_montgomery_reduce() and rewrite it
to be simpler and clearer, moving further towards constant time in the
process. Clean up BN_from_montgomery() in the process.
ok tb@
|
|
|
|
|
|
| |
macOS aarch64 assembly dialect treats ; as comment instead of a newline
ok tb@, jsing@
|
|
|
|
| |
ok miod
|
|
|
|
| |
Requested by tb@
|
|
|
|
|
|
|
|
| |
Now that bn_sub() handles word arrays with potentially different lengths,
we no longer need bn_sub_part_words() - call bn_sub() instead. This allows
us to entirely remove the unnecessarily complex bn_sub_part_words() code.
ok tb@
|
|
|
|
|
|
|
|
| |
Rather than working on BIGNUMs, change bn_add()/bn_sub() to operate on word
arrays that potentially differ in length. This matches the behaviour of
s2n-bignum's bignum_add() and bignum_sub().
ok tb@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
OpenSSL commit 4d524040bc8 changed BN_MONT_CTX_set() so that it computed
a 64 bit N^-1 on both BN_BITS2 == 32 and BN_BITS2 == 64 platforms. However,
the way in which this was done was to duplicate half the code and wrap it
in #ifdef.
Rewrite this code to use a single code path on all platforms, with #ifdef
being limited to setting an additional word in the temporary N and storing
the result on BN_BITS2 == 32 platforms. Also remove stack based BIGNUM in
favour of using the already present BN_CTX.
ok tb@
|
|
|
|
| |
ok tb@
|
|
|
|
|
|
|
|
|
|
|
| |
It does not make sense to use code that is slower, currently broken and
prevents the use of assembly Montgomery implementations.
This is the result of `unifdef -m -DMONT_WORD`, followed by some manual
clean up and the removal of the Ni bignum from BN_MONT_CTX (which was only
used in the non-MONT_WORD case).
ok miod@ tb@
|
| |
|
|
|
|
|
|
|
|
| |
Rewrite and simplify BN_MONT_CTX_set_locked - in particular, only hold the
lock for a short period of time, rather than holding a write lock for a
module across an expensive operation.
ok tb@
|
|
|
|
|
|
|
|
|
| |
Use calloc() rather than malloc() with manual initialisation of all struct
members to zero, use memset() instead of manually initialising all struct
members to zero, use consistent naming, use BN_free() instead of
BN_clear_free() (since it is the same thing).
ok tb@
|
| |
|
|
|
|
|
|
| |
No code outside of bn_mont.c needs access to it.
ok tb@
|
| |
|
|
|
|
|
|
|
| |
No, I'm not trying to overwhelm you... however, we really no longer need
this clutter.
ok tb@
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use bignum primitives rather than the current mess of macros.The sqr_add_c
macro gets replaced with bn_mulw_addtw(), while the sqr_add_c2 macro gets
replaced with bn_mul2_mulw_addtw().
The variables in the comba functions have also been reordered, so that the
patterns are easier to understand - the compiler can take care of
optimising the inputs and outputs to avoid register moves.
ok tb@
|
|
|
|
|
|
|
|
|
| |
The BN_num_bits_word() function is a hot path, being called more than
80 million times during a libcrypto regress run. The word_clz()
implementation uses five instructions to do the same as the generic code
that uses more than 60 instructions.
Discussed with tb@
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
| |
This keeps the naming consistent with the other bignum primitives that have
been recently introduced. Also, use 1/0 intead of h/l (e.g. a1 instead of
ah), as this keeps consistency with other primitives and allows for naming
that works with double word, triple word and quadruple word inputs/outputs.
Discussed with tb@
|
| |
|
|
|
|
|
|
|
| |
This removes the effectively duplicate BN_LLONG version of bn_add_words()
and simplifies the code considerably.
ok tb@
|
| |
|
| |
|
|
|
|
|
|
| |
There were only three versions of each one...
ok tb@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use bignum primitives rather than the current mess of macros, which also
allows us to remove the essentially duplicate versions of
bn_mul_words() and bn_mul_add_words() for BN_LLONG.
The "mul" macro gets replaced by bn_mulw_addw(), "mul_add" with
bn_mulw_addw_addw() and "mul_add_c" with bn_mulw_addtw() (where 'w'
indicates single word input and 'tw' indicates triple word input).
The variables in the comba functions have also been reordered, so that the
patterns are easier to understand - the compiler can take care of
optimising the inputs and outputs to avoid register moves.
ok tb@
|
|
|
|
|
|
|
|
|
|
| |
These use a consistent naming scheme and are implemented using
bitwise/constant time style operations, which should generally be safe on
all platforms (until a compiler decides to optimise and use branches).
More optimised versions can be provided for a given architecture.
ok tb@
|
|
|
|
|
|
|
|
| |
Rather than completely relying on top, check the words of a bignum.
This gets us one step away from being dependent on top and additionally
means that we correctly report zero even if top is not yet correct.
ok tb@
|
|
|
|
|
|
|
|
|
|
| |
If the numerator is negative, the numerator and divisor are the same
length (in words) and the absolute value of the divisor > the absolute
value of the numerator, the "no_branch" case produces -0 since negative
has already been set. Call BN_set_negative() at the end of the function
to avoid this.
ok tb@
|
|
|
|
|
|
|
|
|
|
| |
Provide a simpler and more readable bn_word_clz() function that returns the
number of leading zeros for a given BN_ULONG, then implement
BN_num_bits_word() using bn_word_clz(). This is a hot path and
bn_word_clz() can now be replaced with architecture specific versions where
possible.
ok tb@
|
|
|
|
| |
ok tb@
|
|
|
|
|
|
|
| |
These will be used to test a BN_ULONG in cases where constant time style
behaviour is required.
ok tb@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Whenever setting negative to one (or when it could potentially be one),
always use BN_set_negative() since it checks for a zero valued bignum and
will not permit negative to be set in this case. Since BN_is_zero()
currently relies on top == 0, call BN_set_negative() after top has been
set (or bn_correct_top() has been called).
This fixes a long standing issue where -0 and +0 have been permitted,
however multiple code paths (such as BN_cmp()) fail to treat these as
equivalent.
Prompted by Guido Vranken who is adding negative zero fuzzing to oss-fuzz.
ok tb@
|