| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
| |
Use bignum primitives rather than the current mess of macros.The sqr_add_c
macro gets replaced with bn_mulw_addtw(), while the sqr_add_c2 macro gets
replaced with bn_mul2_mulw_addtw().
The variables in the comba functions have also been reordered, so that the
patterns are easier to understand - the compiler can take care of
optimising the inputs and outputs to avoid register moves.
ok tb@
|
|
|
|
|
|
|
|
|
| |
The BN_num_bits_word() function is a hot path, being called more than
80 million times during a libcrypto regress run. The word_clz()
implementation uses five instructions to do the same as the generic code
that uses more than 60 instructions.
Discussed with tb@
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
| |
This keeps the naming consistent with the other bignum primitives that have
been recently introduced. Also, use 1/0 intead of h/l (e.g. a1 instead of
ah), as this keeps consistency with other primitives and allows for naming
that works with double word, triple word and quadruple word inputs/outputs.
Discussed with tb@
|
| |
|
|
|
|
|
|
|
| |
This removes the effectively duplicate BN_LLONG version of bn_add_words()
and simplifies the code considerably.
ok tb@
|
| |
|
| |
|
|
|
|
|
|
| |
There were only three versions of each one...
ok tb@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use bignum primitives rather than the current mess of macros, which also
allows us to remove the essentially duplicate versions of
bn_mul_words() and bn_mul_add_words() for BN_LLONG.
The "mul" macro gets replaced by bn_mulw_addw(), "mul_add" with
bn_mulw_addw_addw() and "mul_add_c" with bn_mulw_addtw() (where 'w'
indicates single word input and 'tw' indicates triple word input).
The variables in the comba functions have also been reordered, so that the
patterns are easier to understand - the compiler can take care of
optimising the inputs and outputs to avoid register moves.
ok tb@
|
|
|
|
|
|
|
|
|
|
| |
These use a consistent naming scheme and are implemented using
bitwise/constant time style operations, which should generally be safe on
all platforms (until a compiler decides to optimise and use branches).
More optimised versions can be provided for a given architecture.
ok tb@
|
|
|
|
|
|
|
|
| |
Rather than completely relying on top, check the words of a bignum.
This gets us one step away from being dependent on top and additionally
means that we correctly report zero even if top is not yet correct.
ok tb@
|
|
|
|
|
|
|
|
|
|
| |
If the numerator is negative, the numerator and divisor are the same
length (in words) and the absolute value of the divisor > the absolute
value of the numerator, the "no_branch" case produces -0 since negative
has already been set. Call BN_set_negative() at the end of the function
to avoid this.
ok tb@
|
|
|
|
|
|
|
|
|
|
| |
Provide a simpler and more readable bn_word_clz() function that returns the
number of leading zeros for a given BN_ULONG, then implement
BN_num_bits_word() using bn_word_clz(). This is a hot path and
bn_word_clz() can now be replaced with architecture specific versions where
possible.
ok tb@
|
|
|
|
| |
ok tb@
|
|
|
|
|
|
|
| |
These will be used to test a BN_ULONG in cases where constant time style
behaviour is required.
ok tb@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Whenever setting negative to one (or when it could potentially be one),
always use BN_set_negative() since it checks for a zero valued bignum and
will not permit negative to be set in this case. Since BN_is_zero()
currently relies on top == 0, call BN_set_negative() after top has been
set (or bn_correct_top() has been called).
This fixes a long standing issue where -0 and +0 have been permitted,
however multiple code paths (such as BN_cmp()) fail to treat these as
equivalent.
Prompted by Guido Vranken who is adding negative zero fuzzing to oss-fuzz.
ok tb@
|
|
|
|
| |
ok tb@
|
| |
|
|
|
|
| |
This is no longer used, since we're now using s2n-bignum functions instead.
|
|
|
|
|
|
|
|
| |
At least gcc 12 on Fedora is very unhappy about a plain .rodata and throws
Error: unknown pseudo-op: `.rodata'. So add a .section in front of it to
make it happy.
ok deraadt miod
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Currently there are two versions of bn_sqr_words(), which call the sqr or
sqr64 macro. Replace this with a single version that calls bn_umul_hilo()
and remove the various implementations of the sqr macro. The only slight
downside is that sqr64 does three multiplications instead of four, given
that the second and third terms are identical. However, this is a minimal
gain for the amount of duplication and entanglement it introduces.
ok tb@
|
|
|
|
|
|
|
| |
s2n-bignum's bignum_sqr() is not the same as bn_sqr_words() (which only
computes a partial result, unlike the former). This went unnoticed since
bn_sqr() is called directly on amd64, hence bn_sqr_words() is currently
unused.
|
|
|
|
|
|
|
|
| |
When bn_umul_hilo() is implemented using an instruction pair, mark the
first output with a constraint that prevents the output from overlapping
with the inputs ("&"). Otherwise the first instruction can overwrite the
inputs, which then results in the second instruction producing incorrect
value.
|
| |
|
|
|
|
| |
No functional change intended.
|
|
|
|
|
|
|
|
|
|
|
|
| |
BN_mod_lshift() already has a BN_CTX available, make use of it rather than
calling BN_dup() and BN_free().
In BN_mod_lshift_quick(), BN_copy() already handles dst == src, so avoid
checking this before the call. The max_shift == 0 case can also be handled
without code duplication. And as with other *_quick() functions, use
BN_ucmp() and BN_usub() directly given the 0 <= a < m constraint.
ok tb@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use the same naming/code pattern in BN_mod_mul() as is used in BN_mul().
Note that the 'rr' allocation is unnecessary, since both BN_mul() and
BN_sqr() handle the case where r == a || r == b. However, it avoids a
potential copy on the exit from BN_mul()/BN_sqr(), so leave it in place
for now.
Turn BN_mod_sqr() into a wrapper that calls BN_mod_mul(), since it already
calls BN_sqr() in the a == b. The supposed gain of calling BN_mod_ct()
instead of BN_nnmod() does not really exist.
ok tb@
|
|
|
|
|
|
|
|
|
| |
The BN_mod_.*_quick() functions require that their inputs are non-negative
and are already reduced. As such, they can and should use BN_ucmp() and
BN_usub() instead of BN_cmp() and BN_add()/BN_sub() (which internally call
BN_uadd()/BN_usub() and potentially BN_cmp()).
ok tb@
|
|
|
|
|
|
|
|
|
| |
In the case that the result is negative (i.e. one of a or m is negative),
the positive result can be achieved via a single BN_usub(). This simplifies
BN_nnmod() and avoids indirection via BN_add()/BN_sub(), which do BN_cmp()
and then call into BN_uadd()/BN_usub().
ok tb@
|
|
|
|
|
|
| |
Also use accurate/useful variables names.
ok tb@
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Unlike bn_add_words()/bn_sub_words(), the s2n-bignum bignum_add() and
bignum_sub() functions correctly handle inputs with differing word
lengths. This means that they can be called directly, without needing to
fix up any remaining words manually.
Split BN_uadd() in two - the default bn_add() implementation calls
bn_add_words(), before handling the carry for any remaining words.
Likewise split BN_usub() in two - the default bn_sub() implementation
calls bn_sub_words(), before handling the borrow for any remaining words.
On amd64, provide an implementation of bn_add() that calls s2n-bignum's
bignum_add() directly, similarly with an implementation of bn_sub() that
calls s2n-bignum's bignum_sub() directly.
ok tb@
|
|
|
|
| |
responsible from getting the proper address of those blocks.
|
|
|
|
|
|
| |
responsible from getting the proper address of those blocks.
ok tb@ jsing@
|
|
|
|
|
|
| |
Reordering functions with defines hiding in the middle leads to fun
outcomes... and apparently the non-MONT_WORD code is broken, at least on
aarch64.
|
|
|
|
| |
No functional change.
|
|
|
|
|
|
| |
This rather misnamed file (bn_asm.c) previously contained the C code that
was needed to build libcrypto bignum on platforms that did not have
assembly implementations of the functions it contained.
|
|
|
|
|
|
|
| |
Make use of bn_umul_hilo() and remove the tangle of preprocessor directives
that implement different code paths depending on what defines exist.
ok tb@
|
|
|
|
|
|
| |
These should work, but are currently untested and disabled.
ok tb@
|
|
|
|
| |
ok tb@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The bignum code needs to be able to multiply two words, producing a
double word result. Some architectures do not have native support for
this, hence a pure C version is required. bn_umul_hilo() provides this
functionality.
There are currently two implementations, both of which are branch free.
The first uses bitwise operations for the carry, while the second uses
accumulators. The accumulator version uses fewer instructions, however
requires more variables/registers and seems to be slower, at least on
amd64/i386. The accumulator version may be faster on architectures that
have more registers available. Further testing can be performed and one
of the two implementations can be removed at a later date.
ok tb@
|
|
|
|
|
|
|
|
|
|
|
|
| |
BN_usub() requires that a >= b and should return an error in the case that
b < a. This is currently only detected by checking the number of words in
a versus b - if they have the same number of words, the top word is not
checked and b < a, which then succeeds and produces an incorrect result.
Fix this by checking for the case where a and b have an equal number of
words, yet there is a borrow returned from bn_sub_words().
ok miod@ tb@
|
|
|
|
|
|
|
|
| |
The sparc platform got retired a while back, however some parts remained
hiding in libcrypto. Mop these up (along with the bn_arch.h that I
introduced).
Spotted by and ok tb@
|
| |
|
|
|
|
|
|
|
| |
This switches the core bignum assembly implementations from x86_64-gcc.c to
s2n-bignum for amd64.
ok miod@ tb@
|