| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
| |
For now this still calls bn_montgomery_multiply_words(), however it can
be optimised further in the future.
|
|
|
|
|
|
|
|
|
|
| |
Provide EC_FIELD_ELEMENT and EC_FIELD_MODULUS, which allow for operations
on fixed width fields in constant time. These can in turn be used to
implement Elliptic Curve cryptography for prime fields, without needing
to use BN. This will improve the code, reduces timing leaks and enable
further optimisation.
ok beck@ tb@
|
|
|
|
|
|
|
| |
These implement constant time modular addition, subtraction and
multiplication in the Montegomery domain.
ok tb@
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Move bn_add_words() and bn_sub_words() from bn_add.c to bn_add_sub.c.
These have effectively been replaced in the previous rewrites. Remove
the asserts - if bad lengths are passed the results will be incorrect
and things will fail (these should use size_t instead of int, but that
is a problem for another day).
Provide bn_sub_words_borrow(), which computes a subtraction but only
returns the resulting borrow. Provide bn_add_words_masked() and
bn_sub_words_masked(), which perform an masked addition or subtraction.
These can also be used to implement constant time addition and subtraction,
especially for reduction.
ok beck@ tb@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On BN_ULLONG architectures, the C compiler can usually do a decent job
of optimising primitives, however it struggles to see through primitive
calls due to type narrowing. As such, providing explicit versions of
compound primitives can result in the production of more optimal code.
For example, on arm the bn_mulw_addw_addw() primitive can be replaced
with a single umaal instruction, which provides significant performance
gains.
Rather than intermingling #ifdef/#else throughout the header, the
BN_ULLONG defines are pulled up above the normal functions. This also
allows complex compound primitives to be reused. The conditionals have also
been changed from BN_LLONG to BN_ULLONG, since that is what really matters.
ok tb@
|
|
|
|
|
|
|
|
|
|
| |
On some architectures, we can provide an optimised (often single
instruction) count-leading-zero implementation. In order to do this
effectively, provide bn_clzw() as a static inline that can be replaced
by an architecture specific version. The default implementation defers
to the bn_word_clz() function (which may also be architecture specific).
ok tb@
|
|
|
|
|
|
|
|
| |
Provide bn_bitsize(), which performs a constant time scan of a BN in order
to determine the bit size of the BN value. Use this for BN_num_bits() such
that it is no longer dependent on the bn->top value.
ok tb@
|
|
|
|
|
|
|
|
| |
This includes bn_qwaddqw(), bn_qwsubqw(), bn_qwmulw_addw() and
bn_qwmulw_addqw_addw(). These can typically be optimised on architectures
that have a reasonable number of general purpose registers.
ok tb@
|
|
|
|
|
|
|
|
|
| |
Call bn_mulw_addw() rather than doing bn_mulw() follow by bn_addw(). This
simplifies the code slightly, plus on some platforms bn_mulw_addw() can
be optimised (and bn_mulw_addtw() will then benefit from such an
optimisation).
ok tb@
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use bignum primitives rather than the current mess of macros.The sqr_add_c
macro gets replaced with bn_mulw_addtw(), while the sqr_add_c2 macro gets
replaced with bn_mul2_mulw_addtw().
The variables in the comba functions have also been reordered, so that the
patterns are easier to understand - the compiler can take care of
optimising the inputs and outputs to avoid register moves.
ok tb@
|
| |
|
|
|
|
|
|
|
|
|
| |
This keeps the naming consistent with the other bignum primitives that have
been recently introduced. Also, use 1/0 intead of h/l (e.g. a1 instead of
ah), as this keeps consistency with other primitives and allows for naming
that works with double word, triple word and quadruple word inputs/outputs.
Discussed with tb@
|
| |
|
|
|
|
|
|
|
| |
This removes the effectively duplicate BN_LLONG version of bn_add_words()
and simplifies the code considerably.
ok tb@
|
| |
|
|
|
|
|
|
|
|
|
|
| |
These use a consistent naming scheme and are implemented using
bitwise/constant time style operations, which should generally be safe on
all platforms (until a compiler decides to optimise and use branches).
More optimised versions can be provided for a given architecture.
ok tb@
|
|
|
|
|
|
|
| |
These will be used to test a BN_ULONG in cases where constant time style
behaviour is required.
ok tb@
|
|
The bignum code needs to be able to multiply two words, producing a
double word result. Some architectures do not have native support for
this, hence a pure C version is required. bn_umul_hilo() provides this
functionality.
There are currently two implementations, both of which are branch free.
The first uses bitwise operations for the carry, while the second uses
accumulators. The accumulator version uses fewer instructions, however
requires more variables/registers and seems to be slower, at least on
amd64/i386. The accumulator version may be faster on architectures that
have more registers available. Further testing can be performed and one
of the two implementations can be removed at a later date.
ok tb@
|