-
Notifications
You must be signed in to change notification settings - Fork 89
Description
The current division algorithm uses a BIGNUM structure for 'current' while its task can be completed with natural integer arithmetics. In my local fork (sorry, it's still a hot mess), I have changed the algorithm as follows, where several names and small things have been altered, but should still be recognizable fine:
//
// Use an integer of natural size to store the current Bit index as 'current'
// is always a 2's potency. While a BIGNUM can theoretically hold a
// 2's potency eight times larger than what can represent as Bit index with a
// natural integer (Bytes vs Bits), this cannot happen within this function
// as 'a' aligned to the next 2's potency would need to be just as big for
// this to be the case. This cannot happen due to the address space
// limitation.
//
size_t currentBitIndex = 0; // int current = 1;
bignum_u_assign_const(denom, b); // denom = b
bignum_u_assign_const(tmp, a); // tmp = a
const DTYPE half_max = 1 + (DTYPE)(((1ULL << (8 * WORD_SIZE)) - 1U) / 2);
bool overflow = false;
while (bignum_u_cmp(denom, a) != LARGER) // while (denom <= a) {
{
if (denom->d[BN_U_DLEN(denom) - 1] >= half_max)
{
overflow = true;
break;
}
++currentBitIndex; // current <<= 1;
_lshift_bit_const(denom, 1); // denom <<= 1;
}
if (!overflow)
{
_rshift_bit_const(denom, 1); // denom >>= 1;
--currentBitIndex; // current >>= 1;
}
bignum_u_init_const(c); // int answer = 0;
//
// currentBitIndex cannot add-wraparound to reach this value as reasoned in
// the comment before.
//
while(currentBitIndex != (size_t)0U - 1U) // while (current != 0)
{
if (bignum_u_cmp(tmp, denom) != SMALLER) // if (dividend >= denom)
{
bignum_u_sub(tmp, denom, tmp); // dividend -= denom;
bignum_u_or_int (c, 1, currentBitIndex); // answer |= current;
}
--currentBitIndex; // current >>= 1;
_rshift_bit_const(denom, 1); // denom >>= 1;
}
Along with this newly introduced function:
void bignum_u_or_int(ubn* a, DTYPE v, size_t nbits)
{
require(a, "a is null");
const uint8_t nbits_pr_word = (WORD_SIZE * 8);
size_t nwords = nbits / nbits_pr_word;
uint8_t nwordbits = nbits % nbits_pr_word;
if (nwords >= BN_U_DLEN (a)) {
return;
}
a->d[nwords] |= (v << nwordbits);
}
BN_U_DLEN returns the array size of the current sturcture instance, which I added because I cannot affort having every BIGNUM at the maximum required size among all.
This new function is also a really important one performance-wise for different tasks as it allows super-fast construction of 2's potencies. For example, I also used it for the calculation of R², used for Montgomery arithmetics, where R is a 2's potency. The BoringSSL algorithm utilizes BIGNUM shift, pow and mul algorithms for its construction, when with this function, it can be achieved easily with just a preceeding BIGNUM init.