Anyway, the overall algorithm isn't that complicated. If we break down one of the input numbers (let's call them 'a' and 'b') into a string of bits, we can loop through the bits, and if one bit is set, shift the other number left as far as that bit, and add it to an accumulator. Because we know the bounds of the loop (one iteration for every bit in an int), we can ask the compiler to unroll the loop (thereby eliminating the conditional). We can eliminate the "if a particular bit is set" conditional by creating a mask of all 0s or all 1s based on the bit, and using the mask on the shifted 'b'.

Another problem is negative numbers. However, we can create a branchless absolute value function, and then multiply the absolute value of the two inputs. It's also simple enough to xor the sign bits of the two input functions to tell if the output should be negative. We can also create a branchless "negate" function (and thereby create a "subtract" function). We can also use the mask trick to only negate the output if the xor of the two input sign bits is high.

Anyway, here's the code!

#include <stdlib.h>

#include <stdio.h>

#include <stdint.h>

#include <inttypes.h>

inline int32_t abs(const int32_t a) {

const int32_t v = a >> (sizeof(int32_t)*8-1);

return (a ^ v) + v;

}

inline int32_t negate(const int32_t a) {

const int32_t negone = -1;

return (a ^ negone) + 1;

}

inline int32_t subtract(const int32_t a, const int32_t b) {

return a + negate(b);

}

__attribute__((optimize("unroll-loops")))

inline int64_t unsigned_mult(const int32_t a, const int32_t b) {

int i;

int64_t out = 0;

for (i = 0; i < sizeof(a)*8; i++) {

const uint64_t mask = -1 ^ subtract((a & (1 << i)), 1);

const uint64_t big_b = b;

out += (big_b << i) & mask;

}

return out;

}

inline int64_t mult(const int32_t a, const int32_t b) {

const int thirtyone = sizeof(int32_t)*8-1;

const int32_t sign = ((a & (1 << thirtyone)) ^ (b & (1 << thirtyone))) >> thirtyone;

const int64_t unsigned_answer = unsigned_mult(abs(a), abs(b));

return (negate(unsigned_answer) & sign) + (unsigned_answer & (~sign));

}

## No comments:

## Post a Comment