Pages: [1]   Go Down
Author Topic: Multiplying 8 by 16 bits...  (Read 1068 times)
0 Members and 1 Guest are viewing this topic.
Knivsta,Sweden
Offline Offline
Sr. Member
****
Karma: 0
Posts: 274
Low level's cool
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

I thought this would be an easy operation...
I have a byte value that I want to make little larger by multiplying with a little higher than 1.
This is in a tight loop, so I'd like to avoid floating point math.
Let's say I want to multiply by 1.5. I could store 256*1.5 as an integer value in a word.  Multiplying by that instead gives a value that is 256 times too large, so I'll just shift result down 8 steps.   This is what's normally called "fixed point".

Code:
byte b = whatever;
word w = 0x180;   // 1.5 in fixed point notation

result = (b * w) >> 8;

The above fails. Apparently, the 24 bit result of b*w is truncated to a 16 bit int before shifting.

I tried to multiply b with the highbyte and lowbyte of w separately and then adding the results up, keeping only the upper 16 bits:

Code:
result = b * highByte(w) + highByte(b * lowByte(w));

That at least gives me the correct result, but I looked at the assembler... the expression above does 2 multiplications of bytes, which should map directly to the cpu's mul instruction. But the actual code generated has no less than 6 mul instructions in it.

Is there a way, short of writing inline assembler, to express a 8*16 bit multiplication that generates sane code?
Logged

0
Offline Offline
God Member
*****
Karma: 2
Posts: 596
Arduino rocks
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

b is a byte and w is 2 bytes, so I think when the compiler encounters b * w it tries to store the intermediate result in a variable that is large as the largest operand, i.e. 2 bytes.
I think if you want to have an intermediate variable of 3 bytes you should cast b or w to long.
Logged

Knivsta,Sweden
Offline Offline
Sr. Member
****
Karma: 0
Posts: 274
Low level's cool
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

Quote
you should cast b or w to long

if I do that, it will call the compiler's builtin 32x32 bit __mulsi3 function, which contains 10 mul instructions. I still don't want more than 2...
« Last Edit: July 14, 2010, 11:05:48 am by drhex » Logged

0
Offline Offline
Faraday Member
**
Karma: 16
Posts: 2857
ruggedcircuits.com
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset

Anything wrong with:
Code:
result = (b*3)/2;

to multiply by 1.5?

http://www.ruggedcircuits.com
« Last Edit: July 14, 2010, 11:05:55 am by RuggedCircuits » Logged

Knivsta,Sweden
Offline Offline
Sr. Member
****
Karma: 0
Posts: 274
Low level's cool
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

Code:
result = (b*3)/2;


it is more like "some user-changeable value between 1.00 and 4.00"
Logged

0
Offline Offline
Faraday Member
**
Karma: 16
Posts: 2857
ruggedcircuits.com
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset

I only see 2 multiplications in this test code:
Code:
#include <avr/io.h>

#define lowByte(w) ((uint8_t) ((w) & 0xff))
#define highByte(w) ((uint8_t) ((w) >> 8))

uint8_t func(uint8_t b, uint16_t w)
{
  return b*highByte(w) + highByte(b*lowByte(w));
}
/*
  Compiled with avr-gcc -mmcu=atmega328p -O3 -S test.c

      .text
.global      func
      .type      func, @function
func:
      mul r24,r22
      movw r18,r0
      clr r1
      mov r18,r19
      lsl r19
      sbc r19,r19
      mul r24,r23
      mov r24,r0
      clr r1
      add r24,r18
      ret
*/

http://www.ruggedcircuits.com
Logged

Knivsta,Sweden
Offline Offline
Sr. Member
****
Karma: 0
Posts: 274
Low level's cool
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

Code:
result = (b*3)/2;

.. but you're on to something there. I don't need that many different values to multiply with. As long as the number to divide by can be a power of 2 it might work.
Logged

Knivsta,Sweden
Offline Offline
Sr. Member
****
Karma: 0
Posts: 274
Low level's cool
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

Quote
I only see 2 multiplications in this test code:

The result of the (fixed point) multiplication can be greater than 255 so the return type needs to be unit16_t, which results in more muls.
Logged

0
Offline Offline
Faraday Member
**
Karma: 16
Posts: 2857
ruggedcircuits.com
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset

Well, I got it down to 2 multiplications through this annoying code:

Code:
#include <avr/io.h>

#define lowByte(w) ((uint8_t) ((w) & 0xff))
#define highByte(w) ((uint8_t) ((w) >> 8))

uint16_t hi(uint8_t b, uint16_t w)
{
  return b*highByte(w);
}

uint8_t lo(uint8_t b, uint16_t w)
{
  return highByte(b*lowByte(w));
}

uint16_t func(uint8_t b, uint16_t w)
{
  return hi(b,w) + lo(b,w);
}
/*
  Compiled with avr-gcc -mmcu=atmega328p -O3 -S -fno-inline test.c
*/

I agree that it shouldn't expand out to 6 multiplications. If you take the -fno-inline flag out it goes back to 6 muls.

Perhaps in-line assembly is the solution here.

http://www.ruggedcircuits.com
Logged

Knivsta,Sweden
Offline Offline
Sr. Member
****
Karma: 0
Posts: 274
Low level's cool
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

Code:
{ 1, 149, 87, 203, 237, 69, 161, 47, 219, 4};
{ 0,   7,  6,   7,   7,  5,   6,  4,   6, 0};

The solution. Above is my list of 10 values, geometrically spread from 1.00 to 4.00.
I multiply the byte with a value in the first row, and then shift down the resulting word by the corresponding value in the second row.

Thanks to RuggedCircuits for the hint.
Logged

Pages: [1]   Go Up
Jump to: