optimizing code: help avoiding modulus..

Hi
i have came up with an arduino program that outputs a waveform at a chosen frequency.
But...the interrupt vector that takes care of the output of the waveform's step's values includes a modulus (a % b):

 if (conta % pitch==0){  
    PORTD=out[n++];

The modulus really slows the whole interrupt quite a lot, according to some timing test's i've done,
(more info also on these links
-http://www.engblaze.com/faster-code-fridays-understand-division-and-speed-of-operations/
-http://bleaklow.com/2012/06/20/sensor_smoothing_and_optimised_maths_on_the_arduino.html )

right now the program still works fairly well, but in the future i was hoping to build it into a polyphonic synth, and for that reason i'm looking to make the interrupt routine much faster, hopefully by changing the modulus part of the code..
well, i have tried different options (creating a macro that calculated the remainder in a small loop, utilizing bitshifting and multiplications to avoid any divisions, since every other arithmetic operation is handled by the hardware).

even by google i was not able to find any code which would calculate the reminder of two numbers in a speedier way.

any suggestions?

the whole code i'm using right now:

#include <math.h>
#include <TimerOne.h>
#include <avr/interrupt.h> 
#include <stdlib.h> 

#define max_steps 200
#define interrupt_freq 40000 //25 microseconds = 40 Khz... minimum time for the interrupt vector to execute..
                             //the "conta % pitch" takes A LOT of cycles...


volatile char out[max_steps+20]; //just in case
volatile int n=0;
volatile unsigned int conta=0;
volatile unsigned int pitch=0;
volatile unsigned int steps=0;

double freq=0;
int tr=0;


void setup(){

  DDRD  = B11111111;    //port C set all as output for dac

  freq = 261.63;  //middle DO note freq
  
  set_freq(freq); 
  fill_table(steps);

  Timer1.initialize(1000000/interrupt_freq); //converts HZ to micros..
  Timer1.attachInterrupt( DAC_out );
}


//outputs the values using a R2R DAC...

void DAC_out(){ 
  conta++;  //overflow should not be too much of a problem..in case change to unsigned long..

  if (conta % pitch==0){  //TODO: change modulus to something without any divisions for speed
    PORTD=out[n++];
    if (n > steps) n=0;
  }
}



//determine the number of steps the waveform must have and the 
//pitch prescaler to meet the desired frequency

void set_freq(double freq){
  pitch=round(interrupt_freq/(freq*(max_steps-1))+0.5); //rounds always up, by adding 0.5
  steps=interrupt_freq/(freq*pitch);
}

//fills the output array with the value of each waveform step...

void fill_table(int wavesteps){
  for (int x=0; x<wavesteps ; x++){
    out[x]=triangle(x,wavesteps)+127;
  }
}



void loop(){
}

/////////////_____-----_____----- WaveShape functions -----_____-----_____\\\\\\\\\\\\\

int sinewave(int x,int num_steps){  //---ok
  double c=map(x,0,num_steps,0,PI);
  double d=map(tr,0,1023,0,60);
  return (((128-d)*sin(c)))+((d)*sin((10)*c));
}

int square(int x,int num_steps){ //---ok
  int y;
  if (tr==0)  y=-1;
  else y=map(tr,0,1023,5,num_steps);
  if (x<=y) return 128;   
  else return -128;
}

int triangle(int x,int num_steps){  //---ok
  double c=map(tr,0,1023,1,num_steps);
  if (x<=tr) return (256/c)*x-128;   
  else return (256 / (c-num_steps))*(x-num_steps)-128; 
}

int noise(int x,int num_steps){   //---ok
  double c=map(x,0,num_steps,0,PI);
  if (random(1023)+5<tr) return random(256)-128;
  else return 128*sin(c);
}

Several things...

conta does not need to be volatile. That change should shave off a few machine instructions.

• This eliminates the modulus, is much faster, and (assuming I did not make a mistake) is functionally equivalent...

void DAC_out(){ 
  conta++;  //overflow should not be too much of a problem..in case change to unsigned long..

  if (conta >= pitch){
    PORTD=out[n++];
    if (n > steps) n=0;
    conta = 0;
  }
}

pitch is shared with the interrupt service routine and is multi-byte so access needs to be protected (disable/enable interrupts).

pitch does not need to be volatile but I would leave it volatile to indicate it is shared with an interrupt service routine. I believe the generated code will be identical however it is declared.

steps is shared with the interrupt service routine and is multi-byte so access needs to be protected (disable/enable interrupts).

steps does not need to be volatile but I would leave it volatile to indicate it is shared with an interrupt service routine. I believe the generated code will be identical however it is declared.

thanks for the suggestions, i changed conta to not be volatile anymore...

the basic idea behind the modulus is to create polyphony, as explained in this link --> little-scale: Simple polyphonic synth with just arduino.
right now my program has a single monophonic output, but i was hoping to implement polyphony once i figured out how to speed up the interrupt routine...
so, yes, your solution would work right now, but it's not the kind of solution i was after (unless there is a way to implement polyphony with your solution that i'm missing)..

The problem is that Mr. Tomczak used mod for two very different purposes which is generally a very bad idea. Instead, if you create a second variable to serve the second purpose, it becomes trivial to eliminate the modulus...

uint16_t accumulator[4];
...
uint16_t limit;
limit = analogRead(j) >> 2 & B11111110;
++accumulator[j];
if ( accumulator[j] >= limit )
{
  accumulator[j] = 0;
}
uint16_t value = accumulator[j] / (limit / 2);
state = state + value;

Depending on the relative sizes of the numbers, subtracting pitch from conta until it is less than conta gives you the remainder if it was divided (modulus). This can be done in a very tight loop and if the numbers are close only a few subtractions are necessary.

i tried that already, but it ended up consuming more cycles than just doing "conta % pitch" .... :frowning:
things just had to be tricky...

just to let you guys know, i also found out that the reminder of any number divided by 2^n could be calculated as num & (2^n - 1), and so i changed the code to only give pitch prescalers which were powers of 2, but that had the side effect of providing very
inaccurate frequencies (it would yield 300 hz instead of 350 hz, which is really quite noticeable..)

another alternative i explored was considering A % B as A - B * (A/B)... i tried precomputing 1/B before and storing the value in another variable, say C... so technically i could do A - B*(A*C)......but the problem with this is that it always gave 0 as a result.. since 1/B would always give 0 in integer math...

Listening to the demonstration video -

It doesn't look like the most promising basis for a polyphonic synth project.

DDS is a lot less computationally expensive and sounds an awful lot better as well.

Duane B

rcarduino.blogspot.com

The code in reply #1 should work but all the variables used in the IRQ should be volatile.

However I would change the order of the statements a bit especially for the var n which is used as an index.
First test if it is out of range before using it.

void DAC_out()
{ 
  conta++;  //overflow should not be too much of a problem..in case change to unsigned long..
  if (conta >= pitch)
  {
    conta = 0;
    if (n >= steps) n=0;
    PORTD=out[n++];
  }
}

If pitch is a known constant, it can be made easier.

Otherwise, you can use successive subtract; or division on a machine with hardware dividers / multipliers:

  if (conta - (conta / pitch) * pitch) {
    PORTD=...

Successive subtraction can be done line this

  while (conta >= pitch) conta -= pitch;
  if (conta==0) {
    PORTD=...

This will destroy conta so if you don't want that, use that code on a copy of conta.

If pitch is a known constant, it can be made easier.

in the first post of OP - pitch is set in set_freq() outside the Interrupt routine, so it is possibly not constant

I did some quick testing.

Without optimization, the division approach is slightly faster than the successive subtraction approach and slightly faster than the straight modulus approach;

With full optimization, the they are about as fast: 1.5ms for about 250 modulus calculations. I used two saves in the successive subtraction approach so it potentially is the fastest of the bunch.

Still, 6us per modulus calculation shouldn't be that bad - I don't see any need to optimize that, even in an isr.

ok...so i've tried some different options, including the one

  if (conta >= pitch)
  ...

..i have to say i kind of liked this solution quite a lot, as it permitted me to run the isr up to a frequency of 100 Khz, which is 2.5x faster than the modulus solution... i figured out that if i want to implement polyphony i can simply use more counter variable (conta1, conta2 and so on..)..

so right now i have connected some pots to modulate the frequency in real time, and the results are really quite nice ,even though i only had linear pots, and no log ones,so the scale is off...but still , good stuff...

thanks to everyone who replied :smiley: