4 Voice Synthesizer Code Question

I've been playing around with this code

dzlonline/the_synth - A simple to use 4 polyphonic wavetable synthesizer library for Arduino.

It works well, and I understand how to play notes and sounds with it. However, I'm trying to understand how it works by looking at the code.

I understand part of it. There are wave tables that have the shape of various sound waves like sine, triangle, etc. But there is a lot going on that I don't understand. I want to be able to modify it, but I'd like to get an overall feeling for how it works.

My goal is to understand why every calculation is done and then rewrite the code with clearer variable names and lots of comments. Should then be easier for me and others to modify.

It does appear that when the waves of all 4 voices are combined at every point in time, the final output is shifted right by 2 to get the final value back down to a byte value. I'd like to change this so that the output is louder. Most of the time, the 4 waves won't add up to be more than 255, so I'd like to increase the dynamic range for softer sounds without clipping.

Any ideas?

You only have a byte to express the PWM value that gets turned into the analogue output. You need an external A/D to give you that dynamic range. Normally a 12 bit one is a good compromise.
Those variable names all in upper case are fixed and refer to internal registers in the processor.
A lot of the code is concerned with getting the PWM frequency correct.

Grumpy_Mike:
You only have a byte to express the PWM value that gets turned into the analogue output. You need an external A/D to give you that dynamic range. Normally a 12 bit one is a good compromise.
Those variable names all in upper case are fixed and refer to internal registers in the processor.
A lot of the code is concerned with getting the PWM frequency correct.

I ordered one of those 32-bit STM32 microcontroller boards from China that supposedly can be developed using the Arduino software. I am trying to understand how the arduino synth works so I can hopefully port it to the STM32 and make more interesting musical sounds and more simultaneous voices. Eventually, I'd like it to be able to directly play midi data so I can easily incorporate music into projects.

Right now, I'm tweaking midi data from songs to get it into a form that I can use for the existing 4-voice synth. My goal is to make a Christmas clock like the one we have hanging on our wall that plays a different Christmas song every hour. Unlike the one on the wall that looks like an ordinary quartz analog clock, mine will have LED segments, lots of wires sticking out, and will make my wife shudder. But at least I can control the songs it plays :slight_smile:

Take a look at my book for the full background to this topic.

Try this for fun if you have a moment.
William Tell converted

This synth uses 2 timers to generate audio PWM on a pin.

The fast mode PWM on T2 outputs 0 to nnn.

The zero basis for this is a 127 value. So your speaker will modulate about this "center" point. This is about 2.5 volts if you think of it as analog.

in the code : (OCR2A = OCR2B = 127 + .....)

The cpu clock 16,000,000 cps divided by the T2 rate 256 gives you 62,500 pwm updates per sec.

The next timer 1 interrupt is firing at 20,000 CPS. This is where the output value to the pwm (speaker) is updated. (62,500/20,000 is about 3.125 PWM cycles per wave update)

For each voice (multiply it's current value) by its volume. This could be 255*255. Shift the result 8 bits or divide by 256. Add the 4 values together and shift 2 bits or divide by 4.

I think you understand that part pretty good.

Now for the Wave modulation and volume modulation.

For the wave. A sine wave (or other) is stored over 256 bytes of memory.
(Goes from 0 to +127 to 0 to -127 to 0)

Since the program interrupt evaluates 20,000 times per second we can calculate a "step" along the wave. Say I want to make an "A" note that is 440 hz. 20,000/440 = 45.45 steps to traverse the wave, or a offset of 5.6 or so for each scan. This is the "PITCHS" table, when multiplied by 256. So "A" 440hz = 440256/20000256 = 1441.79 (truncate) = 0x05A1. If you look in the table you will find this value.
So for each time thru the 20hz loop the current 16 bit position is updated by adding on this value. The program is only looking at the high byte of this result to figure out where to read the wave. (dividing by 256 without dividing)

Now for the volume...
Very similar in that a 16bit integer(word) high byte is used to scan the envelope.
This program updates the volume scan at 1/4 the rate. That's the divider code. So the elective volume adjust rate is only at 5hz. When the High bit gets set > 127 the program turns off the channel.

Clear as mud?

Richard

Thanks. I am starting to understand the code. I'm modifying it now so that I can make envelopes on the fly for notes. That way I can control attack, decay, sustain length, and release rate. I'm not sure how the note length value is calculated that you can set, but I did figure out that the length doubles every time you add 12 to it. So I made a function that lets you set a real note length (in time), and I'll use that with my envelope generator to try and make some decent sounding notes.

I finished the code, but haven't actually tested it yet, so I don't know how good the sound will actually be.

My ultimate goal is to make a musical clock. One that plays a different song every hour. And I want the notes to sound halfway decent.

My second ultimate goal is to understand the code well enough to eventually port it (or some of it) to an ATTiny so I can make musical birthday cards :slight_smile: I have code that can make multiple voice sounds with clicks, but it doesn't sound that great and I'd rather go the PWM route if I can.

Thanks for the info.

I have modified the code somewhat customizing the wave forms and envelopes.
There is an error in synth.h

 void mTrigger(unsigned char voice,unsigned char MIDInote)
  {
    PITCH[voice]=pgm_read_word(&PITCHS[MIDInote]);
    EPCW[voice]=0;
    FTW[divider] = PITCH[voice] + (int)   (((PITCH[voice]>>6)*(EPCW[voice]>>6))/128)*MOD[voice];
  }

FTW[divider] shoud be FTW[voice] ...
I don't use pitch decay or increase during sustain, so my version removes..

 void mTrigger(unsigned char voice,unsigned char MIDInote)
  {
    PITCH[voice]=pgm_read_word(&PITCHS[MIDInote]);
    EPCW[voice]=0;
    FTW[voice] = PITCH[voice] ;
  }

So the length of the note is looked up in a table.. this is how long the 127 steps of amplification take.
The envelopes included are basic. I use a spread sheet to create envelopes and waves.

Another modification I have made is to increase to 6 notes, reducing the amplitude, and reduced the note step frequency by 1/2, and doubled the wave form. This is so the base frequency will repeat 2 times in 256 bytes. This allows for the 3rd harmonic to be included, but reduces accuracy.

I attached a image of a waveform with this double cycle.

Cheers, Richard

PS.. Did you try out the William tell song? I think this can help you, as it shows a complete 4 part song score.

That's a good idea to remove the pitch modulation. I wouldn't use it except for sound effects. I have software I found that converts midi to notes and time durations so I can make 4 part songs fairly easily. Right now I'm working on a simple notation for entering songs and envelope shapes. I'm writing a tutorial so people can make their own music clock.

I just got my arduino due clone today that I ordered from China, so I may try to get that working with its 12 bit dac. And I want to get a minimal version running on the ATTiny85 for musical birthday cards. So much to do.

shawnlg:
.... so I may try to get that working with its 12 bit dac.

So if I had a 12bit DAC I would try something like this modification in the interrupt.

// up in declarations .. volatile unsigned int dacWord=0;
.......  
          dacWord = 2048 +
           ((
           (((signed char)pgm_read_byte(wavs[0] + ((unsigned char *) & (PCW[0] += FTW[0]))[1]) * AMP[0]) >> 6) +
           (((signed char)pgm_read_byte(wavs[1] + ((unsigned char *) & (PCW[1] += FTW[1]))[1]) * AMP[1]) >> 6) +
           (((signed char)pgm_read_byte(wavs[2] + ((unsigned char *) & (PCW[2] += FTW[2]))[1]) * AMP[2]) >> 6) +
           (((signed char)pgm_read_byte(wavs[3] + ((unsigned char *) & (PCW[3] += FTW[3]))[1]) * AMP[3]) >> 6) 
                   ) );
// Code to write the lower 12 bits of dacWord to the DAC would go here.
}

So if you add any more voices you will need to shift 7, or reduce the wave amplitude.
The sum of the channels should not exceed 2048 for a 12 bit output.
Remember this section of code needs to execute at a fair pace.
In the UNO the interrupt is taking about 100 clock cycles. So it's using about 12% of the CPU.

That's the code I get confused with. I know it's figuring out which piece of the wave to read at the sample time. But I never would have come up with it.

shawnlg:
That's the code I get confused with. I know it's figuring out which piece of the wave to read at the sample time. But I never would have come up with it.

(((signed char)pgm_read_byte(wavs[0] + ((unsigned char *) & (PCW[0] += FTW[0]))[1]) * AMP[0]) >> 6)

I read this code this like this...
read the signed character at the base address of the wave offset at the last position + the frequency step MSB offset,(saving the position) multiply this by the amplitude multiplier or envelope, then divide this by 64

I know it's clear as mud.
Ok... the pgm_read_byte() is a function to read data from the program space which is constant. This is where the tables live.

So... read a signed 8 bit value from memory base address + the offset along the wave.
The offset is 0 to 255, or unsigned char...

Ok, good so far... now to the offset calculation
.. PCW[n] is 16 bit as is FTW[n] (unsigned int)
So the 16 bit part is PCWn = PCWn+FTWn Notice the +=, shorthand assignment.
this is cast as a unsigned char array [] which has elements 0 an 1. The cpu pushes in LSB,MSB order so we want only the MSB.. hence the [1].

We need to calculate the step in 16 bit math for frequency accuracy, but are only looking at the MSB.
So if the accumulator PCW was at 0x6030 and the frequency step is 0x037A then the next accumulator is 0x63AA. The offset on the wave has changed from 60 to 63... next would be 67,6A,6E,71,75,78,etc

So the outer math is signed char (+/- 127) * envelope (0 to 255) and shift(divide) to get back in bounds of the +/- 127...
I hope this helps you with your understanding

Rich

Qsilverrdc:
I hope this helps you with your understanding

Rich

Yes it does. I moved the envelope out of program memory so I could change it on the fly. I want to see if I can add reverb also. Just trying to make more interesting sounds.

shawnlg:
That's a good idea to remove the pitch modulation. I wouldn't use it except for sound effects. I have software I found that converts midi to notes and time durations so I can make 4 part songs fairly easily. Right now I'm working on a simple notation for entering songs and envelope shapes. I'm writing a tutorial so people can make their own music clock.

I just got my arduino due clone today that I ordered from China, so I may try to get that working with its 12 bit dac. And I want to get a minimal version running on the ATTiny85 for musical birthday cards. So much to do.

Hi, I've been also fiddling with my Due with different code but have got the William Tell Overture to work quite well with a 12 bit DAC (MCP4921).

Took me a while to figure out the song format but it's really quite simple.

How do you convert the midi to notes and time durations? been looking around all day to find something so I can experiment with more midi files.

I was also looking at the 4 part song format and devised a simple way to compress it a bit (5671 bytes down from 10740)

The file is just repeats of { duration, b1, b2, b3, b4 } and since the code uses the same envelopes for each column it doesn't really matter if b1 - b4 are swapped around. Duration seems to always be < 127 as are the values in b1 - b4. I reorganized so the rows are variable length... basically if the high bit is set then that's it for the row.

A row with 32, 65, 255, 255, 255 becomes 32, 128|65.
A row with 32, 65, 66, 67, 68 becomes 32, 65, 66, 67, 128|68.
A row with 2, 0, 0, 0, 0 becomes just 128|2 just extra delay.
A row with 2, 255, 255, 255, 255 just becomes 128|2.

The other modification I made was when a note is played it just plays using the 1st voice that's not playing. Had to do this as sometimes the note hadn't finished before wanting to play again and cutting a note in the middle produces clicks and pops.