This is a code to time test the particular routine.
The Problem is with the LOD(look at code for LOD). with the LOD the output is around 1100 ms.
without it it is around 150 ms.
also if i replace 't' with a constant then output is around 150-200 ms.
So why is the access to the RAM taking So Much Time? and is it possible to optimise this code?
As i sai, if i replace the variable 't' with a constant, the time taken for the routine to run 10000 times decreases more than five times. so obviously, something about accessing the ram with a variable index, is making it go terribly slow.
by the way, just in case you haven't read the code., please have a look at the line commented as 'Line of Discussion' to see what i mean.
You seem to have confused the compiler. Confused me too. It's not the RAM access that is slow, it's the nature of what leads up to it. Can you explain what you are doing exactly?
The routine im trying to timetest is part of my project to sample various sound waveforms at different frequencies.
the logic is that 'buff' will contain 8-bit samples of a single wave pulse(i.e 0 to 2 pi) and this can be parsed at different rates to give different frequencies.
e.g. if i want twice the freq at the same sample rate, i'll read every alternate sample. thrice, every one in three samples. so on and so forth. so the calculation of t is very necessary and im not sure if it causes much of a problem because without the LOD or if 't' is replaced with a constant, speeds increase dramatically.
declaring variables to be volatile was necessary bcuz these variables are shared betw interrupt routines in the original project.
declaring variables to be volatile was necessary bcuz these variables are shared betw interrupt routines in the original project.
Just because something is shared between interrupt and background doesn't necessarily mean they have to be qualified as volatile.
You need to analyze their usage carefully.
maharjun:
e.g. if i want twice the freq at the same sample rate, i'll read every alternate sample. thrice, every one in three samples. so on and so forth. so the calculation of t is very necessary and im not sure if it causes much of a problem because without the LOD or if 't' is replaced with a constant, speeds increase dramatically.
Can't you move the calculation of t outside the loop? You don't change the sample rate, per instance of i do you?
maharjun:
The routine im trying to timetest is part of my project to sample various sound waveforms at different frequencies.
the logic is that 'buff' will contain 8-bit samples of a single wave pulse(i.e 0 to 2 pi) and this can be parsed at different rates to give different frequencies.
e.g. if i want twice the freq at the same sample rate, i'll read every alternate sample. thrice, every one in three samples. so on and so forth. so the calculation of t is very necessary and im not sure if it causes much of a problem because without the LOD or if 't' is replaced with a constant, speeds increase dramatically.
declaring variables to be volatile was necessary bcuz these variables are shared betw interrupt routines in the original project.
The routine im trying to timetest is part of my project to sample various sound waveforms at different frequencies.
Your test is testing your test method and not actually sampling waveforms at different frequencies.
Even if you fix the method, it's not going to tell you what you want.
in my original project, the routine is the code of the timer1_compA interrupt vector. so basically, visualise the loop as repeated instances of the interrupt. the slow nature severely limits the Sample rate of the audio.
Can't you move the calculation of t outside the loop? You don't change the sample rate, per instance of i do you?
no im not sure which loop u mean. if its the while loop then obviously not because then it wouldnt be much of a simulation (see above). Also, the calculation depends on noteTime[y] (ive used y istead of i due to italics confusion.) which gets incremented each time.
if its the for loop then no obviously because it depends on i.
Why cast to a 32-bit integer when you are indexing into an array of only 1024 elements?
because noteTime[y] *notefact[y] can get pretty big pretty fast and i have no idea how overflows work when converting from a floating point to an integer type.
More obviously to me, calculating that variable index is the slow part.
This is the FACT:
you remove the LOD(pls read code of my first post loop() function) things are fast;
i dunno how this could be related to the previous line of calculation.
if anyone could please explain this, id be grateful.
Your test is testing your test method and not actually sampling waveforms at different frequencies.
yes obviously. im only trying to see how much time the routine takes. this code will sample 4 notes (simultaneously) if i copy and paste it back into my interrupt routine and fillup buff[] with proper waveform data.
declaring variables to be volatile was necessary bcuz these variables are shared betw interrupt routines in the original project.
Just because something is shared between interrupt and background doesn't necessarily mean they have to be qualified as volatile.
You need to analyze their usage carefully.
I dispute that glib statement - in what circumstances will a non-volatile variable be guaranteed safe to communicate between threads or between ISR and main code?
maharjun:
in my original project, the routine is the code of the timer1_compA interrupt vector. so basically, visualise the loop as repeated instances of the interrupt. the slow nature severely limits the Sample rate of the audio.
I saw that code. It needs major changing as in rewrite.
How many samples per second do you need? And how many bits each? The ADC can give you 8-bit samples more quickly than 10-bit samples (at 105 usec each if you stay on the same pin, else when you change pins it's sample twice and throw the first out). That still lets you go for about 2000 10-bit samples per second with cycles left over a pretty decent bit of processing. If you can tighten that up then you might get 4000-5000 per sec and if you go to 8-bit samples you can get more.
GoForSmoke:
I saw that code. It needs major changing as in rewrite.
How many samples per second do you need? And how many bits each? The ADC can give you 8-bit samples more quickly than 10-bit samples (at 105 usec each if you stay on the same pin, else when you change pins it's sample twice and throw the first out). That still lets you go for about 2000 10-bit samples per second with cycles left over a pretty decent bit of processing. If you can tighten that up then you might get 4000-5000 per sec and if you go to 8-bit samples you can get more.
OTOH I like my audio at either 22050 or 44100.
ya i rewrote that code and things are working fine except that there is an upper cap of 8000 Samples/sec in order to support 4 note polyphony. that is why i am concerned as to why this routine is taking such a long time. someone above has posted the assembly thats produced by the LOD. if you or anyone could explain why thats necessary, and/or how it could be modified it would be of tremendous help.
for my case i would like it to be atleast 20000 Hz
I think that you might need a DSP chip or audio shield in your project. 20000/sec is 50 usec to sample and store. At 8 bits per sample your polyphony is going to get muddy even if you could go that fast.
MarkT:
I dispute that glib statement - in what circumstances will a non-volatile variable be guaranteed safe to communicate between threads or between ISR and main code?
This one, for example:
volatile const int sampleRate = 8000;
It doesn't change, it hardly needs to be volatile.
More obviously to me, calculating that variable index is the slow part.
This is the FACT:
you remove the LOD(pls read code of my first post loop() function) things are fast;
i dunno how this could be related to the previous line of calculation.
if anyone could please explain this, id be grateful.
I can explain it, easily.
Consider this:
if(noteID[i] != -1)
{
t = ((uint32_t)(notefact[i]*noteTime[i]+0.5)) & 0x000003FF;
temp2_byte += *(buff+t); //THIS is the line of discussion henceforth
//refferred to as LOD
noteTime[i]++;
}
It's an optimizing compiler. That's the only place you use t. So if you remove the memory access:
The compiler says "hey, he never uses t, why calculate it?" so it generates:
if(noteID[i] != -1)
{
noteTime[i]++;
}
... if 't' is replaced with a constant, speeds increase dramatically ...
As you say, if you replace t with a constant it is fast. Surely that must convince you that it isn't the memory access that is slow, it is calculating t. What other proof do you need? You admit it yourself.