Go Down

Topic: Fast Data Type (Read 1 time) previous topic - next topic

davewking

I need to store 512 bits that I'll read one at a time in a loop.  I tried using an array, but that's really slow.  I could create 64 bytes and somehow use those, but I'm hoping there's a better way.  Basically the code needs to do something like this:

int array[512];

for(int i; i < 512; i++)
{
  array = digitalRead(2);
}

I'm actually PIND to do the read because digitalRead is too slow, and that seems to be working well.  Any ideas on something I can use?

Thanks

Korman

#1
Sep 25, 2010, 11:55 am Last Edit: Sep 25, 2010, 10:02 pm by Korman Reason: 1
What do you consider slow and what data read rate do you consider adequate on your 16MHz CPU?

Lets look a little what the assembler offers. I will assume here that the compile does use it reasonable efficently, if not you will have to break out the assembler.

The basic data type of the ATmega is the byte. Loading and storing a byte can be done in 2 CPU cycles, a word takes 4. So using int instead of char will cut down on loading time. To make things even better, doing a bit-shift in a register takes 1 CPU cycle.

So if you use your int array, for 8 bits you use 8*4=32 CPU cycles for the storing alone. If you use a char array and set bits, you use 7 cycles for the shifting, 2 cycles for writing you the byte and up to 8 cycles to handle the bits set to 1, giving a total of 9 to 17 cycles. Getting the data from the port or checking of the bit is 1 is assumed to be equivalent, the same with the loop control.

In sum, the gain is nice, but not dramatic. The main benefit in my opinion is, that you use 64B of storage instead of 1kB, which is a dramatic gain considering that you have only 1kB (or 2kB) available on your CPU (ATmega168 or ATmega328 assumed). Your array takes the whole RAM of your CPU. To put it mildly, this is a very bad idea.

And I agree, getting rid of the digitalRead is a good idea.

Perhaps that gives you some ideas.

Korman

deSilva

You should use some "loop unrolling", e.g.
Code: [Select]
for (byte i=0; i<64;++i){
byte b = (PIND>>theBit)&1;
b |= (PIND>>(theBit-1))&2;
b |= (PIND>>(theBit-2))&4;
b |= (PIND>>(theBit-3))&8;
b |= (PIND>>(theBit-4))&16;
b |= (PIND>>(theBit-5))&32;
b |= (PIND>>(theBit-6))&64;
b |= (PIND>>(theBit-7))&128;
// change shift direction if theBit <7 :-)
theData[i]=b;
}//for

Korman

#3
Sep 25, 2010, 12:23 pm Last Edit: Sep 25, 2010, 12:32 pm by Korman Reason: 1
If efficiency is necessary, don't mess with doing maths on PIND. I would do:

Code: [Select]

; Bit 7
CLR Rx
SBIC PIND, PIND2
ORI Rx, 1

; Bit 6
LSL Rx
SBIC PIND, PIND2
ORI Rx, 1

; Bit 5
LSL Rx
SBIC PIND, PIND2
ORI Rx, 1

; Bit 4
LSL Rx
SBIC PIND, PIND2
ORI Rx, 1

; Bit 3
LSL Rx
SBIC PIND, PIND2
ORI Rx, 1

; Bit 2
LSL Rx
SBIC PIND, PIND2
ORI Rx, 1

; Bit 1
LSL Rx
SBIC PIND, PIND2
ORI Rx, 1

; Bit 0
LSL Rx
SBIC PIND, PIND2
ORI Rx, 1


Or if the bits should be stored in the opposite direction use
Code: [Select]

LSR Rx
SBIC PIND, PIND2
ORI Rx, 0x80


The main problem here is a slight uneven read rate when the byte needs to be written. If you want to get reading as evenly as possible, you're probably faring better with a loop and slowing down the reading between the bits with nop if necessary.

But all this depends on what you're trying to do.

Korman

deSilva

Using assembly code is not fair  :o

Korman

#5
Sep 25, 2010, 03:34 pm Last Edit: Sep 25, 2010, 03:44 pm by Korman Reason: 1
Quote
Using assembly code is not fair


No it isn't, but when people start counting CPU-cycles, what else is one supposed to do?

I don't really know the C-compiler well enough to know how it'll optimise it what will be the resulting codes. In assembler at least, any mess I create is of my own device and I know who to blame.

And I think the SBIC and SBIS are really cool instructions. Perhaps I should make me a T-Shirt with them on it. How about:
I know about SBIC and WDR.
How did you waste your life
to end up here?


Or in a more general view how about: ROR >> LOL
Korman

deSilva

Quote
ROR >> LOL

But that IS cool  8-)

davewking

Thanks for all the ideas! There's a lot of good information to go off of.  I agree with the space issues with the array, I was just using that to illustrate what I need.  Like I said I only need to save a 1 or a 0 for each reading.  I might be missing something, but all the examples are only for a single byte, is there an easy way to do multiple bytes easily?  From what I've seen it looks like the array is slow to assign to because of the indexing, not the actual assignment.

Again, I really appreciate everyone's help with this.

Korman

Writing out a byte is reasonably fast. The code shown above was only to demonstrate various ideas how to handle the bitwise stuff. After that group of 8, you just store the byte and start with the next one. Done in assembler,  writing to the array and incrementing the pointer would entail a penalty of 2 CPU cycles. If you consider that too slow, it's time to rethink your structure and perhaps even the platform choice again.


Korman


westfw

I think you want something like
Code: [Select]
byte bits[512];
byte i;

void loop() {
 byte *ptr=&bits[0];
 i = 0;
 do {
   *ptr++ = PIND;
   i--;
 } while (i != 0);
 do {
   *ptr++ = PIND;
   i--;
 } while (i != 0);
}

Using the two loops prevents the AVR from having to do slower 16bit math...  This will compile to code that would be hard to write any faster using equivalent algorithm (ie looping)
Code: [Select]
void loop() {
 byte *ptr=&bits[0];
 a8:   e0 e0           ldi     r30, 0x00       ; 0
 aa:   f1 e0           ldi     r31, 0x01       ; 1
 i = 0;
 ac:   90 e0           ldi     r25, 0x00       ; 0
 do {
   *ptr++ = PIND;
 ae:   89 b1           in      r24, 0x09       ; 9
 b0:   81 93           st      Z+, r24
  i--;
 b2:   91 50           subi    r25, 0x01       ; 1
 } while (i != 0);
 b4:   e1 f7           brne    .-8             ; 0xae <loop+0x6>
 
 do {
   *ptr++ = PIND;
 b6:   89 b1           in      r24, 0x09       ; 9
 b8:   81 93           st      Z+, r24
   i--;
 ba:   91 50           subi    r25, 0x01       ; 1
    } while (i != 0);
 bc:   e1 f7           brne    .-8             ; 0xb6 <loop+0xe>
 be:   10 92 00 03     sts     0x0300, r1
}
 c2:   08 95           ret


If you need faster than that, replace the loop with 512 consecutive "*ptr++ = PIND;" lines (ugly!)

deSilva

#10
Sep 26, 2010, 11:15 pm Last Edit: Sep 26, 2010, 11:15 pm by mpeuser Reason: 1
Quote
I think you want something like...

Reading his last posting, I think he does not really know what he wants  ;D

westfw

Quote
he does not really know what he wants

Perhaps.  I provided what I thought was an interesting answer anyway :-)

It wasn't a particularly GOOD answer, however, since array indexing gets nicely optimized by the compiler, and is NOT particularly slow:
Code: [Select]
 for (int j=0; j < 512; j++) {
     bits[j] = PIND;
 c4:   e0 e0           ldi     r30, 0x00       ; 0
 c6:   f1 e0           ldi     r31, 0x01       ; 1
 c8:   89 b1           in      r24, 0x09       ; 9
 ca:   81 93           st      Z+, r24
 } while (i != 0);
 do {
   *ptr++ = PIND;
   i--;
 } while (i != 0);
 for (int j=0; j < 512; j++) {
 cc:   83 e0           ldi     r24, 0x03       ; 3
 ce:   e0 30           cpi     r30, 0x00       ; 0
 d0:   f8 07           cpc     r31, r24
 d2:   d1 f7           brne    .-12            ; 0xc8 <loop+0x20>
     bits[j] = PIND;
 }

Which goes back to what usually ends up getting said when people start talking about "optimizing" their code: you MUST look at the code that actually gets produced!


davewking

Good to know about the indexing, I thought I read somewhere it was slow.  I appreciate everyone's help here, I think I have what I need now.

Thanks!

westfw

Quote
Good to know about the indexing, I thought I read somewhere it was slow.

Oh, in general it can be slow, especially for multi-dimensional arrays where converting some set of indicies to the actual address/value involves one or more multiplications.  But a particularly simple use of an index array (as in the example here) all of that slowness might disappear because the compiler is smarter than you think.

I guess, in other words, if there's a simple way of making a simple thing faster, it's possible to likely that that is already done for you automatically...

Go Up