Go Down

Topic: Watchdog ISR in assembly language (Read 8 times) previous topic - next topic

robtillaart

@westfw
If C is not part of the problem, .... is it part of the solution then? or ... :)

Know about the discussion about digitalwritefast, seen it passing by several times, thanks anyway  - Many (not all) performance problems can be solved by another design or algorithm, for the rest one can only buy more hardware or lower the requirements, and even that is not allways sufficient :)
Rob Tillaart

Nederlandse sectie - http://arduino.cc/forum/index.php/board,77.0.html -
(Please do not PM for private consultancy)

Graynomad

#26
Feb 16, 2011, 05:33 am Last Edit: Feb 16, 2011, 05:47 am by Graynomad Reason: 1
I said before

Quote
There is a third option of course, C with direct port manipulation which would presumably be somewhere between the two, I may write this to test it as well.

Well I've done another test and I was wrong, C with direct port manipulation is BETTER than assembler. I've always known C was good but I didn't expect this I have to say.

ASM with delays removed 28uS
C with direct mainipulation 26uS

Code size including Arduino overhead

ASM 858 bytes
C 778 bytes

Code size with neither routine 658 bytes

so the size of each ISR

ASM 200
C 120

Frankly I don't undertand the massive difference in code size, maybe my numbers are out a bit but either way this is very interesting.

As an ex ASM writer I think this is probably the end of my assembly writing days  :smiley-eek:

But what do I mean by "C with direct port manipulation"? An example of a byte shift in is as follows

Code: [Select]
 
 for (i = 0; i < 8; i++) {
   asm ("sbi 5,5");  // clock high
   cmd |= (PINB >> ASM_PIN_MOSI) & 1; // bit into LSB of byte
   cmd <<= 1;    // move bits up
   asm ("cbi 5,5"); // clock low
 }


So you could argue that it's really half assembly anyway and as written it's not directly portable between different CPUs, nothing a few #defines or conditional code blocks can't handle.

But I think the point is that it's all written within the C/C++/Arduino environment and I think that's a big plus. It's also a 100 times more readable than the 150 lines of assembly language.

Quote
BTW what logic analyzer do you use? there is still one my wishlist

Rob, take it off your wish list and put it on your desk, I can't even imagine how people debug code that deals with IO without one. I reckon if everybody had an LA this forum would be half the size as there would never be a "I'm sending X and receiving Y" thread again  :)

Anyway to answer you question, I use the Salaea Logic, I can't recomend it highly enough. At $170 it's a no brainer to me. In the "old" days with external peripherals you needed to spend $1000s on a 32-channel 100MHz job, but these days all the hard stuff is inside the chip and we're mostly looking at serial comms of one type or another so these new USB LAs are just the ticket for 99% of work IMO.

I also find it usefull for non-IO code, for example to see how often a function is called I pulse a pin and look at it with the LA. Or to see how many times I go through a loop I do the same.


______
Rob

Rob Gray aka the GRAYnomad www.robgray.com

westfw

Code: [Select]
    asm ("sbi 5,5");  // clock high
The current compiler is pretty good at turning C code like "PORTB |= (1<<PB5);" into a single sbi instruction when it is possible.

You did get rid of ISR_NAKED in your C implementation, right?

Graynomad

Quote
You did get rid of ISR_NAKED in your C implementation, right?

He he, yep, after wondering for a while why the code kept restarting :)

_____
Rob
Rob Gray aka the GRAYnomad www.robgray.com

Graynomad

#29
Feb 16, 2011, 10:18 am Last Edit: Feb 16, 2011, 10:29 am by Graynomad Reason: 1
Hold the bus, C isn't quite as good as I thought, I was bitten in the bum by the optomiser.

I changed the code as westfw suggested, here's the three bytes being received.

Code: [Select]
   
 for (i = 0; i < 8; i++) {
   PORTB |= (1<<PB5);
   cmd |= (PINB >> PB4) & 1;  // I commented this line
   cmd <<= 1;
   PORTB &= ~(1<<PB5);
 }
 for (i = 0; i < 8; i++) {
   PORTB |= (1<<PB5);
   addr_lo |= (PINB >> PB4) & 1;
   addr_lo <<= 1;
   PORTB &= ~(1<<PB5);
 }
 for (i = 0; i < 8; i++) {
   PORTB |= (1<<PB5);
   addr_hi |= (PINB >> PB4) & 1;
   addr_hi <<= 1;
   PORTB &= ~(1<<PB5);
 }


This runs like a cut snake at 27uS for the ISR.

For reasons that I can't remember I commented out the PINB line for the first byte and the ISR blew out to 68uS.

WTF?

It seems that the compiler has decided that "(PINB >> PB4) & 1" will be the same for all three times and has not bothered repeating the code, presumably did that once at the start of the ISR and just did assignments thereafter. If I'd had the other Arduino hooked up I would have realised I was getting duff data, but with MISO unconnected FF is as expected and I didn't look into it.

As I don't know if you can make PINB volatile I made the variables being assigned volatile instead and that seems to work.

So the C version takes 68uS, nearly twice as slow as assembler (thank goodness, the every foundations of my world had taken a beating for a minute there) but not too shabby.

______
Rob


Rob Gray aka the GRAYnomad www.robgray.com

Go Up