YUN too slow or defective?

I’m trying to read data from a digital caliper into my YUN but somehow it does not work as expected.

There are basicly two signals CLK and SDA and I found several resources where people did it with arduinos or similar microcontrollers and it never seemed to be a problem.

The data-block clocking consists of two words with 24 bits each, that are being transmitted in about 460µs (micro) followed by something like 175 ms (milli) of “nothing”.

I’m seeing the data in the oscilloscope as well as in a logic analyser. So technically everything seems to be ok. Decoding that data manually also gives me proper results.

However via the Arduino, it seems that not all bits are seen. I tried to do that in a loop as well as via interrupts.

The code is pretty simple, but as I could not find a way to make it work, I created a very basic test routine that does nothing but just read a digital port and see how many reads are possible in something like 1000µs

int PIN_CLK = 3;
int PIN_SDA = 2;
int PIN_LED = 13;

unsigned long tStart = 0;
unsigned long tNow   = 0;
unsigned long nReadCount = 0;
int nReadVal;

void setup() {
  
  pinMode(PIN_CLK, INPUT);
  
  while (!Serial){};
  
  Serial.begin(115200);
  
  Serial.println("ReadCount test 001");
  digitalWrite(PIN_LED, HIGH);
}

void loop() {
  digitalWrite(PIN_LED, LOW);
  
  tStart = micros();
  tNow   = tStart;
  nReadCount = 0;
  
  while (tNow < tStart+1000){
    tNow = micros();
    nReadVal = digitalRead(PIN_CLK);
    // nReadVal = digitalRead(PIN_SDA);
    nReadCount++;
  }  

  Serial.print( "Time taken: ");
  Serial.print( tNow-tStart);
  Serial.print( "  Number of reads: ");
  Serial.println( nReadCount);
  
  digitalWrite(PIN_LED, HIGH);
  delay(200);
  
}

if I run this, I see that “only” 81 reads are possible within 1000µs, and just about 60 if I read the second port as well. In this scenario I am not even doing anything with the data and still I would not be able to get the 48 bits in time.

What’s wrong?

I’m attaching the sketch as well.

Thanks for any help

regards

Frank

DigitalReadTest.ino (848 Bytes)

This piece of code

while (tNow < tStart+1000){
    tNow = micros();
    nReadVal = digitalRead(PIN_CLK);
    // nReadVal = digitalRead(PIN_SDA);
    nReadCount++;
  }

has a lot more than digitalRead() within it.

If you want to measure the time for digitalRead() try this

volatile byte dd;
tstart = micros();
for (int n = 0; n < 1000; n++) {
    dd = digitalRead(PIN_CLK);
}
tend = micros();
Serial.println(tend - tstart);

I have used volatile in the hope that the compiler won’t just ignore the FOR loop because it thinks its value is not used anywhere.

If digitalRead() is too slow you can use port manipulation

…R

DDTechG:
The data-block clocking consists of two words with 24 bits each, that are being transmitted in about 460µs (micro)

48 bits in 460us sounds like I2C running at a fairly standard 100 kHz, and it sounds like the caliper is the master and the Yun will be the slave. That's quite unusual, a measurement device like that is usually the slave.

if I run this, I see that "only" 81 reads are possible within 1000µs

That comes out to about 12.3 us per read, or 197 CPU clock cycles. That's not bad considering that each "read" is not only a call to digitalRead() but is also incrementing a 32 bit value (nReadCount), calling a function to read another 32 bit value (micros()), then doing a 32 bit addition on that value and a 32 bit comparison. You're doing quite a lot of work besides just reading the input pin.

I tried your sketch, and got a fairly consistent 80 reads per 1000us. Then I got rid of the loop overhead by simply reading tStart, doing 1000 digitalRead calls in-line (no loop) then reading tNow. The difference is 6340 us, which comes down to 6.34 us per nReadVal = digitalRead(PIN_CLK); call. So it would look like your timing test had about 50% loop overhead. (I still have a bit of overhead in mine, one of the micros() reads, but that should be negligible compared to the 1000 reads.)

Simply calling digitalRead() appears to take about 100 CPU cycles. That's not too bad considering all of the overhead involved to translate the pin number to a port register address and pin mask, as well as the other overhead that makes calling digitalRead() so easy and safe. Of course, all of that internal overhead means that calling digitalRead() flat-out will only let you sample at about 157 kHz tops, which will get you about 72 reads in during your 460us window. There is no way you will read the 96 clock edges that occur in that time, let alone also sample the data line. digitalRead() was designed to make digital I/O easy and safe, but the cost of that is speed. There is simply no way you will be able to read that data using digitalRead().

You can boost your speed dramatically by using direct port/bit manipulation to read your data stream. But to do that, you will have to be cognizant of the processor's abilities and write highly optimized code. After all, with the 16 MH clock speed, you will only have 160 CPU cycles per bit, or 80 CPU cycles per SCL edge to capture the data. That's not many clock cycles to find the edges of the SCL signal, AND read the SDA line, AND check for start and stop conditions. You will also need some cycles to actually decode the protocol, count bits, store data, etc.

I don't think that bit-banging a 100 kHz I2C interface is practical with a 16 MHz processor. Sounds like an exercise in futility unless you are a VERY sharp programmer and you bypass the Arduino libraries completely and work with the bare metal registers. Even then, it will be a battle.

You really need to use the I2C hardware to greatly lessen the load on the processor. While I've not used it, the Arduino Wire library appears to allow the Arduino to act as an I2C slave. Take a look at the Master Writer/Slave Receiver Example for ideas.

If that still doesn't give you the speed you need, you will either have to access the I2C hardware directly, or you will need to go with a board that has a faster processor.

Robin, Shapeshifter,

thanks for the quick response.

The sources I found seemed to “simply just have done it”, so that I did not even had the idea of being close to limits at all. None of them wrote about optimization issues and the word even was that it would be not problem to deal with more than one of these calipers.

Most of the code I found uses loops to wait for falling or rising edge (while (digitalRead(CLK)==LOW)… ). I’m now wondering how this could ever work, but maybe they used different calipers with slightly different speed or protocol.
I tried to do it by attaching an interrupt to the rising edge of the clock (the signal is inverted due to level shifting) and collect the data in the ISR and found that I never came to the final bit before the next cycle started. That’s why I tried the little test routine.

Robin2:
This piece of code … has a lot more than digitalRead() within it. If you want to measure the time for digitalRead() try this…

I did not mean to measure the speed of digitalRead() per se, I just wanted to see the performance without any possibly “bad” code I need for data handling etc. However I was not aware of the overhead I added in this little test routine. I’m a developing code for ages, but I’m quite inexperienced in the microcontroller field where these things matter. Thanks for pointing this out.

Robin2:
…If digitalRead() is too slow you can use port manipulation

yes, I will surely give this a try.

ShapeShifter:
48 bits in 460us sounds like I2C running at a fairly standard 100 kHz, and it sounds like the caliper is the master and the Yun will be the slave. That’s quite unusual, a measurement device like that is usually the slave.

I doubt that. I think it is a proprietary protocol. There is no negotiation etc. The caliper seems to broadcast its data to the world, no matter if You like to hear it or not.
I remember coding an I2C commuinication to a Sensirion SHT sensor. That was really talking to each other.

I’m attaching a screenshot of the data packet. The first word seems to be some kind of absolute value and the second one is relative to the last time “zero” was pressed; i.e. the value displayed.
The signal is inverted.

I could actually live with the one or the other value, just wanted to read both, because they were “there” and I’m currently just playing around. Actually all the sources just read 24 bits and in most cases not even that. They cut off after 16 bits and then just read bit 21 for +/- (two’s complement)

ShapeShifter:
…I tried your sketch, and got a fairly consistent 80 reads per 1000us.

Exactely what I get on mine, so the device seems to be OK. Thanks for testing.

Thanks again. I’ll see if I get any further with that. Also will try some of the code I found. Just to see if it would work for me at all.

Frank

DDTechG:
The sources I found seemed to "simply just have done it"

Are these examples of having done it with Arduino? Or some other processor?

The big bottleneck is the digitalRead() function, which makes things easy, but adds overhead. If the sample code was another processor and used direct port I/O, that would likely explain the differences.

Most of the code I found uses loops to wait for falling or rising edge (while (digitalRead(CLK)==LOW)... ). I'm now wondering how this could ever work, but maybe they used different calipers with slightly different speed or protocol.

Hmmm... so it does sound like Arduino code, but I have to agree that given your sample waveform I don't see how it's feasible. On that capture, it looks like the minor tick marks are 10 us. Given that the digitalRead() time was measured as 6.34 us above, you won't even get two reads in during one of those minor tick marks, so I don't see how you could catch those skinny clock pulses.

Perhaps they were reading a device with slower output, or there is a way to tell the device to use a slower send speed, or perhaps they were using a much faster processor like a Due or Teensy board?

I doubt that. I think it is a proprietary protocol. There is no negotiation etc. The caliper seems to broadcast its data to the world, no matter if You like to hear it or not.

I agree. Now that I see the waveform, this is definitely not I2C. Because of the use of the names SCL and SDA, I was assuming I2C. It looks like the device is simply shifting out the data bits, so I wonder if SPI hardware would make this easier to read? After all, SPI is basically a shift register.

Are there any other data lines that would make it easier to detect the start and stop of a transfer? That would probably make a SPI slave implementation easy. Or is the only option to look for the idle gaps in the data?

Also will try some of the code I found. Just to see if it would work for me at all.

I'd like to see some of that sample code. Do you have a link to it?

Also, do you have a link to information on the caliper? It may give me some ideas of a better implementation.

DDTechG:
Most of the code I found uses loops to wait for falling or rising edge (while (digitalRead(CLK)==LOW)… ). I’m now wondering how this could ever work, but maybe they used different calipers with slightly different speed or protocol.
I tried to do it by attaching an interrupt to the rising edge of the clock (the signal is inverted due to level shifting) and collect the data in the ISR and found that I never came to the final bit before the next cycle started. That’s why I tried the little test routine.

Maybe the most useful thing would be to describe what you are trying to achieve and maybe someone here can suggest a practical solution - rather than trying to “fix” existing unsuitable code.

…R

I did some poking around, and found a few sites that are reading Chinese digital calipers using some simple code like you mention. It would appear you have a caliper that sends data much faster than what is expected by those pages with sample code.

According to your logic analyzer capture, your caliper takes approximately 170 us to send 24 bits.

THIS ONE, which uses a simple polling loop to watch the clock, is expecting a single clock cycle to take 300 us. Your caliper is more than 40 times faster than this.

THIS ONE, which uses an interrupt on the clock edge to read the data bit, is expecting the 24 bits of data to take almost 8 ms. Your caliper is almost 50 times faster than this.

THIS ONE, which also uses a polling loop, is expecting the data to take 780 us. Your caliper is about 4.5 times faster than this.

It looks to me like the problem is not that the Yun is slow, but that you have a particularly fast data rate out of your caliper.


It seems to me like this data stream would be a good candidate to go through the SPI interface. There is an interesting discussion about it HERE. The problem is getting a slave select signal to know when there is the start of the data.

Fortunately, the data is bursty with long dead time between samples. I suppose it wouldn't be too hard to make a simple RC low pass filter that takes the clock line and looks for the dead time between readings, and turns that into a slave select signal. Basically, you want the filter output to go low within about 20 us of the clock line going low, but not go high during the short positive clock pulses. It would only go high when the clock line is high for a bout 20 us. That filtered signal feeds the SS pin, unfiltered clock goes to the SCK pin, and data goes to the MOSI pin.

Then, you just need to set up the SPI hardware in slave mode, and write an SPI ISR to catch the incoming data one byte at a time. By doing this, you are now working at the byte level, will not have to deal with individual bits. You'll have about 56 microseconds to process each byte -- not a lot, so you will still have to be efficient within the ISR: read the byte and store it, decrement a byte counter, and set a flag when all 6 bytes have been read. The loop() function then sees that the flag is set and processes the bytes into real values.

Robin, ShapeShifter,

thanks for Your effort in this.

Robin2:
Maybe the most useful thing would be to describe what you are trying to achieve and maybe someone here can suggest a practical solution - rather than trying to “fix” existing unsuitable code.

well, currently I am more or less playing around. The final idea is that the Yún acts like an arbitrator / controller between the user (providing the interface) and one or more other other processors that perform delegated tasks. It’s got something to do with machining. I doubt that multiple scales, plus additional sensing and controlling can be done by one of these chips or in one “big” program and I don’t see the necessity to do so. If it later turns out that distributing the tasks is not necessary, fine, but currently I think “chip per task”.

I’m sure this kind of controllers already do exist, but I have some ideas with the interface and “accessories” and would like to use my own hardware for that.

I’m currently not able do come up with a proper solution “just like that” and I’m aware that a good amount of learning curve is ahead But I’m not in a hurry and not afraid of learning.

When I wrote “playing”, I meant, I’m working on a “proof of concept” in order to see if the world could be as little Frank believes it could be.
I’m working with the calipers because they are cheap (~10$) and handy to work with on the table or a model. Later it surely willl be glass or magnetic scales (with another protocol). With the calipers I can simulate biger ones and test the interaction with the other elements and interface.

Reading the information from the caliper per se seemed quite clear to me, but I came to a point where I really was not sure if my Yún was defective or had quirky firmware loaded, because I underestimated the the internal oscillation speed of the packet bursts. Especially, because the actual repetition of the data is only around 5Hz.

ShapeShifter:
I did some poking around, and found a few sites that are reading Chinese digital calipers using some simple code like you mention. It would appear you have a caliper that sends data much faster than what is expected by those pages with sample code…

Yes, those sites were among the ones I found.
Funny thing is, that almost all of theses calipers, as well as electronic micrometers do have such an interface. There is a little cap that protects the contacts, that can be removed by hand but nicely snaps back into place. A thing that really raises production costs in a segment where probably every 1/10 of a cent matters. But on the other hand not a single (official) document about these interfaces can be found. Every source available seems to be from people who were curious and started to “hack” them, armed with their oscilloscopes and intellect.

There seem to be two basic types of protocols used by these “chinese calipers”. One is called “BCD”, the other the “24-bit data”.

BCD sends six packages for one digit each as well as information on +/- as well as inch/mm. The sources You mentioned mostly seem to deal with this protocol.

Some good information can be found HERE at robotroom.

the other one is the one, my calipers have. Also nicely described at HERE robotroom.
But the original source, most people refer to, is by Shumatech
This already seems to be an older source; I already fell over it years ago, when I first found this port on my caliper and curiously “googled” around.

Shumatech already wrote: “The serial data stream is 48 bits long and is clocked by the scale at a nominal frequency of 90 kHz, although the exact frequency seems to vary somewhat between different scale models. On the scale I used for testing, the clock frequency was about 77 kHz as shown in the following oscilloscope snapshot…”

So mine does not seem to be particularly fast. It seems to be the protocol. I bought one of the calipers in 2009 and it is as “fast” as the others.
… looking again at the LA-Waveforms of shumatech and robotroom, yes mine obviously is a bit faster. Lots of energy down here in Germany :-).

Besides the two protocols mentioned, Mitutoyo seems to use a special one and sometimes You stumble over another protocol mentioned, “Digimatic”. But I’m not sure if that is in fact one of the other two.

Other sources, not arduino related: NerdKits or Robocombo

My idea was not to hamstring the whole yún by looping and waiting but to work with interrupts, as some of the sources did as well.

pseudo coded this would be something like

void setup() {

  ...  
  
  // initialize last interrupt time a few ms forward
  // so that we catch a new cyle for sure and have some
  // time for things to "set" upon startup.
  tLast = micros() + 50000;     
    
  
   // signal comes in inverted so catch on rising 
   attachInterrupt(INT_CLK, onClock, RISING);
  
}

void loop() {
  
    
    // print out the values when certain conditions are meat
    // changed values or a set flag when data is ready for display
    if ( (NewValAbs != OldValAbs) || (NewValRel != OldValRel) ){
            PrintValues();    
    } else {
       
   
    }      
   
}


void onClock(){
    // interrupt delegate on a rising edge.
    // global variables modified here must be declared volatile
    
     cli();
    
     unsigned long tNow = micros();  
    
     // read data value on rising edge of clock as 
     // signal is inverted. Data therefore also needs to be inverted.
     //
     // digitalRead() obviously is too slow in this scenario. Should
     // read port directly.
     
     unsigned int  lData = digitalRead(PIN_SDA);

    if ( (tNow - tLast) > 10000 ){
        // Pause > 175ms, begin of new data set
        // in "fast read" mode packets are being sent every 20ms
        // 10ms should cover both 
          
        //digitalWrite(PIN_LED, LOW);        
        nCurrentBit = 0;
        nCurrentSet = 0;  // 0 = Absolute Daten, 1 = relative Daten, 2 = außerhalb=nicht gestartet, 3= data ready for display
        
        tmpVal      = 0;        
    }
    
    
    if (nCurrentSet < 2){
        if(lData == LOW){            
            tmpVal |= 1 << nCurrentBit ;          
        }  
            
        if (nCurrentBit == 23){
          
             // negative data ? 2's complement
             ....


            if (nCurrentSet == 1){
                NewValRel = tmpVal;        
                nCurrentSet = 3;
                //digitalWrite(PIN_LED, HIGH);
            }
            
            if (nCurrentSet == 0){
                NewValAbs = tmpVal;
                nCurrentBit = 0;
                nCurrentSet = 1;     // dann im weiteren die relativen Werte einlesen                
            }
          
        } else {
          
          nCurrentBit++ ;
          
        }        
    }      
        
    tLast = tNow;
    
   sei();
    
};

Something like that was the idea.

Frank

This may be very wide of the mark, but I wonder if the code in yet another software serial would give some useful ideas. It uses interrupts to detect regular Serial data but the principle should be adaptable to any datastream.

I wrote it with a view to it being reasonably easy to follow and modify.

...R

Robin,

Robin2:
This may be very wide of the mark, but I wonder if the code in yet another software serial would give some useful ideas. ...

Sounds great. I'm sure, that I'll find valuable information in there. I will have a look into article and code as soon as I can (maybe tonight) and report back.

Thanks

Frank

DDTechG:
So mine does not seem to be particularly fast.

No, not unusually fast, but it is on the faster end of the range. And those sites you mention don’t particularly talk about how they are actually reading the data and what processor they are using. They may be using a much faster processor. Of the sites you mention, only the first one really goes into any mention of the code, and all it says is that the code must be fully dedicated to the process without saying what processor he’s using.

I guess my point is that unless the post is mentioning the speed of the data they are capturing, AND the type and speed of the processor they are using, it’s not really fair to assume they are using a caliper as fast as yours and an Arduino running at only 16 MHz. In my quick non-exhaustive searching, I’ve found people who are definitely using a 16 MHz Arduino with a slow caliper, and people who have fast calipers with unnamed processors. I’ve not yet seen one with a fast caliper and a basic Arduino. The closest I’ve seen is this one which doesn’t explicitly mention his caliper speed, but references the page mentioning a range of 77 to 90 kHz: he is using a 14 MHz AVR processor (same family as Arduino) but is using highly optimized direct port manipulation, not digitalRead().

For your ISR pseudocode, I think you should (and will need to) do some optimizing. First off, there is no need to clear and set the interrupt flags, that is handled automatically for you. It doesn’t hurt to redundantly disable interrupts at the beginning of the return (other than burning a few clock cycles) but by re-enabling them at the end, you are opening a window where another interrupt could come in after enabling them and before returning, which could quickly eat up your stack with recursion.

Inside your ISR, you are tracking the bit state with a counter and a mask. You can make it more efficient by just using a mask. Use a 32 bit variable as the mask, and OR it into your temporary value if the data bit is set, then shift it over one place. Keep doing this until the mask is zero (the bit is shifted off the end) and that is your signal that the word is complete. Since you only need 24 bits and not 32, start your mask off at 0x00000100, so after 24 left shifts it will be zero. Another advantage of this is that when the end of the data is reached, any additional clock pulses will be ignored since the mask is clear and it makes no difference if it is ORed into the working value.

Then, in your current pseudocode, you are looking for the end of the value and processing the final value in the ISR, and setting up for the next value. Don’t do this in the ISR, let the background task handle it. When the mask is zero, the background task will know that all of the bits have been read, and it can process the temporary value. Because of the way the mask was set up, the value will be shifted over 8 bits, so now it needs to be shifted back. If you do this as a signed 32 bit value, the sign will be extended as you shift, so if the MSB was set, it will be extended so that the MSB and the intervening high order bits are set, which will automatically take care of the twos-complement conversion for you.

Some psuedocode like this is the direction I would suggest:

volatile long int mask;         // The next bit to be received
volatile long int working;      // The value currently being received
volatile unsigned long tClk;    // The time that the last clock pulse was detected
unsigned long tLast;            // The last time that a clock pulse was detected
byte currentSet;                // The data item being processed
long int relative;              // The current relative position
long int absolute;              // The current absolute position



void loop()
{
    if ( (tClk - tLast) > 10000 )
    {
        // Pause > 175ms, begin of new data set
        // in "fast read" mode packets are being sent every 20ms
        // 10ms should cover both 
        tLast = tClk;           // Remember when the last clock was seen.
        mask = 0x00000100;      // Set up the mask for the first bit
        working = 0;            // Clear out the working value, ready for new reception
        currentSet = 1;         // Ready to read the first value in the set.
    }

    // Mask will be zero when a complete word has been received.
    if (mask == 0)
    {
        // A complete word has been received.
        switch (currentSet)
        {
            case 1: absolute = working;
                    // Got the absolute position, shift it into the proper place.
                    // This will sign-extend a negative value so it reads properly.
                    absolute >>= 8;
                    // Get ready to read the next set.
                    currentSet = 2;

                    // Do whatever else needs to be done with a new value

                    break;

            case 2: relative = working;
                    // Got the relative position, shift it into the proper place.
                    // This will sign-extend a negative value so it reads properly.
                    relative >>= 8;
                    // No more data is expected until the next long pause.
                    currentSet = 0;

                    // Do whatever else needs to be done with a new value

                    break;
        }

        // Set up to read a new value
        mask = 0x00000100;      // Set up the mask for the first bit
        working = 0;            // Clear out the working value, ready for new reception
    }
}



void onClock()
{
    // interrupt delegate on a rising edge.
    // global variables modified here must be declared volatile
    tClk = micros();  

    // read data value on rising edge of clock as 
    // signal is inverted. Data therefore also needs to be inverted.
    // If the data bit is low, OR in the current mask value.
    // Direct port I/O can make this faster
    if (digitalRead(PIN_SDA) == LOW)
        working |= mask;

    // Shift the mask over in preparation for the next bit
    working <<= 1;
}

Note that the ISR does the bare minimum amount of processing, which is what you want an ISR to do: get in, do only what absolutely must be done, then get out. Everything else is handled in the background. Note that this means the background can’t be spending a lot of time doing other things: there absolutely should not be any delay() calls in the background. With this configuration, it’s important that the background get around to saving the working value and set up for a new read before the next clock pulse, in that dead time between data values.

Robin, ShapeShifter,

Robin2:
... I wonder if the code in yet another software serial would give some useful ideas. It uses interrupts to detect regular Serial data but the principle should be adaptable to any datastream.

I'll still need to go into it more thorough and see, that I truely understand everything, but, yes, already got some good info out of it. It seems to hold everything I need, especially the interrupt stuff.

ShapeShifter:
I guess my point is that unless ... , it's not really fair to assume they are using a caliper as fast as yours and an Arduino running at only 16 MHz....

I did not mean to say it must be doable with an arduino. I found those different sources of which some read calipers with an arduino, others used different but (possibly) similar processors and again others just dealt with the protocols in an analytical way. I somehow got the impression, that reading all of them should be no big deal at all and - apart from the different protocol structures - did not see the important differences that apparently are there.

It's a bit like learning to dive. It's impressive and You see some nice big fish, but as You're struggling with the technique and the new environment, the beautiful little things happening right infront Your eyes simply skip Your attention.

So I put them all in one big pot and stirred well. Now the soup tastes a little bitter and the challenge is, to take out of some of it's bitterness.

ShapeShifter:
For your ISR pseudocode, I think you should (and will need to) do some optimizing....

Yes, I know and I wasn't particularly proud of that part, but honestly had no idea how I could hand on the manipulation to another method without actually staying in the ISR. I did not see the possibility to actually do all of it in the main loop the way You changed it to. Thanks for pointing me in that direction.

ShapeShifter:
... there is no need to clear and set the interrupt flags, that is handled automatically ...

Thought I read somewhere to do so, but looking into Nick Gammon's article again, he sais the same. So I mixed that up. And burning clock cycles is not really what I want to do here - thanks.

ShapeShifter:
Inside your ISR, you are tracking the bit state with a counter and a mask. You can make it more efficient by just using a mask. Use a 32 bit variable as the mask, and OR it into your temporary value if the data bit is set,....

Ohhh, how I like that. It took me a moment and a little bit of playing to fully grab it, but then I was smiling half the evening - so minimalistic and elegant; beautiful.

I think, I'm on a better track now. I still have it not working properly, but wanted respond to Your answers and give You an update.

Reading the pin is faster now, reading it directly instead of making use of digitalRead(), but it seems that the interrupt-handling using attachInterrupt(...) takes too much time. If I do nothing more but setting tClck and counting up a counter every time the ISR fires for the rising edge, the counter only counts up to 37 before the reset after 100ms.

I'll see if, with Robin's and Nick Gammon's article at hand, I can manage to get the direct interrupt handling to work.

Otherwise I might try a completely different approach. Instead of firing an ISR on every rise, I might just do that on the first rise, then detachInterrupt() and stay with the signal until the packet is read. As it seems, there are not enough resources left to handle anything else during that phase anyway.

Another idea is, to not attach an interrupt to the clock signal at all, but to detect the time between the data packets and synchronize a timer to that.

I'll also might try to implement the NerdKits-code and see, if I can make this work. Looks very interesting.

I'll keep You informed what I come up with

Happy easter

Frank

Sounds like you're making progress and have some good ideas how to proceed. Thanks for the update.

DDTechG:
Ohhh, how I like that. It took me a moment and a little bit of playing to fully grab it, but then I was smiling half the evening - so minimalistic and elegant; beautiful.

Thanks, I've been doing that sort of stuff for a while... :sunglasses:

Happy Easter to you to (and to all)!

Just another quick update.

After thoroughly reading Nick Gammon's article on interrupts again I see that there is no need to further investigate catching the (i.e. each) clock signal per interrupt.

According to this document, even without the overhead of the attachInerrupt() - handling, there is an overhead of ~ 5µs.

... However the external interrupts (where you use attachInterrupt) do a bit more, as follows: ...
I count 82 cycles there (5.125 µS in total at 16 MHz) as overhead plus whatever is actually done in the supplied interrupt routine. That is, 2.9375 µS before entering your interrupt handler, and another 2.1875 µS after it returns.

This already is damn close to the 6 to 7 µs of the clock cycle, when data comes in. With these 5µs I am not taking in account that the event routine also might be called delayed. So I'm already almost out without doing anything in the ISR.

To make things worse, a bit further down he writes:

... A test shows that, on a 16 MHz Atmega328 processor, a call to micros() takes 3.5625 µS. A call to millis() takes 1.9375 µS. ....

So that is the definite OUT for this approach. Interrupt handling and saving the time of last event (tClk = micros()) already use up more time than available before the next clock signal comes around. And there is no actual data handling yet.

So I'm now going for one of the other approaches mentionend in my last post. That is either to use a timer or to catch only the first clock signal of the packet. With this new information at hand for the latter, I now would catch the first FALLING edge (remember, inverted signal) that initiates the data packet and is followed by some 25 - 30 µs of silence before the actual data bits come in. That should give me enough time to prepare collecting the data without slowing down things too much. I'll try that and see how far it takes me.

I'll keep You updated

Frank

DDTechG:
So I'm now going for one of the other approaches mentionend in my last post. That is either to use a timer or to catch only the first clock signal of the packet. With this new information at hand for the latter, I now would catch the first FALLING edge (remember, inverted signal) that initiates the data packet and is followed by some 25 - 30 µs of silence before the actual data bits come in. That should give me enough time to prepare collecting the data without slowing down things too much. I'll try that and see how far it takes me.

Given the results of your research so far, this sounds like a valid plan. Catch the start of that first long pulse, and that gives you a little time to get set up. Then spin loop in tight code using direct port I/O to sample the clock and data. While looping to catch clock pulses, you can keep a simple counter (not milis() or micros()) to count how many passes you spent looking for the next edge, and if the count says there were too many iterations, time out and exit the operation. That will help prevent you from getting stuck if there is an interruption or error in the data.

ShapeShifter,

ShapeShifter:
...Then spin loop in tight code using direct port I/O to sample the clock and data. While looping to catch clock pulses, you can keep a simple counter (not milis() or micros()) to count....

That's about the plan. Direct I/O already works. The NerdKits code also employs a counter for edge- and timeout detection and I like that. Timeout detection is needed anyway in case the caliper falls asleep or is being switched off. They also do read several samples and take the "best guess" which reduces the risk to get false readings due to glitches.

Frank