RFID mis-reads

I have been using cheap EM4100 Manchester-encoded RFID tags in a wildlife research application for some time now. Very occasionally (<once per thousand reads) the tag code is mis-read, generating a record of a tag I have never deployed. This presumably arises because a tag is presented too fast, or doesn’t completely enter the antenna field. It typically occurs within a sequence of correct reads, so mis-reads are easily corrected, or just edited out with trivial loss of information.

So I’m very happy with my system, but it bugs me that I don’t understand mis-reads! If a given tag is mis-read, it is usually mis-read in the same way, consistently generating the same false ID code on different days, sometimes weeks apart.

14 bytes are obtained each time a tag is read.

Examples:
2 48 48 48 48 48 48 48 49 57 54 57 55 3 mis-reads as 2 48 48 48 48 48 48 48 49 57 48 57 49 3
2 48 48 48 48 48 48 48 48 48 66 48 66 3 mis-reads as 2 48 48 48 48 48 48 48 48 49 66 49 66 3
2 48 48 48 48 48 48 48 49 54 65 54 66 3 mis-reads as 2 67 48 48 48 48 48 48 48 68 50 49 50 3
2 48 48 48 48 48 48 48 49 54 66 54 65 3 also mis-reads as 2 67 48 48 48 48 48 48 48 68 50 49 50 3
2 48 48 48 48 48 48 48 48 67 67 67 67 3 mis-reads as 2 48 48 48 48 48 48 48 48 48 48 48 48 3
several other tags also mis-read as 2 48 48 48 48 48 48 48 48 48 48 48 48 3

I appreciate that the difference between numbers that look very different in decimal can result from mis-reading just one 0/1 bit in binary. But can anyone cast light on why the false ID codes should be so consistent?

The same thing, in the same situation, behaves the same. That's not really mysterious is it?

Usually those IDs are read with a CRC or checksum to check for false reads. So if just one or two bit are read wrongly the reader probably just drops the event. Only in the rare case where more than one bit is wrong but sums up to the correct CRC or checksum the result is presented. I guess there are not that many ways how the read can go wrong to produce the correct checksum.

The same thing, in the same situation, behaves the same. That's not really mysterious is it?

I started my post by noting that the same thing in what appears to be the same situation doesn't always produce the same result.

At each read, a mistake could be made on any of 64 bits of data. Usually there's no mistake, but when a mistake is made, it's generally on the same bit. When I think about the many ways a tagged animal might present itself to the (spiral pass-through) reading antenna, I find this surprising.

@pylon: I've been wondering whether there is any checksum calculation. Good point about the finite number of ways to satisfy it. But where is the checksum implemented? It's not in my code. Does it already happen in the reader (an RDM6300, by the way), so that the reader only passes out IDs that have satisied the checksum?

There's no mention of any checksum in the description of the EM4100 protocol at EM4100 protocol description

There's no mention of any checksum in the description of the EM4100 protocol

The column parity bits are such a checksum mechanism, a rather simple one but I guess it explains your numbers.

OK, I get it. I didn't comprehend the significance of the parity bits before. I see the check works across both columns and rows too.

I guess your explanation of the consistent mistakes does then follow, but I will try out some actual numbers to fully convince myself!

However, I'm left with a further puzzle: my tags are fully 'read' when I've received 14 bytes by the RFID reader, the first of which is 2 (in decimal), and the last 3. But apparently there are only 10 data bytes in the protocol. So now I'm rather confused about what is actually signalled by the RFID chip, and what is constructed by the reader.

Finding it really frustrating to be so ignorant...

However, I'm left with a further puzzle: my tags are fully 'read' when I've received 14 bytes by the RFID reader, the first of which is 2 (in decimal), and the last 3. But apparently there are only 10 data bytes in the protocol. So now I'm rather confused about what is actually signalled by the RFID chip, and what is constructed by the reader.

There are 10 hex nibbles that carry data plus one parity nibble. That would be 11 hex nibbles but you get 12.

A search for your reader showed the datasheet, which explains on page 4 that the following two nibbles are an XOR checksum of the previous 5 bytes (10 nibbles).

Doh! I had that data sheet all the time, didn't understand that bit first time round, and didn't think to look at it again. Thanks so much!

Right, after some exploration my brain has advanced a notch, but I'm still confused. The RDM6300 data sheet gives this example:

Example: card number: 62E3086CED
Output data:36H、32H、45H、33H、30H、38H、36H、43H、45H、44H
CHECKSUM: (62H) XOR (E3H) XOR (08H) XOR (6CH) XOR (EDH)=08H

I am happy that line two is the the same (in hex) as line one (in ASCII). And I have confirmed for myself in an Excel spreadsheet that doing the XOR operation on pairs of ASCII characters as though they were hex characters, as in line three, yields a checksum of 08 hex.

If I apply this to my tags, I come unstuck. For instance:
output data: 48 48 48 48 48 48 48 49 57 54 57 55
ASCII equivalent: HHHHHHHIWTWU
CHECKSUM: won't calculate

In Excel, I'm doing the checksum calculation via a chain of pair-wise XOR calculations using formulae like this:
=DEC2HEX(BITXOR(HEX2DEC(C4),HEX2DEC(D4)))
where for instance the values in C4 and D4 are both HH. (If the values are as in the datasheet example, this works fine.)

If I do the calculations on the output values (e.g. where C4 and D4 are both 48), and leaving out the HEX2DEC conversion, I arrive at a checksum that makes sense for my original question, in that the checksum is the same for both correct readings and mis-readings of the same tag.

However, that's not the calculation the datasheet indicates, so I have no faith in it.

Please can someone un-confuse me?

output data: 48 48 48 48 48 48 48 49 57 54 57 55
ASCII equivalent: HHHHHHHIWTWU

Wrong. ASCII equivalent is 000000019697 (you code print decimal values not hex) and 00 ^ 00 ^ 00 ^ 01 ^ 96 = 97.

Thanks again, I’m convinced by that, but could you explain a little more fully? How did you get 000000019697?

How did you get 000000019697?

000000019697 is just a string. The first character '0' has ASCII code 0x30 or 48 in the decimal system. The example in the datasheet shows the output in hexadecimal while your code is putting it out in decimal. That code is from you, I don't know why you put out the received bytes in decimal and not in hexadecimal or as ASCII characters.

Aaaaaah, I see! Wish I had figured that out myself, but truly grateful for your insight.

I decided to print/store the ID string in decimal rather than hex because it made the ID easier to read, and to sort data lines in Excel based on IDs. I honestly never considered ASCII output.

Cheers.