Stolen Intellectual Property

There are only a finite number of encoding schemes in normal use for these IR devices. For example NEC, Sony, JVC etc. There cannot be too many surprises because the manufacturers have to work with detectors available on the market e.g. the Vishay TSOP range.
I guess the ChungHop tries first to identify the encoding scheme looking at things like the carrier frequency, the length of header burst, the format of a bit (m microseconds mark, n microsecond space), the number of bits transmitted in a block, check digit or redundant transmission scheme in use, repetition code etc. It will do this by performing a series of trys until it gets a match with a known scheme.
Once it has got so far, then it can store the code it has learned from the source device in a relatively simple structure.
I cannot believe that it could work by a brute force recording of every nuance in the IR spectrum during the learning phase and spitting that back at the target device. Of course that would mean that if say a new TV manufacturer came along with a completely new proprietary IR encoding scheme, the ChungHop would be unable to cope without a software update.

So I don’t think you have to do a complex reverse engineering task (unless that itself is really your goal here). You simply have to make an inventory of existing coding schemes and work out how to identify one in use so you can then record and later replay the data block.

What's the point in this? There have been applications even for PalmOS in the days that did this, and they were free. ETH Zürich had a database of all IR protocols, when it was top notch. Nowadays virtually all IR controls use RC5, even transmittion fequency is standardised, so whare's the point?

You may also be able to use a technique described here. https://www.instructables.com/id/Clone-a-Remote-with-Arduino/ .
It is very simple but requires some storage space. You'd only have to identify the carrier frequency, then record the lengths of the marks (uS) and spaces (uS) alternately in storage e.g. 9000,4500,562,562,562,1687.... .
To replay, you would simply switch the carrier on and off (at the correct frequency) for the appropriate time intervals. If you are not attempting to validate the code during recording, or clean it up, this may work for you.

Take a look for example at the NEC code specification to see how it would work: https://www.ad-notam.com/download/RS232/ad_notam_IR_protocol_DFU.pdf

Incidentally, I wrote a simple parser for NEC code which may help you get started if you go the route of attempting to identify the exact protocol in use: Lightweight Arduino IR library for NEC remote control devices - Exhibition / Gallery - Arduino Forum or google for Arduino IR library for more comprehensive solutions.

Your schematic omits the IR receiver, but this would have to be a basic photo transistor or photo diode. You could probably not use a TSOP device which are usually optimized for a specific carrier frequency and protocol. There are simple methods for detecting frequency, especially as there are only a small number of discrete frequencies in use.

Thanks to those who responded to my original post. I appreciate and respect your comments, but I'm afraid you missed the point. I've worked with infrared remote controls for sixteen years and understand them well. Over the years I have designed and built systems for capturing IR signals. Two were just for measuring the precise waveform of the signal, and one was for determining the protocol and encoded data of the signal. Chunghop doesn't try to identify an IR signal, and doesn't give a hoot about that. It simply mimics what it sees, and with incredible accuracy.

In September I built a special test platform that included a Chunghop with 11 learnable buttons, a programmer that could read and write the EEPROM chip used by Chubghop, a signal generator that could produce IR signals of virtually any carrier frequency with a stream of PWM data of any length and description I could imagine, and an 8PDT switch for switching the EEPROM back and forth between Chunghop and programmer. The usual test procedure was as follows: Create an IR signal with the signal generator, record it to verify the details, transfer the EEPROM to the programmer and clear the entire EEPROM, read the EEPROM to verify it holds no data, transfer the EEPROM back to Chunghop and learn the test signal from signal generator, play it back from Chunghop and compare its recording with the original, transfer the EEPROM to the programmer and read and save the entire memory dump. Sounds tedious, and it is. But with this approach I was able to show that Chunghop didn't hesitate to copy and reproduce the most non-protocol signals I could think of. Examples of signals that were reproduced exactly are: Sony12 signals with carrier frequencies of 40KHz, 60KHz, 80KHz, and even 160KHz; A signal consisting of 60 data pairs, each 600uS on, 600uS off; signals consisting of just a single on-burst; a Sony12 signal with two and three times the normal number of data bits; signals with carrier frequency as low as 10KHz; and on and on. For more detailed study of the EEPROM data just one parameter of a signal would be modified and compared with the unmodified version. After many hours of pouring over the EEPROM printouts I was able to find the one word in the standard 64-word chunk of data that defined carrier frequency. And that's all. I have given up the quest, and no, you can't just buy the 009 processor chip to get this technology because it wouldn't have the burned-in proprietary OTP program that performs this wonderful feat.

Can you post a few samples of the eeprom dumps you obtained as attached files here.
For each, state the carrier frequency and the test pattern used, say 600us on, 600us off repeated 10 times. Also indicate where in the dump the carrier frequency is encoded.

Here's a view assorted kinds of data'

OK. It looks like you have been working quite systematically.
So for example, in file:
Chunghop copy of sixty pulse signal_4 .JPG
Where and how is the carrier frequency encoded in the data dump ? You implied in your previous post that you had succeeded in getting that far.


EDIT


In the mean time, I have looked a bit further at it.

Focusing on the 60 pulse signal dumps only (the 2 Sony files do not match each other), 2 things are noteworthy:

  1. The amount of data used to represent the captured signal is very small, assuming that the screen dumps contain all the data, i.e. that shown in the address range 0x1000 and 0x130C.

  2. That the scope output from the captured data is significantly cleaner than the input data.

To point 1.
Obviously the data has been compressed in some way. For that simple pattern that you have applied, the obvious compression technique would be, during the learn phase, to hold the last mark/space pair in a buffer and record the number of repeats of that mark/space pair. As soon as something comes in which deviates from that scheme, then write the mark/space pair from the buffer into storage together with the repetition factor and carry on. For two or three repetitions, it is probably not worth using the repetition format, so the mark/space pairs would be written explicitly.
So, I am guessing that the Carrier frequency, then groups consisting of a mark/space pair together with the repetition factor are written to storage. In your case of 60 repetitions of 600uS mark and 600uS pair, it would mean recording the carrier frequency, that is some representation of 40168 Hz, a recorded mark (623uS) of 25 carrier periods, a recorded space (557uS) of 22 carrier periods and a repetition factor of 60.
To test this,

  1. You'd have to repeat the experiment with say 10, 20, 100, 255 etc. repetitions to see what changed.
  2. Set up an experiment to subvert any sort of compression. Say send the first mark of 600us , a space of 700uS, a mark of 800uS, a space of 900uS etc. and see how it is stored.

Naturally, there are different ways of encoding data for storage (little/big endian, MSB/LSB First etc.) and it is not impossible that the data is obfuscated making it more difficult to find what you expect to see. If you know how the data is encoded and can also alter the data in the flash memory, then you can easily test any theories about how the data is represented.

To point 2.
This supports the idea that the data has been cleaned up/compressed according to the scheme above. Of course there will also be resolution issues. The IR generator has a resolution of say +/- 4uS, the scope will have a resolution of +/- X uS and the recording and playback of the Chunghop will have a resolution of +/- X uS. That is probably why the same numbers keep appearing in the scope output. If everything has a very high resolution say in uS, then there would have been a much wider variety of numbers appearing in the scope output. Further, if the Chunghop does actually do compression as outlined above, then it would have to apply some tolerance when identifying repeating groups. I could guess it has a resolution of 1 or 2 carrier periods.

OLDokasional:
Here's a view assorted kinds of data'

Do you have an IR Scope output of the Sony12 POWER pattern please.

Here are some more.

When looking at an EEPROM data dump, notice that AAh indicates the start of 64 words of learned signal data. The decimal value of the second word divided into the processor clock frequency (4 MHz) is the carrier frequency. For example, 4 MHz/ [63h = 99 decimal] = 40,404 Hz. That's a 1% error. Try as I might I could never get Chunghop to give that as "64h". And I can't explain why my recording device (an IR Widget) might give a carrier frequency like 40,086 or 40,197 for the actual transmission of the Sony12 signal, other than its own limitations in timing measurements.

Timing errors... Is that 4 MHz you mention really 4 MHz? Maybe it is closer to 3.97 MHz? That'd explain the value there and then. For the MCU it doesn't matter as long as that 3.97 MHz itself is a stable value, it is calibrated out as you do the learning process.
The Sony12 likewise. As long as the frequency is within tolerance it'll be detected just fine, and a just slightly wider tolerance may already give a much cheaper product.

Incidentally, many people don't realize the IR receiver in a TV or DVR or whatever is only interested in the demodulated signal. It couldn't care less what the actual carrier frequency is and has no provision for measuring it. That's just the horse it rode in on. Of course the narrow bandpass filter in the receiver chip DOES care, and it will attenuate the signal if the carrier is offset from its center frequency. So carrier frequency error does affect the usable range of a remote control. One of these days I'll do some experiments to see how far from the TV an 80KHz Sony remote signal will work. If you check the data sheet for one of those receiver/de-modulator chips you'll find it attenuates the signal about 50% if the carrier is off-center about 10%.

OLDokasional:
Incidentally, many people don't realize the IR receiver in a TV or DVR or whatever is only interested in the demodulated signal. It couldn't care less what the actual carrier frequency is and has no provision for measuring it. That's just the horse it rode in on. Of course the narrow bandpass filter in the receiver chip DOES care, and it will attenuate the signal if the carrier is offset from its center frequency. So carrier frequency error does affect the usable range of a remote control. One of these days I'll do some experiments to see how far from the TV an 80KHz Sony remote signal will work. If you check the data sheet for one of those receiver/de-modulator chips you'll find it attenuates the signal about 50% if the carrier is off-center about 10%.

That, as I understand it, would be illustrated by fig. 5 in the following diagram:

I've spend a few more minutes looking over the results and have been looking for patterns. I'd start with some assumptions then devise test data (that is generate IR signals for the device to record) to test or break those assumptions. Based on the "60 pulse data" tests, I've got this as a rough first iteration (see attachment)

ChungHop_01.pdf (691 KB)

Sorry to say for me the data is not concise enough to try and make sense of the ROM dump. Along with the ROM dumps you also need the IR data (like in the "Chunghop copy of sixty pulse signal_4 .JPG" image) that the dump refers to.

I wonder if any value with bit 7 set is taken as a command so 0xAA, 0xAF, 0xC1 are block start/end markers.

I believe also that the experimental data presented is too comprehensive and much simpler test cases should be run initially (if not already done, that is) to attempt to understand how the data is encoded in the eeprom. Having established how the carrier frequency is encoded, by systematically varying this one parameter as he has done, the OP could have gone on to understand how the first mark is encoded. For protocols like NEC, incidentally, this is around 9mS.
For this, the tests would probably look like this:

  1. Send only a single mark but vary the length and see which byte is changed and how.
  2. As above but see what happens when the burst is so long that the byte would overflow to see if it invokes a mechanism to concatenate long bursts.
  3. As 1. above, but with a different carrier frequency to see if the burst length is represented in terms of the carrier frequency or in terms of some internal clock controlled by the MCU. Obviously, if the byte does not change when the carrier frequency is changed, then an internal clock is used.

Then continue by sending a space and a second mark, but maybe with lengths different to the first one to prevent any special handling of repeating values that device may invoke.

I can't imagine that the device does more than identifying a carrier frequency and identifying a series of time periods for which the carrier is switched on and off. For example, I would have difficulty believing that if suddenly a manufacturer introduced an IR protocol based on frequency shift keying using 2 carriers, that this device would simply handle all that.

I think this is all a very interesting exercise in reverse engineering and I hope the OP doesn't give up. He has certainly created a well structured test methodology and lab. Naturally, to build a near replica of the ChungHop, it is not necessary to understand all the details of the existing encoding scheme. Just invent a new one with more or less the same capabilities.

Does comparing these two tests side-by-side reveal anything?

Here's a 100 bit signal.

Memory after learning 100 bit Signal to POWER button.txt (11.8 KB)

Do you have anything to contradict this for the first 4 bytes:

0x1000: block marker (We've seen only 0xAA so far)
0x1001: carrier speed. Number of clock ticks (4Mhz) for one IR carrier tick. (We've seen only 0x63 so far)
0x1002: Header mark. units of 40uS e.g. 0x3C ~= 2400us
0x1003: Header space. units of 40uS e.g. 0x0F ~= 600 uS

The first 2 have already been determined elsewhere.
No hex dump has been published here which has a carrier other than 40kHz.
If you can generate a few dumps like those outlined in post #18 we may get further.

Edit 1.

I believe that I may also be close to cracking the encoding scheme which makes the encoded data so compact. The findings are consistent with the data you've published so far for the Sony12 and at the 30/60 repeating groups. Since the data is so sparse (only very few bits set in the unique parts of the data), yet appears to hold so much information, there are only few encoding schemes which are possible. For example, this one allows 60 repetitions of the same pulse to be stored as 15 null bytes. (see attachment)

Edit 2.

The data blocks are protected with a with a modulo 256 check sum.

00000100: AA 63 0F 15 0F 00 15 00 - 04 AF 00 00 40 80 3F 3F  
00000110: 3F 08 C1 20 C1 40 80 04 - 4C A0 1F 60 20 C2 FF FF  
00000120: FF FF FF 00 00 00 00 00 - 00 00 00 00 00 00 00 00  
00000130: 00 00 00 00 00 00 00 00 - 00 00 00 00 3B FF FF FF

The last (non 0xFF) byte in this example is 0x3B and this is the check sum.
Add up all the previous bytes in the data block (not including the 0x3B), take the HEX value of the sum and throw away everything except the last byte and you are left with also 0x3B.
This could be useful if any of the tests involve editing the eeprom data for subsequent use by the Chunghop.

Chunghop IR bit stream encoding.pdf (512 KB)