Stolen Intellectual Property

You hear a lot these days about the Chinese stealing U.S. intellectual property. Most of the time when they “steal” it they don’t have to work at hacking or reverse engineering, the information is simply given to them as part of the conditions required to manufacture in China. This post is NOT intended to discuss the political or moral aspects of this practice, so don’t bother.

I’m looking for one or more team members who would like to put the shoe on the other foot, so to speak, by collaborating on a project to hack a Chinese product and publish the results in a prominent U.S. technical publication where many hobbyists can benefit from it. The device I have selected for the target is one of the simplest, most efficient, best performing designs I have ever seen, and at my age I’ve seen many. The product is called a “Learning Remote Control”. The Chinese company that makes it has many models, but the “crown jewel” in my opinion is the smallest model, shown in the attached schematic. It does everything the models with dozens of learning keys do, only less of it. Let me tell you why I think the design of this product is so commendable.

The entire circuit consists of an 8-bit CMOS processor with OTP program, a 24C16 EEPROM, two resistors, one capacitor, and two LEDs. That’s it! For the 16 pins of the processor, two are for power and ground, two are for the EEPROM, three are for the LEDs, six are for the keyboard matrix, and three are unused. Incredible! The internal oscillator is good enough to provide data (carrier frequency and burst patterns) with accuracy of 1 or 2 percent. Somehow it gets enough current from a processor pin to drive the IR LED so that it has a control range of at least 37 feet (measured), and that’s with a battery supply of only 3 volts. Learning a signal from a TV remote control takes about 15 seconds: Press the Setup button, press the button to be learned, press the button that learns it, and press Setup again. The infrared emitter LED also serves as a fast optical detector for learning. I’ve learned signals of dozens of different protocols from a universal remote modified for crystal accuracy, and compared the learned signals with the originals to evaluate accuracy. Nealy all published learning circuits use a 3-pin demodulating detector chip. They really learn only the demodulated signal waveform, then modulate it at a fixed carrier frequency when reproducing it. Those same 3-pin chips are the front end of all TV remote control receivers. They don’t actually care what the carrier is, but they have a sharp band pass filter on it, so it makes a difference. The Chinese device actually measures and reproduces the exact carrier frequency from 10 KHz to 100 KHz.

I know how hard it can be to get even hex code from a OTP chip. A couple of years ago I tried taking advantage of the EEPROM to figure out how they stored data for a learned signal, thinking that might give some clues as to how the program worked. I made a test setup where I could switch the EEPROM back and forth from the learner to a device that would read out the EEPROM. I would then record several buttons at different carrier frequencies or other parameters and look at the stored data patterns. Hey, I never said this was an easy project. But if you’re curious to learn how the Chinese have done such a clever thing, and think it might be fun to try and find out, let me know. Of course I would expect to provide some hardware and a lot more information and data to anyone who wants to help. I also expect to write the published article, with due credit of course.

Incidentally, I tried for two years to contact someone at the manufacturer to pursuade them to market the OTP chip in the U.S. because there is no ASIC on the market for this. No luck, not even an answer to any of my emails. They need not be afraid someone will use it to compete with them. They have a virtual monopoly on learning remotes. I buy the little unit shown in the schematic for less than $4.

I can’t see my schematic attachment, so I hope it is included.

Tommy Tyler

How would work on technology of days gone by?

You have the schematic, what's stopping you to simply build it?

The ad009-03 chips are sold at prices of about USD 0.25 a piece, the 24C16 for even less. Lots of sellers of this part on Alibaba. Start importing and market your device, don't bother reverse engineering and building your own as you're never going to match let alone beat that price.

There are only a finite number of encoding schemes in normal use for these IR devices. For example NEC, Sony, JVC etc. There cannot be too many surprises because the manufacturers have to work with detectors available on the market e.g. the Vishay TSOP range.
I guess the ChungHop tries first to identify the encoding scheme looking at things like the carrier frequency, the length of header burst, the format of a bit (m microseconds mark, n microsecond space), the number of bits transmitted in a block, check digit or redundant transmission scheme in use, repetition code etc. It will do this by performing a series of trys until it gets a match with a known scheme.
Once it has got so far, then it can store the code it has learned from the source device in a relatively simple structure.
I cannot believe that it could work by a brute force recording of every nuance in the IR spectrum during the learning phase and spitting that back at the target device. Of course that would mean that if say a new TV manufacturer came along with a completely new proprietary IR encoding scheme, the ChungHop would be unable to cope without a software update.

So I don’t think you have to do a complex reverse engineering task (unless that itself is really your goal here). You simply have to make an inventory of existing coding schemes and work out how to identify one in use so you can then record and later replay the data block.

What's the point in this? There have been applications even for PalmOS in the days that did this, and they were free. ETH Zürich had a database of all IR protocols, when it was top notch. Nowadays virtually all IR controls use RC5, even transmittion fequency is standardised, so whare's the point?

You may also be able to use a technique described here. .
It is very simple but requires some storage space. You'd only have to identify the carrier frequency, then record the lengths of the marks (uS) and spaces (uS) alternately in storage e.g. 9000,4500,562,562,562,1687.... .
To replay, you would simply switch the carrier on and off (at the correct frequency) for the appropriate time intervals. If you are not attempting to validate the code during recording, or clean it up, this may work for you.

Take a look for example at the NEC code specification to see how it would work:

Incidentally, I wrote a simple parser for NEC code which may help you get started if you go the route of attempting to identify the exact protocol in use: Lightweight Arduino IR library for NEC remote control devices - Exhibition / Gallery - Arduino Forum or google for Arduino IR library for more comprehensive solutions.

Your schematic omits the IR receiver, but this would have to be a basic photo transistor or photo diode. You could probably not use a TSOP device which are usually optimized for a specific carrier frequency and protocol. There are simple methods for detecting frequency, especially as there are only a small number of discrete frequencies in use.

Thanks to those who responded to my original post. I appreciate and respect your comments, but I'm afraid you missed the point. I've worked with infrared remote controls for sixteen years and understand them well. Over the years I have designed and built systems for capturing IR signals. Two were just for measuring the precise waveform of the signal, and one was for determining the protocol and encoded data of the signal. Chunghop doesn't try to identify an IR signal, and doesn't give a hoot about that. It simply mimics what it sees, and with incredible accuracy.

In September I built a special test platform that included a Chunghop with 11 learnable buttons, a programmer that could read and write the EEPROM chip used by Chubghop, a signal generator that could produce IR signals of virtually any carrier frequency with a stream of PWM data of any length and description I could imagine, and an 8PDT switch for switching the EEPROM back and forth between Chunghop and programmer. The usual test procedure was as follows: Create an IR signal with the signal generator, record it to verify the details, transfer the EEPROM to the programmer and clear the entire EEPROM, read the EEPROM to verify it holds no data, transfer the EEPROM back to Chunghop and learn the test signal from signal generator, play it back from Chunghop and compare its recording with the original, transfer the EEPROM to the programmer and read and save the entire memory dump. Sounds tedious, and it is. But with this approach I was able to show that Chunghop didn't hesitate to copy and reproduce the most non-protocol signals I could think of. Examples of signals that were reproduced exactly are: Sony12 signals with carrier frequencies of 40KHz, 60KHz, 80KHz, and even 160KHz; A signal consisting of 60 data pairs, each 600uS on, 600uS off; signals consisting of just a single on-burst; a Sony12 signal with two and three times the normal number of data bits; signals with carrier frequency as low as 10KHz; and on and on. For more detailed study of the EEPROM data just one parameter of a signal would be modified and compared with the unmodified version. After many hours of pouring over the EEPROM printouts I was able to find the one word in the standard 64-word chunk of data that defined carrier frequency. And that's all. I have given up the quest, and no, you can't just buy the 009 processor chip to get this technology because it wouldn't have the burned-in proprietary OTP program that performs this wonderful feat.

Can you post a few samples of the eeprom dumps you obtained as attached files here.
For each, state the carrier frequency and the test pattern used, say 600us on, 600us off repeated 10 times. Also indicate where in the dump the carrier frequency is encoded.

Here’s a view assorted kinds of data’

OK. It looks like you have been working quite systematically.
So for example, in file:
Chunghop copy of sixty pulse signal_4 .JPG
Where and how is the carrier frequency encoded in the data dump ? You implied in your previous post that you had succeeded in getting that far.


In the mean time, I have looked a bit further at it.

Focusing on the 60 pulse signal dumps only (the 2 Sony files do not match each other), 2 things are noteworthy:

  1. The amount of data used to represent the captured signal is very small, assuming that the screen dumps contain all the data, i.e. that shown in the address range 0x1000 and 0x130C.

  2. That the scope output from the captured data is significantly cleaner than the input data.

To point 1.
Obviously the data has been compressed in some way. For that simple pattern that you have applied, the obvious compression technique would be, during the learn phase, to hold the last mark/space pair in a buffer and record the number of repeats of that mark/space pair. As soon as something comes in which deviates from that scheme, then write the mark/space pair from the buffer into storage together with the repetition factor and carry on. For two or three repetitions, it is probably not worth using the repetition format, so the mark/space pairs would be written explicitly.
So, I am guessing that the Carrier frequency, then groups consisting of a mark/space pair together with the repetition factor are written to storage. In your case of 60 repetitions of 600uS mark and 600uS pair, it would mean recording the carrier frequency, that is some representation of 40168 Hz, a recorded mark (623uS) of 25 carrier periods, a recorded space (557uS) of 22 carrier periods and a repetition factor of 60.
To test this,

  1. You'd have to repeat the experiment with say 10, 20, 100, 255 etc. repetitions to see what changed.
  2. Set up an experiment to subvert any sort of compression. Say send the first mark of 600us , a space of 700uS, a mark of 800uS, a space of 900uS etc. and see how it is stored.

Naturally, there are different ways of encoding data for storage (little/big endian, MSB/LSB First etc.) and it is not impossible that the data is obfuscated making it more difficult to find what you expect to see. If you know how the data is encoded and can also alter the data in the flash memory, then you can easily test any theories about how the data is represented.

To point 2.
This supports the idea that the data has been cleaned up/compressed according to the scheme above. Of course there will also be resolution issues. The IR generator has a resolution of say +/- 4uS, the scope will have a resolution of +/- X uS and the recording and playback of the Chunghop will have a resolution of +/- X uS. That is probably why the same numbers keep appearing in the scope output. If everything has a very high resolution say in uS, then there would have been a much wider variety of numbers appearing in the scope output. Further, if the Chunghop does actually do compression as outlined above, then it would have to apply some tolerance when identifying repeating groups. I could guess it has a resolution of 1 or 2 carrier periods.

Here's a view assorted kinds of data'

Do you have an IR Scope output of the Sony12 POWER pattern please.

Here are some more.

When looking at an EEPROM data dump, notice that AAh indicates the start of 64 words of learned signal data. The decimal value of the second word divided into the processor clock frequency (4 MHz) is the carrier frequency. For example, 4 MHz/ [63h = 99 decimal] = 40,404 Hz. That's a 1% error. Try as I might I could never get Chunghop to give that as "64h". And I can't explain why my recording device (an IR Widget) might give a carrier frequency like 40,086 or 40,197 for the actual transmission of the Sony12 signal, other than its own limitations in timing measurements.

Timing errors… Is that 4 MHz you mention really 4 MHz? Maybe it is closer to 3.97 MHz? That’d explain the value there and then. For the MCU it doesn’t matter as long as that 3.97 MHz itself is a stable value, it is calibrated out as you do the learning process.
The Sony12 likewise. As long as the frequency is within tolerance it’ll be detected just fine, and a just slightly wider tolerance may already give a much cheaper product.

Incidentally, many people don't realize the IR receiver in a TV or DVR or whatever is only interested in the demodulated signal. It couldn't care less what the actual carrier frequency is and has no provision for measuring it. That's just the horse it rode in on. Of course the narrow bandpass filter in the receiver chip DOES care, and it will attenuate the signal if the carrier is offset from its center frequency. So carrier frequency error does affect the usable range of a remote control. One of these days I'll do some experiments to see how far from the TV an 80KHz Sony remote signal will work. If you check the data sheet for one of those receiver/de-modulator chips you'll find it attenuates the signal about 50% if the carrier is off-center about 10%.

Incidentally, many people don't realize the IR receiver in a TV or DVR or whatever is only interested in the demodulated signal. It couldn't care less what the actual carrier frequency is and has no provision for measuring it. That's just the horse it rode in on. Of course the narrow bandpass filter in the receiver chip DOES care, and it will attenuate the signal if the carrier is offset from its center frequency. So carrier frequency error does affect the usable range of a remote control. One of these days I'll do some experiments to see how far from the TV an 80KHz Sony remote signal will work. If you check the data sheet for one of those receiver/de-modulator chips you'll find it attenuates the signal about 50% if the carrier is off-center about 10%.

That, as I understand it, would be illustrated by fig. 5 in the following diagram:

I’ve spend a few more minutes looking over the results and have been looking for patterns. I’d start with some assumptions then devise test data (that is generate IR signals for the device to record) to test or break those assumptions. Based on the “60 pulse data” tests, I’ve got this as a rough first iteration (see attachment)

ChungHop_01.pdf (691 KB)

Sorry to say for me the data is not concise enough to try and make sense of the ROM dump. Along with the ROM dumps you also need the IR data (like in the "Chunghop copy of sixty pulse signal_4 .JPG" image) that the dump refers to.

I wonder if any value with bit 7 set is taken as a command so 0xAA, 0xAF, 0xC1 are block start/end markers.

I believe also that the experimental data presented is too comprehensive and much simpler test cases should be run initially (if not already done, that is) to attempt to understand how the data is encoded in the eeprom. Having established how the carrier frequency is encoded, by systematically varying this one parameter as he has done, the OP could have gone on to understand how the first mark is encoded. For protocols like NEC, incidentally, this is around 9mS.
For this, the tests would probably look like this:

  1. Send only a single mark but vary the length and see which byte is changed and how.
  2. As above but see what happens when the burst is so long that the byte would overflow to see if it invokes a mechanism to concatenate long bursts.
  3. As 1. above, but with a different carrier frequency to see if the burst length is represented in terms of the carrier frequency or in terms of some internal clock controlled by the MCU. Obviously, if the byte does not change when the carrier frequency is changed, then an internal clock is used.

Then continue by sending a space and a second mark, but maybe with lengths different to the first one to prevent any special handling of repeating values that device may invoke.

I can't imagine that the device does more than identifying a carrier frequency and identifying a series of time periods for which the carrier is switched on and off. For example, I would have difficulty believing that if suddenly a manufacturer introduced an IR protocol based on frequency shift keying using 2 carriers, that this device would simply handle all that.

I think this is all a very interesting exercise in reverse engineering and I hope the OP doesn't give up. He has certainly created a well structured test methodology and lab. Naturally, to build a near replica of the ChungHop, it is not necessary to understand all the details of the existing encoding scheme. Just invent a new one with more or less the same capabilities.

Does comparing these two tests side-by-side reveal anything?