One key point of this article is that there are three different ways to interpret the same IR signal and turn it into a hex code. Understanding these three ways will allow you to get codes from different sources and understand them correctly.
The IR transmission of the code
When you press a button on a Sony remote control, an infrared signal is transmitted. This transmission consists of a 40kHz signal which is turned on and off in a particular pattern. Different buttons correspond to different codes, which cause the signal to be turned on and off in different patterns.
The following waveform shows the IR code transmitted for STOP on my Sony CD remote control (RM-DC335). When the signal is high, a 40 kHz IR signal is transmitted, and when the signal is low, nothing is transmitted. In other words, the signal is actually rapidly turning on and off when it appears to be on in the figure. (The IR receiver demodulated the signal, so you don't see the 40 kHz transitions.)
A Sony IR signal starts with 2400 microseconds on and 600 microseconds off; that's the first wide pulse.
A "1" bit is transmitted with 1200 microseconds on and 600 microseconds off, while a "0" bit is transmitted with 600 microseconds on and 600 microseconds off. (This encoding is also called SIRC or SIRCS encoding.)
You may notice the "on" parts of the waveform appear wider than the "off" parts, even when both are supposed to be 600 microseconds. This is a result of the IR receiver, which switches on faster than switching off.
The above waveform represents one transmission of a 12-bit code. This transmission is normally repeated as long as the button is held down, and a minimum of three times. Each transmission starts 45ms after the previous one started. The Sony protocol also supports 15 and 20 bit codes, which are the same as above except with more bits.
For more information on the low-level transmission of Sony codes, see sbprojects.net.
Three ways to interpret the codes
The Sony encoding seems straightforward, but there are several different ways the signal can be interpreted. I will call these official decoding, bit decoding, and bit-1 decoding. Different sources use any of these three, which can cause confusion. I will explain these three decodings, using the previous waveform as an example.Official decoding
The "official" Sony protocol views the 12-bit code as 7 command bits and 5 address or device bits, transmitted least-significant-bit first (i.e. "backwards"). The device bits specify the type of device receiving the code, and the command bits specify a particular command for this device. In this example, the device bits (yellow) are 10001 when read right-to-left, which is 17 decimal. The command bits (blue) are 0111000 when read right-to-left, which is 56 decimal. The device value 17 corresponds to a CD player, and the command 56 corresponds to the STOP command (details).
Sony 15 bit codes are similar, with 7 command bits and 8 device bits. Sony 20 bit codes have 7 command bits, 5 device bits, and 8 extended device bits.
Bit decoding
Many IR decoders just treat the signal as a sequence of bits, most-significant-bit first. I will call this bit decoding. Applying this interpretation to the above code, the code is interpreted as 0001 1101 0001 binary, or 1d1 hex, or 465 in decimal. Note that the last bit doesn't really consist of 1200 microseonds on and 600 microseconds off; it consists of 1200 microseconds on followed by a lot of time off. In other words, the transmission is off for 600 microseconds and then continues to be off until the next code is transmitted.
An alternative but equivalent interpretation is to view the code as a 2400 microsecond header, followed by 12 bits, where each bit is off then on (rather than on then off). A "1" bit is 600 microseconds off and 1200 microseconds on, while a "0" bit is 600 microseconds off and 600 microseconds on. This yields the same value as before (232 decimal), but avoids the special handling of the last bit.
Bit-1 decoding
Many IR decoders drop the last bit, which I will call bit-1 decoding. Because the last bit doesn't end nicely with 600 microseconds off, some IR decoding algorithms treat the signal as 11 bits of data, ending with 600 microseconds on as a trailer. In this interpretation, the above code is 000 1110 1000 binary, or 0e8 hex, or 232 decimal. (Note that doubling this value and adding 1 yields the previous decoding of 465.)
Discussion
Is the code really 17/44 or 465 or 232? The official decoding is "right" in the sense that it is what the manufacturer intends. In addition, it reveals the internal structure of the code and the codes make more sense. For instance, the buttons 1-9 have consecutive codes with the official decoding, but not with the others. The other decodings are fine to use as long as you're consistent; the main thing is to understand that different sources use different decodings. My Arduino library uses the second bit decoding interpretation. The different decodings can be converted from one form to another with binary arithmetic.
Getting codes from a remote
Probably the easiest way to get the codes for your device is to use your existing remote control and see what codes it transmits. You can use my Arduino IR library to do the decoding, with the IRrecvDump demo program. Take a 3-pin IR decoder, hook it up to an Arduino, and then you can read the values for each button press on the serial port.Alternatively, you can look at the transmitted codes with an oscilloscope. For the diagrams above, I used an IR receiver module connected to two resistors (to drop the voltage), connected to the line input of my PC. I used the Zeitnitz Soundcard Oscilloscope program to display the signal. This lets you see exactly what is being transmitted, but you will need to stare at the screen, write down a bunch of 0's and 1's, and convert the binary value to get your codes.
Getting codes from LIRC files
The best source for IR codes that I've found is the Linux Infrared Remote Control project (lirc.org), which has a huge collection of config files for various remotes. (LIRC also includes a large collection of device drivers for many types of IR input/output hardware, and a software library.)The LIRC config file format is documented at WinLIRC, but I will walk through some examples.
Bit decoding
The RM-S530 LIRC file treats the codes as 12 bits long, using what I call bit decoding:begin remote name Sony bits 12 flags SPACE_ENC|CONST_LENGTH eps 30 aeps 100 header 2470 557 one 1243 556 zero 644 556 gap 45076 min_repeat 2 toggle_bit 0 begin codes sleep 0x0000000000000061 cd_stop 0x00000000000001D1 ...This file indicates that each entry is 12 bits long. A header consists of on for 2470 microseconds and off for 557 microseconds. A one bit consists of on for 1243 microseconds and off for 556 microseconds. A zero bit consists of on for 644 microseconds and off for 556 microseconds. Codes are repeated a minimum of 2 more times, with a gap of 45076 microseconds from start to start as the codes are constant length (CONST_LENGTH).
You may be wondering why these time values don't match the official values of 2400, 1200, and 600 microseconds. First, the LIRC data is generally measured from actual remotes, so the real-world timings don't quite match the theory. Second, IR sensors have some lag in detecting "on" and "off", and they typically stretch out the "on" time by ~100 microseconds, shortening the "off" time equally.
The LIRC file then lists the hex code associated with each button. For example the CD STOP code is hex 1D1, which is the same value as described earlier.
Bit-1 decoding
The LIRC file for the RM-D302 remote treats the codes as 11 bits and a trailing pulse. This is what I call the bit-1 decoding:begin remote name RM-D302 bits 11 flags SPACE_ENC|CONST_LENGTH eps 30 aeps 100 header 2367 638 one 1166 637 zero 565 637 ptrail 1166 gap 45101 min_repeat 2 toggle_bit 0 begin codes cd_stop 0x00000000000000E8The key differences with the previous file are the ptrail field, indicating a trailing pulse of 1166 microseconds; and the bit count of 11 instead of 12. Note that the code value is different from the first file, even thought the IR transmission is exactly the same. The hex code 0E8 is the same as described earlier under bit-1 decoding.
Pre_data and post_data bits
Some LIRC files break apart the codes into constant pre_data bits, the code itself, and constant post_data bits. For instance, the file for my RM-D335 remote:begin remote name SONY bits 7 flags SPACE_ENC eps 20 aeps 0 header 2563 520 one 1274 520 zero 671 520 ptrail 1274 post_data_bits 4 post_data 0x8 gap 25040 min_repeat 2 begin codes CONTINUE 0x000000000000005C ... STOP 0x000000000000000E ...This file indicates that each code entry is 7 bits long, but there are also 4 post data bits. This means that after transmitting the 7 bits for the code, 4 additional bits are transmitted with the "post data" hex value 0x8, i.e. 1000 binary.
Putting this together the STOP button has the hex value 0E, which corresponds to the seven bits 000 1110. This is followed by four post data bits 1000, so the total transmission is the eleven bits 000 1110 1000, which is 0E8 hex, the same as before for the bit-1 decoding.
One more thing to notice in this LIRC file is it doesn't have the CONST_LENGTH flag, and the gap is 25ms instead of 45ms. This indicates the gap is from the end of one code to the start of the next, rather than from the start of one code to the start of the next. Specifying a gap between codes isn't how Sony codes are actually defined, it's close enough.
LIRC summary
Note that these LIRC files all indicate exactly the same IR transmission; they just interpret it differently. The first file defines the STOP code as 1D1, the second as E8, and the third as 0E.What does this mean to you? If you want to get a Sony code from a LIRC file and use it with the Arduino library, you need to have a 12 bit (or 15 or 20 bit) code to pass to the library (bit decoding). Look up the code in the file and extract the specified number of bits. If there are any pre_data or post_data bits, append them as appropriate. If the result is one bit short and the LIRC file has a ptrail value, append a 1 bit on the end to convert from bit-1 decoding to bit decoding. Convert the result to hex and you should have the proper code for your device, that can be used with the Arduino library.
Getting codes from hifi-remote.com
The most detailed site I've found on Sony codes specifically is hifi-remote.com/sony. An interesting thing about this site is it analyzes the structure of the codes. While the LIRC files just list the codes, the hifi-remote site tries to explain why the codes are set up the way they are.Note that this site expresses codes in Sony format, i.e. most-significant-bit first, and separating the device part of the code from the command part of the code. Also for a 20-bit Sony code, the 13 bit device code is expressed as a 5 bit value and an 8 bit value separated by a period. As a result, you may need to do some binary conversion and reverse the bits to use these codes.
To work through an example, I can look up the data for a Sony CD. There are multiple device codes, but assume for now I know my device code is 17. The table gives the code 56 for STOP. Convert 56 to the 7 bit binary value 0111000 and reverse it to get 0001110. Convert 17 to the 5 bit binary value 10001 and reverse it to get 10001. Put these together to et 000111010001, which is 1D1 hex, as before.
To work through a 20-bit example, if I have a Sony VCR/DVD Combo with a device code of 26.83, the site tells me that the code for "power" is 21. Convert 21 to a 7 bit binary value, 26 to a 5 bit binary value, and 83 to a 8 bit binary value. Then reverse and concatenate the bits: binary 1010100 + 01011 + 11001010, which is a8bca in hex. To confirm, the RMT-V501A LIRC file lists power as A8B with post_data of CA.
To use this site, you pretty much have to know the device code for your device already. To find that, obtain a code for your device (e.g. from your existing remote or from LIRC), split out and reverse the appropriate bits, and look up the device code on the site. Alternatively, there are only a few different device codes for a particular type of device, so you can just try them all and see what works.
This site also has information on "discrete codes". To understand discrete codes, consider the power button on a remote that toggles between "on" and "off". This may be inconvenient for automated control, since without knowing the current state, you don't know if sending a code will turn the device on or off. The solution is the "discrete code", which provides separate "on" and "off" codes. Discrete codes may also be provided for operations such as selecting an input or mode. Since these codes aren't on the remote, they are difficult to obtain.
Other sites
Additional config files are available at irremote.psiloc.com. These are in LIRC format, but translated to XML. The site remotecentral.com has a ton of information on remotes, but mostly expressed in proprietary formats.