Showing posts with label synth. Show all posts
Showing posts with label synth. Show all posts

Yamaha DX7 chip reverse-engineering, part 6: the control registers

The Yamaha DX7 digital synthesizer (1983) was the classic synthesizer in 1980s pop music. It uses a technique called FM synthesis to produce complex, harmonically-rich sounds. In this blog post, I look inside its custom "OPS" sound chip and explain the control registers for this chip. By reverse-engineering the circuitry, I found a few undocumented test functions. (This post covers some fairly obscure details of the DX7; you might prefer my previous DX7 posts1 starting with "DX7 reverse-engineering".)

Die photo of the YM21280 chip with the main functional blocks labeled. Click this photo (or any other) for a larger version.

Die photo of the YM21280 chip with the main functional blocks labeled. Click this photo (or any other) for a larger version.

The die photo above shows the DX7's OPS sound synthesis chip under the microscope, showing its complex silicon circuitry. Unlike modern chips, this chip has just one layer of metal, visible as the whitish lines on top. Around the edges, you can see the 64 bond wires attached to pads; these connect the silicon die to the chip's 64 pins. In this blog post, I'm focusing on the control registers, highlighted in red. I'll outline the other functional blocks briefly. Each of the 96 oscillators has a phase accumulator used to generate the frequency. The sine and exponential functions are implemented with lookup tables in ROMs. Other functional blocks apply the envelope, hold configuration data, and buffer the output values.

The DX7 synthesizer. Photo by rockheim (CC BY-NC-SA 2.0).

The DX7 synthesizer. Photo by rockheim (CC BY-NC-SA 2.0).

The DX7 generates sounds digitally using a technique called FM synthesis. Each note has six oscillators (called "operators") that can be combined in different ways (called "algorithms"). An algorithm is represented by a diagram (below), where an oscillator modulates the oscillator below, as shown by the lines. For instance, in algorithm 1 below, oscillator 6 modulates oscillator 5 which modulates 4 which modulates 3. Oscillator 2 modulates oscillator 1. The output is taken from the bottom oscillators (1 and 3). Meanwhile, oscillator 6 modulates itself, controlled by a user-selectable feedback level. With 32 different algorithms, the DX7 can generate a wide variety of sounds. In the DX7 synthesizer, all 16 notes must use the same algorithm. But from my reverse-engineering, it appears that the chip supports different algorithms for each note, even though the synthesizer doesn't make use of this.

Four of the 32 "algorithms" that can be selected on the DX7.

Four of the 32 "algorithms" that can be selected on the DX7.

To the programmer of the DX7 firmware, the sound chip appears to have two write-only registers that control the chip. The diagram below shows the layout of the chip's s two registers, as described by Anthony Richardson. The desired algorithm and feedback are written to address 1.2 Address 0 has bits to turn the "key sync" feature3 on and off. As for the Mute and Test Register Select bits, my investigation provides some explanation.

Address | Bit 7 | Bit 6          | Bit 5        | Bit 4 | Bit 3 | Bit 2 | Bit 1 | Bit 0 |
0       | Mute  | Clear Key Sync | Set Key Sync | Test Register Select                  |
1       | Algorithm Select (0..31)                              | Feedback Level (0..7) |

(The functionality above is pretty limited, so you might wonder how the synthesizer controls which notes are played. Most of the synthesizer functions are controlled through a second custom chip, the envelope generator chip (EGS). Note and envelope data is written to registers in the EGS chip, which then sends frequency and amplitude data to the sound chip over a special bus.)

The diagram below shows the main components of the register circuitry. The large block at the bottom is the A-register, which holds the algorithm/feedback entries, as 16 8-bit values.4 The most puzzling feature of the A-register is its size; it holds 16 entries, one for each note, but the DX7 uses the same algorithm/feedback setting for all 16 notes. The second puzzling feature is that although the chip appears to have two 8-bit registers, the implementation is one 9-bit register, and one 5-bit register. Moreover, the 5-bit register can only be modified by writing through the 9-bit register.

Main functional blocks of the register circuitry.

Main functional blocks of the register circuitry.

The chip has one address pin (called "DS") which selects between the two registers. When a byte is written to the chip, to either address, the 8 bits along with DS (the address bit) are stored in the 9-bit latches. If DS is 0 (i.e. write to the control address 0), the bits are decoded to perform any special functions, and the lower 5 bits are loaded into the 5-bit register on the right. If DS is 1 (i.e. write to the algorithm/feedback address 1), the 8-bit algorithm/feedback value is stored into the A register, in a location controlled by the 5-bit register.

Updating the algorithm/feedback

The algorithm/feedback A-register register holds data for 16 notes. It can be updated in two ways. The first way, used by the DX7, updates all entries with the same value. The second way updates a single entry, allowing different notes to have different algorithms. Both cases involve a write to address 0, followed by writing the algorithm/feedback byte to address 1.

To update all entries, address 0 must be written with a value with the bit pattern 0??1?0?? (where ? indicates a "don't care" bit that can be 0 or 1).5 This pattern triggers a circuit that constantly loads the value in the data latch into the storage register.

The DX7's CPU controls the OPS chip in this way. Specifically, an update of the algorithm and feedback is performed by writing either 0x30 (if sync is on) or 0x50 (if sync is off) to address 0, and then writing the algorithm/feedback byte to address 1.2

The second update path will change the algorithm and feedback for a single note. (The DX7 does not use this feature.) is triggered by writing the bit pattern 0??0nnnn, where nnnn specifies one of the 16 notes.

The implementation of this is a bit tricky because the chip uses shift registers for storage, not RAM. The A-register consists of 8 shift registers (one for each bit), each with 16 stages (one for each note). An entry can only be updated when it is shifted out the end of the shift register, and a new value can be inserted. (This is unlike RAM, where an arbitrary entry can be written.) To update an entry in the shift register, a 4-bit comparator circuit (below) compares the number of the current note with the number of the desired note in the control register. When there is a match, the new value is written to the shift register.

The 4-bit comparator determines when the shift register is at the desired note position. It is built from four exclusive-NOR gates.

The 4-bit comparator determines when the shift register is at the desired note position. It is built from four exclusive-NOR gates.

Special command sequences

The logic circuitry recognizes several bit patterns when they are written to address 0, and causes special actions when they are detected. These are not used by the DX7; I think they were used for testing the chip during manufacturing to make tests more predictable and faster.

1???????: Setting the top bit triggers several special actions. Earlier analysis has labeled this bit as "Mute", but I suspect it is more of a "Test Reset" function, resetting the chip to a known state so tests will be predictable. This bit clears the phase accumulators. This bit disables the scale factors, so the output data is unshifted. It also bypasses the output latch, which may output digital note data at a higher rate.

1??????1: In addition to the previous action, this pattern resets the counters that count through the operators and notes, controlling the actions of the chip. This is probably used start testing the chip from a known state, so the outputs can be compared with expected values.

1?????1?: This causes the low-order bits of the phase register to generate the waveform, rather than the high-order bits. I think this is used for testing so the low-order bits can be examined more directly to find flaws. It also will increase the frequencies by a factor of 1024, which may help run through waveforms faster for testing.

Conclusion

By looking inside the chip and reverse-engineering the silicon circuits, I learned some details about the internal registers. One interesting discovery is that the chip appears to support separate algorithms for each voice, even though the synthesizer doesn't use this feature. I also uncovered some test functionality.

The Yamaha YM21280 OPS integrated circuit package with the metal lid removed, revealing the silicon die.

The Yamaha YM21280 OPS integrated circuit package with the metal lid removed, revealing the silicon die.

I plan to continue investigating the DX7's circuitry, so follow me on Twitter @kenshirriff for updates. I also have an RSS feed. Thanks to Jacques Mattheij and Anthony Richardson for providing the chip and discussion.6

Disclaimer: I figured out the behavior described in this post studying the die. It hasn't been tested on an actual DX7 so I don't guarantee that it is correct.

Notes and references

  1. My previous posts on the DX7: DX7 reverse-engineering, The exponential ROM, The log-sine ROM, How algorithms are implemented, and The output circuitry

  2. Looking at the ROM shows how the synthesizer's CPU communicates with the OPS chip. Since the DX7 ROM code has been disassembled, you can view the code that writes to the sound chip here

  3. Oscillator Key Sync is a feature of the DX7. According to the manual, Operator Key Sync "enables you to set the operator so its 'oscillator' begins at the start of the sine wave cycle each time you play a note. When Oscillator Key Sync is off, the sine wave continues so that subtle differences will occur even when you play the note repeatedly." 

  4. The DX7/9 Service Manual shows the "A-register" holding the algorithm and feedback level, so I'll use that name. 

  5. The bit pattern 0??1?0?? looks a bit random. I don't know why this pattern was chosen. The first two bits can be explained, but I don't see a purpose for the last 0 bit. 

  6. For more information on the DX7 internals, see DX7 Technical Analysis, DX7 Hardware, OPLx decapsulated, and the video Emulating the DX7 the hard way

Yamaha DX7 chip reverse-engineering, part V: the output circuitry

The Yamaha DX7 digital synthesizer (1983) was the classic synthesizer in 1980s pop music. It uses a technique called FM synthesis to produce complex, harmonically-rich sounds. In this blog post, I look inside its custom sound chip and explain how the chip's output circuitry works. You might expect it's just a digital output fed into a digital-to-analog converter, but there's much more to it than just that.

Die photo of the YM21280 chip with the main functional blocks labeled. Click this photo (or any other) for a larger version.

Die photo of the YM21280 chip with the main functional blocks labeled. Click this photo (or any other) for a larger version.

The composite die photo above shows the DX7's OPS sound synthesis chip under the microscope, revealing its complex silicon circuitry. Unlike modern chips, this chip has just one layer of metal, visible as the whitish lines on top. Around the edges are the 64 bond wires attached to pads; these connect the silicon die to the chip's 64 pins. The three blocks in red are the focus of this post. The output buffers hold the 16-bit digital values for the 16 notes. The output is controlled by a counter and PLA (Programmable Logic Array). The synthesizer's digital-to-analog conversion uses a sample-and-hold circuit, controlled by the "S/H ctrl" block.

I've discussed the chip's other functional blocks in earlier posts1, so I'll just give a brief summary here. Each of the 96 oscillators has a phase accumulator used to generate the frequency. The oscillators are combined using the operator computation circuitry in the middle of the chip, under the control of the algorithm ROM. The signal synthesis uses sine and exponential functions, implemented with lookup tables in ROMs.

The Yamaha DX7 synthesizer with its 61-key keyboard and digital controls. Photo by rockheim (CC BY-NC-SA 2.0).

The Yamaha DX7 synthesizer with its 61-key keyboard and digital controls. Photo by rockheim (CC BY-NC-SA 2.0).

The synthesizer's output circuitry

Before I dive into the details of the chip, I'll explain the synthesizer's output circuit. The heart of the synthesizer is the OPS (Operator S) sound chip that digitally generates the notes. It provides digital values to the digital-to-analog converter (D/A). The resulting analog signal goes through a low-pass filter (LPF). The volume is controlled by a foot pedal and the synthesizer's volume control. Finally, the signal is amplified for the line and headphone outputs.

Block diagram of the output circuit. Based on the DX7/9 Service Manual.

Block diagram of the output circuit. Based on the DX7/9 Service Manual.

The digital-to-analog conversion is more complex than you might expect. The process starts with the digital-to-analog converter (DAC) chip2 that takes a 12-bit digital value from the sound chip and converts it to an analog value in the range 0 to 15 volts.3 The multiplexer allows the overall synthesizer volume to be controlled by MIDI, but with just 8 levels.

Schematic of the volume control and DAC circuit. Based on the DX7 schematics.

Schematic of the volume control and DAC circuit. Based on the DX7 schematics.

The DAC provides 12 bits of resolution, but an additional circuit (below) provides approximately two more bits. This scaler circuit divides the analog signal by 1, 2, 4, or 8, using a resistor network and IC switch. The scaler is controlled by the sound chip through the scale factor signals SF0-SF3. The scaler adds more dynamic range to the digital value; the result is similar to a floating-point value with a sign bit, 11-bit mantissa, and two-bit exponent.4

The scaler divides the voltage by 1, 2, 4, or 8. Based on the DX7 schematics.

The scaler divides the voltage by 1, 2, 4, or 8. Based on the DX7 schematics.

Next, the signal goes to a sample-and-hold circuit that samples the analog voltage at a point in time and holds it in a capacitor, kind of like an analog memory. An op-amp buffers the capacitor's voltage so it can be "read" without draining the capacitor. There are two hold circuits, used in alternation, so the last two samples are stored and summed to form the circuit's output.5 The SH1 and SH2 control signals load the analog value into a capacitor, using IC52 as a switch. Finally, the output from the sample-and-hold circuit is filtered,6 the volume is adjusted,7 and the signal is amplified for the output (circuitry not shown).

The sample-and-hold circuit. IC52 looks complicated because it uses pairs of switches in parallel. Based on the DX7 schematics.

The sample-and-hold circuit. IC52 looks complicated because it uses pairs of switches in parallel. Based on the DX7 schematics.

To summarize, the sound chip interacts with the output circuitry in three ways. The 12-bit digital value (DA1-DA12) is most important as it specifies the output value for each voice. The scale factor signals (SF0-SF3) are also a key contributor to the signal. The sound chip also provides the sample-and-hold control signals (SH1 and SH2).

Time-division multiplexing

The DX7 has 16 voices, so it can play 16 notes at once. Each note is produced by an "algorithm" that combines 6 oscillators in a particular way, so there are 96 oscillators in total. An oscillator can modulate the frequency of another oscillator to generate complex sounds with FM synthesis.

The chip performs all its processing sequentially, one oscillator at a time, rather than computing the notes in parallel. Internally, the chip has one "operator" calculation circuit to combine oscillators. As shown below, the chip starts by processing operator 6 for note 1, then operator 6 for note 2, and so forth through note 16. Then it processes operator 5 for notes 1 through 16. Finally it processes operator 1 for notes 1 through 16, generating the output sound values. It takes a bit over 20 µs to compute all 16 notes in a complete processing cycle.

Timing diagram of sound production. This time interval corresponds to 49.096 kHz. From the DX7/9 Service Manual.

Timing diagram of sound production. This time interval corresponds to 49.096 kHz. From the DX7/9 Service Manual.

You might expect the chip to combine the 16 notes into a single digital output. However, the sound chip outputs the 16 notes sequentially, using a technique called time-division multiplexing. Each time interval (~20µs) is divided into 16 intervals and one note is output from the chip per interval. (Note that these intervals don't line up with the intervals in the diagram above.) Thus, digital values are output at 786 kilohertz, 16 times the underlying frequency, and the DAC chip converts them to analog at this rate.

As an example, consider two notes that are sine waves with different frequencies. The digital output would look like the image below. You might think that this signal is unusable since it jumps around wildly from point to point.

Output data with two multiplexed sine waves. (Theoretical, not actual DX7 data.)

Output data with two multiplexed sine waves. (Theoretical, not actual DX7 data.)

However, applying a low-pass filter smooths out the waveform (essentially summing nearby points). The result is the waveform below, which shows the sum of the two sine waves.8 The point is that time-division multiplexing data may look strange, but the analog circuitry's filtering creates a "normal" waveform.

Output data after filtering with a 16 kHz low-pass filter.

Output data after filtering with a 16 kHz low-pass filter.

Output buffer

Inside the chip, the output buffer stores values for the 16 notes as they are generated, and outputs them in sequence. Rather than RAM, the chip uses shift registers for storage. The shift registers are arranged in a loop of 16 stages, one stage for each note. On each clock cycle, the values in the shift register move to the next stage. The output value is fed back into the shift register's input so the value is retained. Alternatively, a new value can be stored in the shift register. Shift registers provided an efficient way to store data, but they cannot be accessed arbitrarily; instead, data must be processed as it becomes available.

The schematic below shows how one stage of the shift register is implemented. The chip uses a two-phase clock. In the first phase, clock ϕ1 goes high, turning on the first transistor. The input signal goes through the inverter, through the transistor, and the voltage is stored in the capacitor (kind of like DRAM). In the second phase, clock ϕ2 goes high, turning on the second transistor. The value stored in the capacitor goes through the second inverter, through the second transistor, and to the output, where it enters the next shift register stage.

Schematic of one stage of the shift register.

Schematic of one stage of the shift register.

The die photo below shows the output buffer, with the 16 shift-register loops arranged in columns. These hold the 16-bit sound values (four scale factor bits and 16 data bits.) Each shift register is 16 stages long to hold the 16 notes. In the next sections, I'll discuss the bit shifters, the logic, and the output latches.

Closeup of the die, showing the output buffer circuitry.

Closeup of the die, showing the output buffer circuitry.

The scale factor: pseudo floating point

The DX7 uses a 12-bit digital-to-analog converter chip, but the scaling circuit (discussed earlier) will scale the voltage by 1, 2, 4, or 8, which adds more resolution. This isn't quite equivalent to 14-bit resolution; it's more like a floating-point number with a sign, 11-bit mantissa, and 2-bit exponent. This provides more resolution for low signals and reduces signal noise.

Inside the chip, scaling is implemented with a shifter that shifts the data bits by 0 to 3 bit positions. (This is unrelated to the shift registers that hold data.) The shifter (below) is implemented as eleven chevron-shaped logic gates; each gate selects one of four potential bits for each mantissa position.

A data sample is shifted 0 to 4 bits by this shifter circuit.

A data sample is shifted 0 to 4 bits by this shifter circuit.

The operator circuitry generates data as 15 bits (2's complement, so one of the bits provides the sign). The output from the chip is 12 bits, so three bits must be discarded. Normally these are the low-order bits, but by using the shifter, high-order zero bits can be discarded instead, and the external scaler counteracts this. The result is more bits of precision in the output.

The shifter is controlled by the logic circuitry to the left of the buffer, which controls the amount of shift based on the number of leading zeros. (For a negative number, leading 1's.) With 5 leading zeros, the number is shifted left by 3 positions. With 4 leading zeros, the number is shifted by 2 positions. With 3 leading zeros, the number is shifted by 1 position. With 2 or fewer leading zeros, the number is unshifted.

Note that the circuit leaves two leading zeros when it shifts, so it's "wasting" two potential bits of precision. I assume this is because the scaler won't be perfectly linear (due to the resistor imperfections9), so you want to avoid switching scale levels for large signals (which don't really need the extra bits).10

The output latches

As mentioned earlier, the 16 notes are output individually, spaced across the interval. This timing doesn't line up with the timing of the output buffer, which shifts to a new note every clock cycle. To fix the timing, two 16-bit latches sit between the output buffer and the output pins. While one latch outputs the current note, the other latch grabs the next note as it is shifted out of the shift register. At the appropriate time, the latches swap roles; the second latch outputs the note while the first latch waits for the next note.

The timing for the latches is fairly tricky to make sure the note data is loaded into the right latch at the right time. These latches are controlled by the chip's master counter, which is the subject of the next section.

To summarize, the sound chip runs at 4.7 MHz. Data values are produced at this rate (but intermittently) and stored into the output buffer. The output latches provide data values to the DAC chip at 786 kHz for an overall audio rate of 49096 Hertz.

Keeping track of 96 clock cycles: the chip's counter and timing PLA

One complete cycle of the sound chip takes 96 clock cycles: processing all 16 notes through the 6 operators that form an algorithm. Because data can only be accessed when it exits a shift register, everything must be timed so the right data is available at the right time. A critical part of the chip is the counter that keeps track of the current note number and operator number to keep everything synchronized.

On the right of the die photo below is the counter, consisting of seven toggle flip flops: four to count the note number (0-15) and three to count the algorithm number (0-5). On the left is the PLA that defines what happens for particular time slices. (A Programmable Logic Array (PLA) is similar to a ROM, but implements arbitrary logic.) The PLA has 39 columns, each one implementing an AND gate triggered by a particular counter output, corresponding to a particular operator and note. Below the PLA is some logic; mostly buffers with a few gates.

The chip's main counter, along with the control PLA.

The chip's main counter, along with the control PLA.

Of the PLA's 39 columns, the 32 columns on the left control the data output latches,11 two columns control loading data values into the output buffer, one generates the chip's sync output signal, three reset the operator count, and the last increments the operator count.12

Sample-and-hold

The chip outputs two signals to control the sample-and-hold circuitry, SH1 and SH2. These signals are activated in alternation to take an analog sample of each digital output.

The sound chip on the DX7 schematic has three missing pins, indicated in red.

The sound chip on the DX7 schematic has three missing pins, indicated in red.

The sound chip has three unused pins next to the SH1 and SH2 pins; the DX7 schematic doesn't show pins 6-8. I traced the chip's internal circuitry and found that these pins, in conjunction with the sample-and-hold pins, count out the 16 samples. It appears that the chip is designed to sample-and-hold all 16 notes individually, so the synthesizer could have had separate outputs for all 16 notes.13

Moreover, the chip has data buffers to hold separate algorithm algorithms for the 16 notes. This would let the chip drive 16 independent voices, each with a separate algorithm. My conclusion is that the sound chip supports much more flexibility than is used in the DX7 synthesizer.

Conclusion

The DX7 generates sounds digitally and then converts the digital values to the analog output. This process turns out to be more complicated than one would expect, with circuitry inside the chip interacting with synthesizer circuitry to scale and adjust the signal. My hope is that my analysis of this process will help DX7 emulators to achieve more accuracy. Looking at the chip's internal circuitry reveals the floating-point format of the output data as well as the function of the three unused pins.

I plan to continue investigating the DX7's circuitry, so follow me on Twitter @kenshirriff for updates. I also have an RSS feed. Thanks to Jacques Mattheij and Anthony Richardson for providing the chip and discussion.14

Notes and references

  1. My previous posts on the DX7: DX7 reverse-engineering, The exponential ROM, The log-sine ROM, and How algorithms are implemented

  2. The DAC chip is the BA9221, a 12-bit D/A converter that produces an output current based on a 2's-complement input value. A datasheet is here. The DAC receives an input voltage reference. This voltage reference can be one of 8 values selected by a multiplexer. This allows the overall volume to be set via MIDI, but only with 3-bit resolution (see the DX7 ROM code here). The volume is an exponential function (so linear in decibels) except that 0 is off. Also see this DAC discussion and this StackExchange discussion.) 

  3. The output from the DAC is centered around 7.5 volts. In other words. a digital value 0 corresponds to 7.5 volts, with positive digital value above 7.5 volts and negative digital values below 7.5 volts. I would have expected the signals to be centered around 0 volts, which is what the DAC datasheet shows. I think strictly positive voltages were used because they work better with the TC4066 and TC4066 integrated circuits switches (IC 41 for scaling and IC 52 for sample-and-hold respectively). The DX7 converts the signals to zero-centered voltages for the low-pass filter. 

  4. The amplitude scaler is built from an R-2R resistor ladder, similar to a DAC circuit. However, the attenuator only has 4 useful values, not the 16 levels you might expect with four control lines, because only one control line can be activated at a time. Combinations of control lines do not yield useful outputs. For example, if the top switch is on, you get the maximum output regardless of the other switches. Other switch combinations are non-monotonic. Thus, the scaler only provides two additional bits of resolution, not four. 

  5. The benefit of keeping two samples is not clear to me. One theory is that this reduces intermodulation distortion between the voices, the effect of one signal on another. With one "solid" sample and one changing sample, the effect of the changing sample will be reduced. Another, more speculative, possibility is that the circuitry was originally designed for stereo, holding one sample for each channel. 

  6. The filter is a sharp low-pass filter around 16 kHz using a Sallen-Key topology. 

  7. The volume is controlled by an external volume pedal and a volume control on the synthesizer. The signal also passes through a relay, which cuts the output when the synthesizer is being reset (presumably to avoid random noise).

    The external volume pedal has an interesting circuit. The pedal is essentially a variable resistor, so you might expect the output signal to pass through it. Instead, the pedal is connected to a photocoupler with an LED and a cadmium sulfide photocell inside. The output signal passes through the photocell and is attenuated as controlled by the LED. I think the motivation behind using a cadmium sulfide photocell instead of a phototransistor is that the photocell is completely resistive, so there is no nonlinear distortion of the signal. 

  8. Time-division multiplexing and filtering isn't perfect, and will contribute some artifacts to the output. In particular, there will be some aliasing, where high frequencies turn into lower frequencies. The low-pass filter will eliminate most of the high frequencies—I believe the DX7's filter is at 16 kHz—but it's not perfect and will add its own color to the sound. Two notes could also interact differently based on their relative positions in the time slice. These artifacts probably contribute to the DX7's characteristic sound. You could consider the artifacts desirable if you're trying to duplicate the DX7 sound. 

  9. The scale resistors are marked on the schematic with Ⓑ, which probably indicates they are higher-precision resistors. Assuming they are 1% resistors, a 1% error in a large signal would be much more error than the benefit of additional bits of precision. For smaller signals, the additional bits reduce the quantization noise, which is probably more important than the nonlinearity error from scaling. 

  10. There's an interesting timing issue for the scale calculation. The scaling logic requires about 3 clock cycles to determine the scale factor, so the straightforward implementation would shift a voice based on the amplitude of an earlier voice. The solution is that the 5 bits for scale calculation are pulled out of the operator shift register six stages (3 clock cycles) earlier. Thus, these shift registers have two output; the "early" output gives the scale factor circuitry time to work. 

  11. The output buffer has two latches, used by alternating notes. Each latch has one control line to latch a data value and one control line to output the latched value, so there are four control lines in total. Curiously, it appears that the notes aren't output sequentially; the order is 1, 13, 5, 11, 3, 15, 7, 10, 2, 14, 6, 12, 4, 16, 8, 9. I don't know if there's a motivation for this; it's also possible that I'm misinterpreting the circuit. 

  12. A few notes on the PLA outputs in case anyone looks at them more closely. Because signals get delayed through multiple shift registers and clock cycles, things don't happen on the cycle you'd expect. For instance, SYNC is generated 6 cycles before the end. Likewise, loading of the output buffer is triggered midway through operator 6, about 26 cycles later than operator 1 started generating outputs. Most PLA columns are triggered for a specific voice and operator value. The exception is the last column, which increments the operator regardless of the operator value. Curiously, there are three counter reset lines. One resets near the end of operator 1 (as you'd expect). The other two reset near the end of the two invalid operator values (there are 6 operators but 8 possible bit values). Presumably this keeps the synth from starting up in a bad state. Below the PLA are some gates. These are mostly buffering and clock synchronization. 

  13. Someone with a DX7 could probe the three unused pins and verify that they count out the notes. 

  14. For more information on the DX7 internals, see DX7 Technical Analysis, DX7 Hardware, OPLx decapsulated, and the video Emulating the DX7 the hard way

Yamaha DX7 chip reverse-engineering, part 4: how algorithms are implemented

The Yamaha DX7 digital synthesizer (1983) was the classic synthesizer in 1980s pop music. It uses two custom digital chips to generate sounds with a technique called FM synthesis, producing complex, harmonically-rich sounds. Each note was implemented with one of 32 different patterns of modulation and summing, called algorithms. In this blog post, I look inside the sound chip and explain how the algorithms were implemented.

Die photo of the YM21280 chip with the main functional blocks labeled. Click this photo (or any other) for a larger version.

Die photo of the YM21280 chip with the main functional blocks labeled. Click this photo (or any other) for a larger version.

The die photo above shows the DX7's OPS sound synthesis chip under the microscope, showing its complex silicon circuitry. Unlike modern chips, this chip has just one layer of metal, visible as the whitish lines on top. Around the edges, you can see the 64 bond wires attached to pads; these connect the silicon die to the chip's 64 pins. In this blog post, I'm focusing on the highlighted functional blocks: the operator computation circuitry that combines the oscillators, and the algorithm ROM that defines the different algorithms. I'll outline the other functional blocks briefly. Each of the 96 oscillators has a phase accumulator used to generate the frequency. The sine and exponential functions are implemented with lookup tables in ROMs. Other functional blocks apply the envelope, hold configuration data, and buffer the output values.

The DX7 was the first commercially successful digital synthesizer, using a radically new way of generating sounds. Instead of the analog oscillators and filters of an analog synthesizer, the DX7 generates sounds digitally, using a technique called FM synthesis. The idea is that you start with a sine wave (the carrier signal) and perturb it with another signal (the modulating signal). The modulating signal changes the phase (and thus the frequency) of the carrier, creating complex harmonic structures. The custom chips inside the DX7 made this possible at an affordable price.

The DX7 synthesizer. Photo by rockheim (CC BY-NC-SA 2.0).

The DX7 synthesizer. Photo by rockheim (CC BY-NC-SA 2.0).

FM synthesis

I'll briefly explain how FM synthesis is implemented.1 The DX7 supports 16 simultaneous notes, with 6 operators (oscillators) for each note, 96 oscillators in total. However, to minimize the hardware requirements, the DX7 only has a single digital oscillator circuit. This circuit calculates each operator individually, in sequence. Thus, it takes 96 clock cycles to update all the sounds. To keep track of each oscillator, the DX7 stores 96 phase values, an index into the sine wave table. By incrementing the index at a particular rate, a sine wave is produced at the desired frequency.

The idea of FM synthesis is to modulate the index into the sine wave table; by perturbing the index, the output sine wave is modified. The diagram below shows the effects of modulation. The top curve shows a sine wave, generated by stepping through the sine wave table at a fixed rate. The second curve shows the effects of a small amount of modulation, perturbing the index into the table. This distorts the sine wave, compressing and stretching it. The third curve shows the effects of a large amount of modulation. The index now sweeps back and forth across the entire table, distorting the sine wave unrecognizably. As you can see, modulation can produce very complex waveforms. These waveforms have a rich harmonic structure, yielding the characteristic sound of the DX7. (I made a webpage here where you can experiment with the effects of modulation.)

Modulation examples. The top sine wave is unmodulated. The middle wave has a small amount of modulation. The bottom wave is highly modulated.

Modulation examples. The top sine wave is unmodulated. The middle wave has a small amount of modulation. The bottom wave is highly modulated.

Algorithms

The above section illustrated how two oscillators can be combined with modulation. The DX7 extends this principle, generating a note by combining six oscillators through modulation and summing. It implements 32 different ways of combining these oscillators, illustrated below, and calls each one an algorithm. The different algorithms provide flexibility and variety in sound creation. Multiple levels of modulation create harmonically-rich sounds. On the other hand, multiple output operators allow different sounds to be combined. An electric piano sound, for example, could have one sound for the hammer thud, a second sound for the body of the tone, and a third sound for the ringing tine, all varying over time.

The 32 algorithms of the DX7 synthesizer.

The 32 algorithms of the DX7 synthesizer.

Looking at algorithm #8, for example, shows the structure of an algorithm. Each box represents an operator (oscillator). Operators 1 and 3 (in blue), are combined to form the output. The remaining operators provide modulation, as indicated by the lines. Operator 2 modulates operator 1. Operators 4 and 5 are combined to modulate operator 3, providing a complex modulation. Operator 6, in turn, modulates operator 5. Finally, the line looping around operator 4 indicates that operator 4 modulates itself. Since each modulation level can vary over time, the resulting sound can be very complex.

Algorithm 8 combines the six operators; two produce outputs.

Algorithm 8 combines the six operators; two produce outputs.

Shift-register storage

To understand the DX7's architecture, it's important to know that the chip uses shift registers, rather than RAM, for its storage. The idea is that bits are shifted from stage to stage each clock cycle. When a bit reaches the end of the shift register, it can be fed back into the register or a new bit can be inserted. For the phase accumulators, the shift registers are 96 bits long since there are 96 oscillators. Other circuits use 16 bit-shift registers to hold values for the 16 voices. The shift register circuitry (below) is dense, but even so, it takes up a large fraction of the chip.

A small part of the shift register storage.

A small part of the shift register storage.

The use of shift registers greatly affects the design of the DX7 chip. In particular, values cannot be accessed arbitrarily, as in RAM. Instead, values can only be used when they exit the shift register, which makes the circuit design much more constrained. Moreover, circuits must be carefully designed so that each path of a computation takes the same number of cycles (e.g. 16 cycles). Shorter paths must be delayed as necessary.2

I want to emphasize how unusual this chip is, compared to a microprocessor. You might expect that an algorithm is implemented with code, for example reading operator 2, applying modulation to operator 1, and then storing the result in operator 1. Instead, computation happens continuously in the chip, with data moving into the circuitry every clock cycle as it comes from the shift registers. The chip is more like an assembly line with bits constantly moving on many conveyor belts, and circuits steadily operating on bits as they move by. An advantage of this approach is that every clock cycle, calculations happen in parallel in multiple parts of the chip, providing much higher performance than a microprocessor could in the 1980s.

Implementation of the algorithms

The block diagram below shows the overall structure of the OPS sound chip. The idea is that the envelope chip (EGS) constantly provides frequency (F) and envelope control (EC) values at the top. The DX7's control CPU updates the algorithm (A) if the user selects a new one. The sound chip generates digital data (DA) for the 16 voices, which is fed out at the right. (The DX7's digital-to-analog converter circuitry (DAC) converts these digital values to the analog sound from the synthesizer.)

Diagram showing the architecture of the OPS chip, from the DX7/9 Service Manual.

Diagram showing the architecture of the OPS chip, from the DX7/9 Service Manual.

In more detail, the circuitry in the upper left generates the phase values for the 96 oscillators and looks up the values in the sine wave table. In the lower-left, the highlighted block implements the algorithm, producing two outputs. This block contains its own storage: the memory (M) register and feedback (F) register. It generates a modulation value that modulates the index into the sine wave table. It also produces the digital sound value that is the output from the chip. (This highlighted block is the focus of this article.) At the right, the CPU specifies the algorithm number; the algorithm ROM specifies the algorithm by generating control signals COM, SEL, and so forth.

The DX7 has 96 oscillators, which are updated in sequence. The cycle of 96 updates takes place as shown below. In the first clock cycle, computation starts for operator 6 of voice (channel) 1. In the next clock cycles, operator 6 processing starts for voices 2 through 16. Next, operator 5 is processed for the 16 voices, and likewise for operators 4 to 1. At the end of this cycle, all the notes have been updated. Two factors are important here. First, operators are processed "backward", starting at 6 and ending at 1. Second, for a particular voice, there are 16 clock cycles between successive operators. This means that 16 cycles are available to compute each operator.

A complete processing cycle, as shown in the service manual. The overall update rate is 49.096 kHz providing reasonable coverage of the audio spectrum.

A complete processing cycle, as shown in the service manual. The overall update rate is 49.096 kHz providing reasonable coverage of the audio spectrum.

The diagram below provides more detail of highlighted block above, the circuitry that modulates the waveform according to a particular algorithm. The effect of modulation is to perturb the phase angle before lookup in the sine wave table.3 At the bottom right, the signal from operator N+1 enters, and is used to compute the modulation for operator N, exiting at the bottom left.

Diagram showing modulation computation, from the patent. Inconveniently, the signal names are inconsistent with the service manual.

Diagram showing modulation computation, from the patent. Inconveniently, the signal names are inconsistent with the service manual.

The key component is the selector at the left, which selects one of the five modulation choices, based on the control signal S or SEL. Starting at the bottom of the selector, SEL=1 selects the unmodified signal from the input operator; this implements the straightforward modulation of an operator by another. Next, SEL=2 uses the value from the adder (61) for modulation. This allows an operator to be modulated by the sum of operators, for instance in algorithm 7. SEL=3 uses the delayed value from the buffer; this is used solely for algorithm 21, where operator 6 modulates operator 4. SEL=4 and SEL=5 use the self-feedback operator for modulation. Because the feedback value is buffered in the circuitry, it is available at any time, unlike other operators. SEL=4 is used to obtain delayed feedback, for instance when operator 6 modulates operator 4 in algorithm 19. (In most cases, feedback is applied immediately, for instance when operator 6 modulates operator 5, and this uses SEL=1.) SEL=5 handles the self-feedback case; the previous two feedback values are averaged to provide stability.4 The SEL=0 case is not shown; it causes no modulation to be selected so the operator is unmodulated.

Several control signals (A, B, C, D, E) also control the circuit. (Confusingly, the patent diagram below uses the names A and B for the feedback register enable (FREN) line. The memory register enable (MREN) lines are called C and D.) Signals A and B have the same value: they select if the feedback buffer continues to hold the previous value or loads a new value. Signals C and D control the buffer/sum shift register. If C is 1 and D is 0, the register holds its previous value. If C is 0 and D is 1, the input signal is loaded into the register. If both C and D are 1, the input signal is added to the previous value. This register can be used to sum two modulation signals, as in algorithm 7. But it is also used to hold and sum the output signals. (As a consequence, an algorithm can't sum modulation signals and outputs at the same time.) Signal E loads the algorithm's final output value into the output buffer (70). Signal E and buffer 70 are implemented separately, so I won't discuss them further.

The algorithm ROM

The algorithms are defined by a ROM with 9-bit entries that hold the selector value (SEL), the control signals MREN and FREN (A,C,D), and the compensation scaling value COM (which I explain later). Each algorithm needs 6 entries in the ROM to select the action for the 6 operators. Thus, the ROM holds 96 9-bit values.

The photo below shows the algorithm ROM. It has 32 columns, one for each algorithm and 9 groups of 6 rows: one group for each output bit. From bottom to top, the outputs are three bits for the selector value SEL, two MREN lines and the FREN line, and three bits for the COM value. The groups of 6 diagonal transistors at the left of the ROM select the entry for the current operator.

The algorithm ROM. The metal layer has been removed to show the silicon structure underneath that defines the bits.

The algorithm ROM. The metal layer has been removed to show the silicon structure underneath that defines the bits.

The bits are visible in the pattern of the ROM. By examining the ROM closely, I extracted the ROM data. Each entry is formatted as "SEL / A,C,D / COM". (I only show three entries below; the full ROM is in the footnotes.5)

 Operator
Algorithm 654321
11/100/01/000/01/000/10/001/01/010/15/011/0
21/000/01/000/01/000/15/001/01/110/10/011/0
...
81/000/05/001/02/111/10/001/01/010/10/011/0

To see how an algorithm is implemented, consider operator 8, for instance.6

Algorithm 8 has four modulators and two carriers.

Algorithm 8 has four modulators and two carriers.

Processing of an algorithm starts with operator 6's signal value at the output of the operator block and operator 5's modulation is being computed. Table column 6 above shows SEL=1, A,C,D=000. In the modulation circuit (below), SEL=1 selects the raw signal in (i.e. operator 6's value) for modulation. Thus, operator 6 modulates operator 5, the desired behavior for algorithm 8.

Diagram showing modulation computation.

Diagram showing modulation computation.

Next, (16 cycles later), operator 5's signal is at the output and operator 4's modulation is being computed. Column 5 of the table shows SEL=5, A,C,D=001. SEL=5 selects the filtered feedback register for self-modulation of operator 4. D=1 causes operator 5's value to be loaded into the shift register, in preparation for modulating operator 3.

Next, operator 4's signal is at the output and operator 3's modulation is being computed. Column 4 shows SEL=2 and A,C,D=111. Bits A (and B) are 1 to load the feedback register with operator 4's value, updating the self-feedback for operator 4. Bits C and D cause operator 4 to be added to the previously-stored operator 5 value. SEL=2 selects this sum for operator 3's modulation, so operator 3 is modulated by both operators 4 and 5. COM=1 indicates this operator is one of 2 outputs, so operator 3's value will be divided by 2 as it is computed.

Next, operator 3's signal is at the output and operator 2's modulation is being computed. Looking at the ROM, SEL=0 results in no modulation of operator 2. D=1 loads operator 3's signal into the summing shift register, in preparation for the output.

Next, operator 2's signal is at the output and operator 1's modulation is being computed. SEL=1 causes operator 1 to be modulated by operator 2. C=1 so the summing shift register continues to hold the operator 3 value, to produce the output. As with operator 3, COM=1 so operator 1's value will be divided by 2 when it is computed.

Finally, operator 1's signal is at the output and operator 6's modulation is being computed. SEL=0 indicates no modulation of operator 6. Control signals C and D are 1 so operator 1 is added to the register (which holds operator 3's value), forming the final output.

This process repeats cyclically, interleaved with processing for the 15 other voices. This section illustrates how a complex algorithm is implemented through the modulator circuitry, directed by a few control signals from the ROM. The other algorithms are implemented in similar ways.7

The modulation circuitry

The diagram below shows the circuitry that computes the modulation and output; this functional block is in the center of the chip. The memory register (red) holds 16 values, one for each voice. To its right, the adder (blue) adds to the value in the memory register. The selector (purple), is the heart of the circuit, selecting which value is used for modulation. It is controlled by the selector decoder (orange) at the bottom, which activates a control line based on the 3-bit SEL value. At the far right, the two feedback registers (red) hold the last two feedback values for each of the 16 voices. The feedback adder sums two feedback values to obtain the average. The feedback shifter (yellow) scales the feedback value by a power of 2.

The circuitry that calculates the modulation for the algorithm.

The circuitry that calculates the modulation for the algorithm.

Shift registers

The schematic below shows how one stage of the shift register is implemented. The chip uses a two-phase clock. In the first phase, clock ϕ1 goes high, turning on the first transistor. The input signal goes through the inverter, through the transistor, and the voltage is stored in the capacitor. In the second phase, clock ϕ2 goes high, turning on the second transistor. The value stored in the capacitor goes through the second inverter, through the second transistor, and to the output, where it enters the next shift register stage. Thus, in one clock cycle (ϕ1 and then ϕ2), the input bit is transferred to the output. (The circuit is similar to dynamic RAM in the sense that bits are stored in capacitors. The clock needs to cycle before the charge on the capacitor drains away and data is lost. The inverters amplify and regenerate the bit at each stage.)

Schematic of one stage of the shift register.

Schematic of one stage of the shift register.

The diagram below shows part of the shift register circuitry as it appears on the die. The blue rectangle indicates one shift register stage. The power, ground, and clock wiring is in the metal layer, which was mostly removed in this image. Shift register stages are linked horizontally. Shift registers for separate bits are stacked vertically, with alternating rows mirrored.

Die photo showing a stage of the shift register.

Die photo showing a stage of the shift register.

The selector

The selector circuit selects one of the five potential multiplexer values, based on the SEL input. The circuit uses five pass transistors (indicated in yellow) that pass one of the 5 inputs to the driver circuit and then the output. (A sixth transistor pulls the output high if none of the inputs is selected; I've labeled this "x".) The diagram below shows one selector in the top half, and a mirror-image selector below; there are 12 selector circuits in total. The circuit is built around the six vertical select lines. One select line is activated to select a particular value. This turns on the corresponding transistors, allowing that input to flow through the transistors. The result goes through another transistor to synchronize it to the clock, and then an inverter/buffer to drive the output line. The outputs go to the sine-wave circuit, where they modulate the input to the lookup table.

Two stages of the selector.

Two stages of the selector.

The adder

The chip contains multiple adders. Two adders are used in the modulation computation: one to sum operators and one to average the two previous feedback values. The adders are implemented with a standard binary circuit called a full adder. A full adder takes two input bits and a carry-in bit. It adds these bits to generate a sum bit and a carry-out bit. By combining full adders, larger binary numbers can be added.

Diagram showing a full adder.

Diagram showing a full adder.

The diagram above shows a full adder stage in the chip. The circuit is built from three relatively complex gates, but if you try the various input combinations, you can see that produces the sum and carry. (Due to the properties of NMOS circuits, it's more efficient to use a small number of complex gates rather than a larger number of simple gates such as NAND gates.)

One problem with binary addition is that it can be relatively slow for carries to propagate through all the stages. (This is the binary equivalent of 99999 + 1.) The solution used in the DX7 is pipelining: an addition operation is split across multiple clock cycles, rather than being completed in a single clock cycle. This reduces the number of carries in one clock cycle. Although a particular addition takes several clock cycles, the adders are kept busy with other additions, so one addition is completed every cycle.

The compensation (COM) computation

In the DX7, different algorithms have different numbers of oscillators in the output, which poses a problem An algorithm with 6 output oscillators (e.g. #32) would be six times as loud as an algorithm with 1 oscillator (e.g. #16), which would be annoying as the user changes the algorithm. To avoid this problem, the chip scales the level of output oscillators accordingly. For instance, the levels of output oscillators in algorithm #32 are scaled by 1/6 to even out the volumes. This factor is called COM (compensation) in the service manual and ADN (addition channel number) in the patent.8 To implement this scaling, the algorithm ROM holds the output count for each operator, minus 1. For example, algorithm #32 has six output oscillators, each one having a COM value of 5 (i.e. 6-1). For algorithm #1, the two output oscillators are 1 and 3: these have a COM value of 1 (i.e. 2-1). Operators that are used for modulation are not scaled, and have a COM value of 0.

Recall that the envelope scaling is accomplished by adding base-2 logarithms. The COM scaling also uses logarithms, which are subtracted to scale down the output level. A small ROM generates 6-bit logarithms for the COM values 1 through 5, corresponding to scale factors 2 through 6. The diagram below shows the COM circuitry, which is in the upper-right corner of the chip. At the left, the decoder and tiny ROM determine the logarithmic scaling factor from the number of inputs. This is added to the logarithmic envelope level that the chip receives from the envelope chip. The result goes through a few shift register stages for timing reasons.

The COM circuitry adds a compensation level to the envelope to compensate for algorithms with multiple outputs.

The COM circuitry adds a compensation level to the envelope to compensate for algorithms with multiple outputs.

Conclusion

The DX7's algorithm implementation circuitry is at the heart of the chip's sound generation. This circuitry is cleverly designed to implement 32 different algorithms at high speed with the limited hardware of the 1980s. The circuitry runs fast enough to process 16 voices sequentially, each with 6 separate oscillators, while producing outputs fast enough to produce audio signals. By taking advantage of the pipelined architecture built around shift registers, the chip processes a different oscillator during each clock cycle, a remarkable throughput. Overall, I'm impressed with the design of this chip. Its cutting-edge design was the key to the DX7's ability to provide dramatic new sounds at a low price. As a result, the DX7 defined the canonical sound of the 1980s and changed the direction of pop music.

I plan to continue investigating the DX7's circuitry, so follow me on Twitter @kenshirriff for updates. I also have an RSS feed. Also see my previous posts on the DX7: DX7 reverse-engineering, the exponential ROM, The log-sine ROM.

Thanks to Jacques Mattheij and Anthony Richardson for providing the chip and discussion.9

Notes and references

  1. Note that the underlying frequency of the oscillator stays the same during modulation, but the phase is changed. Technically the DX7 uses phase modulation (PM) rather than frequency modulation (FM). The two are closely related—phase modulation with a signal is the same as frequency modulation with the derivative of the signal—so the difference is usually ignored. 

  2. Another complication is that the chip is pipelined. It doesn't simply go through 96 clock cycles, updating one operator each cycle. Instead, the computations for an operator are spread across multiple clock cycles. The result is still that one operator calculation is completed per clock, but different parts of the circuitry are working on different operators at any particular time.

    The reason for pipelining is to handle calculations that won't fit into one clock cycle. For instance, the chip adds 22-bit numbers. Propagating a carry through all 22 adder stages would take too long for one clock cycle. Instead, addition takes place in chunks of about 4 bits. The lowest 4 bits are added in one clock cycle, the next bits in the next clock cycle, and so forth. Thus, the propagation delay during one clock cycle is substantially reduced. The circuit still completes one addition per cycle, even though any particular addition takes multiple cycles. 

  3. The diagram below from the patent shows how this is implemented. The modulation is added to the phase angle to create the index into the sine table, yielding the modulated signal. This signal is scaled by the envelope; instead of multiplying, the base-2 logarithms of both values are added. (Ignore ADN for now; I'll discuss it later.) Finally, the logs are converted back to linear values by an exponential ROM and circuit. The result is the modulated and scaled output signal. The steps in this box take exactly 16 clock cycles, which will turn out to be important. As a result, operator N's values enter the box at the same time that operator N+1's values exit the box. (Remember that operators are processed in reverse order: 6 down to 1.)

    Diagram showing the construction of an operator, from the patent.

    Diagram showing the construction of an operator, from the patent.

    I'll summarize the patent's mathematical notation in case anyone reads it. The phase angle, varying with time is ωt. kωt indicates the possible use of a frequency modifier k. The modulation function is f(ωmt), a function of the modulation frequency. The envelope, as a function of t, is A(t) for the amplitude or I(t) for the modulation index; that is, applied to an output operator or a modulating operator respectively. On the diagrams, Φ indicates the clock. 

  4. When an operator provides feedback to itself (usually operator 6), the modulation uses a special path that averages the previous two values. The patent calls this an "anti-hunting" feature. I think this avoids wild oscillations from self-feedback. Suppose you have a situation where a large modulation signal produces a small output and a small modulation signal produces a large output. This would result in the signal oscillating between small and large every clock cycle, which would be unpleasant. Averaging the previous two values is essentially a low-pass filter and would prevent these wild oscillations. Also note that the self-feedback path allows the feedback level to be controlled by the FBL signal. This shifts the feedback signal, dividing it by a power of 2. 

  5. The full algorithm ROM contents are below. The format is "SEL/ FREN MREN / COM value". Note that algorithm numbers are 1 to 32, while the ROM's binary addresses are 0 to 31.

     Operator
    Algorithm 654321
    11/100/01/000/01/000/10/001/01/010/15/011/0
    21/000/01/000/01/000/15/001/01/110/10/011/0
    31/100/01/000/10/001/01/010/01/010/15/011/0
    41/000/01/000/10/101/01/010/01/010/15/011/0
    51/100/20/001/01/010/20/011/01/010/25/011/0
    61/000/20/101/01/010/20/011/01/010/25/011/0
    71/100/00/001/02/011/10/001/01/010/15/011/0
    81/000/05/001/02/111/10/001/01/010/10/011/0
    91/000/00/001/02/011/15/001/01/110/10/011/0
    100/001/02/011/15/001/01/110/01/010/10/011/0
    110/101/02/011/10/001/01/010/01/010/15/011/0
    120/001/00/011/02/011/15/001/01/110/10/011/0
    130/101/00/011/02/011/10/001/01/010/15/011/0
    140/101/02/011/01/000/10/001/01/010/15/011/0
    150/001/02/011/01/000/15/001/01/110/10/011/0
    161/100/00/001/01/010/00/011/02/011/05/001/0
    171/000/00/001/01/010/05/011/02/111/00/001/0
    181/000/01/000/05/001/00/111/02/011/00/001/0
    191/100/24/001/20/011/01/010/01/010/25/011/0
    200/001/02/011/25/001/01/110/24/011/20/011/0
    211/001/33/001/35/011/01/110/34/011/30/011/0
    221/100/34/001/34/011/30/011/01/010/35/011/0
    231/100/34/001/30/011/01/010/30/011/35/011/0
    241/100/44/001/44/011/40/011/40/011/45/011/0
    251/100/44/001/40/011/40/011/40/011/45/011/0
    260/101/02/011/20/001/01/010/20/011/25/011/0
    270/001/02/011/25/001/01/110/20/011/20/011/0
    285/001/01/110/01/010/20/011/01/010/20/011/2
    291/100/30/001/01/010/30/011/30/011/35/011/0
    305/001/01/110/01/010/30/011/30/011/30/011/3
    311/100/40/001/40/011/40/011/40/011/45/011/0
    320/101/50/011/50/011/50/011/50/011/55/011/5
     

  6. The DX7/9 service manual explains the steps of algorithms 1 and 21 in detail. 

  7. Note that the algorithms are carefully designed with operator 6 on top and 1 on the bottom, so operators are modulated only by operators with a higher number. This is due to the implementation of the modulation circuitry which processes operators starting with 6 and ending with 1. The 32 algorithms make it look like almost anything is possible, but the hardware imposes several constraints that limit the possibilities. For instance, there is only one sum/delay register so you can't sum modulators and the output at the same time. You can't delay a non-feedback operator after an output takes place; for instance, algorithm 11 has 6 delayed to modulate 3, but only because there haven't been any outputs at that point. An algorithm can only have one self-feedback loop. 

  8. The logarithmic COM values are:

    COMbinary valuevalue
    000.000log2(1)
    101.000log2(2)
    201.101≈log2(3)
    310.000log2(4)
    410.011≈log2(5)
    510.101≈log2(6)

    Since the computation is done with logarithms, the circuit subtracts these values (or equivalently adds the complement). This is equivalent to dividing by the number of outputs or multiplying by the reciprocal. Note that the COM input is one less than the number of outputs. Entry 0 is not explicitly stored in the ROM but results by default. If the result of the subtraction is negative, gates clamp the envelope at 0. 

  9. For more information on the DX7 internals, see DX7 Technical Analysis, DX7 Hardware, OPLx decapsulated, and the video Emulating the DX7 the hard way

Yamaha DX7 reverse-engineering, part III: Inside the log-sine ROM

The Yamaha DX7 digital synthesizer (1983) was the classic synthesizer for 1980s pop music. It used two custom digital chips to generate sounds with FM synthesis. In this blog post, I examine the log-sine ROM that digitally produces sine waves inside one of these chips. (This blog post jumps into the details; unless you care about the sine values specifically, my previous DX7 reverse-engineering article is probably more interesting.)

I created the high-resolution die photo below by compositing over a hundred microscope photos. I removed the metal layer from the chip with acid to reveal the silicon and polysilicon wiring underneath. You can see the structure of the functional blocks and the connections between them. The colors are due to variations in thickness of the oxide layer, causing thin-film interference. With the metal layer removed, I could read out the bits from the ROM, reverse-engineer the circuitry, and determine the exact values used for sine-wave generation.

Die photo of the DX7's YM21280 Operator chip.  Click this photo (or any other) for a magnified version.

Die photo of the DX7's YM21280 Operator chip. Click this photo (or any other) for a magnified version.

Instead of the analog oscillators and filters of an analog synthesizer, the DX7 generates sounds digitally, using a technique called FM synthesis. The idea is that you start with a sine wave (the carrier signal) and perturb it with another signal (the modulating signal). The modulating signal changes the phase (and thus the frequency) of the carrier, creating complex harmonic structures like the waveform below. These signals are represented as digital values throughout the system; a digital-to-analog converter (DAC) turns the digital representation into an analog voltage for the synthesizer's output.

An example of a complex waveform created by FM synthesis.  (I made a tool that lets you experiment with FM synthesis.)

An example of a complex waveform created by FM synthesis. (I made a tool that lets you experiment with FM synthesis.)

The digital implementation of frequency modulation uses a lookup table that holds a digitized sine wave. By stepping an index through the table at a specific rate, you can produce a sine wave of a fixed frequency. By perturbing this index with another signal, you can produce a modulated sine wave like the one below. The DX7 implements this with a sine-wave table in ROM, an increment value that controls the frequency, and an adder that adds the increment to the table index (i.e. the phase angle) each time step. This ROM is the subject of this blog post.

The amplitude of the sine wave is controlled by an envelope, varying over time; multiplying the sine wave by the envelope level yields the output. However, fast multiplication required too much hardware in the 1980s, so the DX7 uses a mathematical shortcut: adding logarithms is equivalent to multiplying the values. The obvious problem is that computing logarithms is harder than multiplying, but the trick is to store the (negated) logarithm of the sine wave in the lookup table (below) instead of the sine wave. This provides the logarithm for free. (The other issue is that you need to perform an exponential to get the final result. I described the exponential ROM and circuit in my previous DX7 article).

This graph shows the log-sine function over one quarter of the wave, as a 14-bit value. It's not recognizable as a sine function, but will turn into a sine wave after exponentiation.

This graph shows the log-sine function over one quarter of the wave, as a 14-bit value. It's not recognizable as a sine function, but will turn into a sine wave after exponentiation.

The block diagram below shows the structure of the log-sine circuit, computing a 14-bit value from a 12-bit input. The circuitry is somewhat complex to fit a fast, high-accuracy calculation into a small space on the die. The implementation takes advantage of the symmetry of the sine wave so only a quarter-wave needs to be stored. The top bit is used as the sign bit, which inverts the output elsewhere to obtain the negative half of the sine wave. (This also avoids the problem of taking the log of a negative value.) The second bit implements the mirror symmetry of each sine-wave peak by inverting the bits for the second half of the peak.

Block diagram of the sine circuit. Input bits are indicated in green.

Block diagram of the sine circuit. Input bits are indicated in green.

The ROM and associated logic take a 10-bit input address representing a quarter of the sine wave (angles 0 through π/2). A technique called delta encoding is used to reduce the size of the ROM. The idea of delta encoding is that if values change slowly, the difference between two values is considerably smaller than the value itself.1 Specifically, only every fourth value is explicitly stored in the ROM; this value is called an "absolute" value.3 The next three values are stored as deltas: the difference between the value and the previous absolute value.2 An adder circuit adds the absolute value to the difference value, yielding the desired log-sin value.

The diagram below labels the main functional blocks of the chip. In this article, I focus on the sine circuit, highlighted in red, but I'll summarize the other blocks. The 96 phase accumulators, implemented with shift registers, are the largest block of the chip. They hold the current table index for each of the DX7's 96 oscillators. The exponential function is implemented by two identical ROMs and associated addition/shifter circuitry. Other major blocks apply the envelope, hold configuration data, compute the operators that combine oscillators, define different operator algorithms, and buffer the output values.

Die with the major functional blocks labeled. This photo shows the metal layer of the chip. (Click for a larger version.)

Die with the major functional blocks labeled. This photo shows the metal layer of the chip. (Click for a larger version.)

The ROM

The photo below shows the log-sin ROM. The ROM itself consists of a grid of transistors. At the top, decoder circuits select signal lines based on the address bits. At the right, the diagonal circuits are multiplexers, selecting particular rows of the ROM. To the right of the multiplexers, logic circuits select the delta values. I won't explain these circuits in detail since I discussed the similar circuits for the exponential ROMs in my previous article.

High-resolution image of the sine ROM. Click this image (or any other) for an enlarged image.)

High-resolution image of the sine ROM. Click this image (or any other) for an enlarged image.)

By examining the ROM closely, you can see the individual transistors that store bits. A transistor represents a 1, and the lack of a transistor represents a 0. Thus, the data in the ROM is created by the pattern of how the silicon is doped. I was able to read out the ROM data visually by looking at this pattern.

Closeup of the ROM.

Closeup of the ROM.

The delta representation and the adder

The ROM itself produces 43 output bits, 13 bits for the "absolute" value and 30 bits for the three delta values. Some logic circuitry expands the ROM's 30 bits into three deltas of 12 bits (and a zero delta for the absolute value), taking advantage of some structure in the deltas. This circuitry is just to the right of the ROM and is implemented with AND-OR-INVERT gates. These gates implement 4-to-1 multiplexers, selecting the appropriate delta value based on the 2 lowest input bits.

Next, the adder circuit to the right of the ROM adds the 13-bit absolute value and the 12-bit delta value to generate the final 14-bit value. One interesting feature of the adder is it is pipelined to minimize the delay from carry propagation. I discussed the adder implementation in my previous article so I won't go into details here. The adder is immediately followed by a second adder that adds the envelope value to scale the signal level, taking advantage of the logarithmic representation.

Overall, the log-sine circuit generates 1024 14-bit values. Stored directly, this would take over 14 kilobits, but the ROM is only 5344 bits. The delta representation and ROM compression reduce the ROM size by almost 63%, important for a chip built in the 1980s when transistors were precious. By itself, the delta representation doesn't save much space: a 12-bit delta instead of a 14-bit value. But the ROM's implementation makes the deltas efficient: if a 32-bit row in the ROM is all zeroes, the row can be omitted entirely and the output defaults to 0. For the flat parts of the function, the high-order bits of the deltas are mostly zero, so much of the ROM can be omitted.

Conclusion

The DX7 generates its waveforms from a digital sine wave, so producing a high-accuracy value rapidly is key to the synthesizer's performance. By examining the ROM and associated circuitry, I could obtain the exact values that the DX7 uses for the log-sine function. The ROM provides one quarter of the sine wave and the other quarters are formed by symmetry. For a 10-bit input value n, the corresponding angle is ω = (n + .5)/1024×π/24 and output value y is -log2(sin(ω)), represented as the integer round(y×1024).5

The DX7's OPS chip comes in a 64-pin ceramic package with staggered pins. This is known as a Quad Inline Package (QIP). Photo courtesy of Jacques Mattheij.

The DX7's OPS chip comes in a 64-pin ceramic package with staggered pins. This is known as a Quad Inline Package (QIP). Photo courtesy of Jacques Mattheij.

I plan to continue investigating the DX7's circuitry, so follow me on Twitter @kenshirriff for updates. I also have an RSS feed.

Thanks to Jacques Mattheij and Anthony Richardson for providing the chip and discussion.6

Notes and references

  1. A different chip, the Yamaha YMF262 (1988) was used in computer sound cards such as the Sound Blaster 16. (This chip is also known as OPL3 for FM Operator Type-L.) It uses FM synthesis, but is stripped down compared to the DX-7. The chip was reverse-engineered by Matthew Gambrell and Olli Niemitalo who decapsulated the chip and read out the ROM contents.

    The OPL3 log-sine ROM is similar to the DX7's in some ways, but is lower resolution. The OPL3 chip is 256 samples long, rather than 1024, and holds 8-bit values, rather than 13-bit values. Both chips use delta encoding, but the OPL3 has one delta-encoded value for each absolute value, while the DX7 has three delta-encoded values. 

  2. To be precise, the three delta values are stored before the absolute value in the ROM. That is, entries 3, 7, 11, ... are absolute, instead of 0, 4, 8, ..., the expected locations. I think this is because the log-sin function is decreasing, so if you want to add the deltas (instead of subtracting), the absolute value needs to be the last of the group, not the first. 

  3. The absolute value in delta encoding is the full, explicit value. It's unrelated to the absolute value function |x|. 

  4. Note that the input to the ROM is incremented by half a bit. This avoids duplication of the 0 value of the waveform when the quarter-wave is mirrored. It also avoids computation of the undefined value log(0). 

  5. The value is rounded to an integer by computing int(y×1024 + .5002). The constant .5002 rounds the value up, with just a tiny bit more that affects a single entry. I'm not sure why the rounding is not exact; perhaps Yamaha used a lower-precision sine or logarithm, which was just enough to change one bit. (Note that the value .0002 is somewhat arbitrary; a slightly larger or smaller number will yield the same result.) 

  6. For more information on the DX7 internals, see DX7 Technical Analysis, DX7 Hardware, OPLx decapsulated, and my previous DX7 articles Reverse-engineering the Yamaha DX7 synthesizer's sound chip from die photos and The Yamaha DX7's exponential circuit