Tiny transformer inside: Decapping an isolated power transfer chip

I saw an ad for a tiny chip1 that provides 5 volts2 of isolated power: You feed 5 volts in one side, and get 5 volts out the other side. What makes this remarkable is that the two sides can have up to 5000 volts between them. This chip contains a DC-DC converter and a tiny isolation transformer so there's no direct electrical connection from one side to the other. I was amazed that they could fit all this into a package smaller than your fingernail, so I decided to take a look inside.

I obtained a sample chip from Texas Instruments. Robert Baruch of project5474 decapped this chip for me by boiling it in sulfuric acid at 210 °C. This dissolved the epoxy package, leaving a pile of tiny components, shown below with a penny for scale. At the top are two tiny silicon dies, one for the primary circuitry and one for the secondary. Below the dies are two magnetized ferrite plates from the transformer. To the right is one of five pieces of woven glass fiber. At the bottom is a copper heat sink, partially dissolved by the decapping process.3

Components of the chip, on a penny for scale.

The chip also contained two octagonal copper coils that were the transformer windings. The photo below shows the remnants of one coil after decapping. These windings were probably copper traces on tiny printed circuit boards; the pieces of woven glass fiber are the remnants of these boards after the epoxy was dissolved. It appears that the winding consisted of multiple wires in parallel, rather than a coiled wire.

An octagonal transformer winding.

To determine how the components went together, I studied Texas Instruments patents and found a similar power isolation chip (below). Note the structure of the two dies and the coils. A key feature of this patent is the leads are raised internally, with the dies mounted upside down. This provides better electromagnetic isolation from the circuit board.

Diagram from a Texas Instruments patent, showing the structure of a power isolation chip.

The chip is in a SOIC package, smaller than a fingernail. The mockup image below shows that the silicon dies and the transformer winding are so small that they can fit in this package.4 This power chip is about twice as thick as a standard SOIC package so it can hold the multiple layers of the transformer.`

A representation of the chip's internals. This is a composite of the various pieces. The second ferrite plate would go over the transformer coils. The dies are probably upside-down in the actual chip. The chip measures 7.5mm×10.3mm and 2.7mm thick.

The secondary die and its components

The chip contains two silicon dies, one for the primary-side circuitry that receives power and one for the secondary-side circuitry that outputs power. The photo below shows the silicon die for the secondary. The metal layer on top of the chip is visible; I think there are three metal layers in total to provide the chip's wiring. The chip's silicon is not visible in this photo as it is hidden under the metal. At the top and left, bond wires are connected to pads on the die. The left half of the chip is covered with a lot more metal than the right; the left side has the analog power electronics, so it needs high-current wiring.

The secondary-side die. Click for a larger image.

Removing the metal layers5 reveals the underlying silicon (below). This shows the transistors, resistors, and capacitors that make up the chip. There's not a lot of visual similarity between the metal layer and the underlying silicon, but a few of the features match up.

The secondary-side die with the metal removed.

One interesting feature of the chip is "CMP fill". During manufacturing, the layers of the chip were polished flat with Chemical-Mechanical Polishing (CMP). However, regions without any metal wiring are softer and would be polished down too much. To prevent this, empty regions are filled in with a grid of squares, ensuring that the chip is polished to a uniform level. The fill is visible in the photo below as the tiny square boxes at a slight angle. The chip has multiple layers of metal, and each layer has its own fill at a different angle. (The angle prevents the fill from aligning with other features, minimizing stray capacitance and inductance.)

The logo on the primary die, surrounded by CMP fill. The "P" in "UCP" indicates the primary.

At the bottom of the chip, underneath the metal layers, the silicon also has CMP fill, shown below. These raised fill squares are part of the silicon and the lines between the squares are filled with material, probably polysilicon. Note that although the grid is at an angle, each square is parallel with the chip. In other words, the positions of the squares are at an angle, but not the squares themselves.

The secondary silicon die, showing CMP fill surrounding some circuitry.

The diagram below labels some components of the die. The left side has the power components connected to the transformer, while the right side has the control logic.

The chip's logic appears to be built from two blocks of standard-cell circuitry, where each logic element is a fixed design from a library, and these cells are arranged on a grid. The photo below shows a closeup of the silicon implementing this logic. Each block is an MOS transistor, wired together by the metal layers that were on top. The smallest visible features are about 700 nm wide, the wavelength of red light. (This explains why the image is fuzzy.) In comparison, cutting-edge chips are now moving to a 5 nm process, 140 times smaller.

A closeup of standard-cell circuitry.

A large area of the chip consists of capacitors, which are constructed from a metal layer over the silicon, separated by dielectric. The large square regions in the photo below are capacitors; the dielectric appears yellowish, reddish, or greenish, depending on its thickness. These capacitors are connected together by the metal layer to form larger capacitors. (The tiny square pattern between the capacitors is CMP fill, discussed earlier.) I couldn't dissolve the dielectric, so I suspect it is silicon nitride, rather than the silicon dioxide that provides most of the insulation between the die's layers.

The die has numerous square capacitors.

The horizontal stripes in the silicon below are resistors, formed by doping silicon to produce regions with higher resistance. The resistance is proportional to the length divided by the width, so resistors are long and thin to obtain significant resistance. By connecting the resistor stripes at the ends in a zig-zag pattern, a high-value resistor can be produced.

These long stripes are presumably resistors.

The photo below shows some of the transistors on the chip. The chip uses a wide variety of transistors, ranging from the large power transistor at the bottom to the collection of tiny logic transistors to the left of the "10µm" label. All the transistors are shown at the same scale, so you can see the dramatic range in sizes. (There might be diodes in here too.)

A collection of transistors from the secondary die, all displayed at the same scale for comparison.

The primary die

The photo below shows the primary-side silicon die. Some of the bond wires are attached to the chip at the top. In this photo, some of the metal layer has been removed, showing the underlying wiring. The top side of the chip has the analog power circuitry, mainly capacitors, and it is covered with a mostly-uniform layer of metal.6

The primary-side die with some of the metal removed.

The closeup below shows the primary die midway through removal of the metal and oxide layers. Note that some metal and polysilcon pieces have come loose from the die and are at random angles. This illustrates how the die has a three-dimensional structure, with multiple layers on top of each other. With the oxide removed, the structures in a layer can fall off.

A closeup of the primary die with the metal partially removed.

How the chip works

The basic idea of the chip is straightforward; it operates as an isolated DC-DC converter. The primary side of the chip converts the input voltage into pulses that are fed into the transformer. The secondary side rectifies the pulses to produce the output voltage. Because there is no electrical connection between the primary and secondary—just the transformer—the output voltage is electrically isolated. However, the details are not documented: there are many possible "topologies" for generating and rectifying the pulses, such as a flyback converter, a forward converter, or a bridge converter. Another question is how the output voltage is controlled.7

I studied various TI patents, and I think the chip uses a technique called a "phase-shifted dual-active-bridge", shown below. The primary uses four transistors configured as an H-bridge (on the left) to send positive and negative pulses to the transformer (middle). A similar H-bridge on the secondary side (right) converts the transformer's output back to DC. The reason to use an H-bridge instead of diodes on the secondary side is that by changing the timing, more or less power gets transmitted. In other words, by shifting the phase between the primary's bridge and the secondary's bridge, the voltage can be regulated. (Unlike most converters, neither the pulse frequency nor the pulse width is modified in this approach.)

Diagram from patent 10122367, Isolated phase-shifted DC to DC converter.

Each H-bridge consists of four transistors: two N-channel MOS transistors and two P-channel MOS transistors. The photo below shows six large power transistors that take up a large fraction of the secondary die. Examining their structure, I think the two on the right are N-channel MOSFETs and the other four are P-channel MOSFETs. This would yield the four transistors required for the H-bridge, with two transistors left over for another purpose.

These large power transistors are on the left side of the secondary die photo.

Using the chip

I wired up the chip on a breadboard (below) and it worked as advertised. It's an extremely easy chip to use, just a couple of filter capacitors on the input and output. (While the dies contain numerous capacitors, they are much too small for filtering. External capacitors provide larger capacitances.) I put 5 volts in (lower left) and got 5 volts out (upper right), lighting an LED. When implementing power electronics, it is important to follow layout recommendations to avoid noise and oscillation. However, even though this breadboard did not satisfy any of these recommendations, the chip worked fine. I measured the output at 5 volts, with little noise.

The chip wired up on a breadboard. The chip is mounted on the breakout board in the middle, which allows it to be plugged into the breadboard.

Conclusion

When I saw a chip containing a complete DC-DC converter, I figured there must be some interesting technology inside. Decapping the chip revealed the components, including two silicon dies and tiny planar transformer windings. By studying the pieces and comparing with Texas Instrument patents, I concluded that the chip uses a phase-shifted dual-active-bridge topology for power transfer. (Interestingly, this topology is becoming popular for electric vehicle chargers, although at much higher power.8)

The dies are complex with three layers of metal and small features that can't be resolved optically. I usually examine chips that are decades older and much easier to understand, so this post has more speculation than my typical reverse-engineering. (In other words, I probably got some things wrong.) If you're familiar with modern IC components and recognize any components, please let me know.

I announce my latest blog posts on Twitter, so follow me @kenshirriff for future articles. I also have an RSS feed. Thanks to Robert Baruch for decapping this chip for me and thanks to Texas Instruments for supplying me with a free sample chip.

Notes and references

A lot of people complain about ad targeting, but in this case, the ad (below) was an exact match for my interests. This chip is the UCC12050; the datasheet is here.

Texas Instruments' ad for the power transfer chip, showing how small the chip is.

↩
The chip can output 5V, 3.3V, 5.4V, or 3.7V, selectable by a resistor. The 5.4V and 3.7V values may seem random, but the motivation is they provide an extra 0.4V, allowing the voltage to be regulated by an LDO regulator. The chip doesn't provide a lot of power, just half a watt. ↩
Because of the internal structures in the chip, there is a risk of moisture penetrating the package and accumulating inside. When soldering the chip, this moisture could vaporize, causing the chip to pop like popcorn. To avoid this possibility, the chip was packaged in a special moisture-proof bag that contained moisture indication cards. The chip has moisture sensitivity level 3, indicating it must be soldered within a week of removal from the bag. If the chip exceeds the limit, it must be baked before soldering to drive out the residual moisture.

↩
The moisture-proof bag that held the chip and the moisture indication cards.
It would be interesting to take a cross-section of this chip to see the exact internal layout, like the cross-sections done by @TubeTimeUS. ↩
To remove the layers from the chip, I alternated application of hydrochloric acid (pool acid) to dissolve the metal and application of Armour Etch to remove the silicon dioxide layer. ↩
I accidentally dropped the primary die down the drain while trying to clean it, so I don't have many pictures of the primary die. ↩
Controlling the output voltage in a DC-DC converter can be done in various ways. A common approach is to send feedback from the secondary side to the primary side through an optoisolator, allowing the primary side to adjust the voltage. In another approach, the primary side uses a separate transformer winding to monitor the voltage. Neither of these approaches seems possible with this chip, though: there's no feedback path from the secondary, but the output voltage is selected by the secondary. An inefficient approach would be to put a linear voltage regulator on the secondary side to drop the voltage to the desired value. ↩
I came across an interesting video that shows a dual-active-bridge converter for electric vehicle charging. This converter is powered directly from a 2.5-kilovolt power line, which is a bit scary. ↩

Reverse-engineering the audio amplifier chip in the Nintendo Game Boy Color

The Nintendo Game Boy Color is a handheld game console that was released in 1998. It uses an audio amplifier chip to drive the internal speaker or stereo headphones. In this blog post, I reverse-engineer this chip from die photos and explain how it works.1 It's essentially three power op-amps with some interesting circuitry inside.

Die photo of the audio amplifier chip in the Nintendo Game Boy Color. Click this (or any other image) for a larger image. Photo courtesy of John McMaster.

The photo above shows the chip's silicon die as it appears under a microscope. The white lines are the chip's metal layer, connecting the components. The silicon itself appears greenish and is underneath the metal. The black circles around the outside are the bond wire connections, where tiny wires connected the silicon die to the chip's package. Regions of the chip are treated (doped) to change the electrical properties of the silicon. The next sections explain how components are created from these different types of silicon.

NPN transistors

The amplifier chip is built from transistors known as NPN and PNP bipolar transistors, different from the low-power MOS transistors used in processors. These transistors have three connections: the emitter, the base, and the collector. The magnified photo below shows one of the transistors as it appears on the chip. The slightly different tints in the silicon indicate regions that have been doped to form N and P regions, with dark lines separating the regions. The bubbly silverish areas are the metal layer of the chip on top of the silicon—these form the wires connecting to the collector, emitter, and base.

An NPN transistor in the amplifier chip. The collector (C), emitter (E), and base (B) are labeled, along with N and P doped silicon.

Underneath the photo is a cross-section drawing illustrating how the transistor is constructed. The emitter (E) wire is connected to N+ silicon. Below that is a P layer connected to the base contact (B). And below that is an N+ layer connected (indirectly) to the collector (C). If you look at the vertical cross-section below the 'E', you can find the N-P-N layers that form the transistor.

The photo below shows one of the large output transistors used to drive the speaker. These transistors must produce a high-current output, so they are much larger than the regular transistors and have a different structure. Note the multiple interlocking "fingers" of the emitter and base, surrounded by the large collector. If you look back at the die photo, you can see two of these transistors filling the upper left part of the die.

A large, high-current NPN output transistor in the chip. The collector (C), base (B) and emitter (E) are labeled.

PNP transistors

The chip also uses PNP transistors, which have an entirely different construction, as shown in the diagram below.2 The PNP transistor has a small square emitter (P-silicon), surrounded by a square base region (N-silicon), which in turn is surrounded by the collector (P-silicon). (The emitter metal covers both the emitter and the base, but is only connected to the emitter.) These regions form a P-N-P sandwich horizontally (laterally), unlike the vertical structure of the NPN transistors. Note that although the base region physically surrounds the emitter, the metal connection to the base is further away; the base signal passes through the N and N+ regions, underneath the collector, to reach the base region.

A PNP transistor in the chip. Connections for the collector (C), emitter (E) and base (B) are labeled, along with N and P doped silicon. The base forms a ring around the emitter, and the collector forms a ring around the base.

How resistors are implemented in silicon

Resistors are an important component of analog chips. The photo below shows a long, zig-zagging resistor, connected to metal wiring at the bottom of the photo. (The resistor passes under the metal layer at several points.) The resistor is formed as a strip of P silicon. The resistance is proportional to the length of the resistor, so large-value resistors have a zig-zag shape to fit in the available space. Because resistors are relatively large and inaccurate, chip designs try to minimize the number of resistors required. Even so, an analog chip like this one requires numerous resistors.

A resistor inside the chip, along with the part number. The resistor is a zig-zagging strip of P silicon between two metal contacts. Parts of other resistors are visible at the left and right.

Capacitors

This chip has three large capacitors, one for each amplifier. The photo below shows one of the capacitors. The capacitors are simply a layer of metal over the underlying silicon, separated by a thin insulating oxide layer. In this chip, capacitors are used to ensure the stability of the amplifiers. Because they are large, the three capacitors are easy to spot in the chip die photo.

A capacitor on the chip.

The chip and the Game Boy Color

The role of the audio chip is to take the sound generated by the CPU and amplify it, either for the internal speaker or for external headphones. The photo below shows how the chip appears on the Game Boy motherboard. It also shows the speaker, headphone jack, and the volume control that adjusts the input levels to the amplifier chip.

The Game Boy Color motherboard with key components labeled. Photo from Evan-Amos.

The chip contains three audio amplifiers: one for the speaker and two for the headphones (because they have left and right channels). The design of these three amplifiers is almost identical, except the speaker amplifier uses larger transistors for more output power. The amplifiers use an op-amp, a type of amplifier that uses negative feedback to control the level of amplification. (The feedback resistors are internal to the chip, but it uses external capacitors for filtering.4)53

IC circuits: The current mirror

There are some subcircuits that are very common in analog ICs, but may seem mysterious at first. The current mirror is one of these. The idea is you start with one known current and then you can "clone" multiple copies of the current with a simple transistor circuit, the current mirror. A common use of a current mirror is to replace resistors. As explained earlier, resistors inside ICs are both inconveniently large and inaccurate. It saves space to use a current mirror instead of a resistor whenever possible. Also, the currents produced by a current mirror are nearly identical, unlike the currents produced by two resistors.

The following circuit shows how a current mirror implemented with PNP transistors.6 A reference current "I" passes through the transistor on the left. (In this case, the current is set by the resistor.) Since all the transistors have the same emitter voltage and base voltage, they source the same current, so the currents through each transistor match the reference current on the left. In this mirror, the three transistors on the right are connected so the total output is 3I. Thus, by using multiple transistors, currents can be generated with precise ratios.

Current mirror circuit. The transistors on the right each copy the current on the left.

Six transistors form a current mirror in the chip.

The photo above shows how that current mirror is implemented on the chip with six PNP transistors. Their bases are all connected (top thin metal strip) as are their emitters (wide central middle strip). The leftmost transistor has its base and collector connected, so it controls the current mirror.

IC component: The differential pair

The second important circuit to understand is the differential pair, the most common two-transistor subcircuit used in analog ICs. 7 The differential pair is the basis of an op-amp: it takes two voltages, computes their difference, and amplifies the result. The schematic below shows a simple differential pair. The resistor at the top provides a fixed current I, which is split between the two input transistors. If the input voltages are equal, the current will be split equally into the two branches (I1 and I2). If one of the input voltages is a bit higher than the other, the corresponding transistor will conduct more current, so one branch gets more current and the other branch gets less. The load resistors at the bottom produce an output voltage depending on the current.

Schematic of a simple differential pair circuit. The current source sends a fixed current I through the differential pair. If the two inputs are equal, the current is split equally.

To improve performance, a differential pair is implemented as shown below. A current mirror at the top provides the fixed current. The two load resistors at the bottom of the differential pair have been replaced by load transistors. The output is taken from one branch of the differential pair and fed into a transistor for more amplification. The output then goes to the amplifier's high-current output stage (not shown). A compensation capacitor stabilizes the circuit.

A differential pair as implemented in the chip.

The diagram below shows the implementation of a differential pair in silicon, corresponding to the schematic above. The circuit has three larger PNP transistors above and three smaller NPN transistors. By following the metal, it can be seen how the circuit corresponds to the schematic.

A differential pair in the headphone amp.

Layout of the chip

The diagram below shows the main functional blocks of the chip. The upper-left part of the chip has the two large driver transistors for the speaker output (one to pull the signal low and the other to pull the signal high). The remaining circuitry for the speaker amplifier includes the differential pair, current mirrors, and other circuits. The headphone amplifier consists of two nearly-identical blocks: one for the left channel and one for the right. The circuitry for the current sources and current mirrors is shared by both headphone channels. The lower-left of the chip contains digital logic to enable the speaker amp or the headphone amp, depending if a headphone is plugged into the jack and depending on the enable pin.

The chip with pins and key functional blocks labeled.

Zooming in on the upper-right corner shows the amplifier circuitry for one of the headphone channels. The input signal goes through the differential stage (discussed earlier) and amplification, before going to the output stage, which consists of multiple transistors. Although the speaker amp uses large output transistors, the headphone amp uses 10 regular transistors in parallel; one set to pull the output high and the second to pull the output low. Resistors are used to generate the negative feedback signals for the amplifier. Note that power and ground use much thicker metal traces to support the necessary current.

The headphone amplifier, right channel.

I created a complete schematic of the chip here. I won't explain it in detail here, since its op-amps use a standard architecture, but I'll point out some highlights.9 The headphone amplifiers and the speaker amplifier have very similar designs, but there are a few differences. Most notably, the speaker transistors are larger because the speaker requires more current: not just the output transistors, but many of the other transistors in the circuit. The current mirrors are also structured slightly differently between the headphone amplifiers and the speaker.8 Unlike many amplifier chips, this chip doesn't appear to have any protection if the output is short-circuited.

Part of the reverse-engineered schematic for the AMP-MGB chip. Click here for the full schematic.

Conclusion

This amplifier chip from 1998 has about 100 transistors and is simple enough that the circuitry can be traced out under a microscope. (In comparison, a Pentium II processor from the same time had 7.5 million transistors.) The chip illustrates important analog design functions such as the differential pair and current mirror, and how they can be combined to build an amplifier. People have reverse-engineered many Nintendo chips to help build Nintendo emulators. I don't think knowing the audio chip circuitry helps with emulation, but it's interesting to see how it is constructed.

I announce my latest blog posts on Twitter, so follow me @kenshirriff for future articles. I also have an RSS feed. My KiCad files for the schematic are on Github. Thanks to John McMaster for providing the chip photos; his page is here.

Notes and references

The audio chip is labeled AMP MGB, presumably for "amplifier, Mini-Game Boy". The part number on the 18-pin chip is IR3R53N.

The IR3R53N chip. Photo courtesy of John McMaster.

↩
On this chip, the NPN transistors and PNP transistors look superficially similar, but the PNP transistors are considerably larger. The PNP transistors can also be distinguished by the wide base ring under the square emitter metal. ↩
One interesting thing about the chip is that it has three ground pins (1, 2, and 11), and two power pins (4 and 14). By examining the chip, we can why there are multiple pins. Most of the chip uses the pin 1 ground. The pin 2 ground is used solely for the speaker output transistor. The pin 14 ground is used by the headphone driver circuitry. The separate grounds prevent transients from the high-current output transistors from affecting the rest of the chip. For the power pins, most of the chip uses pin 4, while pin 14 feeds the various current sources. This ensures the current sources remain stable. ↩
I believe the three external filter capacitors implement a high-pass filter for each channel. ↩↩
The excerpt from the Game Boy Color Schematic below shows how the audio chip is connected. The Game Boy CPU chip provides left and right audio channels to the audio chip inputs (LIN and RIN). The chip provides a single-channel speaker output SPKOUT. It also provides two-channel headphone output: HPLOUT and HPROUT. Each channel has an external capacitor attached for filtering: SPKBC, HPLBC, and HPRBC.4 When headphones are plugged in, this signals the SW pin, causing the chip to switch from the speaker output to the headphone outputs. The SD pin allows the chip to be disabled, but is unused.

Schematic showing the audio chip's role in the Game Boy Color. From Consoles TechWiki.

On the left, the chip receives the audio inputs from the CPU, via a volume control. On the right, the chip is connected to the speaker and headphone jack. The filter capacitors are also connected on the right. The SW input is connected to a switch in the headphone jack; it is normally grounded, but disconnected when headphones are inserted into the jack. ↩
For more information about current mirrors, check Wikipedia or chapter 3 of Designing Analog Chips. ↩
According to Analysis and Design of Analog Integrated Circuits differential pairs are "perhaps the most widely used two-transistor subcircuits in monolithic analog circuits" (p214). For more information about differential pairs, see Wikipedia or chapter 4 of Designing Analog Chips. ↩
The headphone amp or speaker amp are disabled by shutting down their respective current mirrors. Some of the current mirrors remain partially powered, rather than shutting down completely. ↩
The amplifiers use a fairly complex scheme to bias and drive the two output transistors. I'll explain my understanding of it; follow along with the schematic. A standard approach is to use diodes to achieve the biasing. However, this chip uses a complex current mirror setup. Looking at the speaker amplifier circuit, transistor Q128 provides the main amplification. The current sunk by this transistor controls the output. The output pull-up transistor Q126 receives base current from current sources Q118 and Q119. This base current can instead flow through Q124 and Q128 if Q128 is conducting, shutting off Q126. At the same time, if Q128 is conducting, the current through it will be (partially) mirrored by Q122, causing current flow through Q121 to turn on pull-down output transistor Q125. To turn off Q125, this current will flow through Q123 instead. To summarize, if Q128 is conducting, Q125 turns on and the output is pulled low. If Q128 is not conducting, Q126 turns on and the output is pulled high. In between, the output will be linear. (I couldn't find references to this approach anywhere, so please let me know if you have more details about this amplifier configuration.) ↩

Inside the Am2901: AMD's 1970s bit-slice processor

You're probably familiar with modern processors made by Advanced Micro Devices. But AMD's processors go back to 1975, when AMD introduced the Am2901. This chip was a type of processor called a bit-slice processor: each chip processed just 4 bits, but multiple chips were combined to produce a larger word size. This approach was used in the 1970s and 1980s to create a 16-bit, 36-bit, or 64-bit processor (for example), when the whole processor couldn't fit on a single fast chip.1

Die photo of the Am2901 chip. This image shows the metal layers of the chip; the silicon is underneath. Around the edges of the die, tiny bond wires connect the chip to the external pins. (Click the photo for a high-res image.)

The Am2901 chip became very popular, used in diverse systems ranging from the Battlezone video game2 to the VAX-11/730 minicomputer, from the Xerox Star workstation to the F-16 fighter's Magic 372 computer.3 The fastest version of this processor, the Am2901C, used a logic family called emitter-coupled logic (ECL) for high performance. In this blog post, I open up an Am2901C chip, examine its die under a microscope, and explain the ECL circuits that made its arithmetic-logic unit work.

The bit-slice processor

You might wonder how multiple processor chips could work together to support arbitrary word lengths. The key is that a bit-slice processor is a building block, rather than a complete processor,6 and requires separate circuitry to decode instructions and control the system.4 The bit-slice processor chips performed arithmetic or logic operations on the data and contained registers, while a control chip (such as the Am2910) told the bit-slice chips what to do. Each machine instruction was broken down into smaller steps called micro-instructions which were stored in a microcode ROM. Note that the computer's instruction set was defined by the microcode, not by the Am2901, so almost any instruction set could be supported.5

Bit-slice processors fell in between using a microprocessor chip and building a computer out of simple TTL chips. Building a processor out of TTL chips was much faster than a microprocessor at the time, but required boards full of chips. Using a bit-slice processor kept the speed advantage, but reduced the chip count. The bit-slice processor also provided much more flexibility than a microprocessor, allowing the designer to customize the instruction set and other architectural features.

An overview of the die

The photo below shows the Am2901 die, with key functional blocks labeled.7 For this photo, I removed the metal layers so you can see the silicon and the transistors.8 The largest functional block of the chip is the register memory in the center. The chip has sixteen 4-bit registers. (If you look closely, you can see 16 columns and 4 rows in the memory array.) To the left and right of the memory block are the memory driver circuits that read and write the memory.

Die photo of the Am2901 chip with main functional blocks labeled. The circuitry around the outside largely consists of buffers to convert between the external TTL signals and the internal ECL signals.

The chip's arithmetic-logic unit (ALU) performs arithmetic operations (addition or subtraction) or logical operations (And, Or, Exclusive-or). The first section of the ALU is a large block in the lower left of the chip; it consists of four rows since it is a 4-bit ALU. The ALU also contains logic to generate the carry outputs for addition, using a fast technique called carry lookahead.9 Next, the ALU uses the carry values to generate the sum in parallel. Finally, the output circuitry processes and buffers the sum and sends it to the output pin.

The empty squares near the edge of the chip are the pads that connect the chip to the outside world. Next to the pads is the circuitry to send and receive signals. In particular, since the chip communicates with external circuits using TTL signals, but uses ECL circuitry inside, this circuitry converts between TTL and ECL voltages.

The chip has two shifters that can shift a word one bit to the left or right. The Q register is a 4-bit register built from flip flops. Finally, the reference voltage circuitry generates the precision voltage references required by the ECL logic.

How to see the die

To see what's inside a chip usually requires dissolving the plastic case with dangerous acids. However, I bought an Am2901 chip that came in a ceramic package instead of plastic. By simply tapping the chip's seam with a chisel, I popped the two halves of the chip apart, exposing the die inside. The silicon die is the small square in the center of the chip. Thin bond wires connect the pads on the die to the lead frame, which goes to the 40 external pins of the chip.

The Am2901 after separating the two halves of the ceramic package.

I used a special type of microscope called a metallurgical microscope to take high-resolution photographs of the chip. The photograph below shows the AMD logo. Above is a bond wire connected to a pad. The chip has two layers of metal wiring up the circuitry, visible to the right.

I stitched together multiple microscope photos to create the high-resolution images. I describe my process for creating die photos in more detail here. I then removed the metal layers8 and created another set of images of the silicon.

The photo below is a closeup of the silicon, showing four transistors and three resistors. Parts of the silicon are "doped" to give them different properties, and the different doping regions are visible under the microscope. This chip is built with bipolar NPN transistors, different from the MOS transistors in modern computers. The transistor on the left has the base (P-type silicon), emitter (N-type silicon), and collector (N-type silicon) labeled. The whiteish rectangles are the contacts between the silicon and the metal layer which was on top before being removed. The two transistors on the right share a single large collector. On this chip, it is common for multiple transistors to share the collector.

A closeup of the die with metal removed, showing transistors and resistors.

At the bottom are three resistors. A resistor is produced by doping the silicon to increase its resistance. Resistors on integrated circuits generally have poor accuracy. They are also relatively large; these ones are the same size as transistors, while other resistors are even larger. For these reasons, integrated circuit designs try to minimize the number of resistors.

Emitter-coupled logic

Logic circuits can be built in a wide variety of ways. Almost all computers today use a logic family called CMOS (complementary metal-oxide-semiconductor), building gates out of MOS transistors. In the minicomputer era, TTL (transistor-transistor logic) was very popular. Emitter-coupled logic (ECL) was a faster,10 but less common logic family. A disadvantage of ECL was its higher power consumption. (Circuitry in the Cray-2 supercomputer (1985) had to be immersed in Fluorinert coolant because the ECL gates gave off so much heat.)

The first versions of the Am2901 used TTL logic, but in 1979 AMD introduced a faster version, the Am2901C. The Am2901C used ECL logic internally for speed, but supported TTL voltages externally, allowing it to be easily used in TTL computers. The Am2901C, the ECL version, is the one in this blog post.

ECL is based on a differential pair, similar to the circuit inside an op-amp. The idea behind a differential pair (below) is that a fixed current flows through the circuit. If the left input is a higher voltage than the right, the left transistor will turn on and most current will flow through the left branch. Conversely, if the right input is a higher voltage than the left, the right transistor will turn on and most current will flow through the right branch. (Note that the emitters of the transistors are coupled together, thus the name emitter-coupled logic.)

A differential pair. If the left input (red) is higher, most of the current flows along the left path. Conversely, if the right input (blue) is higher, most of the current flows along the right path.

A few modifications turn the differential pair into an ECL gate. First, the voltage into one branch is fixed at a reference voltage, midway between the "0" level and the "1" level. Thus, if the input is higher than the reference voltage, it will be considered a "1", and lower will be a "0". Next, an output transistor (green) is attached to a branch to produce an output by buffering the branch's voltage. The circuit below is an inverter, since if the input is high, the current through the left resistor will pull the output low. To improve performance, the bottom resistor has been replaced with a current sink (purple), built from a transistor and a resistor.11

An ECL inverter. This is based on the differential pair with an output transistor added (green) and the bias resistor replaced with a constant-current circuit (purple). The upper-right resistor can be omitted since no output is connected to it.

A more complex ECL gate can be created by adding more inputs. In the circuit below, a second input transistor (2) has been added in parallel with transistor 1. The current will go through the resistor R1 if input A or input B are 1 (i.e. higher than the reference voltage). In this case, the output is pulled low, creating a NOR gate. Other circuit configurations can implement AND gates, XOR gates, or more complex logic circuits.12

An ECL NOR gate as implemented on the chip.

The schematic above shows a NOR gate as implemented on the chip. The photos below show the corresponding physical layout of the gate. On the left is the silicon layer of the die, showing the transistors and resistors. The photo on the right shows the metal wiring for the same part of the chip. At the top of the photo, transistors 1 and 2 receive the inputs to the gate. Each transistor has its base at the top and emitter in the middle. The transistors share a collector, the white rectangle below. The resistors R1 and R2 are the indicated rectangles of silicon. The transistors in the middle (including 3 and 4) all share a collector, connected twice to the positive voltage. (The non-numbered transistors and resistors are parts of other gates.)

A NOR gate as implemented on the Am2901 die.

Looking at the wiring on the right, the top layer provides horizontal wiring for the positive supply voltage, reference voltages, the current sink voltage V_CS, and the negative (ground) supply voltage. (Note that the suppy and ground are much wider to support higher current.) Underneath this is the wiring connecting the transistors together. At the top, the inputs A and B are wired to the transistor bases. It's harder to trace out the other wiring as it is obscured by the top layer. But, for instance, you can see the connection between transistor 4, the collector of transistors 1 and 2, and R1. By studying the die photos carefully, one can determine all the wiring and reverse-engineer the chip's logic.

The Arithmetic-Logic Unit (ALU)

The arithmetic-logic unit (ALU) in the Am2901 chip performs 4-bit arithmetic or logical operations. It supports 8 different operations: addition, subtraction, and bitwise logic operations.17 (Note that it does not perform multiplication or division.)

The block diagram below shows the structure of the Am2901's ALU. First, a selector (multiplexer) selects the two inputs to the ALU from the potential sources. "D" is the value fed into the chip's data pins, typically the processor's data bus. (This data first goes through circuitry to convert the external TTL voltage levels used to the ECL voltage levels inside the chip.) "A" is the value of one of the 16 entries in the chip's register file, selected by pins A0-A3, and "B" is similar. The constant value 0 can be fed into the ALU. Finally, "Q" is the contents of the Q register (an extra register, separate from the register file). The multiple data sources give the chip a lot of flexibility.

Block diagram of the Am2901 ALU, from the datasheet. The ALU performs one of eight functions on its two 4-bit inputs: R and S. At the right are various outputs from the chip: G, P, carry out, sign, overflow, and zero test.

The two selected values (labeled R and S) are fed into the ALU, which performs the selected operation, yielding the result (F). The ALU also takes a carry-in value and produces a carry-out value (C_N+4); these allow multiple ALUs to be combined for larger words. The G and P outputs are used for carry lookahead, while the other sign, overflow, and zero outputs can be used as condition codes in a processor.

I'll give a brief explanation of the ALU circuitry, starting with the selector. The first two selector boxes below (D and A) select the ALU's first argument, while the last three (A, Q, and B) select the ALU's second argument. Each selector box implements the function Select · (Value ⊕ Invert), where Value is a potential input value, Select is 1 to select that value, and Invert is 1 to invert the value. (Since the ALU is four bits wide, four bits are selected. Each selector box is implemented with four ECL gates; see the footnote for details.13) By enabling one of the Select lines, the desired value is selected. If no Select line is enabled, the value to the ALU is 0.12 Note that the selector can also invert the input; the chip performs subtraction by adding the inverted value.

The first part of the ALU consists of four horizontal layers, one for each bit.

Once the two ALU inputs have been selected, the ALU computes "Propagate" (P) and "Generate" (G) bits for each pair of input bits. This is part of the carry lookahead,9 used for high-speed addition.

The photo below indicates the remaining parts of the ALU circuitry. (For variety, this die photo shows the metal layer, while the previous showed silicon.) The P and G signals from the previous circuit go to two blocks of carry computation circuitry. The lower carry block computes external P, G, and carry signals that provide carry lookahead across multiple chips; this allows fast addition for larger words.14 The upper carry block computes the carries that are used internally. The "sum" circuitry computes the sum for each bit using the carry, P, and G values. The important thing is that the sum for each bit can be computed in parallel, thanks to the carry lookahead. Finally, the output circuitry converts the internal ECL signals to TTL signals and drives the four output pins.15

The remaining ALU circuitry.

The chip uses some interesting techniques to reuse the adder hardware for its eight operations. The selector circuit described earlier can optionally complement its input. This is used for subtraction, as well as for some logic functions. To perform logic operations (instead of addition/subtraction), the carry computation is disabled. (For a logic operation, each bit position is unaffected by what happens in other bit positions.) Finally, the adder's EXCLUSIVE OR circuit is turned into AND by forcing the P signals high.16 Thus, instead of using eight different circuits for the ALU's eight operations, the chip uses a single circuit with a few carefully-chosen tweaks. 17

Conclusion

The Am2901C chip is interesting because it is an example of high-speed ECL circuitry, a relatively uncommon logic family. The chip's ALU is spread across the lower half of the chip, implementing eight different functions and using carry lookahead for high performance. Although the chip is complex, it can be reverse-engineered with careful examination under a microscope.

Bit-slice processors such as the Am2901 were used in minicomputers and many other systems in the 1970s and 1980s. Eventually, though, improvements in CMOS technology permitted a fast processor to be implemented on a single chip, rendering the bit-slice processor obsolete. While the Am2901 had maybe a thousand transistors and ran at 16MHz, AMD now makes processors that have billions of transistors and run at 4GHz.

Follow me @kenshirriff for more reverse engineering. I also have an RSS feed.

Notes and References

Microprocessors on a single chip existed at the time, but they used MOS transistors that were slower than the bipolar transistors used in most minicomputers. They also generally had smaller word sizes. Eventually, CMOS processors became faster than bipolar processors; CMOS is what almost all computers now use. ↩
The Atari Battlezone documentation (p40) doesn't refer to the Am2901 explicitly, but gives it the Atari part number 137004-001 and calls it a "Transistor Array". Moreover, the schematic (p9) obfuscates the Am2901 pinout, showing 20 address pins and 8 data pins, so it looks like a ROM. (In contrast, all the 7400-series chips are described accurately.) Perhaps Atari was attempting to prevent cloning of the video games by hiding the identity of a few key chips. ↩
A popular alternative to the Am2901 in many minicomputers was the 74181 ALU chip. This provided arithmetic and logic functions, but not the registers of the Am2901. ↩
Some complications arise in bit-slice processors, since the slices aren't entirely independent. For instance, when adding two numbers, the carry from one slice needs to be passed into the next slice. Operations such as determining the sign of a number or testing if a number is zero, also require the slices to cooperate. The Am2901 has outputs to support these functions. ↩
For a detailed discussion of bit-slice processors, see Introduction to designing with the Am2901. ↩
Is the Am2901 a microprocessor? In my view, the Am2901 is part of a processor and not a complete microprocessor, but it depends on your definition of a microprocessor. I've written a lot more about these definitions in The surprising story of the first microprocessors. Interestingly, the Soviet Union leaned much more towards bit-slice processors (instead of single-chip microprocessors) than the US. While "microprocessor" usually referred to a single-chip processor in the West, bit-slice and single-chip microprocessors weren't really distinguished in the Soviet Union. (According to "Microcomputing in the Soviet Union and Eastern Europe".) ↩
A full block diagram of the Am201 is below. (Click this or any other image for a larger version.) Note that the multiplexers above the RAM and the Q register implement a 1-bit left shift or right shift; they are labeled as "shifters" on the die photo. The multiplexers above the ALU in the block diagram are physically part of the ALU circuitry on the die.

Block diagram of the Am2901, from the datasheet.

↩
To remove the metal layers from the chip, I alternated applications of Armour Etch to remove the silicon dioxide layer and hydrochloric acid (pool acid) to remove metal. ↩↩
Carry lookahead uses "Generate" and "Propagate" signals to determine if each bit position will always generate a carry or will propagate an incoming carry. For instance, if you're adding 0+0+C (where C is the carry-in), there's no way to get a carry out from that addition, regardless of what C is. On the other hand, if you're adding 1+1+C, there will always be a carry out generated, regardless of C. Finally, for 0+1+C (or 1+0+C), there will be a carry out propagated if there is a carry in. Putting this all together, for each bit position you create a G (generate) signal if both bits are 1, and a P (propagate) signal unless both bits are 0, using simple logic gates.

The formula for computing the carry depends on the bit position. For instance, consider the carry from bit 0 to bit 1. This carry will occur if if P₀ is set (i.e. a carry is generated or propagated) and there is either a carry-in or a generated carry. So C₁ = P₀ AND (C_in OR G₀). Higher-order carries have more cases and are progressively more complicated. For example, consider the carry in to bit 2. First, P₁ must be set for a carry out from bit 1. As well, a carry either was generated by bit 1 or propagated from bit 0. Finally, the first carry must have come from somewhere: either carry-in, generated from bit 0 or generated from bit 1. Putting this all together produces the function used by the Am2901: C₂ = P₁ AND (G₁ OR P₀) AND (C₀ OR G₀ OR G₁). Formulas for the various carries and external P, G, and carry are given in the datasheet, Figure 9. ↩↩
ECL gates obtained much of their speed advantage because the transistors were not completely turned on (i.e. saturated). This allowed the transistors to switch the current path rapidly. Additionally, the difference between a "0" voltage and a "1" voltage was small (about 0.8) volts, so signals could switch between the two voltages quickly. In comparison, TTL gates typically had a difference of about 3.2 volts between a "0" and a "1", requiring more time to switch. (Signals could typically switch at about 1 volt per nanosecond, so a larger voltage swing caused nanoseconds of delay.) On the other hand, the small voltage swings of ECL made the circuits more sensitive to electrical noise. ↩
The current sink at the bottom of the ECL gate provides an essentially-constant current, controlled by the input voltage V_CS. This is an improvement over a simple resistor, since the current through the resistor varies based on the voltage across it, which depends on the input voltages. The current sink circuit also saves space by using a smaller resistor. ↩
The outputs of the ALU select gates are connected together with a wired-OR. The unselected values output 0, so the value on the wire is the desired one. In this way, the circuit implements a multiplexer with minimal circuit. ↩↩
The diagram below shows the AND-XOR circuit used in the AM2901 ALU that implements A' · (B ⊕ C). I'll briefly explain its operation. If input A is high, current flows through the leftmost transistors, pulling the output low. If B and C are both high, current through the left B and C transistors pulls the output low. If B and C are both low, current through the Vref transistors pulls the output low. If B and C are different, the current is sourced from on the "+" transistors so the output remains high. The key point is that a single ECL gate can implement a complex function; in contrast, XOR is difficult with most logic families. (I find ECL logic reminiscent of 1920s-era relay logic because it switches between two paths, rather than switching on or off.)

Schematic of an ECL AND-XOR circuit. It is slightly simplified: the input voltage levels for the lower half need to be a diode drop lower than the upper inputs. I'm not sure of the purpose of the horizontal resistor.

The only reference I've found for complex ECL circuits is The VLSI Handbook chapter 38. ↩
The carry lookahead techniques can be implemented across multiple chips for fast additions larger than 4 bits. Each chip generates a Generate and Propagate signal, indicating if that chip will generate a carry or propagate a carry-in. These signals are combined by a look-ahead carry generator chip such as the Am2902 look-ahead carry generator chip. ↩
The output circuitry also includes multiplexers; the chip can either output the ALU result or the A register value. ↩
The chip uses the P and G values to generate the sum of inputs R and S with carry-in C. The sum is (R ⊕ S ⊕ C)', computed as ((P' ∨ G) ⊕ C)', where P = R∨S and G = R•S. If P is forced to 1, (P' ∨ G) reduces to G, which is R•S. Thus, by changing P, the same circuit can be used to compute the AND of the inputs R and S. ↩

The table below shows the eight operations that the ALU can compute. Three of the instruction bits fed into the chip are used to select the operation: I₅, I₄, and I₃. The "Function" column in the table shows the function as documented, while the "Computation" column shows how each bit of the function is computed internally. First, note that the operations all boil down to EXCLUSIVE OR (⊕) or AND (∧). Addition is performed by bitwise EXCLUSIVE OR of the two arguments and the carry bits. Subtraction is performed by complementing an argument and then adding. For example, adding the complement of R (R') is the same as subtracting R. Bit I₃ complements R, while bit I₄ complements S. Note that the EXCLUSIVE OR operations (EXOR and EXNOR) use the same circuitry as addition, but carry computation is blocked. The AND operation is performed by blocking the G signal. Finally, OR is computed using De Morgan's law, which shows that R' ∧ S' = (R ∨ S)'. The point of this is that the Am2901 doesn't need separate circuitry for addition, subtraction, AND, OR, and EXCLUSIVE OR, but reuses most of the circuitry.

Mnemonic	I₅	I₄	I₃	Function	Computation
ADD	0	0	0	R Plus S	R ⊕ S ⊕ Carry
SUBR	0	0	1	S Minus R	R' ⊕ S ⊕ Carry
SUBS	0	1	0	R Minus S	R ⊕ S' ⊕ Carry
OR	0	1	1	R OR S	(R' ∧ S') ⊕ 1
AND	1	0	0	R AND S	R ∧ S
NOTRS	1	0	1	R' AND S	R' ∧ S
EXOR	1	1	0	R EX OR S	R ⊕ S' ⊕ 1
EXNOR	1	1	1	R EX NOR S	R' ⊕ S' ⊕ 1

↩↩

Ken Shirriff's blog

Tiny transformer inside: Decapping an isolated power transfer chip

The secondary die and its components

The primary die

How the chip works

Using the chip

Conclusion

Notes and references

Reverse-engineering the audio amplifier chip in the Nintendo Game Boy Color

NPN transistors

PNP transistors

How resistors are implemented in silicon

Capacitors

The chip and the Game Boy Color

IC circuits: The current mirror

IC component: The differential pair

Layout of the chip

Conclusion

Notes and references

Inside the Am2901: AMD's 1970s bit-slice processor

The bit-slice processor

An overview of the die

How to see the die

Emitter-coupled logic

The Arithmetic-Logic Unit (ALU)

Conclusion

Notes and References

Get new posts by email: