Improvements to the Xerox Alto Mandelbrot drop runtime from 1 hour to 9 minutes

Last week I wrote a Mandelbrot set program for the Xerox Alto, which took an hour to generate the fractal. The point of this project was to learn how to use the Alto's bitmapped display, not make the fastest Mandelbrot set, so I wasn't concerned that this 1970s computer took so long to run. Even so, readers had detailed suggestions form performance improvements, so I figured I should test out these ideas. The results were much better than I expected, dropping the execution time from 1 hour to 9 minutes.

The Mandelbrot set, generated by a Xerox Alto computer.

The Mandelbrot set, generated by a Xerox Alto computer.

The Alto was a revolutionary computer designed at Xerox PARC in 1973 to investigate personal computing. It introduced high-resolution bitmapped displays, the GUI, Ethernet and laser printers to the world, among other things. For my program I used BCPL, the primary Alto programming language and a precursor to C. In the photo above, the Alto computer is in the lower cabinet. The black box is the 2.5 megabyte disk drive. The Alto's unusual portrait display and an early optical mouse are on top.

Easy optimization: mirror the Mandelbrot set

The Mandelbrot set is a famous fractal, generated by a simple algorithm. Each point on the plane represents a complex number c. You repeatedly iterate the complex function f(z) = z^2 + c. If the value diverges to infinity, the point is outside the set. Otherwise, the point is inside the set and the pixel is set to black.

Since the Mandelbrot set is symmetrical around the X axis, a simple optimization is to draw both halves at the same time, cutting the time in half. (This only helps if you're drawing the whole set; this doesn't help if you're zoomed in.) I implemented this, cutting the time down to about 30 minutes. The image below shows the mirrored appearance midway through computation.

Drawing both halves of the Mandelbrot set simultaneously doubles the performance.

Drawing both halves of the Mandelbrot set simultaneously doubles the performance.

Improving the code

Embedded programmer Julian Skidmore had detailed comments on how to speed up my original code. I tried his changes and the time dropped from 30 to 24 minutes. Some of his changes were straightforward - calculating the pixel address in memory incrementally instead of using multiplication, and simplifying the loop counting. But one of his changes illustrates how primitive the Alto's instruction set is.

Quick background: my Mandelbrot program is implemented in BCPL, a programming language that was a precursor to C. The program is compiled to Nova machine code, the instruction set used by the popular Data General Nova 16-bit minicomputer. The Alto implements the Nova instruction set through microcode.

Since the Xerox Alto doesn't support floating point, I used 16-bit fixed point arithmetic: 4 bits to the left of the decimal point and 12 bits to the right. After multiplying two fixed point numbers with integer multiplication, the 32-bit result must be divided by 2^12 to restore the decimal point location. Usually if you're dividing by a power of 2, it's faster to do a bit shift. That's what I originally did, in the code below. (In BCPL, % is the OR operation, not modulo. ! is array indexing.)

let x2sp = (x2!0 lshift 4) % (x2!1 rshift 12)

Unfortunately this turns out to be inefficient for a couple reasons. Modern processors usually have a barrel shifter, so you can efficiently shift a word by as many bits as you want. The Alto's instruction set, however, only shifts by one bit. So to right shift by 12 bits, the compiled code calls an assembly subroutine (RSH) that does 12 separate shift instructions, much slower than the single instruction I expected. The second problem is the instruction set (surprisingly) doesn't have a bitwise-OR instruction, so the OR operation is implemented in another subroutine (IOR).1 I took Julz's suggestion and used the MulDiv assembly-language function to multiply two numbers and divide by 4096 instead of shifting. It's still not fast (since the Alto doesn't have hardware multiply and divide), but at least it reduces the number of instructions executed.

Shutting off the display

One way to speed up the Alto is to shut off the display.2 I tried this and improved the time from 24 minutes to 9 minutes, a remarkable improvement. Why does turning off the display make such a big difference?

One of the unusual design features of the Alto is that it performed many tasks in software that are usually performed in hardware, giving the Alto more flexibility. (As Alan Kay put it, "Hardware is just software crystallized early.") For instance, the CPU is responsible for copying data between memory and the disk or Ethernet interface. The CPU also periodically scans memory to refresh the dynamic RAM. These tasks are implemented in microcode, and the hardware switches between tasks as necessary, preempting low priority tasks to perform higher-priority tasks. Executing a user program has the lowest priority and runs only when there's nothing more important to be done.

All this task management was done in hardware, not by the operating system. The Xerox Alto doesn't use a microprocessor chip, but instead has a CPU built out of three boards of simple TTL chips. The board below shows one of the CPU boards, the control board that implements the microcode tasks and controls what the CPU is doing. It has PROMs to hold the microcode, 64-bit RAM chips (yes, just 64 bits) to remember what each task is doing, and more chips to determine which task has the highest priority.

Control board for the Xerox Alto. Part of the CPU, this board holds microcode and handles microcode tasks.

Control board for the Xerox Alto. Part of the CPU, this board holds microcode and handles microcode tasks.

The task that affects the Mandelbrot program is the display task: to display pixels on the screen, the CPU must move the pixels for each scan line from RAM to the display board, 30 times a second, over and over. During this time, the CPU can't run any program instructions, so there's a large performance impact just from displaying pixels on the screen. Thus, not using the display causes the program to run much, much faster. I still set the Mandelbrot pixels in memory though, so when the program is done, I update the display pointer causing the set to immediately appear on the display. Thus, the Mandelbrot set still appears on screen; you just don't see it as it gets drawn.

Microcode: the final frontier

The hardest way to optimize performance on the Alto is to write your own microcode. The Alto includes special microcode RAM, letting you extend the instruction set with new instructions. This feature was used by programs that required optimized graphics such as the Bravo text editor and the Draw program. Games such as Missile Command and Pinball also used microcode for better performance. Writing the Mandelbrot set code in microcode would undoubtedly improve performance. However, Alto microcode is pretty crazy, so I'm not going to try a microcode Mandelbrot.

Conclusion

After writing a Mandelbrot program for the Xerox Alto, I received many suggestions for performance improvements. By implementing these suggestions, the time to generate the Mandelbrot set dropped from one hour to 9 minutes, a remarkable speedup. The biggest speedup came from turning off the display during computation; just putting static pixels on the screen uses up a huge amount of the processing power. My improved Mandelbrot code is on github.

My goal with the Mandelbrot was to learn how to use the Alto's high-resolution display from a BCPL program. Using what I learned with the Mandelbrot, I wrote a program to display images; an example is below.3

The Xerox Alto displaying an image of the Xerox Alto displaying...

The Xerox Alto displaying an image of the Xerox Alto displaying...

Notes and references

  1. The Alto has an AND machine instruction but not an OR instruction, so the OR operation is performed by complementing an argument, ANDing, and complement-adding the complement. I.e. ab plus b. 

  2. Strictly speaking, I left the display on; it just wasn't displaying anything. The Alto uses a complex display system with a display list pointing to arbitrarily-sized blocks of pixels. (For instance, when displaying text, each block is a separate text line. Scrolling the screen just involves updating the list pointers, not moving actual data.) Thus, I could set the display list to NULL while rendering the Mandelbrot into memory. Then when the Mandelbrot was done, I simply updated the display list to make the image appear. 

  3. The recursive picture-in-picture effect is known as the Droste effect. After making this picture, I learned that generating the Droste effect on old computers is apparently a thing

One-hour Mandelbrot: Creating a fractal on the vintage Xerox Alto

I wrote a short program to generate the Mandelbrot set on the Xerox Alto, a groundbreaking minicomputer from the 1970s. The program, in the obsolete BCPL language, ran very slowly—taking almost exactly an hour—but the result below shows off the Alto's monochrome bitmapped display. (Bitmapped displays were a rarity at the time because memory was so expensive.)

The Xerox Alto took an hour to generate the Mandelbrot set.

The Xerox Alto took an hour to generate the Mandelbrot set.

The Alto was a revolutionary computer designed at Xerox PARC in 1973 to investigate personal computing. It introduced the GUI, Ethernet and laser printers to the world, among other things. In the photo above, the Alto computer itself is in the lower cabinet. The Diablo disk drive (with the 1970s orange stripe) uses a removable 14 inch disk pack that stores 2.5 megabytes of data. (A bunch of disk packs are visible behind the Alto.) The Alto's display is bitmapped with 606x808 pixels in an unusual portrait orientation, and the optical mouse is next to the display.

Last year Y Combinator received an Alto from computer visionary Alan Kay and I'm helping restore the system, along with Marc Verdiell, Luca Severini and Carl Claunch. My full set of Alto posts is here and Marc's videos are here. I haven't posted an update for a while, but now I can write new programs and download them to the Alto using the Living Computer Museum's Alto file system implementation and gateway to the Alto's 3Mb Ethernet. I decided to start with the Mandelbrot set to take advantage of the Alto's high resolution display.

Marc's latest video shows the Mandelbrot programming running on the Alto.

The Mandelbrot program

The Mandelbrot set algorithm is fairly simple. Each point on the plane represents a complex number c. You repeatedly iterate the complex function f(z) = z^2 + c. If the value diverges to infinity, the point is outside the set. Otherwise, the point is inside the set and the pixel is set to black. Setting the pixel is tricky because the Alto doesn't have a graphics API; you need to determine which bit in memory to set.4

Since the Xerox Alto doesn't support floating point1, I needed a way to represent the numbers with its 16-bit word. I use fixed point arithmetic: 4 bits to the left of the decimal point and 12 bits to the right.2 For instance, the number 1.25 is represented in 16 bits as 1.25*2^12 = 0x1400. These fixed point numbers can be added with standard integer addition. After multiplying two fixed point numbers with integer multiplication, the 32-bit result must be divided by 2^12 (i.e. shifted right by 12) to restore the decimal point location.3

The code (above) is written in BCPL, the main language used on the Alto. BCPL is a precursor to C and many features of C are clearly visible in BCPL: everything from lvalues and rvalues to the ternary operator. You can think of BCPL as C without types; the only BCPL types are 16-bit words along with C-like structs, unions and bitfields. BCPL may look unfamiliar at first, but the code above should be clear if you consider the following syntax differences with C:

  • Blocks are indicated with [ and ] instead of { and }.
  • Indexing is with a!1 instead of a[1].
  • And, Or, and Shift bit operations are &, %, and lshift/rshift.
  • Variable definitions use let.
  • Arrays are defined with vec.

More information on BCPL is in the BCPL Reference Manual and my earlier article on using BCPL with the Alto.

The Xerox Alto, a few minutes into generation of the Mandelbrot set.

The Xerox Alto, a few minutes into generation of the Mandelbrot set.

Why is the Alto so slow?

Running the Mandelbrot set illustrates the amazing improvement in computer speed since the Alto was created in 1973 and the huge changes in computer architecture. On a modern computer, a Javascript program can generate the Mandelbrot set in a fraction of a second, while the Alto took an hour. The first factor is the Alto's slow clock speed of 5.88 MHz, hundreds of times slower than a modern processor. In addition, the Alto doesn't execute machine instructions directly, but uses a relatively inefficient microcode emulator that takes many cycles to perform one machine instruction.

The ALU board from the Xerox Alto. The Alto doesn't use a microprocessor chip, but a CPU built from three boards of integrated circuits.

The ALU board from the Xerox Alto. The Alto doesn't use a microprocessor chip, but a CPU built from three boards of integrated circuits.

Unlike modern computers, the Alto doesn't use a microprocessor chip, but instead has a CPU built from three boards full of simple TTL chips. The photo above shows the arithmetic-logic unit (ALU) board, which uses four 4-bit 74181 ALU chips to perform addition, subtraction and logic operations. You can also see the CPU's registers on this board. The Alto doesn't include a hardware multiplier, but must perform multiplication by repeated shifts and adds. Thus, the Alto performs especially poorly on the Mandelbrot set, which is essentially repeated multiplications.

Conclusion

The Mandelbrot set was a quick program to try out the Alto's graphics. Next I'll try some more complex projects on the Alto. If you want to run my code, it's on Github; you can run it on the ContrAlto simulator if you don't have an Alto available.

If you're interested in retrocomputing fractals, I also generated a Mandelbrot on a 50 year old IBM 1401 mainframe The 1401 generated the Mandelbrot set in 12 minutes—not because it's a faster machine than the Alto, but because the resolution on the line printer was very very low.

Mandelbrot generated on the IBM 1401 mainframe.

Mandelbrot generated on the IBM 1401 mainframe.

Notes and references

  1. There is a floating point library (source) for the Alto. I decided not use use it since the integer Mandelbrot was already very slow. But using floating point would make sense if you wanted to zoom in on the Mandelbrot. 

  2. Fixed-point arithmetic is a common trick for fast Mandelbrot calculation. 

  3. To multiply two 16-bit numbers efficiently, I use the double precision MulFull function (written in Nova assembler) in PressML.asm, part of the Computer History Museum's archived Alto software. 

  4. The hardest part of generating the Alto Mandelbrot was figuring out how to configure the display memory and update it correctly. The details on how the display works are in chapter 4 of the Xerox Alto Hardware Manual. To summarize, the display contents are defined by a linked list of display control blocks (DCBs), which define a rectangular region of pixels on the display. A microcode task reads 16 words of pixels from memory at a time and writes them to the display board, which shifts the pixels out to the monitor. Thus, as each scanline is being written to the CRT, the CPU is busily reading the pixels for that line from memory and feeding them to the display, another reason why the Alto is slow.

    The Alto's Smalltalk environment has a simple graphics API, but we don't have Smalltalk running yet. 

Reverse engineering the 76477 "Space Invaders" sound effect chip from die photos

Remember the old video game Space Invaders? Some of its sound effects were provided by a chip called the 76477 Complex Sound Generation chip. While the sound effects1 produced by this 1978 chip seem primitive today, it was used in many video games, pinball games. But what's inside this chip and how does it work internally? By reverse-engineering the chip from die photos, we can find out. (Photos courtesy of Sean Riddle.) In this article, I explain how the analog circuits of this chip works and show how the hundreds of transistors on the silicon die form the circuits of this complex chip.

The 76477 chip combines several functional blocks to produce a variety of sound effects. A voltage-controlled oscillator (VCO) produces a signal whose frequency depends on the control voltage. A "super low frequency" SLF oscillator generates a triangle wave. Feeding this into the VCO generates a varying pitch, useful for bird chirps, sirens, or the warbling sound of the UFO in Space Invaders. A "one-shot" produces a pulse of a fixed length to control the length of the sound. An envelope generator makes the sound more realistic by ramping its amplitude (volume) up at the start and down at the end. A digital white noise generator can be used for drums, gunshots, explosions and other similar sound effects. Finally a digital mixer combines these signals and feeds them to the output amplifier.

The diagram below indicates the functional blocks on the 76477 die. Looking under a microscope, you can see the circuitry that makes up the chip. The yellowish lines are metal traces that connect the circuits of the die. The reddish and greenish regions are the silicon of the chip, forming transistors and resistors. The black blobs around the edges of the chip show where tiny bond wires connected the die to the integrated circuit pins. Analog circuits are outlined in purple, while digital circuits are in cyan. The 76477 is primarily analog—most control signals are analog, the chip has no digital registers, and most sounds are generated from analog circuits—but about a third of the chip's area is digital logic.2

Functionality blocks inside the 76477 sound chip, indicated on the die. Die photo courtesy of Sean Riddle.

Functionality blocks inside the 76477 sound chip, indicated on the die. Die photo courtesy of Sean Riddle.

The block diagram below shows the chip's functional elements and can be compared to the die photo above. The chip is primarily controlled by resistors (red pins), capacitors (cyan pins) and voltages (violet pins). This made the chip difficult to control with a microprocessor, and more useful for hardwired sounds.

Block diagram of the 76477 sound chip, from the datasheet. Resistor inputs: red, capacitor inputs: cyan, voltage inputs: violet.

Block diagram of the 76477 sound chip, from the datasheet. Resistor inputs: red, capacitor inputs: cyan, voltage inputs: violet.

The remainder of this article will dive into the internals of the 76477 chip. First I'll show how transistors and resistors are built on an integrated circuit. Next I'll explain two key analog building blocks: the current mirror and the comparator. Finally, I'll show the reverse-engineered circuits for the 76477's analog functional modules and discuss how they operate. (I'll describe the chip's digital logic in a future article.)

Integrated circuit transistors and resistors

A bipolar integrated circuit such as the 76477 is built from two types of transistors: NPN and PNP. The diagram below shows two transistors on the 76477 die, with the emitter, base and collector labeled. The N-doped silicon appears reddish, while P-doped silicon appears green. Metal lines (yellowish) on top of the silicon connect the circuits, with outlines visible where metal is connected to the silicon layer. The transistor on the left is an NPN transistor. Internally, the transistor is built vertically, with the emitter on top, the base forming a thin layer beneath the emitter, and the collector region underneath. The transistor on the right is a PNP transistor. The collector forms a ring surrounding the central emitter. 3

Two transistors as they appear on the die of the 76477, showing the Emitter, Base, and Collector.

Two transistors as they appear on the die of the 76477, showing the Emitter, Base, and Collector.

Resistors are an important component of analog circuits. On a silicon chip, they can be formed from a long, narrow region of doped silicon with higher resistance. On an IC, resistors take a lot of space and are inaccurate, so they are generally avoided where possible. The die image below shows three resistors, which appear red in the photo. They are connected to the metal wiring at the contact points marked with blue arrows.

Three resistors (red) on the die of the 76477 chip. The ends of the resistors are connected to the metal layer at contact points marked in blue.

Three resistors (red) on the die of the 76477 chip. The ends of the resistors are connected to the metal layer at contact points marked in blue.

If a metal wire needs to cross another metal wire, the signal can use the silicon layer to pass under the wire. Two of these cross-unders are shown below. The silicon (green) is doped to be lower resistance than in the case of a resistor. Cross-unders are higher resistance than metal wiring, so they are only used when necessary.

A relatively low resistance silicon "wire" (green) passes under two metal wires.

A relatively low resistance silicon "wire" (green) passes under two metal wires.

By carefully examining the die photo, you can pick out the transistors and resistors and determine how they are connected. From this, you can reverse-engineer the chip's circuits.

The current mirror

A key component of most analog circuits is the current mirror, and the 76477 is no exception, containing many current mirrors. A current mirror takes one reference current and "clones" it, generating a current that matches the reference current. Either of the symbols below can be used to indicate a current mirror or current source.

Schematic symbols for a current source.

Schematic symbols for a current source.

The following circuit shows the circuit for a current mirror with two current source outputs. A reference current passes through the transistor on the right. (In this case, the current is set by the resistor.) Since all the transistors have the same emitter voltage and base voltage, they source the same current, so the currents on the left match the reference current on the right.4

Current mirror circuit. The currents on the right copy the reference current on the left.

Current mirror circuit. The currents on the left copy the reference current on the right.

The die segment below shows four PNP current mirror transistors providing a dozen current outputs. Each of the three pinwheel-shaped transistors has four collectors surrounding the central emitter, allowing it to produce four matched current outputs. The lower left transistor is a standard PNP transistor with a ring-shaped collector. The large green rectangle in the center is the shared base connection for the transistors.

Four transistors from current mirrors in the 76477 sound chip. Three of them have four collectors surrounding the emitter, giving them a "pinwheel" appearance.

Four transistors from current mirrors in the 76477 sound chip. Three of them have four collectors surrounding the emitter, giving them a "pinwheel" appearance.

Current mirrors are commonly used to generate bias currents instead of pull-up resistors. Since resistors inside ICs are both inconveniently large and inaccurate, a current mirror is used when possible. If you look at the die image at the beginning of the article, note the large die area dedicated to current mirrors for bias current generation. 5

Comparators

Another key building block of the 76477 is the comparator, comparing two voltages and determining which one is higher. The heart of the comparator is a differential pair, a two-transistor circuit. If both inputs are equal, the transistors will conduct equally and the current will be split equally along both branches. But if one input is lower, that transistor will conduct more, switching most of the current into that branch.

The schematic below shows a typical comparator in the 76477. If the positive input is higher than the negative input, the comparator outputs a 1. Otherwise it outputs a 0. Transistors 3 and 4 form the differential pair. The current to them is supplied by a current mirror above them, and most of the current will be directed to the side with the lower input. Transistors 1 and 2 buffer the inputs (using emitter followers) and are biased by more current mirrors. Transistors 5 and 6 form another current mirror, used as an active load to double the circuit's amplification. Finally, transistors 7 and 8 form an inverter, generating a digital output from the comparator.

Schematic of comparator circuit in 76477 sound chip, slightly simplified.

Schematic of comparator circuit in 76477 sound chip, slightly simplified.

The die image below shows one of the comparators used in the 76477 with the transistors labeled to match the schematic above. Note that transistors pairs 1 and 2, 3 and 4, and 5 and 6 have similar layouts to give them matched characteristics, improving the balance of the comparator. The base and collector of transistor 5 are connected together for the current mirror. The current sources and resistors are on another part of the die, not shown below.

Die image of the 76477 sound chip showing a comparator used in the one-shot circuit.

Die image of the 76477 sound chip showing a comparator used in the one-shot circuit.

The one-shot

The one-shot is a simple circuit that generates one pulse of a set width , triggered when the chip's inhibit signal drops low. This pulse controls the duration of the sound. For instance, a short pulse of noise can be used for a gunshot sound, while a longer noise could be an explosion.

Schematic of one-shot circuit inside the 76477 sound chip.

Schematic of one-shot circuit inside the 76477 sound chip.

The one-shot charges an external capacitor via an external resistor and current mirror. The resistor sets the reference current for the current mirror, and the mirror feeds this current into the capacitor. The advantage of using this charging circuit rather than a simple R-C circuit is that the charging current remains constant, rather than decreasing as the capacitor charges.7 This "charging trick" is used several times in the 76477.

When the capacitor's voltage reaches the comparator's limit level (2.6V), the comparator output goes low and the pulse ends. Thus, the faster the capacitor charges, the shorter the pulse. Digital logic circuitry (not shown) resets the one-shot by discharging the capacitor at the end of the pulse and holds it low until the next pulse is triggered via the inhibit pin.

The super-low frequency oscillator

The next functional block of the 76477 that I'll examine is the super-low frequency (SLF) oscillator. It generates a triangle wave that can control the voltage-controlled oscillator (VCO) to generate warbling sounds by ramping the pitch up and down. The frequency is controlled by an external resistor and an external capacitor. Like the one-shot, a current mirror provides a fixed charging current. However, the SLF oscillator uses a second current mirror to discharge the capacitor at the same rate, generating the triangle wave output.

Building a current mirror from NPN transistors creates a current mirror that sinks current instead of sourcing current, as in the lower current mirror. This current mirror also uses another trick: by using a transistor with two emitters, the current mirror doubles the output current; this is indicated on the schematic with two arrows in the current mirror circle. When the lower current mirror is disabled by the transistor, the capacitor is charged by the upper current mirror, with a current (I) set by the resistor, similar to the one-shot circuit. But with the lower current mirror enabled, the lower current mirror sinks current 2I. Since the upper current mirror is still supplying I to the capacitor, the net current (-I) discharges the capacitor. Thus, by combining two current mirrors, the capacitor can either be charged or discharged with the same current I. This trick will appear again in the VCO.

Schematic of SLF inside the 76477 sound chip.

Schematic of SLF inside the 76477 sound chip.

The final piece of the SLF oscillator is the comparator. The + input is set to an upper limit of 2.46 volts, so the comparator will output 1 as the capacitor charges. When the capacitor reaches the limit voltage, the comparator output drops to 0. The first effect of this is to enable the lower mirror, so the capacitor starts discharging. The second effect is to pull the comparator input low (0.36V) through the hysteresis circuit.6 This keeps the comparator output low until the capacitor has discharged. Thus the circuit "remembers" if it is charging or discharging, without using a flip flop. The square wave output is used by the mixer and the triangle wave output is used by the VCO.

Zooming in on the die of the 76477 sound chip shows the circuitry for the SLF oscillator.

Zooming in on the die of the 76477 sound chip shows the circuitry for the SLF oscillator.

The diagram above shows how the SLF circuit looks on the die. Note the three transistors for the upper current mirror. Below the capacitor pin is a transistor for the lower current mirror with a doubled emitter; this causes the mirror's output current to be doubled. On the left are resistors forming the hysteresis circuit. The comparator circuit is underneath it. The Vcc power trace has been colored red and the ground trace has been colored blue.

The voltage-controlled oscillator (VCO)

The voltage-controlled oscillator generates a pitch that depends on its voltage input, as shown below. The circuit for the VCO has a lot in common with the SLF: it creates a triangle wave by charging and discharging an external capacitor. The main difference is that it charges until the capacitor reaches the control voltage (rather than a fixed voltage). Thus, the voltage input controls the pitch: with a higher control voltage, the capacitor takes longer to charge so the frequency is lower. This control voltage can be provided either from the SLF or an external pin. The "VCO select" pin selects which control voltage to use.

The triangle wave from the SLF oscillator can control the frequency of the Voltage Controlled Oscillator (VCO). From the 76477 datasheet.

The triangle wave from the SLF oscillator can control the frequency of the Voltage Controlled Oscillator (VCO). From the 76477 datasheet.

The VCO's output is a digital square wave that is active during the charging half of the VCO's internal triangle wave. Another pin controls the output's duty cycle, the fraction of the time it is high.8 The triangle wave is compared to the duty cycle control voltage to determine when the output switches on and off. A lower control voltage results in a shorter duty cycle while a higher control voltage results in a longer duty cycle (up to 50%). The digital logic combines the two outputs to yield the final output. The VCO has two comparators in parallel; the VCO select input enables one of them.

Schematic of VCO inside the 76477 sound chip.

Schematic of VCO inside the 76477 sound chip.

Envelope generation

The 76477 provides envelope generation, so the output can smoothly ramp up at the start of a sound and ramp down at the end, making it more realistic9 The diagram below (from the datasheet) shows the linear attack and decay applied to a sound waveform. (The sound below illustrates the random pulses produced by the white noise generator.)

A sound waveform with attack and decay applied.

A sound waveform with attack and decay applied.

The schematic below shows the circuit for envelope generation. As with the other circuits, a capacitor is charged and discharged using current mirrors, but two separate resistors are used so the charge (attack) and discharge (decay) rates can be different. The attack signal (from the digital logic) causes the envelope capacitor to charge through a current mirror at a rate controlled by an external attack resistor. The decay signal (simply the complement of the attack signal) causes the capacitor to discharge, controlled by the external decay resistor. Discharge uses a second current mirror operating as a current sink, but unlike earlier circuits it doesn't double the current. The inhibit signal rapidly discharges the capacitor, resetting the envelope.

Schematic of envelope generator inside the 76477 sound chip.

Schematic of envelope generator inside the 76477 sound chip.

The output circuit

The 76477's output circuit uses four separate current mirrors. The varying reference current for the first current mirror is generated from the envelope voltage and the external amplitude resistor. Unlike the other control resistors, this resistor has a varying voltage applied so it produces a varying reference current. This current controls the output's overall amplitude.

Schematic of output circuit inside the 76477 sound chip.

Schematic of output circuit inside the 76477 sound chip.

The amplitude reference current goes into the second current mirror (lower left). The inhibit signal blocks this current mirror, which is how the inhibit signal blocks the chip's output. A third current mirror (upper right) generates two output currents referenced from the current sunk by the second current mirror. The final current mirror is enabled and disabled by the output signal from the mixer. Thus, the current to the op amp alternates between positive and negative, with magnitude depending on the envelope and the control resistor.

The output op amp drives triple-Darlington emitter-follower output transistors. These transistors are not particularly large, so the 76477 has limited output power. An external feedback resistor to the op amp controls the output's amplification. The die photo below shows part of the output circuit. A capacitor helps stabilize the output so it doesn't oscillate.

Part of the output circuit for the 76477 sound chip.

Part of the output circuit for the 76477 sound chip.

Conclusions

The 76477 is a complex integrated circuit with hundreds of transistors, but by examining the die the operation of the chip can be reverse engineered. The chip uses interesting techniques to generate sounds by combining oscillators, a noise generator and other functional blocks. Outside of analog chips, current mirrors are fairly obscure but the 76477 makes heavy use of current mirrors, with multiple current mirrors in almost every functional unit, driving comparators, generating bias currents, and providing uniform charge/discharge currents.

The chip had several disadvantages that led to its replacement by more advanced chips. The biggest inconvenience is that most of the 76477's parameters are controlled by resistors and capacitors, rather than digitally, making it hard to control the chip with a microprocessor. A second disadvantage of the chip is the sounds were largely digital square waves which gave the sounds a harsh quality rather than a "warm" analog sound. Finally, it was difficult to produce accurate pitches with the VCO, making the chip less useful for music synthesis. For these reasons, digitally-controlled chips such as the AY-3-8910 (1978) surpassed the 76477 in popularity.

I announce my latest blog posts on Twitter, so follow me at kenshirriff. I also have an RSS feed.

Thanks to Sean Riddle for the die photos. I'll end with his die photo of the 76477 after dissolving the metal layer in acid, making it easier to see the resistors and transistors.

Die photo of the 76477 sound chip. The metal layer has been dissolved with acid to reveal the silicon. Colors are enhanced. Photo courtesy of Sean Riddle.

Die photo of the 76477 sound chip. The metal layer has been dissolved with acid to reveal the silicon. Colors are enhanced. Photo courtesy of Sean Riddle.

Notes and references

  1. The Space Invaders schematics show that the video game used seven different circuits to create its different sounds. The 76477 generated the "UFO" sound, while other sounds (saucer hit, explosion, missile, invader hit, etc.) were mostly generated by collections of op amps. 

  2. The 76477's digital circuitry was built with Integrated Injection Logic (I2L), which was developed in the 1970s with the promise of building fast bipolar logic with the VLSI density of MOS logic. Spoiler: I2L lost out to CMOS, the technology used in microprocessors today. For the 76477, the most useful feature of I2L is that it used the same manufacturing process as analog bipolar transistors, allowing the analog and digital circuitry to be combined on one chip. In the 76477, digital logic is used for digitally selecting and combining audio signals, generating white noise with a nonlinear feedback shift register, and control functions. 

  3. Some PNP transistors have a different structure; a PNP transistor with the collector grounded lacks the collector ring, using the grounded substrate as the collector. This makes it more similar to an NPN transistor both in appearance and construction. For details on the internal structure of bipolar IC transistors, see my earlier articles on the 555 and 741 chips. In brief, the NPN transistor is vertical in cross-section with the emitter on top and collector on the bottom. Most of the PNP transistors are lateral (i.e. horizontal) with the emitter on the inside and the collector on the outside, and the base in between (somewhat distant from the base connection). 

  4. Many of the 76477's current mirrors are more complex, with a Darlington pair of transistors between the reference transistors's base and collector. This compensates for base currents and makes the mirror more accurate. See this paper for an explanation. 

  5. For bias generation, you might wonder what provides the initial current to the current mirrors. The answer is an internal resistor, along with several transistors to keep the current stable. The circuit is similar to the self-biased current source described here

  6. Note that the hysteresis circuit used by the SLF is not providing feedback like with an op amp. It provides a comparison voltage of 2.46V while charging and 0.36V while discharging. The SLF comparator's output is open collector, so it can only pull the circuit low. 

  7. If you use a simple resistor-capacitor circuit instead of a current mirror, the capacitor charges more slowly as its voltage increases, resulting in an exponential charging curve. By using a current mirror, the current remains constant so the capacitor charges at a constant rate. The result is a linear triangle wave. 

  8. The datasheet calls the VCO's duty cycle control a "pitch control". This is wrong, since the frequency is unaffected. 

  9. Note that the envelope is the only analog part of the sound; until the envelope ramps are applied, the output is a digital square wave. 

1950's tax preparation: plugboard programming with an IBM 403 Accounting Machine

Long before computers existed, businesses used electromechanical accounting machines for data processing. These one-ton accounting machines were "programmed" through wiring on a plugboard control panel, allowing them to generate complex business reports from records stored on punched cards. Even though they lacked electronics and used spinning mechanical wheels to add up data, these machines could process more than two cards a second.

This plugboard for an IBM 403 implements tax deduction computation.
Board courtesy of Carl Claunch.

This plugboard for an IBM 403 implements tax deduction computation. Board courtesy of Carl Claunch.

In honor of April 151, I examine a plugboard that was used for tax preparation in the 1950s9 and explain the forgotten art of plugboard programming, showing how a tangle of wiring implemented a data processing algorithm. By mounting the plugboard on an accounting machine, a particular data processing task could be performed. Although the plugboard looks like spaghetti code made physical, tracing out the connections shows its function: it computed deductions by summing records across multiple fields, printed a report with subtotals and totals, and punched a smaller card deck with the subtotals.

Overview of punched card data processing

Punched cards were a key part of data processing from 1890 until the 1970s, used for accounting, inventory, payroll and many other tasks. Typically, each 80-column punched card held one record, with data stored in fixed fields on the card. The example below shows an example card with columns divided into fields such as date, vendor number, order number and amount. An accounting machine would process these cards: totaling the amounts, and generating a report with subtotals by account and department, as shown below.

Example of a punched card holding a 'unit record', and a report generated from these cards. The accounting machine can group records based on a field to produce subtotals, intermediate totals, and totals. From Manual of Operation.

Example of a punched card holding a 'unit record', and a report generated from these cards. The accounting machine can group records based on a field to produce subtotals, intermediate totals, and totals. From Manual of Operation.

Punched-card data processing was invented by Herman Hollerith for the 1890 US census, which used a simple tabulating machine that counted records indicated by holes in the cards.2 These machines steadily accumulated features, becoming complex "accounting machines" that could generate business reports.6 These machines became popular with businesses and by 1944, IBM had 10,000 tabulating and accounting machines in the field.3 In July 1948, IBM introduced the 402 Accounting Machine, which used the plugboard I'm examining. The 402 (and the similar 4035) were feature-rich machines that had 16 counters, multiple levels of subtotals, vertical spacing control to support forms, comparisons and conditional operations, and leading zero elimination.

IBM 403 accounting machine, with Type 82 card sorter at right.4 These machines are on display at the Computer History Museum.

IBM 403 accounting machine, with Type 82 card sorter at right.4 These machines are on display at the Computer History Museum.

The surprising thing about this history is that businesses were performing data processing with punched cards decades before the first computers, using machinery that was entirely electro-mechanical, not even using vacuum tubes. This equipment was built from components such as wire brushes to read holes in punch cards, relays to control the circuits, and mechanical counter wheels to add values. Even though these systems were technologically primitive, they revolutionized business data processing and paved the way for electronic business computers such as the popular IBM 1401.

Plugboard programming

The accounting machines were programmed by wiring up a plugboard for a specific task. Since each application used cards with fields in different positions, accounting machines needed a way to define each field. Different reports would be formatted with values in different locations on the page. Applications would need to total and subtotal different values. Before stored-program computing existed, a technique was needed to easily customize the system for a particular application. The result was wiring on control panel plugboards.

Closeup of the plugboard for an IBM 403. The accounting machine is "programmed" by plugging in wires to form connections.

Closeup of the plugboard for an IBM 403. The accounting machine is "programmed" by plugging in wires to form connections.

The photo above shows a closeup of the plugboard. The plugboard has a grid of holes (which are called hubs), with their functions labeled. By inserting a wire into the board, two hubs are connected, causing the accounting machine to perform a particular operation. The collection of wires specifies the operations that are performed on each card.

The back of the plugboard for an IBM 403 accounting machine.

The back of the plugboard for an IBM 403 accounting machine.

When a wire is inserted into the plugboard, the jack on the end of the wire sticks out the back of the plugboard, as shown above. When the plugboard is mounted in the accounting machine (below), these jacks make contact with a grid of connectors on the accounting machine, completing the desired circuits. (Note the "setup change" switches above the plugboard; these switches will be relevant later.)

A plugboard inserted into the side of an IBM 403 at the Computer History Museum. Note the control switches above the plugboard. These can be used to change what the plugboard does.

A plugboard inserted into the side of an IBM 403 at the Computer History Museum. Note the control switches above the plugboard. These can be used to change what the plugboard does.

Since the plugboard is removable, companies could easily switch plugboards to perform different tasks. (Rewiring a plugboard for each function would be much too time-consuming.) As a consequence, companies might have shelves full of plugboards for all the operations they performed; with plugboards, the "software" takes up considerable physical space. The photo below shows one company's collection of plugboards to perform different tasks.

Shelves full of plugboards for the IBM 402, courtesy of IBM 1401 restoration team.

Shelves full of plugboards for the IBM 402, courtesy of IBM 1401 restoration team.

The tax program

I closely examined the wiring of the tax plugboard to determine what it does. The first step was to trace out each wire to draw a schematic wiring diagram (below) that shows all the connections on the plugboard. If you compare the diagram with the plugboard photo at the start of the article, you can see that it shows the same wiring, but in a much easier to follow format.

A wiring diagram for an IBM 403 plugboard to compute tax deductions. (Click for full size.)

A wiring diagram for an IBM 403 plugboard to compute tax deductions. (Click for full size.)

I found that the program wired into the board reads cards and computes subtotals and totals from the cards. In more detail, each card has seven fields that are read. The first field is an identifier, and all cards with the same identifier are totaled together to give totals for each of five fields. My hypothesis is that this field is an employee id, and each card corresponds to one pay period.7 Summing the records for each employee id gives the employee's total deductions (or year-to-date deductions). The totaled five fields could be payroll deductions such as federal income tax, state tax, social security tax, Medicare tax and retirement contributions. After reading the cards for an employee, the accounting machine punches a new summary card with the employee's total deductions prints a line on the report. The per-employee totals are then summed together to give overall totals at the end.

Here's how the plugboard works, step by step. When an 80-column card is read, each digit is available in one of the reading hubs, labeled 1 through 80. By putting a wire in a hub, the digit is transmitted to another part of the machine. For instance, suppose there is a 6-digit number punched into columns 28 to 33 of the card and we want to total these numbers. This is done by connecting a wire from reading column 28 to the upper digit of the counter, a wire from column 29 to the second digit of the counter, and so forth, for 6 wires in total.

The wires transferring the field to counter 6C are the six red wires in the photo below. The 80 card columns are available in the two rows of hubs below the label "Third reading". The inputs to the counters are the four rows of hubs below the "Counter entry" labels. Other fields are wired to counters similarly.

The six red wires connect six columns read from the card (right) to the entry of counter 6C (left).

The six red wires connect six columns read from the card (right) to the entry of counter 6C (left).

Trying to figure out the wiring from the photo is difficult, so plugboard wiring is typically indicated in a diagram. The diagram below shows the wiring between the columns read (right) and the counter 6C (left). The six wires are compressed into one line on the diagram, using IBM's style of representing plugboards. The horizontal bars connected by a line indicate six parallel wires.

A diagram representing the connection between the card read (right) and the counter (left).

A diagram representing the connection between the card read (right) and the counter (left).

To print a total, a counter "exit" is wired to the desired printer columns. On the plugboard, the printer columns are labeled print entries: 43 "alphamerical print entry" positions that can print alphabetical or numerical characters, followed by 45 "numerical print entry" positions that only print numbers. The diagram below shows four wires from counter 4C to print columns 1 through 4 (yellow), and six wires from counter 6C (red) to print columns 35 through 40.

Wiring a counter to a "print exit" causes the counter value to be printed.

Wiring a counter to a "print exit" causes the counter value to be printed.

The accounting machine contains 16 decimal counters in all. Four of them are 8-digit counters, named 8A, 8B, 8C and 8D. Four are 6-digit counters (6A to 6D), four are 4-digit counters (4A to 4D), and four are 2-digit counters (2A to 2D). In addition, two counters can be joined together to form a larger counter. There are also connections between counters for subtotals. For instance, counter 8A accumulates a per-employee subtotal. These subtotals are added to counter 8B to form the final total.

Another important operation is to compare two cards to see if they have the same id (and should be counted together) or if they have different ids (so a subtotal should be printed and the counters reset). A comparison is done by wiring two fields to the two "comparing entry" rows. If the fields are different, the "comparing exit" will trigger a signal. Since we want to compare each card with the next card, we get one field from the "second reading" and one from the "third reading"; the card we are processing will be at the third reading stage while the card behind it will be at the second reading stage. Finally, the comparison output is wired to the "program start (minor)" hub. This causes the accounting machine to start an additional cycle to print the subtotals (i.e. minor totals) and reset the counters. (There are also "intermediate" and "major" program start hubs, which provide two additional levels of totals.)

Columns 1-4 of the cards are compared to determine if subtotals should be printed.

Columns 1-4 of the cards are compared to determine if subtotals should be printed.

On the diagram above, columns 1-4 from the second reading and from the third reading are wired to the comparing entry hubs. The four corresponding comparing exit hubs are wired together (gray) and connected to the minor (MI) program start hub (yellow wire to PRG START in upper right). The closeup of the plugboard below shows the wiring on the plugboard.

Columns 1-4 of the cards are compared to determine if subtotals should be printed.

Columns 1-4 of the cards are compared to determine if subtotals should be printed.

Another interesting feature of the plugboard is conditional behavior, using "selectors". Connections can be switched based on a different signal, allowing behavior to change based on a comparison, or a panel switch. This plugboard changes behavior based on the "setup change 1" panel switch, one of the switches on the accounting machine above the panel. (You can think of this as the plugboard version of command-line options.) According to the label on the plugboard (below), this switch selects "year to date". On the board, this switch enables processing of one field, as well as switching between the constant 2 and 5 for addition to counter 2B. (The reason for this constant is a mystery to me.)

The label on the plugboard shows it computes tax deductions.9 "S/P" is presumably "summary punch". The setup 1 switch selects "year to date".

The label on the plugboard shows it computes tax deductions.9 "S/P" is presumably "summary punch". The setup 1 switch selects "year to date".

The wiring on the right side of the plugboard controls the counter behavior, such as accumulating subtotals versus final totals. It also wires some of the counters together to form larger counters. For instance, counters 2C and 4D are combined to form a single 6-digit counter. 8 I won't explain the counter control wiring here; the manuals15 explain how it works.

"Summary punching" is another interesting feature of the accounting machine. This lets you take a large file of cards and punch a smaller summary file. For the tax plugboard, one summary card is punched for each employee, with the totals for that employee. Thus, a card file with one record for each employee's pay period is reduced to a much smaller file with one card for the employee's yearly totals. This smaller card file can then be used for further processing.

IBM 403 accounting machine connected to a 519 summary punch. Courtesy Columbia University Computing History.

IBM 403 accounting machine connected to a 519 summary punch. Courtesy Columbia University Computing History.

Summary punching is accomplished by connecting a summary punch machine (above right) to the accounting machine (left) through a thick cable. A hub on the plugboard is wired to enable summary punching, and another hub is wired to control when to punch a card. For the tax plugboard, a summary card is punched for each minor total with the wiring below. A separate plugboard on the summary punch machine controlled which columns were punched on the summary card.

Summary punch wiring on the IBM 403 plugboard.  The summary punch control pickup (SP Control PU on the left) is wired to punch a summary card on a minor total.  The summary punch switch (SP.SW) hubs are connected by the gray wire (lower left).

Summary punch wiring on the IBM 403 plugboard. The summary punch control pickup (SP Control PU on the left) is wired to punch a summary card on a minor total. The summary punch switch (SP.SW) hubs are connected by the gray wire (lower left).

Inside the 403 Accounting Machine

Its amazing how much functionality these accounting machines could provide without the benefit of electronics, purely through clever electromechanical systems. Inside the accounting machine is a maze of motors, rotating shafts, cams and clutches, making it seem more like a car than a computer—it even contained an oil pump! With all these mechanical parts a 403 accounting machine weighed over a ton (2515 pounds / 1143 kg).

Inside an IBM 403 Accounting Machine, front view. From the 402/403 Field Engineering Manual, fig. 5.

Inside an IBM 403 Accounting Machine, front view. From the 402/403 Field Engineering Manual, fig. 5.

On the plugboard, a wire is used to route a column of the card. How does a character on the card get sent across this wire? How does a counter perform addition? And how does the result get printed? The accounting machines use clever mechanisms, closely tied to the structure of a punched card, to perform these operations.

In modern terms, a character is encoded serially over a wire, by a single pulse whose timing depends on the position of the hole. These pulses start and stop the counters used to add values. These pulses also control the timing of the typebars that print the result. How these pulses are generated and how they electromechanically control the system will be described more below.

The 403's timing is based off the rotating shafts that drive the machine, rather than clock time. Each revolution of the shaft corresponds to a "card cycle", the reading and processing of one card. The fundamental timing unit is a rotation of 18°: this is the time between reading successive card holes, moving a typebar by one character, and rotation of a counter by one count. At 150 cards per minute, these values work out to approximately 400 milliseconds per card and 20 milliseconds per 18° step, remarkably fast for mechanical operations.

Reading cards

To understand the accounting machine, one must first understand how punched cards hold data. Punched cards hold 80 characters of data; each character is represented by the hole pattern in a column. The card below shows how numbers and the alphabet are punched; each character is printed at the top of the card with the corresponding punches in the column below. A digit is simply represented by a hole in the corresponding row, 0 through 9. (Note that numbers are stored in decimal, not binary.) To support alphanumeric data, two "zone" rows were added above the digit rows.10 A letter is represented by putting two holes in a column: a zone punch and a digit punch.11

An 80-column IBM punched card. Each column encodes a character (printed at top) by punching holes in the column. For a digit, a hole is punched in the row with the same number. A letter is encoded by adding a "zone punch" in one of the top three rows.

An 80-column IBM punched card. Each column encodes a character (printed at top) by punching holes in the column. For a digit, a hole is punched in the row with the same number. A letter is encoded by adding a "zone punch" in one of the top three rows.

You might expect the accounting machine to read cards a column at a time, so one character gets processed at a time. But instead, cards are read "sideways", starting at the bottom. All 80 columns are read in parallel, one row at a time, starting with row 9 and ending with row 0 and then the zone rows. The accounting machine uses sets of 80 wire brushes to read a card, one for each column. If there is a hole, the brush makes contact with the energized metal roller underneath the card, completing a circuit and generating a pulse. Thus, each column will have a pulse corresponding to its hole, with the 9 pulse first, followed by 8 and so forth, ending with 0. Thus, each character is encoded serially, and each plugboard wire carries one of these serial signals, but all columns are processed in parallel.

Printing

Typebars in an IBM 402 accounting machine. Courtesy Columbia University Computing History.

Typebars in an IBM 402 accounting machine. Courtesy Columbia University Computing History.

The accounting machine's printing mechanism consists of 88 typebars;12 each vertical bar holds all the characters that can be printed. The typebars move vertically to line up the proper characters and then hammers13 hit the typebars into an inked ribbon to print the selected characters. Thus, the characters in a line of text are printed simultaneously.

The wires in the plugboard control what gets printed by stopping each rising typebar at the right time to select the desired character. The motion of the typebars is carefully timed to match the reading of a card, so the "3" row (for instance) of a card is read at the same time that the "3" on the typebar moves into position. If the brush's hub is wired to a column's print hub, this signal energizes a print magnet, releasing a "stop pawl" which meshes with a tooth on the typebar, stopping it with the "3" character in position to print. If a "2" is read instead, the brush reads the hole one time unit later; the typebar will have risen one more position, causing a "2" to be printed.

The printing mechanism consists of a complex arrangement of mechanical parts: cams, pawls, slides, springs and clutches, in combination with electromagnets to activate these parts at the right time. The mechanism can print 100 lines per minute, so the parts are flying around rapidly and require exact timing. The typebars move one position for every 18° rotation of the driveshaft, keeping them synchronized with card reading.

Counters

The heart of the accounting machine is the electromechanical counters that sum the values. Each digit in a counter is represented by a wheel that rotates to perform addition. The position of the wheel indicates the digit. For instance, to add 27 to a counter, the tens digit wheel is rotated two positions and the unit wheel is rotated seven positions. Thus, to add the value in a card field, the wheels must rotate an amount corresponding to the number punched in the card. The wheel starts rotating when a hole is read, rotates one position as each additional row is read, and stops reading at row 0. Since row 9 is read first and row 0 is last, the result is the counter rotates the number of positions indicated by the hole.

An electromechanical counter from the IBM 403 accounting machine performs addition on two digits by rotating the counter wheels.

An electromechanical counter from the IBM 403 accounting machine performs addition on two digits by rotating the counter wheels.

The photo above shows a two-digit counter unit. The counter wheels are at the left. The start and stop coils cause the counter to start and stop rotating at the correct times by activating lever arms that control a clutch under the wheel. Carry is implemented by cams underneath the wheel that close electrical contacts. On the back of the board are the electrical contacts that read out the value stored in the counter; these are wired to the connector on the right.

Diagram of the electromechanical counter, indicating the key components. From the IBM 403 Field Manual.

Diagram of the electromechanical counter, indicating the key components. From the IBM 403 Field Manual.

The plugboard specifies which card columns are added to which counter digits. To add a field's value to a counter, a column's read brush is wired to the counter through the plugboard, so the card controls how much the counter rotates. This signal activates the counter's start coil, engaging the counter's clutch and starting the counter's rotation. At the 0 position, the stop coil disengages the clutch, stopping the counter. For instance, if the brush read a 7 from the card, the counter will rotate through seven positions before stopping, adding 7 to its value. If the brush read a 1, the counter will rotate by just one position. The reason this works is the synchronization between card movement and counter rotation; an 18° rotation corresponds to the card moving by one row as well as one count on the counter wheel. (A counter wheel has 20 positions spaced 18° apart. Counting by 10 rotates the wheel halfway.) Subtraction is performed by adding the complement.14

A carry from one position to the next is handled by a complex mechanism. You might expect that when one wheel rolls over from 9 to 0, it increments the higher wheel like an odometer, but that would be slow for multi-digit counters. (Keep in mind that the counters can add 150 numbers per minute, so they are spinning rapidly.) Instead, the counters use a mechanism similar to carry lookahead. If a wheel is at 9, an electrical contact closes, allowing a lower-order carry to be passed through to the higher wheel. If a wheel passes from 9 to 0, it closes a different electrical contact, generating a carry. After the "regular" addition, any necessary carries are generated in parallel and added in a single time step. Thus, something like 99999999+1 isn't delayed by a ripple carry; instead all digits get a carry in parallel.

Relays

The accounting machine is controlled by hundreds of relays, electromechanical switches that provide all the "control logic" for the system. The photo below shows the back of the accounting machine, filled with relays; more relays are on the end panel. To generate timing signals signals for the relays, switches were opened and closed by cams attached to the rotating shaft. Thus, everything in the system is timed from the rotating shaft.

The IBM 403 accounting machine is controlled by hundreds of relays, many of which are mounted in the back of the machine. Photo from the Field Engineering Manual, fig 81.

The IBM 403 accounting machine is controlled by hundreds of relays, many of which are mounted in the back of the machine. Photo from the Field Engineering Manual, fig 81.

Conclusions

Punched card data processing is almost forgotten now, but it ruled data processing for almost a century. Even before computers existed, businesses used punched cards and tabulators for accounting. IBM's accounting machines were able to perform surprisingly complex tasks even though they were built from electromechanical components that seem primitive today. Accounting machines and plugboard programming remained popular into the 1960s, when businesses gradually switched to stored-program business computers such as the IBM 1401. Even so, IBM continued marketing accounting machines until 1976. Incredibly, one company in Texas still uses an IBM 402 accounting machine for their accounting today (details), illustrating the amazing longevity of punched card technology.

I announce my latest blog posts on Twitter, so follow me at kenshirriff. I also have an RSS feed.

Thanks to Carl Claunch (one of the Xerox Alto restoration co-conspirators) for providing the plugboard and documentation.

Notes and references

  1. April 15 is traditionally tax day in the US, but if you don't have your taxes done yet, don't panic. In 2017, US tax day is April 18 due to the weekend and holiday. 

  2. To support addition, tabulators used a module called an "accumulator" with rotating dials to hold decimal numbers. This accumulator gave its name to the accumulator register still used in microprocessors today. For example, Intel's x86 processors have a register called EAX, the EXtended Accumulator. 

  3. The history of IBM's tabulating machines is described in IBM's Early Computers. Also see Columbia University's computing timeline

  4. Another part of the unit record system is the card sorter, which rapidly sorts cards on a field, putting them in the proper order to be processed by an accounting machine. I discuss IBM card sorters in detail here

  5. The 402 and 403 accounting machines were essentially the same except the 403 could print three-line addresses. In order to print three lines from one card, the 403 has three card reading stations instead of two. (That is, it read each card three times using three sets of 80 brushes). This feature is called MLP (multi-line printing) and is useful for printing addresses on invoices, for instance. An MLP card is indicated with a special punch: 8, 9 and (1, 2, 3 or 4) punched in a single column; the last digit controls the number of lines printed. 

  6. I wouldn't be surprised if these accounting machines were technically Turing-complete due to their support for conditional operations, although it's unclear how to represent the tape. Perhaps storage could be implemented by punching a new deck of cards on each cycle through the machine. Of course this would be impractical for any real use. 

  7. I suspect each card represents one employee pay stub and each field indicates a payroll deduction. However, there are alternative explanations for the plugboard. For instance, the id field could indicate a company division, and each card represents a subdivision. In this case, the accounting machine could be totaling the tax deductions for each division such as business expenses and depreciation. Or each card could represent one month. Since there are no variable names, it is speculation. 

  8. The table below summarizes the program implemented by the plugboard, showing the mapping between input fields on the card and output fields on the printer.

    Card columnsOutput columnsSubtotal counterTotal counter
    1-41-44C 
    34-385-108D4A/2D
    44-4511-188A8B
    61-6619-266A2C/4D
    67-7127-326B2A/4B
    28-3335-406C8C
    14-17 4D 
    from switch 2B 

    Columns 14-17 are summed but not printed. Presumably they are punched on the summary punch card. Columns 34-38 are only processed if the "setup change 1" switch is active. Counter 2B is controlled by the panel switch, adding either 2 or 5 each step. I can't figure out a reason for this; I assume the plugboard on the summary punch (which I don't have) does something useful with this value. 

  9. The tax plugboard I'm examining was labeled with embossing tape which dates the labeling of the board to post 1958. The board could originally be older, or it could have been used into the 1960s. 

  10. The row above "0" is called 11 or X, while the row above that is 12. For alphabetical characters, the "0" row is used as a zone instead of a digit. (This causes some complications in the accounting machine, such as a special mechanism to print a "numeric zero" versus a "zone zero".) 

  11. This punch card code evolved into EBCDIC (Extended Binary Coded Decimal Interchange Code), the encoding used by IBM computers in place of ASCII. Many of the strange characteristic of EBCDIC, such as the alphabet not being entirely sequential, are due to its roots in punched cards. 

  12. The accounting machine has typebars on the left that print alphanumerics and typebars on the right that just print digits. For alphanumeric printing, the digit signal moves the typebar in steps of four, while the zone signal moves the typebar 0 through 3 steps. Thus, an alphanumeric character can be printed. Typebars with special characters could also be installed, to print $, @, - or %. 

  13. You might expect that to print each line, all the hammers hit the typebars, but it's more complex than that. First, each hammer has a mechanical "hammerlock" control, which can enable the hammer, disable the hammer, or put the hammerlocked hammers under program control. Thus, part of the line may be printed or suppressed based on the data. In addition, the hammers also have mechanical "hammersplit" levers which when raised cause leading zeros in a field to be suppressed. This allows the value "000123" to be printed as "   123" for instance. 

  14. Subtraction uses 9's complement addition. That is, subtracting a digit n is done by adding 9-n. This is accomplished mechanically by starting the counter's rotation at position 9 and stopping when a hole is read. For example, if the hole is at position 7, the counter will increment by two positions. There are a few complications with 9's-complement subtraction. The answer is off by one, but an "end-around carry" adds 1 to yield the correct result. Negative numbers require special handling to be printed properly using the "net balance method" or the "balance selection" method; see the 403 manual if you care about the details. The numeric typebars include a "CR" symbol, which indicates negative numbers as a "credit". On a punch card, negative numbers are typically indicated with an X-punch (i.e. a zone punch in row 11) over the value. 

  15. IBM's accounting machine manuals are available on Bitsavers. The operation of the IBM accounting machines is discussed in detail in: IBM 402, 403 and 419 Accounting Machines: Manual of Operation. For a thorough discussion of how the machine works internally, see IBM 402, 403, 419 Field Engineering Manual of Instruction. For an overview of how plugboard wiring for IBM's works, see IBM Functional Wiring Principles