Ken Shirriff's blog: Search results for ibm 1401

Showing posts sorted by relevance for query ibm 1401. Sort by date Show all posts

12-minute Mandelbrot: fractals on a 50 year old IBM 1401 mainframe

When I found out that the Computer History Museum has a working IBM 1401 computer[1], I wondered if it could generate the Mandelbrot fractal. I wrote a fractal program in assembly language and the computer chugged away for 12 minutes to create the Mandelbrot image on its line printer. In the process I learned a bunch of interesting things about the IBM 1401, which I discuss in this article.

$The IBM 1401 at the Computer History Museum printing the Mandelbrot fractal on the 1403 printer.$

The IBM 1401 mainframe computer (left) at the Computer History Museum printing the Mandelbrot fractal on the 1403 printer (right). Note: this is a line printer, not a dot matrix printer.

The IBM 1401 computer was announced in 1959, and went on to become the best-selling computer of the mid-1960s, with more than 10,000 systems in use. The 1401 leased for $2500 a month[2] (about $20,000 in current dollars), a low price that let many more companies use computers. Even a medium-sized business could use the 1401 for payroll, accounting, inventory, order processing, invoicing, analysis, and many other tasks. The 1401 was called the Model-T of the computer industry due to its low price and great popularity.[3] Even for its time, IBM 1401 only had moderate performance, especially compared to a high-end business computer like the IBM 7080 (rental fee: $48,000 a month).[2] But the IBM 1401 became hugely popular because of its affordability, reliability, ease of use, high-quality printer and stylish appearance[4].

The 1401 was an early all-transistorized computer. These weren't silicon transistors, though, they were germanium transistors, the technology before silicon. The transistors and other components were mounted on circuit boards about the size of a playing card. These boards were called Standard Modular System (SMS) boards and each one provided a function such as a flip flop or simple logic functions. The IBM 1401 could contain thousands of SMS cards, depending on the features installed - the basic system had about 933 cards[5], while the system I used has 2881 SMS cards. (For more information on SMS cards, see my earlier article.)

SMS cards inside the IBM 1401. These cards are part of the tape drive control, amplifying signals read from tape.

The SMS cards plug into racks (which IBM confusingly calls "gates"), that fold out from the computer as shown below. The 1401 is designed for easy maintenance - to access a gate, you just grab the handle and it swings out from the computer, exposing the wires and boards for maintenance. At the bottom of the gate, wiring harnesses connect the gate to other parts of the computer.[6] There are 24 of these gates in total.

The IBM 1401 computer is built from thousands of SMS circuit cards. This open rack (called a gate) shows about 150 SMS cards.

Unusual features of the IBM 1401

It's interesting to look at old computers because they do things very differently. Some of the unusual features of the IBM 1401 are that it used decimal arithmetic and 6-bit characters, it had arbitrary-length words, and additional instructions were available for a rental fee.

The IBM 1401 is based on decimal arithmetic, not binary. Of course it uses 0's and 1's internally, but numbers are stored as digits using binary coded decimal (BCD). The number 123 is stored as three characters: '1', '2', and '3'. If you add 7 and 8, you get the digit 1 and the digit 5. Addresses are in decimal, so storage is in multiples of 1000, not 1024: the system with 16K of memory stores exactly 16,000 characters. All arithmetic is done in base-10. So if you divide two numbers, the IBM 1401 does base-10 long division, in hardware.

The IBM 1401 does not use bytes.[9] Instead, it uses 6-bit BCD storage. Every character is stored as a 4-bit BCD digit with two extra bits called "zone bits", named A and B.[7] The two extra zone bits allow upper-case letters (and a few special symbols) to be stored, as well as digits.[8] Using a byte as the unit of operation didn't become popular until later with the IBM System/360; in the early 1960s, computers often used strange word sizes such as 13, 17, 19, 22, 26, 33, 37, 41, 45, and 50 bit words.[9]

The photo below shows the core memory module from the IBM 1401, with 4,000 characters of memory. Each bit is stored in a tiny donut-shaped ferrite core with wires running through it. The core module is more complex than you'd expect, with 16 layers (frames) in total. Eight frames hold the 6 bits of data, plus the word mark bits (explained below) and parity bits. Six frames hold data from the card reader brushes and the print hammers, for data and error checking.[10] The remaining two frames are just used for wiring.

The 4000 character core memory module from the IBM 1401 computer requires a huge amount of wiring.

Probably the most unusual feature of the 1401 is that it uses variable-length words, with word marks indicating each word. You might expect that variable-length words would let you use words of perhaps 8, 16, and 32 bits. But the IBM 1401 permitted words of arbitrarily many characters, up to the total size of memory! For instance, an instruction could move a 47-character string, or add 11-digit numbers. (Personally, I think it's easier to think of it as variable-length fields, rather than variable-length words.)

The word mark itself is a bit that is set on a memory location to indicate the boundary of a word (i.e. field).[11] An instruction on the IBM 1401 processes data through memory sequentially until it hits a word mark. It's important to remember that word marks are not part of the characters, but more like metadata, so they remain as new data records are read in and processed.[11] The main motivation behind variable-length words was to save expensive core memory, since each field length can be fit to the exact size required.

Another interesting thing about the IBM 1401 is that many instructions were extra-price options. The "advanced programming" feature provided new instructions for moving records, storing registers, and using index registers; this required the installation of 105 new SMS cards and (coincidentally) cost $105 a month. Even the comparison instruction cost extra. Because the 1401 uses BCD, you can't just subtract two characters to compare them as you would on most processors. Instead, the 1401 uses a bunch of additional circuitry for comparison, about 37 SMS cards for which you pay $75 a month.[12] Renting the printer buffer feature for $375 a month added a separate core storage module, 267 more SMS cards, and two new instructions. The bit test instruction cost only $20 a month and additional card punch control instructions were $25 a month. If you bought one of these features, an IBM engineer would install the new cards and move some wires on the backplane to enable the feature. The wire-wrapped backplanes made it relatively easy to update the wiring in the field.

The 1401 could be expanded up to 16,000 characters of core memory storage: 4,000 characters in the 1401 itself, and 12,000 characters in a 1406 expansion box, about the size of a dishwasher. The 12K expansion sold for $55,100 (about $4.60 per character), or rented for $1,575 a month. (You can see why using memory efficiently was important.) Along with the expanded memory came additional instructions to manipulate the larger addresses.

The tiny magnetic cores providing storage inside the IBM 1401's 4,000 character memory. Wires pass through each core to read and write memory. You can see multiple layers of cores in this photo.

One feature that you'd expect a computer to have is a subroutine call instruction and a stack. This is something the 1401 didn't have. To call a subroutine on the IBM 1401, you jump to the start of the subroutine. The subroutine then stores the return address into a jump instruction at the end, actually modifying the code, so at the end of the subroutine it jumps back to the caller.[13] If you want recursion, you're on your own.

Some advanced features of the 1401

Compared to modern computers, the IBM 1401 is extremely slow and limited. But it's not as primitive as you might expect and it has several surprisingly advanced features.

One complex feature of the IBM 1401 is Editing, which is kind of like printf implemented in hardware. The Edit instruction takes a number such as 00123456789 and a format string. The computer removes leading zeros and inserts commas as needed, producing an output such as 1,234,567.89. With the optional Expanded Editing feature (just $20/month more), you can obtain floating asterisks (******1,234.56) or a floating dollar sign ($1,234.56), which is convenient for printing checks. Keep in mind that this formatting is not done with a subroutine; it is implemented entirely in hardware, with the formatting applied by discrete transistors.

Another advanced feature of the IBM 1401 is extensive checking for errors. With tens of thousands of components on thousands of boards, many things can go wrong. The 1401 catches malfunctions so they don't cause a catastrophe (such as printing million dollar payroll checks). First, the memory, internal data paths, instruction decode, and BCD conversion are all protected by parity and validity checks. The ALU uses qui-binary addition to detect arithmetic errors. The card reader reads each card twice and compares the results.[10] The 1401 verifies the printer's operation on each line. (The read, punch, and print checks use the additional core memory planes discussed earlier.) As a result, the 1401 turned out to be remarkably reliable.

Because the IBM 1401 has variable word length, it can perform arbitrary-precision arithmetic. For instance, it can multiply or divide thousand-digit numbers with a single instruction. Try doing that on your Intel processor! (I tried multiplying 1000-digit numbers on the 1401; it takes just under a minute.) Hardware multiply/divide is another extra-cost feature; to meet the 1401's price target, they made it an option with the relatively steep price of $325 per month. You do get a lot of circuitry for that price, though - about 246 additional SMS cards installed in two gates.[14] And remember, this is decimal multiplication and division, which is much more difficult to do in hardware than binary.

The 1401 I used is the Sterling model which it supports arithmetic on pounds/shillings/pence, which is a surprising thing to see implemented in hardware. (Up until 1971, British currency was expressed in pounds, shillings, and pence, with 12 pence in a shilling and 20 shillings in a pound. This makes even addition complicated, as tourists often discovered.) By supporting currency arithmetic in hardware, the 1401 made code faster and simpler.[15]

A maze of wire-wrapped wires on the back of a gate connects the circuits of the IBM 1401 computer. The wiring was installed by automated machinery, but the wiring could be modified by field engineers as needed.

Implementing the Mandelbrot in 1401 assembly language

Writing the Mandelbrot set code on the 1401 is a bit tricky since I did it in assembly language (called Autocoder). The hardest part was thinking about word marks. Another complication was the 1401 doesn't have native floating point arithmetic, so I used fixed point: I scaled each number by 10000, so I could represent 4 decimal places with an integer. The 1401 is designed for business applications, not scientific applications, so it's not well-suited for fractal generation. But it still got the job done.

The 1401 didn't need to be programmed in assembly language - it supports languages such as Fortran and COBOL - but I wanted the full 1401 experience. It does amaze me though that you can run a COBOL compiler on a machine with just 4,000 characters of memory. The Fortran compiler required a machine with 8,000 memory location; in order to fit, it ran in 63 separate phases.

The assembly language code for the Mandelbrot fractal is shown below. The first part of the code defines constants and variables with DCW. This is followed by three nested loops to loop over each row, each column, and the iterations for each pixel. Some of the instructions in the code are M (multiply), A (add), S (subtract), and C (compare). Comments start with asterisks.

               JOB  MANDELBROT
     *GENERATES A MANDELBROT SET ON THE 1401
     *KEN SHIRRIFF  HTTP://RIGHTO.COM
               CTL  6641
               ORG  087
     X1        DCW  001  *INDEX 1, COL COUNTER TO STORE PIXEL ON LINE
               ORG  333
     *
     *VALUES ARE FIXED POINT, I.E. SCALED BY 10000
     *Y RANGE (-1, 1). 60 LINES YIELDS INC OF 2/60*10000
     *
     YINC      DCW  333
     XINC      DCW  220          *STEP X BY .0220
     *
     *Y START IS -1, MOVED TO -333*30 FOR SYMMETRY
     *
     Y0        DCW  -09990       *PIXEL Y COORDINATE
     *
     *X START IS -2.5
     *
     X0INIT    DCW  -22000       *LEFT HAND X COORDINATE
     X0        DCW  00000        *PIXEL X COORDINATE
     ONE       DCW  001
     ZR        DCW  00000        *REAL PART OF Z
     ZI        DCW  00000        *IMAGINARY PART OF Z
     ZR2       DCW  00000000000  *ZR^2
     ZI2       DCW  00000000000  *ZI^2
     ZRZI      DCW  00000000000  *2 *ZR *ZI
     ZMAG      DCW  00000000000  *MAGNITUDE OF Z: ZR^2 + ZI^2
     TOOBIG    DCW  00400000000  *4 (SCALED BY 10000 TWICE)
     I         DCW  00           *ITERATION LOOP COUNTER
     ROW       DCW  01
     ROWS      DCW  60
     COLS      DCW  132
     MAX       DCW  24           *MAXIMUM NUMBER OF ITERATIONS
     *
     *ROW LOOP
     *X1 = 1  (COLUMN INDEX)
     *X0 = -2.2 (X COORDINATE)
     *
     START     LCA  ONE, X1     *ROW LOOP: INIT COL COUNT
               LCA  X0INIT, X0  *X0 = X0INIT
               CS   332         *CLEAR PRINT LINE
               CS               *CHAIN INSTRUCTION
     *
     *COLUMN LOOP
     *
     COLLP     LCA  @00@, I     *I = 0
               MCW  X0, ZR      *ZR = X0
               MCW  Y0, ZI      *ZI = Y0
     *
     *INNER LOOP:
     *ZR2 = ZR^2
     *ZI2 = ZI^2
     *IF ZR2+ZI2 > 4: BREAK
     *ZI = 2*ZR*ZI + Y0
     *ZR = ZR2 - ZI2 + X0
     *
     INLP      MCW  ZR, ZR2-6   *ZR2 =  ZR
               M    ZR, ZR2     *ZR2 *= ZR
               MCW  ZI, ZI2-6   *ZI2 =  ZI
               M    ZI, ZI2     *ZI2 *= ZI
               MCW  ZR2, ZMAG   *ZMAG = ZR^2
               A    ZI2, ZMAG   *ZMAG += ZI^2
               C    TOOBIG, ZMAG  *IF ZMAZ > 4: BREAK
               BH   BREAK
               MCW  ZI, ZRZI-6  *ZRZI = ZI
               M    ZR, ZRZI    *ZRZI = ZI*ZR
               A    ZRZI, ZRZI  *ZRZI = 2*ZI*ZR
               MCW  ZRZI-4, ZI  *ZI = ZRZI (/10000)
               MZ   ZRZI, ZI    *TRANSFER SIGN
               A    Y0, ZI      *ZI += Y0
               S    ZI2, ZR2    *ZR2 -= ZI2
               MCW  ZR2-4, ZR   *ZR = ZR2 (/10000)
               MZ   ZR2, ZR     *TRANSFER SIGN
               A    X0, ZR      *ZR += X0
     *
     *IF I++ != MAX: GOTO INLP
     *
               A    ONE, I      *I++
               C    MAX, I      *IF I != MAX THEN LOOP
               BU   INLP
               MCW  @X@, 200&X1  *STORE AN X INTO THE PRINT LINE
     BREAK     C    X1, COLS    *COL LOOP CONDITION
               A    ONE, X1
               A    XINC, X0    *X0 += 0.0227
               BU   COLLP
               W                *WRITE LINE
     *
     *Y0 += YINC
     *IF ROW++ != ROWS: GOTO ROWLP
     *
               C    ROW, ROWS   *ROW LOOP CONDITION
               A    ONE, ROW
               A    YINC, Y0    *Y0 += 0.0333
               BU   START
     FINIS     H    FINIS       HALT LOOP
               END  START

I compiled and ran the code with the ROPE compiler and simulator before using the real computer.[16] The cards were punched automatically by an IBM 029 keypunch controlled by a PC through a bunch of USB-controlled relays. The photo below shows the keypunch in operation. Each blank card drops down from the feeder in the upper right. The card is punched as it moves to the left. Punched cards are then flipped up and stacked in the upper left area (empty in this picture).

$An IBM 029 keypunch preparing a card deck that generates the Mandelbrot fractal.$

An IBM 029 keypunch preparing a card deck that generates the Mandelbrot fractal.

The resulting card deck is shown below, along with the output of execution. The program fits onto just 16 cards, but the card format is a bit unusual. The machine code for the Mandelbrot program is punched into the left half of the each card, with code such as M384417A395417. An interesting thing about the 1401 is the machine code is almost human-readable. M384417 means Move field from address 384 to address 417. A395417 means Add the number at address 395 to the number at address 417. The text on these cards is the actual machine code that gets executed, not the assembly code. Since the machine is character-based, not binary, there's no difference between the characters "428" and the address 428.

$The card deck to generate the Mandelbrot fractal on the IBM 1401 computer.$

The card deck to generate the Mandelbrot fractal on the IBM 1401 computer, along with the output. The white stripe through the fractal near the right is where a hammer in the printer malfunctioned.

If you look at the right half of the cards, there's something totally different going on, with text like L033540,515522,5259534. There's no operating system, so, incredibly, each card has code to copy its contents into the right place in memory (L instruction), add the word marks (, instruction), and load the next card. In other words, the right hand side of each card is a program that runs card-by-card to load into memory the program on the left hand side of the card deck, which is executed after the last card is loaded.[17]

To run the program, first you hit the "Power On" button on the IBM 1401 console. Relays clunk for a moment to power up the system and then the computer is ready to go (unlike modern computers that take so long to boot). You put the cards into the card reader and hit the "Load" button. The cards fly through the reader at the remarkable speed of 800 cards per minute so the Mandelbrot program loads in just over a second. The console starts flickering as the program runs, and every few seconds the line printer hammers out another line of the fractal. After 12 minutes of execution, the fractal is done. (Interestingly enough, the very first picture of a Mandelbrot set was printed on a line printer in 1978.[18])

The console of the IBM 1401 mainframe. The top half shows the data flow through the computer, from storage to the B and A registers and the logic unit. Each 6-bit value is displayed as 1248ABC, where A and B are zone bits and C is the check (parity) bit. On the right, "OP" shows the operation being executed. Below are knobs to manually access memory. On the left, the "Start Reset" button clears an error, such as the card read failures I would hit. At the bottom are the important buttons to turn the computer on and off. Note the Emergency Off handle that immediately cuts the power.

Conclusions

Writing a Mandelbrot program for the IBM 1401 was an interesting project. You think a bit differently about programming when using decimal numbers and keeping track of word marks. But I have to say that comparing the performance with a current machine - not to mention the storage capacity - makes me appreciate Moore's Law.

The Computer History Museum in Mountain View runs demonstrations of the IBM 1401 on Wednesdays and Saturdays. It's amazing that the restoration team was able to get this piece of history working, so if you're in the area you should definitely check it out. The schedule is here. Tell the guys running the demo that you heard about it from me and maybe they'll run my prime number program or Pi program. You probably wouldn't want to wait for the Mandelbrot to run.

Thanks to the Computer History Museum and the members of the 1401 restoration team, Robert Garner, Ed Thelen, Van Snyder, and especially Stan Paddock. The 1401 team's website (ibm-1401.info) has a ton of interesting information about the 1401 and its restoration.

Notes and references

[1] The Computer History Museum has two working 1401 computers: the "German 1401" and the "Connecticut 1401" (based on where they came from). I used the German 1401 since the Connecticut 1401 was undergoing card reader maintenance at the time.

[2] While the $2500 per month rental rate is quoted in many places, the price could climb to $10,000 a month for a full system with multiple tape drives. The price varied greatly depending on the 1401 model, the amount of memory, and the peripherals (tape drives, card reader, printer, disk drive). The minimal configuration (1401 Model A, 1402 card reader, and 1403 printer) went for $2,475 a month (or purchased for $125,600 - about $1 million accounting for inflation). A "recommended" configuration with 8K of memory, processor options, and another printer went for $4,610 a month. Tape drives boosted the price at $980 a month for the interface and $1100 a month for each 729 IV tape drive. A 4000 character memory expansion cost $575 per month.

Detailed information on 1961 computers including rental rates is available in an interesting survey of computers in 1961, the thousand-page A Third Survey of Domestic Electronic Digital Computing Systems", Report No. 1115, March 1961 (1401 page). The basic IBM rental price was for one 8-hour shift (176 hours a month). The computers included a time counter, and users were billed extra if they went over the allotted time. Customers often paid a higher rental fee so they could run 24/7.

[3] The comment that the 1401 became the Model-T of the computer industry is from the article IBM System/360, by IBM VP Bob Evans. One piece of trivia from that article is the IBM 1620 rented for $1600 a month, making it the first IBM system renting for a price less than its model number.

[4] The IBM 1401 has a very distinctive style, especially compared to earlier IBM computers (such as the 650 or 704) with a very utilitarian, industrial appearance. The sleek, modernist style of the 1401 isn't arbitrary, but the result of a detailed industrial design process. The book The Interface: IBM and the Transformation of Corporate Design has a very interesting discussion of the effort IBM put into industrial design. Edgar Kaufmann, Jr came up with important design ideas that were developed by Eliot Noyes. Some design concepts were recessed pedestals for a feeling of floating and lightness, the concealment of most of the circuitry, expressing the "inherent drama" of computers, the carefully controlled color scheme, and modern materials for the cabinets. The tape drives in particular were wildly successful at expressing the "inherent drama" of computing, to the point that spinning tape drives became a movie cliche (tvtropes: Computer Equals Tapedrive).

[5] The number of SMS cards in an IBM 1401 depends on the model, the options installed, engineering changes (i.e. fixes) applied to the system, and the amount of memory in use. I got the number 1206 by analyzing the SMS plug chart and counting 933 basic cards, 267 Sterling basic cards, 6 power supply cards, and 11 cards for storage support. This machine is the Sterling model, so it is slightly more complex than the regular model.

[6] The IBM 1401 has 32 "potential" gates: 16 on the front and another 16 on the back, but only 24 of these are gates with circuitry. The two potential panels in the upper left are taken up by the control panel, which swings out to reveal the core memory behind it. Four panels have power supplies behind them (although much of the power supply is inside the card reader, strangely). Two more spots are occupied by the surprisingly thick cables connecting the 1401 to peripherals. This leaves 24 swing-out gates; some may be unused, depending on the optional features installed.

[7] The zone bits are closely related to the zone punches in IBM punch cards. The top row of a punch card is the 12 (Y) zone, and the row beneath it is the 11 (X) zone. A number has one hole punched in the card row corresponding to the number (rows 0 through 9). A character usually has two holes punched: 1 through 9 for the BCD value, and a zone punch for the zone bits. The zone punch is card zone 11 for zone bit B set, card zone 12 for zone bits A and B, or card row 0 for zone bit A.

There are a few complications, though, that mess up this pattern. First, for characters outside the 0-9 range, two digit punches are used: 8 and the digit for the low three bits. (e.g. '#' is stored as bits 8, 2, and 1, so it is punched as 8 and 3.) Second, because card row 0 is used both for the digit 0 and as a zone punch, there is a conflict and the value 0 is treated as 10 in certain conditions (and punched as 8 and 2). Because a blank has no punches and is stored as 0 internally, the digit 0 is stored as 10. Different IBM systems treat these corner cases differently. Custom features were available for the 1401 to provide compatibility as needed.

[8] The zone bits are used for a few things in addition to letters. A zone bit is added to the low-order digit of a number to indicate the sign of the number. Memory addresses are expressed as three digits, which would allow access to 1000 locations; by using zone bits, the three digit address can reach 16,000 locations. The zone bits also track overflow in arithmetic operations.

[9] Originally byte referred to the group of bits used to encode a character, even if it wasn't 8 bits (see Planning a Computer System: Project Stretch, p40). Some examples of unusual word lengths: The RCA 601 supported 6, 8, 12, 16-bit, or variable-length words. SPEC used 13-bit words. The Hughes Airborne Computer used 17-bit words, while the Hughes D Pat used 19-bit words and the Hughes M 252 used 20-bit words. The RW 300 used 18-bit words, while the RW 400 used 26-bit words. The Packard Cell 250 used 22-bit words. UNIVAC 1101 used 24-bit words. ALWAC II used 32 bits plus sign (33 bits). COMPAC used 37 bits (36 + sign). AN/MJQ used 41-bit words. SEAC used 45 bits (44 plus sign). AN/FSQ 31 used 48 bit words. ORACLE used 50-bit words. The Rice University computer used 54-bit words. Details on these computers are in A Third Survey of Domestic Electronic Digital Computing Systems.

[10] The card reader reads each card twice and verifies that the hole count is the same for both reads. If the counts don't match, the card reader detects the error and stops. In more detail, each card is read "sideways", a row of 80 positions at a time. Two bits keep the status of each column. One bit is turned on if there is any hole. The other bit is toggled for each hole. (Thus, it's not exactly a count, simplifying the logic.) The process is reversed on the second read, so both bits will end up back at 0 for a correct read.

Since the next card is already getting read as the first card is getting verified, two sets of bits are needed, one for the first card and one for the second card. Thus, four planes of 80 bits each are used in total to verify card reads.

Each of the 240 brushes in the card reader has a separate wire that goes through a specific core in the 1401's core memory. Likewise, each of the 132 print hammers in the printer is wired directly to an individual core. Thus, there are thick cables containing hundreds of wires between the IBM 1401 and the card reader and the printer.

[11] There are several details of wordmarks I'll point out. The IBM 1401 is obviously big-endian, since that's how numbers are punched on cards. Since arithmetic operations need to start with the lowest-order digit, they start at the "end" of the number and work backwards through memory to the highest-order digit. The consequence is an instruction is given the address of the end of the field and progresses to lower addresses until it hits the word mark, which is at the beginning of the field. This seems backwards if you're a C programmer, where you start at the beginning of a string and go forwards until you hit the end.

Word marks are also used to indicate the start of each instruction. Instructions can be 1 to 8 characters long, and the presence of a word mark controls the length. Bootstrapping the word marks for the first instructions loaded into the computer requires some tricks.

[12] The comparison logic is more complex than you'd expect. Surprisingly, the comparison order doesn't match the binary order of characters. Also, comparisons aren't implemented with subtraction (like most processors). Instead, logic first determines if the characters are special characters or not - special characters are before regular characters (with some exceptions: for example, - is between I and J). Then a lot of AND-OR logic performs basically brute-force comparison by looking at various bit patterns. The results of a comparison can be seen on the control panel in the Logic box. The optional compare logic is shown on the Intermediate Level Diagrams (ILD), page 37.

[13] Self-modifying code, where the program changes its own instructions, was common in the past. A guide to IBM 1401 programming, 1961, has a whole chapter (6) on this, discussing how "we are able to operate on instruction in storage just as through they were data". Treating code as data wasn't done only by Lisp programmers. In fact, the book calls the ability of a program to modify itself "by all odds the most important single feature of the stored program concept." As well as subroutine returns, IBM 1401 programmers used self-modifying code for indexing, address computation, and complex conditional branching. On current machines, Self-modifying code is rare because it's harder to debug and messes up the instruction pipeline.

[14] For details on how the multiply and divide operations work internally, see the optional feature manual. This circuit has some complicated optimizations. For example, to speed up the repeated additions, it will add the doubled value instead if appropriate. But doubling a decimal value takes a fairly complicated circuit (unlike binary doubling, which is trivial). And there's error checking to make sure nothing goes wrong in the doubling.

[15] The Sterling circuitry to support £sd math is even more complicated because shillings and pence are stored in a compressed form. The obvious representation is a two-digit field for pence (0 to 11) and a two-digit field for shillings (0 to 19). But to save precious memory and storage space, the BSI standard and incompatible IBM standard use one-digit fields and special characters. A knob on the control panel selects which standard to use. The Sterling hardware must perform arithmetic on this compressed representation, as well as handling the non-decimal bases of shillings and pence.

This knob on the control panel of the IBM 1401 computer selects the storage mode for pence and shillings.

[16] If you want to write a program for the 1401, instructions on using the ROPE simulator are here. It's a simple IDE that lets you edit assembly code (which is called Autocoder), assemble it, and then run it on the simulator. Take a look at A guide to IBM 1401 Programming and Programming the 1401 if you want to understand how to program the 1401. The 1401 Reference Manual is also useful for understanding what the instructions do.

[17] Each card also has a four-digit sequence number in the last columns. This lets you re-sort the cards if you happen to drop the deck and scramble the program.

[18] The first picture of the Mandelbrot set appears in 1978 paper by Brooks and Matelski, prior to Mandelbrot's work. (Thanks to Robert Garner for pointing this out.) There's some controversy over who "really" discovered the Mandelbrot set. See Who Discovered the Mandelbrot Set? in Scientific American for a discussion.

Qui-binary arithmetic: how a 1960s IBM mainframe does math

The IBM 1401 computer uses an unusual technique called qui-binary arithmetic to perform arithmetic. In the early 1960s, the IBM 1401 was the most popular computer, used by many businesses for the low monthly price of $2500. For a business computer, error detection was critical: if a company sent out bad payroll checks because of a hardware fault, it would be catastrophic. By using qui-binary arithmetic, the IBM 1401 detects arithmetic errors.

If you've studied digital circuits, you've seen the standard binary adder circuits that add two numbers. But the IBM 1401 uses a totally different approach. Unlike modern computers, the IBM 1401 operates on decimal digits, not binary numbers, using BCD (binary-coded decimal). To add two numbers, digits are converted from BCD to qui-binary, added together with a special qui-binary adder, and then converted back to digits in BCD. This may seem pointlessly complex, but it allows easy error detection.

The photo below shows the IBM 1401 with one panel opened to show the addition/subtraction circuitry, made up of dozens of Standard Module System (SMS) cards. Each SMS card holds a simple circuit with a few germanium transistors (the computer predates silicon transistors). This article explains in detail how these circuits implement it.

The IBM 1401 mainframe with gate 01B3 opened. This gate contains the arithmetic circuitry, made up of many SMS cards.

What is qui-binary?

Qui-binary code is a way of representing a decimal digit with 7 bits. The number is split into a qui part (0, 2, 4, 6, or 8) and a binary part (0 or 1).[1] For example, 3 is split into 2+1, and 8 is split into 8+0. The qui part is labeled Q0, Q2, Q4, Q6, or Q8 and the binary part is B0 or B1. The number is then represented by seven bits: Q₈Q₆Q₄Q₂Q₀B₁B₀. The following table summarizes the qui-binary representation.

Digit	Qui	Binary	Bits: Q₈Q₆Q₄Q₂Q₀B₁B₀
0	Q0	B0	0000101
1	Q0	B1	0000110
2	Q2	B0	0001001
3	Q2	B1	0001010
4	Q4	B0	0010001
5	Q4	B1	0010010
6	Q6	B0	0100001
7	Q6	B1	0100010
8	Q8	B0	1000001
9	Q8	B1	1000010

The advantage of qui-binary is error detection, since it is straightforward to detect an invalid qui-binary number.[2] A valid qui-binary number has exactly one qui bit and exactly one binary bit. Any other qui-binary number is faulty. For instance, Q4 Q2 B0 is bad, as is Q8. A problem in any bit creates a bad qui-binary number and can be detected.

Overview of the 1401's qui-binary circuit

The IBM 1401's arithmetic unit operates on one digit at a time, adding them with a qui-binary adder.[3] The block diagram below[4] shows how the adder takes two binary-coded decimal digits, stored in the A and B temporary registers, and produces their sum. The digit from the A register enters on the left, and is translated to qui-binary by the translation circuit (labeled XLATOR). This qui-binary value goes through a translate/complement circuit which is used for subtraction. The digit in the B register enters on the right and is also converted to qui-binary. The binary bits (B0/B1) are added by the binary adder at the bottom. The quinary values are added with a special quinary adder. The adder output circuit combines the quinary bits with any carry, generating the qui-binary result. Finally, the translation circuit at the top converts the qui-binary result back to a BCD digit, sending the BCD value to core memory and to the console display lights.[5]

Overview of the arithmetic unit in the IBM 1401 mainframe.

The photo below shows the IBM 1401 console during an addition instruction. The numbers are displayed in binary-coded decimal; the qui-binary representation is entirely hidden from the programmer. At this point in the addition instruction, the digit 1 was read from address 423 into the B register, and is added to the digit 2 already in the A register. The result from the qui-binary adder is 3 (binary 2 + 1), which is stored back to memory.[6]

The IBM 1401 console, showing an addition operation.

BCD to qui-binary translation

To examine the addition/subtraction circuitry in more detail, we'll start with the logic that converts a BCD digit to qui-binary. The logic is implement with an AND-OR structure that is common in the 1401. Note that the logic gate symbols are different from modern symbols: an AND gate is represented as a triangle, and an OR gate is represented as a semi-circle. Each bit of the BCD digit, as well as the bit's complement, is provided as input. Each AND gate matches a specific bit pattern, and then the results are combined with an OR gate to generate an output.

The circuit in an IBM 1401 mainframe to translate a BCD digit into qui-binary code.

To see how this works, look at the AND gate at the bottom (labeled 8, 9). Tracing the wires to the inputs, this gate will be active if input 8 AND input not-4 AND input not-2 are set, i.e. if the input is binary 1000 or 1001. Thus, output Q8 will be set if the input digit is 8 or 9, just as required for the qui-binary code.

For a slightly more complicated case, the first AND gate matches binary 1010 (decimal 10), and the second AND gate matches binary 000x (decimal 0 or 1). Thus, Q0 will be set for inputs 0, 1, or 10. Likewise, Q2 is set for inputs 2, 3, or 11. The other Q outputs are simpler, computed with a single AND gate.[7]

The B0 and B1 outputs are simply wires from the not-1 and 1 inputs. If the input is even, B0 is set, and if the input is odd, B1 is set.

9's complement circuit

To perform subtraction, the IBM 1401 adds the 9's complement of the digit. The 9's complement is simply 9 minus the digit. The complement circuit below passes the qui-binary number through unchanged for addition or complemented for subtraction.[8] The complement input selects which mode to use; it is generated from the operation (addition or subtraction), and the signs of the input numbers.

To see how complementation works in qui-binary, consider 3 (Q2 B1). Its complement is 6 (Q6 B0). The general pattern for complementation is B0 and B1 get swapped. Q0 and Q8 are swapped, and Q2 and Q6 are swapped. Q4 is unchanged; for example, 4 (Q4 B0) is complemented to 5 (Q4 B1).[9]

The complement circuit from the IBM 1401 mainframe. This converts a digit to its 9's complement value.

Quinary adder

The circuit below adds the quinary parts of the two numbers and can be considered the "meat" of the adder. The qui part from the A register is on the left, the qui part from the B register is on the top, and the qui output is on the right. The outputs with "+c" indicate a carry if the result is 10 or more. The addition logic is implemented with a "brute force" matrix, connecting each pair of inputs to the appropriate output. An example is Q2 + Q6, shown in red. If these two inputs are set, the indicated AND gate will trigger the Q8 output.[10]

The quinary addition circuit in the IBM 1401 mainframe. This adds the quinary parts of two qui-binary digits. Highlighted in red is the addition of Q2 and Q6 to form Q8.

In the photo below, we can find the exact card in the IBM 1401 that performs this addition. The card in the upper left marked with a red asterisk computes the output Q8.[11]

The SMS cards in the IBM 1401 that perform arithmetic.

The circuitry in the IBM 1401 is simple enough that you can follow it all the way to the function of individual transistors.[12] The asterisk-marked card is a 3JMX SMS card containing 4 AND gates, and is shown below. Each of the round metal transistors corresponds to one AND gate for one of the sums that generates the output Q8. The top transistor is activated by inputs 8+0, the next for 0+8, the next 6+2, and the bottom one 2+6. Thus, the bottom transistor corresponds to the red AND gate in the schematic above.[13]

The SMS card of type 3JMX has four AND gates.

Qui-binary to BCD translation

The diagram below shows the remainder of the qui-binary adder, which combines the qui and binary parts of the output, converts the output back to BCD, and detects errors. I'll just give an overview here, with more explanation in the footnotes.[14] The qui-binary carry circuit, in the blue box, processes the carry signals from the adder circuit. The next circuit, in the green box, applies any carry from the B bits, incrementing the qui component if necessary. The translation circuit, in red, converts the qui-binary result to BCD, using AND-OR logic. It also generates the parity output used for error detection in memory. The final circuit, in purple, is the error detection circuit which verifies the qui-binary result is valid and halts the computer if there is a fault.

The circuitry in the IBM 1401 mainframe to convert a qui-binary sum to a BCD result.

The photo below shows the functions of the different cards in the arithmetic rack.[15] The cards in the left half perform arithmetic operations. Each function takes multiple cards, since a single SMS card has a small amount of circuitry. "Q8" indicates the card discussed earlier that computes Q8. The right half is taken up with clock and timing circuits, which generate the clock signals that control the 1401.

This rack of circuitry in the IBM 1401 contains arithmetic logic (left) and timing circuitry (right).

Conclusion

This article has discussed how the 1401 adds or subtracts a single digit. The complete addition/subtraction process in the 1401 is even more complex because the 1401 handles numbers of arbitrary length; the hardware loops over each digit to process the entire numbers.[16][17]

Studying old computers such as the IBM 1401 is interesting because they use unusual, forgotten techniques such as qui-binary arithmetic. While qui-binary arithmetic seems strange at first, its error-detection properties made it useful for the IBM 1401. Old computers are also worth studying because their circuitry can be thoroughly understood. After careful examination, you can see how arithmetic, for instance, works, down to the function of individual transistors.

Thanks to the 1401 restoration team and the Computer History Museum for their assistance with this article. The IBM 1401 is regularly demonstrated at the Computer History Museum, usually on Wednesdays and Saturdays (schedule), so check it out if you're in Silicon Valley.

Notes and references

[1] Qui-binary is the opposite of bi-quinay encoding used in abacuses and old computers such as the IBM 650. In bi-quinary, the bi part is 0 or 5, and the quinary part is 0, 1, 2, 3, or 4.

[2] You might wonder why IBM didn't just use parity instead of qui-binary numbers. While parity detects bit errors, it doesn't work well for detecting errors during addition. There's no easy way to figure out what the parity should be for a sum.

[3] The IBM 1401 has hardware to multiply and divide numbers of arbitrary length. The multiplication and division operations are based on repeated addition and subtraction, so they use the qui-binary addition circuit, along with qui-binary doublers.

[4] The logic diagrams are all from the 1401 Instructional Logic Diagrams (ILD). Pages 25 and 26 show the addition and subtraction logic if you want to see the diagrams in context.

[5] The IBM 1401 performs operations on memory locations and the A and B registers provide temporary storage for digits as they are read from core memory. They are not general-purpose registers as in most microprocessors.

[6] A few more details about the console display. The "C" bit at the top of each register is the check (parity) bit used for error detection. The 1401 uses odd parity, so if an even number of bits are set, the C bit is also set. The "M" bit at the bottom is the word mark, which indicates the end of a variable-length field. The machine opcode character is zone B + zone A + 1, which indicates the letter "A".

Unlike modern computers, the 1401 uses intuitive opcodes so "A" means add, "S" means subtract, "B" means branch and so forth. (This is the actual opcode in memory, not the assembly mnemonic.) In the lower right, the mode knob is set to "Single cycle process", which allowed me to step through the instruction to get this picture. Normally this knob is set to "Run" and the console flashes frantically as instructions are executed.

[7] One surprising feature of the BCD translator is that it accepts binary inputs from 0 to 15, not just "valid" inputs 0 to 9. Input 10 is treated as 0, since the 1401 stores the digit 0 as decimal 10 in core. Values 11 through 15 are treated as 3 through 7. Thus, every binary input results in a valid (but probably unexpected) qui-binary value. As a result, the 1401 can perform addition on non-decimal characters, but the results aren't very useful.

[8] The IBM 1401 uses 9's complements since it is a decimal machine, unlike modern binary computers which use 2's complements. For example, the complement of 1 is 8, and the complement of 4 is 5. To subtract a number, the 9's complement of each digit is added (along with a carry). An example of using complements for subtration is 432 - 145. The 9's complement of 145 is 854. 432 + 854 + 1 = 1287. Discarding the top digit yields the desired result 432 - 145 = 287. Complements are explained in more detail in Wikipedia.

[9] If you trace through the AND-OR logic in the complement circuit, you can see that each pair of AND gates and and OR gate forms a multiplexer, selecting one input or the other. For example for the B1 output: if complement is 0 AND B1 is 1, the output is 1. OR, if complement is 1 AND B0 is 1, the output is 1. In other words, the output matches the B1 input if complement is 0, and matches the B0 input if complement is 1. The box labeled I in the schematic is an inverter.

[10] The quinary adder is implemented using wired-OR logic. Instead of an explicit OR gate, the AND outputs are simply wired together to produce the OR output. While the quinary adder looks symmetrical and regular in the schematic, its implementation uses three different SMS cards: 3JMX and 4JMX AND/OR gates, and JGVW AND gates, depending on the number of AND gates feeding the output.

[11] One component of interest in the photo of SMS cards is the silver rectangle on the lower right card. This is the quartz crystal that generates timing for the 1401. The SMS card is type RK, and the crystal runs the 347.5kHz oscillator. Eight oscillator half-cycles make up the 11.5 microsecond cycle time of the 1401. At the top of the photo are the wiring bundles connecting these circuits to other parts of the computer.

[12] Due to the simplicity of the IBM 1401 compared to modern computers, it's possible to understand how the IBM 1401 works at every level all the way to quantum physics. I'll give an outline here. The gates in an SMS card use a simple form of logic called CTDL by IBM and DTL (Diode-Transistor Logic) by the rest of the world. The 3JMX card schematic shows that each input is connected through a diode to the output transistor. If any input is high, current flows through the diode and turns off the transistor. The result is an AND gate (with inverted inputs). IBM Transistor Component Circuits (page 108) explains this circuit in detail.

Going deeper, we can look inside the transistor. The board uses type 034 germanium alloy-junction transistors (details, details), very different from modern silicon-based planar transistors. These transistors consist of a germanium crystal base with indium beads fused on either side to form the emitter and collector. The regions of germanium-indium alloy form the "P" regions. In the photo, the germanium disk is in the small circular hole. Copper wires are connected to the indium beads. The photo below shows an IBM 083 transistor from the IBM 1401. This is the NPN version of the transistors in the 3JMX card. If you want a deeper understanding, look at bipolar junction transistor theory, which in turn is explained by quantum physics and solid-state device theory.

Inside a germanium alloy-junction transistor used in the IBM 1401 computer. This is an IBM 083 NPN transistor. Photo from IBM 1401 restoration team.

[13] You may wonder how 8=4+4 gets computed, since the card described doesn't handle that. The sum 4+4 is computed by the card just below the asterisk (a triple AND gate card of type JGVW). The other two AND gates in that card compute 6+6 and 8+8. To determine what each board in the IBM 1401 does, look at the Automated Logic Diagrams, page 34.32.14.2.

[14] The qui-binary carry logic happens in several phases. The qui parts are added, generating a carry if needed. The binary parts are added with a simple binary adder (not shown). A carry from the binary part shifts the qui part by 2. A carry out signal is also generated as needed. For instance, adding 3 + 5 is done by adding Q2 B1 + Q4 B1. This generates Q6 + B0 + B carry. The B carry increments the qui component to Q8, yielding the result Q8 B0 (i.e. 8).

The qui-binary to BCD translation circuit uses straightforward AND-OR logic, detecting the various combinations. Note that 0 is represented in the 1401 as binary 1010 (because binary 0000 indicates a blank), so the BCD output bits 8 and 2 are set for qui-binary value Q0 B0. The parity output is generate by combining the binary parity (even for B0; odd for B1) with the qui parity value. The qui even parity signal is set for Q0 or Q6, while the qui odd parity signal is set for Q2, Q4, Q8. Note that representing 0 as binary 1010 instead of 0000 doesn't affect the parity.

The error detection circuit uses AND-OR logic to detect bad qui-binary results. It detects a fault if no B bits are set or both B bits are set. Instead of testing every qui bit combination, it implements a short cut from the qui parity circuit. If the even qui parity signal and the odd qui parity signal are both set, this indicates multiple qui lines are set, triggering a fault. If neither qui parity signal is set, then no qui lines are set, also triggering a fault. The parity check misses a few qui combinations (such as Q0 and Q6 set), so these are tested separately. The result is that any invalid qui-binary result triggers a fault.

[15] The rack of cards shown is officially known as gate 01B3. The functions assigned to each card in the photo are approximate, because some cards are used by multiple functions. For exact information, see the plug list, which specifies the card type and function for every card in the 1401.

[16] One complication with the 1401's arithmetic instructions are numbers are stored as a positive value with a sign bit (on the last digit). This format makes printing of positive and negative numbers simpler, which is important for a business computer, but it makes arithmetic more complicated. First, the signs must be checked to determine if the numbers are being added or subtracted. Next, each digit is added or subtracted in sequence until the end of the number is reached. If the result is negative, the 1401 flips the result sign and converts the answer back to a positive value by making two additional digit-by-digit passes over the number. Modern computers use binary and handle negative numbers with two's complement, which makes subtraction much simpler. It takes 9 pages of documentation to explain the addition operation, complete with multiple flowcharts: see IBM 1401 Data Flow pages 24-32. (Keep in mind that these flowcharts are implemented in hardware, not with microcode or subroutines.)

[17] Arithmetic on the 1401 and the qui-binary adder are discussed in detail in 1401 Instruction Logic, pages 49-67. For the history leading up to qui-binary arithmetic, see this article by Carl Claunch.

Examining the core memory module inside a vintage IBM 1401 mainframe

The IBM 1401 mainframe computer was announced in 1959 and by the mid-1960s had become the best-selling computer, extremely popular with medium and large businesses because of its low cost. A key component of the 1401's success was its 4,000 character core memory, which stored data on tiny magnetized rings called cores.

The 4000-character core memory module from the IBM 1401 mainframe.

The core module is surprisingly complex, as can be seen above, with thousands of tiny cores mounted on red wires. The module consists of 16 frames stacked together and requires a large amount of wiring. The remainder of the article will dive into the details of this core module. (For an overview of the 1401, see my articles about Bitcoin mining and fractals on the 1401.)

The IBM 1401 mainframe from the 1960s. The 1403 line printer is to the right, and a 792 tape drive at the back.

The IBM 1401 mainframe (above) is about the size of two refrigerators. The core memory module in the 1401 can be accessed by swinging open the computer's front panel, as seen below. The console switches, lights, and wiring are on the left. The core module itself is in the center, mostly hidden behind the brown circuit board.

Opening the console panel (left) of the IBM 1401 mainframe shows the 4K core memory unit (center).

The diagram below illustrates how the character 'A' is stored in core memory. Each bit of data in memory is stored in a tiny ferrite ring or core. These cores can be magnetized in one of two directions, corresponding to a 0 or 1 bit. The cores are arranged into a grid of 4000 cores, called a plane. To select an address, an X wire and a Y wire are activated, selecting the cores where those two wires cross. Each plane stores one bit and planes are stacked up to store a character. You might expect 8 planes are used to store a byte, but the IBM 1401 predates bytes; it uses 6-bit characters based on BCD (binary-coded decimal). Each location also has a special metadata bit called the "word mark", indicating the start of a field or instruction. Adding the parity bit yields eight bits of storage at each address.

Diagram from the 1401 Reference Manual representing how the character 'A' is stored in core memory.

Because the IBM 1401 was a business computer, it uses decimal arithmetic rather than binary arithmetic; each character is a binary-coded decimal value, along with two extra "zone bits" for alphanumeric characters. Since the 1401 uses three-character addresses, you might expect that it could only access 1000 locations. The trick is the two zone bits of the hundreds character provided the thousands digit 0 to 3. A consequence is that addresses above 1000 turn into alphanumerics instead of digits; location 2345 is addressed as L45.

Properties of ferrite cores

The physical properties of ferrite cores are critical to the operation of the core memory, so it is important to understand them. First, if a wire through a core carries a strong current, the core will be magnetized according to the direction of the current (following the right-hand rule). Current in one direction will write a 1 to the core, while the opposite current will cause the opposite magnetization and write a 0 to the core.

Hysteresis is a key property of the cores: current must exceed a threshold to affect a core's magnetization. A small current will have no effect on the core, but a current above a threshold will cause the core to "snap" into the magnetized state aligned with the current.

Closeup of the ferrite cores from the IBM 1401 mainframe's 4K storage. Four wires run through each core: X select, inhibit, Y select, and sense.

The hysteresis property makes it possible to select a particular core. A "half-write" current is sent through the appropriate X select wire and a "half-write" current through the Y select wire. The single core with the selected X and Y wires will have enough current to change state, but the other cores will not have enough current, and will remain unchanged.

The final important property is that when a core switches its direction of magnetization, it induces a current in a sense wire through the core (kind of like a transformer). If the core already has the target state and doesn't change magnetization, no current is induced. This induced current is used to read the state of a core. A consequence is that reading a core erases it, and the desired value must be written back to the core.

Structure of a core plane

Each core plane has 4000 cores arranged as a 50x80 grid of cores. (The I/O planes are configured differently, and will be explained later.) To reduce interference, the ferrite cores are arranged in a "checkerboard" pattern with each core arranged diagonally in the opposite direction from its neighbors. Four wires pass through each core. The horizontal wires are the X select line and the inhibit line (used for writing). The vertical wires are the Y select line and the sense line (used for reading). The X and Y select lines go through all the planes, so all planes are accessed in parallel.

Core memory in the IBM 1401. Each plane of cores has 4000 cores in a 80x50 grid.

To read a core, the X and Y select lines magnetize the selected cores to the "0" direction. If the core was previously in the "1" state, the core's state change induces a current in the sense wire. If the core was already in the "0" state, no current is induced. Thus, the sense wire allows the bit stored in the core to be determined. The read process destroys the previous value of the core, leaving it in the 0 state. Each plane has a sense wire threaded through all the cores in the plane.

To write a core, current of the opposite polarity is sent through the X and Y select lines to magnetize the core into the 1 state. To keep the core in the 0 state, a current is sent through the plane's inhibit line. The inhibit wire runs through all the cores in a plane parallel to the X select lines. By running the reverse current through the inhibit wire, the X line's current is canceled out, and the core remains unchanged. The inhibit current is too low to flip a core by itself, so other cores are not zeroed out.

The diagram below shows the reverse-engineered wiring topology of an IBM 1401 core memory plane. Most of the core has been cut out of the diagram, as indicated by the dotted gray lines. The sides of the plane are labeled A through D, matching the 1401 documentation. The A and C sides have 56 pins, while the B and D sides of the plane have 104 pins. Not all the pins are connected.

The wiring topology of the IBM 1401's core memory plane.

The X select lines are in green and the Y select lines are in red. The select lines are generated in a complex way by matrix switches, so core addresses are not arranged sequentially. Each matrix switch takes two sets of input lines and activates an output line based on the input values. The 5x10 X matrix switch has 5 row inputs and 10 column inputs, producing 50 outputs, which are the X select lines. The 10 column inputs come from the units digit, and the 5 row inputs are the "even hundreds" digit. The 8x10 Y matrix switch has 8 row inputs and 10 column inputs, producing 80 outputs for the Y select lines. The 10 column inputs are from the tens digit and the 8 row inputs are a tricky combination of the thousands and "odd hundreds". This scheme may seem overly complicated, but it minimizes the hardware required for address decoding.

Each half of the core plane (0-1999 and 2000-3999) has a separate sense line loop, but they are usually wired together. The two sense lines are in blue and run in the Y direction. The sense lines are carefully arranged to avoid picking up interference. The lines cross over along the midpoint to cancel out noise from the Y select lines - the sense line runs in the opposite direction along half of each Y select line, so any induced signal will be canceled out. In addition, the sense lines are twisted as they exit the middle of the plane, to avoid picking up interference. (Many other core memory systems avoid interference by running the sense line diagonally, but the 1401 uses a rectangular layout.)

Each half of the plane has a separate inhibit line. The two inhibit lines are in brown and run next to the X select lines, which they inhibit. The two lines are normally driven separately to reduce noise, but have the same signal. Since the inhibit line switches direction each row, alternating X select lines are also driven in opposite directions.

The card reader/punch, the printer, and the I/O cores

One unusual feature of the core module is the eight special-purpose I/O frames: six core planes and two terminal frames. To understand the I/O cores, some background on the IBM 1401 is necessary. The 1401 was used in business applications such as accounting and payroll, so accuracy was extremely important. If a malfunction caused bad payroll checks to be printed, it would be a catastrophe. To catch problems, IBM put many types of validity checking into the 1401, making it much more reliable than competitors. The basic I/O devices for the 1401 were the card reader/punch and the line printer, separate units from the computer itself and the I/O cores detected problems with these devices.

The I/O planes are addressed exactly the same as the data planes. However, the I/O planes are very sparse, with only 297 cores rather than 4000 cores, so most locations have no storage as can be seen in the photo below. These planes are accessed by the I/O circuitry, and are invisible to the programmer.

Closeup of the IBM 1401's core memory. The row bit core planes are used for I/O and are sparsely populated.

The IBM 1401 uses 80-character punch cards. You might expect the card reader to read each character on the card in sequence and send the character to the computer, but that's not at all how it works. Instead, the card reader processes each card "sideways" for speed, using 80 metal brushes to read a row at a time. If a card has a hole in a position, the brush contacts a metal roller under the card, completing a circuit. The brushes are connected to the IBM 1401 by 80 wires, one for each brush. Each wire is connected directly to a "row-bit core" in the core memory module, setting the core if a hole was detected. There's no driver circuitry or memory addressing; it's literally a separate wire from each brush that is wrapped 5 times around a core. Let me emphasize how unusual this is: it's like having a separate wire from each key on your keyboard directly to a specific transistor in your memory chip.

The card reader/punch has three read stations: RD1 and RD2 for reading, and PCH for reading after punching. Since each read station has 80 brushes, 240 wires connect the brushes to the 240 row-bit cores. (As you might have guessed, the cables between the 1401 computer and the reader/punch are very thick.) As well as the row-bit cores, reading/punching uses core planes called XU, YU, XL, and YL to count the number of holes detected in each position. If the two read stations have different hole counts, the computer stops and reports a fault. Likewise, the count is checked after punching a card to make sure all holes were punched correctly.

The high-speed line printer uses 132 hammers to produce 132-column output. A chain with the 48 printable characters whizzes around horizontally. As each character on the chain passes a position where it should be printed, a hammer fires at the precise time, hitting the paper against the inked ribbon to print the character. The I/O cores are also used to detect problems in the printing process.

Printing uses several different core planes for multiple validity checks. Each of the 132 print hammers is wired directly to a "hammer-fire core" in the memory module. The XU core plane is used during printing for the print-compare check: a bit is set in the XU plane if a hammer should fire for the character position. These 132 bits are compared with the hammer-fire cores to verify that the correct hammers fired. Plane YL holds print-line complete cores that verify that every character position either printed a character or holds a non-printable character. Finally, to aid printer maintenance, plane YU records the location of any fault in print-error storage core.

Physical layout of the core module

The core module consists of 16 frames in a stack - 14 core planes and two terminal frames. The upper 8 frames hold the character data planes and the lower 8 frames are the I/O frames. The following table shows the usage of each frame. The terminal frames do not contain cores, but provide connections for the large wire bundles from the reader brushes and the print hammers.

1:	Bit 8
2:	Bit 4
3:	Bit 2
4:	Bit 1
5:	Bit A
6:	Bit B
7:	Parity
8:	Word Mark
9:	Terminals for frame 10
10:	Card reader brushes (RD2), punch brushes (PCH)
11:	Terminals for frame 12
12:	Card reader brushes (RD1), print hammers (PRT)
13:	XU (I/O)
14:	YU (I/O)
15:	XL (I/O)
16:	YL (I/O)

The picture below shows the large amount of wiring required by the core module. Frame 16 (YL) is at the left and frame 1 (bit 8) is at the right. The two matrix switches are on the front of the module: the 8x10 switch for the Y select lines is at top, and the 5x10 switch for the X select lines is at the bottom.

The core memory module from the IBM 1401 mainframe.

The yellow wires at the left and right connect the Y select lines on frame 16 and frame 1 to the 8x10 matrix switch. Two bundles of wires connect to I/O planes near the middle of the module. One connects the brushes in the card reader and the printer hammer drivers to terminals on frame 11. The other bundle connects read brushes and punch check brushes to terminals on frame 9. The horizontal wire bundle across the middle of the planes connects the inhibit lines of each plane.

The photo below provides another view, focusing on the data plane wiring. At front is frame 1, the core plane for data bit 8, with the gray cores visible on red wires. The other 15 frames are layered behind frame 1. The two matrix switches are on top. The 8x10 matrix switch is connected to the Y select lines on the top and the 5x10 matrix switch is connected to the X select lines on the left.

The core memory module from the IBM 1401 mainframe. The cores in one of the planes are visible, strung along red wires. At the top, two matrix decoder boards generate the 50 X select lines and 80 Y select lines, addressing one of 4000 storage locations. The X select lines are connected to the core planes by the yellow wires on the left side of the core module, while the Y select lines are connected on top.

The detailed block diagram below shows how the components are connected in the 1401's core memory system. This diagram shows the physical arrangement of the 16 frames in the core memory module, along with the driver circuitry. The inhibit drivers are at the upper left, feeding each core plane. The sense amplifiers are at the upper right. The 5x10 X matrix switch is in the lower left, and the 8x10 Y matrix switch is in the lower right. Note the read brushes, punch brushes, and print hammer drivers are wired directly into the core module through the terminal frames. The diagram also shows the timing of the read and write pulses, and how they have opposite polarity, writing 0 and 1 respectively.

Diagram of the core memory system in the IBM 1401 mainframe from ALD 42.41.11.2.

The matrix switches

Generating the X and Y select signals is a tricky problem. The drive signals must have a positive pulse of the right current and duration for reading, followed by a negative pulse for writing. In addition, the number of select lines is large (50 X and 80 Y), so hardware costs would be excessive if each line had its own driver circuitry.

The 1401 uses an interesting solution to drive the select signals. Matrix switches generate the select signals by using a set of ferrite cores. But instead of storage, these cores are used for their switching properties. As with storage, the matrix switch depends on the "coincident current" property, where two signals of sufficient current will cause a core to snap to the opposite magnetization. But instead of being used for storage, the cores in the matrix switch generate a drive signal.

The 5x10 matrix switch in the IBM 1401 mainframe. This board provides the drive signals for the core module.

The photo above shows the X matrix switch. The switch consists of 50 cores in a 5 by 10 grid, with 5 lines driving the rows and 10 lines driving the columns. Each core also has an output winding and a bias winding. When two input lines are triggered, the corresponding core flips state, generating a pulse on the output winding. When the input lines are released, the bias winding flips the core back to its original state, generating a negative pulse on the output winding. Thus, the desired one of the 50 outputs has a positive pulse followed by a negative pulse, which is just what the core module requires for read followed by write.

The photo below shows the wiring of the matrix switch cores. The bias wire (black) is wound through pairs of cores three times. Each horizontal input wire (red) is wound through pairs of cores about twelve times, as are the vertical input wires. Each core has an output wire wound diagonally about twelve times.

Closeup of the matrix switch used in the IBM core memory. Each ferrite ring drives one of the select lines in the core memory.

How core memory is mounted in the 1401

The following picture shows the core memory module mounted in its rack, along with the many SMS cards required by the core module. (IBM built computers from Standard Modular System cards, each about the size of a playing card and holding a few transistors and other components.) At the left are the driver cards and current source cards that drive the matrix switch boards, and the driver cards for the inhibit lines. The next column holds the address decode cards. The address lines plug into the empty sockets at the bottom. The next column holds the sense line pre-amplifier and amplifier cards. The core module itself is mounted with the matrix switch cards on top. At the far right are the sockets for the hundreds of wires from the brushes and print hammers.

Core memory module and associated circuit board from an IBM 1401 mainframe. Photo courtesy of Rob Storey.

The photo below shows the core module mounted inside the IBM 1401 mainframe, looking into the left end of the computer. The core module is behind the bundle of black and yellow wires, mostly address lines. The matrix switches are on the left. The colorful brush and hammer wires are connected via paddles underneath the core module. The SMS driver cards are above the core module, mostly behind a metal cover for airflow.

The core memory module inside the IBM 1401 mainframe. The module is in the lower right, with the driver cards above

The photo shows some other interesting features of the 1401. At the top of the computer is the time meter that records how much time the computer has been running. IBM usually leased the 1401 and if you used the computer more than 8 hours per day, they would charge you for the excess. (Unless, of course, you paid for the 24/7 lease.) In the upper right is the "convenience" outlet located inside the computer, a standard electrical outlet. Below the outlet is the wiring on the back of the front console. The computer didn't use a backplane; instead, many loose bundles of wires connected circuitry modules, as you can see at the bottom of the photo.

Conclusion

Core memory was the leading memory technology from the mid-1950s until it was replaced by semiconductor memory in the early 1970s. For its time, core memory provided dense, reliable, and inexpensive storage, but memory technology has improved incredibly since then. The 1401 had a 11.5 microsecond memory cycle time, compared to 5 nanoseconds for modern RAM. While the 1401 had 4000 characters of storage (expandable to 16K), modern computers have many gigabytes. Adding a 4K memory expansion to the 1401 cost $20,100 ($162,000 in current dollars). Now a 16 gigabyte memory costs under $100. But even though it is obsolete, core memory is still an interesting technology to examine.

Thanks to the members of the 1401 restoration team and the Computer History Museum for their assistance The IBM 1401 is demonstrated at the Computer History Museum on Wednesdays and Saturdays (subject to hardware problems) so check it out if you're in Silicon Valley (schedule).

References

The IBM 1401 core module is documented in detail in 1401 ALD 42 and 1401 Instructional Logic Diagrams. For more information on core memory, see Coincident Current Ferrite Core Memories and Magnetic Core Memory Systems.

Repairing a 1960s mainframe: Fixing the IBM 1401's core memory and power supply

A few weeks ago, I wanted to use one of the vintage IBM 1401 mainframe computers at the Computer History Museum, but the computer wasn't working.1 This article describes the multi-week repair process to get the computer working again.

The problem started when the machine was powered up at the same time someone shut down the main power, apparently causing some sort of destructive power transient. The computer's core memory completely stopped working, making the computer unusable. To fix this we had to delve into the depths of the computer's core memory circuitry and the power supplies.

The IBM 1401 computer. The card reader/punch is in the foreground. The 12K memory expansion box is partially visible to the right behind the 1401.

Debugging the core memory

The IBM 1401 was a popular business computer of the early 1960s. It had 4000 characters of internal core memory with additional 12000 characters in an external expansion box.2 Core memory was a popular form of storage in this era as it was relatively fast and inexpensive. Each bit is stored in a tiny magnetized ferrite ring called a core. (If you've ever heard of a "core dump", this is what the term originally referred to.) The photo below is a magnified view of the cores, along with the red wires used to select, read and write the cores.4 The cores are wired in an X-Y grid; to access a particular address, one of the X lines is pulsed and one of the Y lines is pulsed, selecting the core where they intersect.3

Detail of the core memory in the IBM 1401. Each toroidal ferrite core stores one bit.

In the 1401, there are 4000 cores in each grid, forming a core plane that stores 4000 bits. Planes are then stacked up, one for each bit in the word, to form the complete core module, as shown below.

The 4000 character core memory module from an IBM 1401 computer. Tiny ferrite cores are strung on the red wires.

To diagnose the memory problem, the team started probing the 1401 with an oscilloscope. They checked the signals that select the core module, the memory control signals, the incoming addresses, the clock signals and so forth, but everything looked okay.

The next step was to see if the X and Y select signals were being generated properly. These pulses are generated by two boards called "matrix switches", one for the X pulse and one for the Y pulse.5 Some address lines are decoded and fed into the X matrix switch, while the other address lines are decoded and fed into the Y matrix switch. The matrix switches then create pulses on the appropriate X and Y select lines to access the desired address in the core planes.

The photo below shows the core memory module and its supporting circuitry inside the 1401. The core memory module itself is at the bottom, with the two matrix switch boards mounted on it. Above it, three rows of circuit boards (each the size of a playing card) provide the electronics. The top row consists of inhibit drivers (used for writing memory) and the current source and current driver boards (providing current to the matrix switches). The middle row has 17 boards to decode the memory addresses. At the bottom 19 sense amplifier boards read the data signals from the cores. As you can see, core memory requires a lot of supporting electronics and wiring. Also note the heat sinks on most of these boards due to the high currents required by core memory.

Inside the IBM 1401 computer, showing the key components of the core memory system.

After some oscilloscope measurements, we found that one of the matrix switches wasn't generating pulses, which explained why the memory wasn't working. We started checking the signals going into the matrix switch and found one matrix switch input line showed some ringing, apparently enough to keep the matrix switch from functioning.

Since the CHM has two 1401 computers, we decided to swap cards with the good machine to track down the fault. First we tried swapping the thermal switch board (below). One problem with core memory is that the properties of ferrite cores change with temperature. Some computers avoid this problem by heating the core memory to a constant temperature in air (as in the IBM 1620 computer) or an oil bath (as in the IBM 7090). The 1401 on the other hand uses temperature-controlled switches to adjust the current based on the ambient temperature. We swapped the "AKB" thermal switch board (below) and the associated "AKC" resistor board, with no effect.

The core memory uses a thermal switch board to adjust the current through core memory as temperature changes. The switches open at 35°C, 29°C and 22°C. The type of the board (AKB) is stamped into the lower left of the board.

Next we tried swapping the "AQW" current source boards that control current through the matrix switches.6 We swapped these board and the 1401's memory started working. Replacing the original boards one at a time, we found the bad board, shown below.

The IBM 1401 has four "AQW" cards that generate currents for the core memory switches. This card had a faulty inductor (the upper green cylinder), preventing core memory from working.

I examined the bad board and tested its components with an multimeter. There were two 1.2mH inductors on the board (the large green cylinders). I measured 3 ohms across one and 3 megaohms across the other, indicating that the second inductor had failed. With an open inductor, the board would only provide half the current. This explained why the matrix switch wasn't generating pulses, and thus why the core memory didn't work.

I gave the bad inductor to Robert Baruch of Project 5474 for analysis. He found that the connection between the lead and the inductor wire was intermittent. He dissolved the inductor's package in acid and took photographs of the winding inside the inductor.7

The faulty inductor from the IBM 1401 showing the failed connection.

We looked in the spare board cabinet for an AQW board to replace the bad one and found several. However, the replacement boards were different from the original—they had one power transistor instead of two. (Compare the photo below with the photo of the failed card from the computer.)

The replacement AQW card had one transistor instead of two, but was supposedly compatible with the old board.

Despite misgivings from some team members, the bad AQW card was replaced with a one-transistor AQW card and we attempted to power the system back up. Relays clicked and fans spun, but the computer refused to power up. We put the old card back (after replacing the inductor), and the computer still wouldn't start. So now we had a bigger problem. Apparently something had gone wrong with the computer's power supplies so the debugging effort switched focus.

Diagnosing the power supply problem

The power supply system for the IBM 1401 is more complex than you might expect. Curiously, the main power supplies for the system are inside the card reader; a 1250W ferro-resonant transformer in the card reader regulates the line input AC to 130V AC, which is fed to the 1401 computer itself through a thick cable under the floor. Smaller power supplies inside the 1401 then produce the necessary voltages.

Since it was built before switching power supplies became popular, the IBM 1401 uses bulky linear power supplies. The photo below shows (left to right) the +30V, -6V, +6V and -12V supplies.8 In the lower left, under the +30V supply, you can see eight relays for power sequencing. The circuit board to the right of the relays is one of the "sense cards" that checks for proper voltages. Under the +6V supply is a small "+18V differential" supply for the core memory. Foreshadowing: these components will all be important later.9

Power supplies in the IBM 1401.

After measuring voltages on the multiple power supplies, the team concluded that the -6V power supply wasn't working right. This was a bit puzzling because the AQW card (the one we replaced) only uses +12 and +30 volts. Since it doesn't use -6 volts at all, I didn't see how it could mess up the -6 volt supply.

Inside the IBM 1401's -6V power supply.

The team removed the -6V supply and took it to the lab. In the photo above, you can see the heavy AC transformer and large electrolytic capacitors inside the power supply. Measuring the output transistors, they found one bad transistor and some weak transistors and decided to replace all six transistors. In the photo below, you can see the new transistors, mounted on the power supply's large heat sink. These are germanium power transistors; the whole computer is pre-silicon.

The -6V power supply from the IBM 1401 uses six power transistors on a large heat sink.

The -6V power supply tested okay in the lab with the new transistors, so it was installed back in the 1401. We hit the "Power On" button on the console and... it still didn't work. We still weren't getting -6V and the computer wouldn't power up.

In the next repair session, we tried to determine why the computer wasn't powering up. Recall the eight relays mentioned earlier; these relays provide AC power to the power supplies in sequence to ensure that the supplies start up in the right order. If there is a problem with a voltage, the next relay in the sequence won't close and the power-up process will be blocked. We looked at which relays were closing and which weren't, and measured the voltages from the various power supplies. Eventually we determined that about halfway through the power-up process, relay #1 was not closing when it should, stopping the power-up sequence.

Relay #1 was driven by the +30V supply and was activated by a "sense card" that checked the +6V supply. But the +30V and +6V supplies were powering up fine and the sense card was switching on properly. Thus, the problem seemed to be a failure with the relay itself. Just before we pulled out the relay for testing, someone found an updated schematic showing the relay didn't use the regular +30V supply but instead obtained its 30 volts through the "18V differential supply".11 And the schematic for the 18V differential supply had a pencilled-in fuse.10

Could the power problem be as simple as a burnt-out fuse? We opened up the 18V differential supply, and sure enough, there was a fuse and it was burnt out. After replacing the fuse, the system powered up fine and we were back in business.

The 18V differential power supply in the IBM 1401 provides 12 volts to the core memory. The fuse is under the large electrolytic filter capacitors.

With the computer operational, I could finally run my program. After a few bug fixes, my program used the computers's reader/punch to punch a card with a special hole pattern:

A punch card with "Merry Xmas" and a tree punched into it.

Happy holidays everyone!12

Conclusion

After all this debugging, what was the root cause of the problems? As far as we can tell, the original problem was the inductor failure and it's just a coincidence that the problem occurred after the power loss during system startup. The new AQW card must have caused the fuse to blow, although we don't have a smoking gun.13 The reason the -6V power supply wasn't showing any voltage is because it was sequenced by relay #1, which didn't close because of the fuse. The bad transistors in the -6V power supply problem were apparently a pre-existing and non-critical problem; the good transistors handled enough load to keep the power supply working. The moral from all this is that keeping an old computer running is challenging and takes a talented team.

Thanks to Robert Baruch for the inductor photos. Thanks to Carl Claunch for providing analysis. The Computer History Museum in Mountain View runs demonstrations of the IBM 1401 on Wednesdays and Saturdays so check it out if you're in the area; the demo schedule is here.

Follow me on Twitter or RSS to find out about my latest blog posts.

Notes and references

Although there are two IBM 1401 computers at the CHM, only one of them has the "column binary punch" feature that I needed. "Column binary" lets you punch arbitrary patterns on a punch card (to store binary) rather than being limited to the standard punch card character set of 64 characters. ↩
Note that the 1401 has 4000 characters of memory and not 4096 because it is a decimal machine. Also, the memory stores 6-bit characters plus a (metadata) word mark and not bytes. ↩
If you want to know more about the 1401's core memory, I've written in detail about core memory and described a core memory fix . ↩
The trick that makes core memory work is that the cores have extremely nonlinear magnetic characteristics. If you pass a current (call it I) through a wire through a core, the core will become magnetized in that direction. But if you pass a smaller current (I/2) through a wire, the core doesn't change magnetization at all. The result is that you can put cores on a grid of X and Y wires. If you put current I/2 through an X wire and current I/2 through a Y wire, the core at their intersection will get enough current to change state, while the rest of the cores will remain unchanged. Thus, individual cores can be selected. ↩
The matrix switch is another set of cores in a grid, but used to generate pulses rather than store data. The 1401's memory has 50 X lines and 80 Y lines (yielding 4000 addresses), so generating the X and Y pulses with transistors would require 50 + 80 expensive, high-current transistors. The X matrix switch has 5 row inputs and 10 column inputs, and 50 outputs—one from each core. The address is decoded to generate the current pulses for these 15 inputs. Thus, instead of using transistor circuits to decode and drive 50 lines, just 15 lines need to be decoded and driven, and the matrix switch generates the final 50 lines from these. The Y lines are similar, using a second matrix switch to drive the 80 Y lines. ↩
Each matrix switch has two current inputs (for the row select and the column select), so there are four current source boards and four current driver boards in total. ↩
Strangely, half the inductor is nicely wound while the winding in the other half is kind of a mess.

↩
The faulty inductor from the IBM 1401.
The 1401 has more power supplies that aren't visible in the picture. They are behind the power supplies in the photo and slide out from the side for maintenance. ↩
If you want to see the original schematics and diagrams of the 1401's power supplies, you can find them here. Core memory schematics are here. ↩
The pencilled-in fused on the schematic also had a note about an IBM "engineering change". In IBM lingo, an engineering change is a modification to the design to fix a problem. Thus, it appears the the 1401 originally didn't have the fuse, but it was added later. Perhaps we weren't the first installation to have this problem, and the fuse was added to prevent more serious damage. ↩
The 18V differential supply provides 12 volts. This seemed contradictory, but there's an explanation. The core memory circuitry is referenced to +30 volts. It needs a supply 18 volts lower, which is provided by the 18V differential supply. Thus, the voltage is +12V above ground. Unlike the regular +12V power supply, however, the differential power supply's output will move with any changes to the +30V supply, ensuring the difference is a steady 18 volts. ↩
The "Merry Xmas" card was inspired by a tweet from @rrragan. (I had also designed a card with a menorah, but unfortunately encountered keypunch problems and couldn't get it completed in time. Maybe next year.) Punch cards normally encode characters by punching up to three holes per column. Since this decorative card required many holes per column, I needed to use the 1401's column binary feature, which allows arbitrary binary data to be punched. I ended up punching the card upside down to simplify the program:

↩
Front of my "Merry Xmas" punch card.
After carefully examining the AQW boards, we determined that one- and two-transistor cards should be compatible. The two-transistor board had the two transistors in parallel, probably using earlier transistors that couldn't handle as much current. It's possible that the filter capacitor between +30V and ground was shorted in the replacement AQW board, blowing the fuse. ↩