Repairing a 1960s mainframe: Fixing the IBM 1401's core memory and power supply

A few weeks ago, I wanted to use one of the vintage IBM 1401 mainframe computers at the Computer History Museum, but the computer wasn't working.1 This article describes the multi-week repair process to get the computer working again.

The problem started when the machine was powered up at the same time someone shut down the main power, apparently causing some sort of destructive power transient. The computer's core memory completely stopped working, making the computer unusable. To fix this we had to delve into the depths of the computer's core memory circuitry and the power supplies.

The IBM 1401 computer. The card reader/punch is in the foreground. The 12K memory expansion box is partially visible to the right behind the 1401.

The IBM 1401 computer. The card reader/punch is in the foreground. The 12K memory expansion box is partially visible to the right behind the 1401.

Debugging the core memory

The IBM 1401 was a popular business computer of the early 1960s. It had 4000 characters of internal core memory with additional 12000 characters in an external expansion box.2 Core memory was a popular form of storage in this era as it was relatively fast and inexpensive. Each bit is stored in a tiny magnetized ferrite ring called a core. (If you've ever heard of a "core dump", this is what the term originally referred to.) The photo below is a magnified view of the cores, along with the red wires used to select, read and write the cores.4 The cores are wired in an X-Y grid; to access a particular address, one of the X lines is pulsed and one of the Y lines is pulsed, selecting the core where they intersect.3

Detail of the core memory in the IBM 1401. Each toroidal ferrite core stores one bit.

Detail of the core memory in the IBM 1401. Each toroidal ferrite core stores one bit.

In the 1401, there are 4000 cores in each grid, forming a core plane that stores 4000 bits. Planes are then stacked up, one for each bit in the word, to form the complete core module, as shown below.

The 4000 character core memory module from an IBM 1401 computer. Tiny ferrite cores are strung on the red wires.

The 4000 character core memory module from an IBM 1401 computer. Tiny ferrite cores are strung on the red wires.

To diagnose the memory problem, the team started probing the 1401 with an oscilloscope. They checked the signals that select the core module, the memory control signals, the incoming addresses, the clock signals and so forth, but everything looked okay.

The next step was to see if the X and Y select signals were being generated properly. These pulses are generated by two boards called "matrix switches", one for the X pulse and one for the Y pulse.5 Some address lines are decoded and fed into the X matrix switch, while the other address lines are decoded and fed into the Y matrix switch. The matrix switches then create pulses on the appropriate X and Y select lines to access the desired address in the core planes.

The photo below shows the core memory module and its supporting circuitry inside the 1401. The core memory module itself is at the bottom, with the two matrix switch boards mounted on it. Above it, three rows of circuit boards (each the size of a playing card) provide the electronics. The top row consists of inhibit drivers (used for writing memory) and the current source and current driver boards (providing current to the matrix switches). The middle row has 17 boards to decode the memory addresses. At the bottom 19 sense amplifier boards read the data signals from the cores. As you can see, core memory requires a lot of supporting electronics and wiring. Also note the heat sinks on most of these boards due to the high currents required by core memory.

Inside the IBM 1401 computer, showing the key components of the core memory system.

Inside the IBM 1401 computer, showing the key components of the core memory system.

After some oscilloscope measurements, we found that one of the matrix switches wasn't generating pulses, which explained why the memory wasn't working. We started checking the signals going into the matrix switch and found one matrix switch input line showed some ringing, apparently enough to keep the matrix switch from functioning.

Since the CHM has two 1401 computers, we decided to swap cards with the good machine to track down the fault. First we tried swapping the thermal switch board (below). One problem with core memory is that the properties of ferrite cores change with temperature. Some computers avoid this problem by heating the core memory to a constant temperature in air (as in the IBM 1620 computer) or an oil bath (as in the IBM 7090). The 1401 on the other hand uses temperature-controlled switches to adjust the current based on the ambient temperature. We swapped the "AKB" thermal switch board (below) and the associated "AKC" resistor board, with no effect.

The core memory uses a thermal switch board to adjust the current through core memory as temperature changes.  The switches open at 35°C, 29°C and 22°C.  The type of the board (AKB) is stamped into the lower left of the board.

The core memory uses a thermal switch board to adjust the current through core memory as temperature changes. The switches open at 35°C, 29°C and 22°C. The type of the board (AKB) is stamped into the lower left of the board.

Next we tried swapping the "AQW" current source boards that control current through the matrix switches.6 We swapped these board and the 1401's memory started working. Replacing the original boards one at a time, we found the bad board, shown below.

The IBM 1401 has four "AQW" cards that generate currents for the core memory switches. This card had a faulty inductor (the upper green cylinder), preventing core memory from working.

The IBM 1401 has four "AQW" cards that generate currents for the core memory switches. This card had a faulty inductor (the upper green cylinder), preventing core memory from working.

I examined the bad board and tested its components with an multimeter. There were two 1.2mH inductors on the board (the large green cylinders). I measured 3 ohms across one and 3 megaohms across the other, indicating that the second inductor had failed. With an open inductor, the board would only provide half the current. This explained why the matrix switch wasn't generating pulses, and thus why the core memory didn't work.

I gave the bad inductor to Robert Baruch of Project 5474 for analysis. He found that the connection between the lead and the inductor wire was intermittent. He dissolved the inductor's package in acid and took photographs of the winding inside the inductor.7

The faulty inductor from the IBM 1401 showing the failed connection.

The faulty inductor from the IBM 1401 showing the failed connection.

We looked in the spare board cabinet for an AQW board to replace the bad one and found several. However, the replacement boards were different from the original—they had one power transistor instead of two. (Compare the photo below with the photo of the failed card from the computer.)

The replacement AQW card had one transistor instead of two, but was supposedly compatible with the old board.

The replacement AQW card had one transistor instead of two, but was supposedly compatible with the old board.

Despite misgivings from some team members, the bad AQW card was replaced with a one-transistor AQW card and we attempted to power the system back up. Relays clicked and fans spun, but the computer refused to power up. We put the old card back (after replacing the inductor), and the computer still wouldn't start. So now we had a bigger problem. Apparently something had gone wrong with the computer's power supplies so the debugging effort switched focus.

Diagnosing the power supply problem

The power supply system for the IBM 1401 is more complex than you might expect. Curiously, the main power supplies for the system are inside the card reader; a 1250W ferro-resonant transformer in the card reader regulates the line input AC to 130V AC, which is fed to the 1401 computer itself through a thick cable under the floor. Smaller power supplies inside the 1401 then produce the necessary voltages.

Since it was built before switching power supplies became popular, the IBM 1401 uses bulky linear power supplies. The photo below shows (left to right) the +30V, -6V, +6V and -12V supplies.8 In the lower left, under the +30V supply, you can see eight relays for power sequencing. The circuit board to the right of the relays is one of the "sense cards" that checks for proper voltages. Under the +6V supply is a small "+18V differential" supply for the core memory. Foreshadowing: these components will all be important later.9

Power supplies in the IBM 1401.

Power supplies in the IBM 1401.

After measuring voltages on the multiple power supplies, the team concluded that the -6V power supply wasn't working right. This was a bit puzzling because the AQW card (the one we replaced) only uses +12 and +30 volts. Since it doesn't use -6 volts at all, I didn't see how it could mess up the -6 volt supply.

Inside the IBM 1401's -6V power supply.

Inside the IBM 1401's -6V power supply.

The team removed the -6V supply and took it to the lab. In the photo above, you can see the heavy AC transformer and large electrolytic capacitors inside the power supply. Measuring the output transistors, they found one bad transistor and some weak transistors and decided to replace all six transistors. In the photo below, you can see the new transistors, mounted on the power supply's large heat sink. These are germanium power transistors; the whole computer is pre-silicon.

The -6V power supply from the IBM 1401 uses six power transistors on a large heat sink.

The -6V power supply from the IBM 1401 uses six power transistors on a large heat sink.

The -6V power supply tested okay in the lab with the new transistors, so it was installed back in the 1401. We hit the "Power On" button on the console and... it still didn't work. We still weren't getting -6V and the computer wouldn't power up.

In the next repair session, we tried to determine why the computer wasn't powering up. Recall the eight relays mentioned earlier; these relays provide AC power to the power supplies in sequence to ensure that the supplies start up in the right order. If there is a problem with a voltage, the next relay in the sequence won't close and the power-up process will be blocked. We looked at which relays were closing and which weren't, and measured the voltages from the various power supplies. Eventually we determined that about halfway through the power-up process, relay #1 was not closing when it should, stopping the power-up sequence.

Relay #1 was driven by the +30V supply and was activated by a "sense card" that checked the +6V supply. But the +30V and +6V supplies were powering up fine and the sense card was switching on properly. Thus, the problem seemed to be a failure with the relay itself. Just before we pulled out the relay for testing, someone found an updated schematic showing the relay didn't use the regular +30V supply but instead obtained its 30 volts through the "18V differential supply".11 And the schematic for the 18V differential supply had a pencilled-in fuse.10

Could the power problem be as simple as a burnt-out fuse? We opened up the 18V differential supply, and sure enough, there was a fuse and it was burnt out. After replacing the fuse, the system powered up fine and we were back in business.

The 18V differential power supply in the IBM 1401 provides 12 volts to the core memory. The fuse is under the large electrolytic filter capacitors.

The 18V differential power supply in the IBM 1401 provides 12 volts to the core memory. The fuse is under the large electrolytic filter capacitors.

With the computer operational, I could finally run my program. After a few bug fixes, my program used the computers's reader/punch to punch a card with a special hole pattern:

A punch card with "Merry Xmas" and a tree punched into it.

A punch card with "Merry Xmas" and a tree punched into it.

Happy holidays everyone!12

Conclusion

After all this debugging, what was the root cause of the problems? As far as we can tell, the original problem was the inductor failure and it's just a coincidence that the problem occurred after the power loss during system startup. The new AQW card must have caused the fuse to blow, although we don't have a smoking gun.13 The reason the -6V power supply wasn't showing any voltage is because it was sequenced by relay #1, which didn't close because of the fuse. The bad transistors in the -6V power supply problem were apparently a pre-existing and non-critical problem; the good transistors handled enough load to keep the power supply working. The moral from all this is that keeping an old computer running is challenging and takes a talented team.

Thanks to Robert Baruch for the inductor photos. Thanks to Carl Claunch for providing analysis. The Computer History Museum in Mountain View runs demonstrations of the IBM 1401 on Wednesdays and Saturdays so check it out if you're in the area; the demo schedule is here.

Follow me on Twitter or RSS to find out about my latest blog posts.

Notes and references

  1. Although there are two IBM 1401 computers at the CHM, only one of them has the "column binary punch" feature that I needed. "Column binary" lets you punch arbitrary patterns on a punch card (to store binary) rather than being limited to the standard punch card character set of 64 characters. 

  2. Note that the 1401 has 4000 characters of memory and not 4096 because it is a decimal machine. Also, the memory stores 6-bit characters plus a (metadata) word mark and not bytes. 

  3. If you want to know more about the 1401's core memory, I've written in detail about core memory and described a core memory fix

  4. The trick that makes core memory work is that the cores have extremely nonlinear magnetic characteristics. If you pass a current (call it I) through a wire through a core, the core will become magnetized in that direction. But if you pass a smaller current (I/2) through a wire, the core doesn't change magnetization at all. The result is that you can put cores on a grid of X and Y wires. If you put current I/2 through an X wire and current I/2 through a Y wire, the core at their intersection will get enough current to change state, while the rest of the cores will remain unchanged. Thus, individual cores can be selected. 

  5. The matrix switch is another set of cores in a grid, but used to generate pulses rather than store data. The 1401's memory has 50 X lines and 80 Y lines (yielding 4000 addresses), so generating the X and Y pulses with transistors would require 50 + 80 expensive, high-current transistors. The X matrix switch has 5 row inputs and 10 column inputs, and 50 outputs—one from each core. The address is decoded to generate the current pulses for these 15 inputs. Thus, instead of using transistor circuits to decode and drive 50 lines, just 15 lines need to be decoded and driven, and the matrix switch generates the final 50 lines from these. The Y lines are similar, using a second matrix switch to drive the 80 Y lines. 

  6. Each matrix switch has two current inputs (for the row select and the column select), so there are four current source boards and four current driver boards in total. 

  7. Strangely, half the inductor is nicely wound while the winding in the other half is kind of a mess.

    The faulty inductor from the IBM 1401.

    The faulty inductor from the IBM 1401.

  8. The 1401 has more power supplies that aren't visible in the picture. They are behind the power supplies in the photo and slide out from the side for maintenance. 

  9. If you want to see the original schematics and diagrams of the 1401's power supplies, you can find them here. Core memory schematics are here

  10. The pencilled-in fused on the schematic also had a note about an IBM "engineering change". In IBM lingo, an engineering change is a modification to the design to fix a problem. Thus, it appears the the 1401 originally didn't have the fuse, but it was added later. Perhaps we weren't the first installation to have this problem, and the fuse was added to prevent more serious damage. 

  11. The 18V differential supply provides 12 volts. This seemed contradictory, but there's an explanation. The core memory circuitry is referenced to +30 volts. It needs a supply 18 volts lower, which is provided by the 18V differential supply. Thus, the voltage is +12V above ground. Unlike the regular +12V power supply, however, the differential power supply's output will move with any changes to the +30V supply, ensuring the difference is a steady 18 volts. 

  12. The "Merry Xmas" card was inspired by a tweet from @rrragan. (I had also designed a card with a menorah, but unfortunately encountered keypunch problems and couldn't get it completed in time. Maybe next year.) Punch cards normally encode characters by punching up to three holes per column. Since this decorative card required many holes per column, I needed to use the 1401's column binary feature, which allows arbitrary binary data to be punched. I ended up punching the card upside down to simplify the program:

    Front of my "Merry Xmas" punch card.

    Front of my "Merry Xmas" punch card.

  13. After carefully examining the AQW boards, we determined that one- and two-transistor cards should be compatible. The two-transistor board had the two transistors in parallel, probably using earlier transistors that couldn't handle as much current. It's possible that the filter capacitor between +30V and ground was shorted in the replacement AQW board, blowing the fuse. 

19 comments:

  1. This may be too obvious but I wondered if you knew of the existence of the sublime "IBM 1401 -
    A User's Manual" by Johan Johansson https://open.spotify.com/user/grahamrowe/playlist/6H1u9rbhwLOuA20DzTzfVo
    A beautiful elegy to his father. https://pitchfork.com/reviews/albums/9583-ibm-1401-a-users-manual/

    ReplyDelete
  2. Brian: yes, I know of the Johansson album. It turns up every time I look for documentation and do a Google search for IBM 1401 manuals :-)

    ReplyDelete
  3. Wow, brings back memories from an era bygone.. CDC CYBERs, DEC PDPs and VAX systems.

    ReplyDelete
  4. My first computer, I programmed it in Autocoder.

    ReplyDelete
  5. Thank you. This is a beautiful peek into the early days. Reading about an updated schematic with a penciled in fuse is a nice present on this Christmas morning.

    ReplyDelete
  6. It's a shame you found the blown fuse last, might that have saved some trouble? Probably not. But swapping known good cards into a known bad system? You are far more daring than I am.

    This was almost my first computer, I worked decommissioning a 1401, porting Courseware programs into DEC Basic.

    ReplyDelete
  7. Multnomah County (Portland, Oregon) school district had an IBM 1401 that the high schools could submit punched cards to, and receive line-printer output. Circa 1969-1972 or so. Each school got a key punch machine to go with the teletype connected to the HP 2000 Time Share Basic computer in the Math department.

    We got to see the computers on a field trip to the county building. I was more impressed with the HP computers than the IBM. The 1401 was strictly batch.

    Basically, the 1401 was to introduce us to Fortran programming, but in 1971 a Fortran simulator was written in Basic, and the keypunch was removed to the Business class, to teach keypunch skills.

    ReplyDelete
  8. Nice job! As you mentioned the keypunch, though, couldn't you have just used the multi-punch feature on the 029 and auto-dup the master card? :-)

    ReplyDelete
  9. I am in the midst of making a card reader using an arduino and some homemade parts and pieces. My Uncle who died about ten years ago left me an Argosy Airstream trailer, in it was some core memory from a univac computer and some punch cards from an old ibm computer. Amongst a few other odd and ends as well. I really enjoyed reading your article. My Uncle used to work on these computers and would have loved reading this. I keep his parts in a shadow box o nthe wall and am using some of the NOS punch cards for my build. I am going to have to punch it by hand though. Thanks for all the hard work. I love seeing this old stuff being saved.

    ReplyDelete
  10. Thanks everyone for the interesting comments. Mike: I could have used multi-punch, but where's the challenge in that? Also, keypunches don't like duplicating cards with weird hole patterns and it can actually damage them (the mechanism to move the code plate for printing gets pushed in bad directions). SimonElse: let me know if you need some cards punched for your project.

    ReplyDelete
  11. I was reading a report from 1972 that said the radar digitizer was causing RF interference to the long haul radio, when its doors were opened.

    Technicians were sent out to analyze the problem. Yep, when the doors were opened, the radio link suffered. Closer inspection found that the component causing the interference was the core memory of the digitizer.

    Probably doesn't meet FCC Part 15 specifications for unlicensed operation :-)

    ReplyDelete
  12. Hi Ken,
    You narrative brought back memories from Madras 1974! I worked for IBM then, and went to fix a problem on a 1401 at a customer location - Rats had gotten into the area above the memory stack and done their bit! Anyhow, I ended up un-soldering an entire core plane from the middle of the stack and replacing it with another one I had to un-solder from a spare (which had other bad frames, then replacing it and soldering all the connections. It worked, the customer and my field manager could not believe it!
    I recall this outfit was an ordnance factory; even the rat traps were painted in camouflage drab :)
    Hope to drop by sometime at the museum when I come by Silicon Valley and flip through the ILD's and ALDs's if possible.
    RK

    ReplyDelete
  13. A friend and co-worker, Wayne Linder, came to help me troubleshoot an intermittent problem on a 1401 in the DC area many years ago. Wayne asked me to tap the back of the Tape control gate while he traced a circuit with a Tectronix Scope. I forgot that I still had my wedding band on and proceeded to short out something. It took us about 4 hours to fix that problem so that we could get back to the original bug. Ike Cabase.

    ReplyDelete
  14. HaHa! That reminds me of the time when someone accidentally dropped a coin on the SMS card gate in a 7330 - As luck would have it, it happened to come to rest on the one card that happened to have the insulating coating on the land pattern on it - whew!

    ... and I know about the problems with the early TAU-9 tape control SMS cards -- It was 1967 or 1968 ?, my first 1401 account, and almost daily I had to troubleshoot intermittents. Invariably it was due to corrosion of the sharply bent through-hole leads of the 101 series transistors. The SMS card land patterns were originally designed for larger transistors and had greater lead spacings, and the 101's were smaller, so when these were mounted on the cards, the leads had to be formed into a shape that would allow proper mounting and soldering. This caused minute cracks in the nickel(?) plating on the lead wires where they were bent. Over time, (and possible storage under humid conditions) rust would form and weaken the leads to the point where they would crack with normal vibration. If you were lucky, this could be seen in the 'scope signals if knew where to look but most often, we would have to whack the logic gates with a handfull of 5081 punched cards to provide not-so-gentle encouragement to induce failure!

    ReplyDelete
  15. Really brought back my old memories of our work horse 1401. Starting from mid 60s till almost early 80s (yes early 80s) 1401s were the most popular used computers in India. Worked a lot on these machines during that time.

    !8 Volt differential power supply was adjusted to optimize the memory operation and the fuse was notorious for the powering up problems since there was no physical indication. All other power supplies had CBs which could be easily observed.

    ReplyDelete
  16. Yes, bm! Do you recall "schmoo-ing" -- the process of optimizing the PS voltages to ensure that the average core would be magnetically biased correctly within the hysteresis curve. And it was so temperature dependent!

    ReplyDelete
  17. ccdman (RK) You really helped me to recollect the expression "schmooing" which we used at that time. Your narration of incidence of rats reminded me of one of our installations where 1401 was located right above the canteen of that institute. It was practically annual affair to have rat droppings giving us intermittent 'process errors'. Once they also happened to invade 1311 disks which took almost 10hrs to diagnose the problem.

    ReplyDelete
  18. In 1967 I was the Controller of Aerosol Techniques Inc (Eastern Division). I was responsible for the operation of an IBM 1401 like this. Our work on this machine was the subject of a Harvard Business School 'case' after we worked through all the data collection issues outside the technical operation of the computer itself. We used a truckload of punched cards every week. I often talk about the 4K of main memory, so very good to see pictures of this together with a great description of what the memory consisted of! Thanks for this trip down my memory lane. Peter Burgess

    ReplyDelete
  19. Hello, I started to work at IBM (in Brazil) in 1974 and was immediately involved with the end of IBM /360 systems and starting with IBM /370 systems, as 3125, 3145, 3148 and such. I made all the basic training of the I/Os (card readers and punch units, printers 1403, tape drives - 2420 and 3420, Hard Disks 2319 and 3330). It was a fantastic adventure through a technology that for such time was way advanced, over what I knew of electronics - vacuum tubes and first transistors and starting the TTL 7400 family of IC's from Texas Instruments. Yes, it was a fascinating diving into a technology that only huge corporations as IBM had at that time. Traveled the world to make training for those machines at IBM plans. I was specially mind driven by the idea of making a whole 3145 processor training in 2 months, not only learning how to debug everything (and I mean "everything), but also microcode loaded from a 8" floppy disk. Several problems on the processor hardware would be located analyzing the microcode failure, when it fails and why it fails. I really exploded my mind in knowledge at those years. It gaves me great advantage above other people in my long future ahead, since I learn how to do things in the right way, technically speaking. My future in electronics (until today) is profoundly based on IBM quality and sound thinking, do it well, do it good, do it better than perfect. IBM was gifted by stating all their ideals and future objectives with one simple word, "THINK".

    ReplyDelete