As soon as we finished repairing a printer failure at the Computer History Museum, Murphy's law struck and the card reader started malfunctioning. The printer and card reader were attached to an IBM 1401, a business computer that was announced in 1959 and went on to become the best-selling computer of the mid-1960s. In that era, data records were punched onto 80-column punch cards and then loaded into the computer by the card reader, which read cards at the remarkable speed of 13 cards per second. This blog post describes how we debugged the card reader problem, eventually tracking down and replacing a faulty electromechanical relay inside the card reader.
The IBM 1402 card reader at the Computer History Museum. Cards are loaded into the hopper on the right. The front door of the card reader is open, revealing the relays and other circuitry.
The card reader malfunction started happening every time the "Non-Process Run-Out" mechanism (NPRO) was used. During normal use, if the card reader stopped in the middle of processing, unread cards could remain inside the reader. To remove them, the operator would press the "Non-Process Run-Out" switch, which would run the remaining cards through the reader without processing them. Normally, the "Reader Stop" light (below) would illuminate after performing an NPRO. The problem was the "Reader Check" light also came on, indicating an error in the card reader. Since there was no actual error and the light could be cleared simply by pressing the "Check Reset" button, this problem wasn't serious, but we still wanted to fix it.
Control panel of the 1402 card reader showing the "Reader Check" error. The Non-Process Run-Out switch is used to run cards out of the card reader without processing them.
To track down the problem, the 1401 restoration team started by probing the card reader error circuitry inside the computer with an oscilloscope. (The card reader itself is essentially electromechanical; the logic circuitry is all inside the computer.) The photo below shows the 1401 computer with a swing-out gate opened to access the circuitry. Finding a circuit inside the IBM 1401 is made possible by the binders full of documentation showing the location and wiring of all the computers circuitry. This documentation was computer generated (originally by a vacuum tube IBM 704 or 705 mainframe), so the diagrams were called Automated Logic Diagrams (ALD).
The IBM 1401 with gate 01B4 opened. The yellow wire-wrapped wiring connects the circuit boards that are plugged into the gate. The computer's console is visible on the front of the computer.
The reader check error condition is stored in a latch circuit; the latch is set when an error signal comes in, and cleared when you press the reset button. To find this circuit we turned to the ALD page that documented the card reader's error checking circuitry. The diagram below is a small part of this ALD page, showing the read check latch of interest: "READ CHK LAT". Each basic circuit (such as a logic gate) is drawn on the ALD as a box, with lines showing how they are connected. Inside the box, cryptic text indicates what the circuit does and where it is inside the computer. For example, the latch (lower two boxes) is constructed from a circuit card of type CQZV (an inverter) and a CHWW card (a NAND gate). The text inside the box also specifies the location of each card (slots D21 and D24 in gate 01B4), allowing us to find the cards in the computer. Note that the RD REL CHK signal comes from ALD page 126.96.36.199; this will be important later.
The read check latch circuit, excerpted from the ALD 188.8.131.52.
The schematic below shows the latch redrawn with modern symbols. If the RESET line goes low, it will force the inverted output high. Otherwise, the output will cycle around through the gates, latching the value. If any OR input is high (indicating an error), it will force the latch low. Note the use of wired-OR—instead of using an OR gate, signals are simply wired together so if any signal is high it will pull the line high. Because transistors were expensive when the IBM 1401 was built, IBM used tricks like wired-OR to reduce the transistor count. Unfortunately, the wired-OR made it harder to determine which error input was triggering the latch because the signals were all tied together.
The read check latch circuit redrawn with modern symbols.
Once we located the circuit cards for the latch, we used an oscilloscope to verify that the latch itself was operating properly.
Next, we needed to determine why it was receiving an error input.
After disconnecting wires to get around the wired-OR, we found that the error was not coming from the Read/punch error or the Feed error signal.
The RD REL CHK was the obvious suspect; this signal was part of the optional Read Punch Release feature1.
However, the team insisted that Read Punch Release wasn't installed in our 1401.
The source of this signal was ALD 184.108.40.206 and
our documentation didn't include ALD section 56, confirming that this feature wasn't present in our system.
Additional oscilloscope tracing showed a lot of noise on some of the signals from the card reader. This wasn't unexpected since the card reader is built from electromechanical parts: relays, cam switches, brushes, motors, solenoids and other components that generate noise and voltage spikes. I considered the possibility that a noise spike was triggering the latch, but the noise wasn't reaching that circuit.
The plug charts show the type of card in each position in the computer, and the function assigned to it. This is part of the plug chart for gate 01B4.
At this point, I was at a dead end, so I took another look at the RD REL CHK signal to see if maybe it did exist. The 1401's documentation includes "plug charts," diagrams that show what circuit card is plugged into each position in the computer. I looked at the plug chart for the card reader circuitry in gate 01B4 (swing-out gate, not logic gate). The plug chart (above) showed cards assigned to the mysterious ALD 220.127.116.11, such as the cards in slots A15-A17 and B15. (The plug chart also had numerous pencil updates, crossing out cards and adding new ones, which didn't give me a lot of confidence in its accuracy.) I looked inside the computer and found that these cards, the Read Punch Release cards generating RD REL CHK, were indeed installed in the computer. So somehow our computer did have this feature.
The gate in the 1401 holding the card reader circuitry. Note the cards in positions A15 and B15.
The problem was that even though these cards were present in the system, we didn't have the ALDs that included them, leaving us in the dark for debugging. I checked the second 1401 at the Computer History Museum; although it too had the cards for Read Punch Release, its documentation binders also mysteriously lacked the section 56 ALDs. Fortunately, back in 2006, the Australian Computer Museum Society sent us scans of the ALDs for their 1401 computer. I took a look and found that the Australian scans included the mysterious section 56. There was no guarantee that their 1401 had the same wiring as ours (since the design changed over time), but this was all I had to track down RD REL CHK.
Simplified excerpt of ALD 18.104.22.168 from an Australian 1401 computer.
According to the Australian ALD above, RD REL CHK was generated by the CGVV card in slot A17 (upper right box above). The oscilloscope trace below confirmed that this card was generating the RD REL CHK signal (yellow), but its input (RD BR IMP CB) (cyan) looked bad. Notice that the yellow line jumps up suddenly (as you'd expect from a logic signal), but the cyan line takes a long time to drop from the high level to the low level. Perhaps a weak transistor in a circuit was pulling the signal down slowly, or some other component had failed.
Oscilloscope trace showing the "circuit breaker" signal from the card reader (cyan) and the READ REL CHK error signal (yellow).
We looked into the RD BR IMP CB signal, short for "ReaD BRush IMPulse Circuit Breakers". In IBM terminology, a "circuit breaker" is a cam-operated switch, not a modern circuit breaker that trips when overloaded. The read brush circuit breakers generate timing pulses when the read brushes that detect holes in the punch card are aligned with a row of holes, telling the computer to read the hole pattern.
The NGXX integrator card contains resistor-capacitor filters. Unlike most cards, this one doesn't have any transistors. Photo courtesy of Randall Neff.
We looked up yet another ALD page to find the source of the strangely slow RD BR IMP CB signal. That signal originated in the card reader and then passed through an NGXX integrator card (above). Earlier I mentioned that the signals from the card reader were full of noise. This isn't a big problem inside the card reader since brief noise spikes won't affect relays. But once signals reach the computer, the noise must be eliminated. This is done in the 1401 by putting the signal through a resistor-capacitor low-pass filter, which IBM calls an "integrator". That card eliminates noise by making the signal change very slowly. In other words, although the signal on the oscilloscope looked strange, it was the expected behavior and not a problem. But why was there any signal there at all?
Part of IBM 1402 card reader schematic showing the cams (circles) that generate the CB read pulses and the relay that blocks the pulses during NPRO.
After some discussion, the team hypothesized that the pulses on RD BR IMP CB shouldn't be getting to the 1401 at all doing a Non-Process Run-Out, since cards aren't being read. The schematic4 for the card reader (above) shows the complex arrangement of cams and microswitches that generates the pulses. During an NPRO, relay #4 will be energized, opening the "READ STOP 4-4" relay contacts. This will stop the BRUSH IMP CB pulses from reaching the 1401.53 In other words, relay #4 should have blocked the pulses that we were seeing.
Frank King replacing a bad relay in the 1402 card reader. The relays are next to his right shoulder.
The card reader contains rows of relays; the reader's hardware is a generation older than the 1401 and it implements its basic control functions with relays rather than logic gates. Frank pulled out relay #4 and inspected it.6 The relay (below) has 6 sets of contacts, activated by an electromagnet coil (yellow) and held in position by a second coil. Springs help move the contacts to the correct positions. One of the springs appeared to be weak, preventing the relay from functioning properly.7 Frank put in a replacement relay and found that the card reader now performed Non-Process Run-Outs without any errors. We loaded a program from cards just to make sure the card reader still performed its main task, and that worked too. We had fixed the problem, just in time for lunch.
The faulty relay from the IBM 1402 card reader.
It is still a mystery why section 56 of the ALDs was missing from our documentation. As for the presence of the Read Punch Release feature on our 1401, that feature turns out to be standard on 1401 systems like the ones at the museum.8 I think the belief that our 1401 didn't include this feature resulted from confusion with the Punch Feed Read feature, which we don't have. (That feature allowed a card to be read and then punched with additional data as it passed through the card reader.)
The team that fixed this problem included Frank King, Alexey Toptygin, Ron Williams and Bill Flora. My previous blog post about fixing the 1402 card reader is here, tracking down an elusive problem with a misaligned cam.
I announce my latest blog posts on Twitter, so follow me at @kenshirriff for future articles. I also have an RSS feed. The Computer History Museum in Mountain View runs demonstrations of the IBM 1401 on Wednesdays and Saturdays so if you're in the area you should definitely check it out (schedule).