We've been archiving a bunch of old Xerox Alto disk packs from the 1970s. A few of them turned out to be password-protected, so I needed to figure out how to get around the password protection. I've developed a way to disable password protection, as well as a program to find the password instantly. (This attack is called XeroDay, based on a suggestion by msla.)
The Alto was a revolutionary computer designed at Xerox PARC in 1973 to investigate personal computing. In the photo above, the Alto computer itself is in the lower cabinet. The Diablo disk drive (in 1970s orange, below the keyboard) takes a removable 14 inch disk pack that stores 2.5 megabytes of data. (A bunch of disk packs are visible behind the Alto and in the photo below.) I've been restoring a Xerox Alto, along with Marc Verdiell, Luca Severini and Carl Claunch. (The full set of Alto posts is here and Marc's videos are here.)
Now that we have the Alto running, one project is to archive a bunch of disks that have been sitting at Xerox PARC for decades, and find out if there are any interesting treasures among the disks. We can archive disks by running the Copydisk program on the Alto, and copying them over Ethernet to a PC server. (Ethernet was invented by Xerox for the Alto.) However, it's considerably faster to read the data directly off the disk drive, bypassing the computer. Carl created an FPGA-based disk controller (below) that connects to the Diablo disk drive, speeding up the archiving process.1
Before reading each disk, we open up the pack and carefully clean the surface. After storage for decades, these disks have some grime, dust, and the occasional bug (of the dead insect variety), so we need to clean them to reduce the chance of a head crash.
Most of the archived disks can be booted on the Alto or the ContrAlto simulator. But a few disks only booted to a password prompt (see below), and we couldn't use the disk without the password. So I decided to hack my way into the password-protected disks.
The Alto documentation discusses password protection, explaining how a password can be associated with a disk. It only promises "a modest level of security", which turns out to be true. It also says if you forget the password, "you will need an expert to get access to anything on your disk." But could I break in without finding an expert?
A bit about passwords
Storing passwords in plain text is a very bad idea, since anyone who can access the file can see the password.9 Most systems use a solution invented by Roger Needham in 1967. Instead of storing the password, you hash the password through a cryptographic one-way function and store the hash. When the user inputs a password, you hash it through the same function and compare the hashes. If they match, the passwords match. And if anyone sees the hash, there's no easy way to get the password back.
One problem with hashed passwords is if two users have the same hash, then you know they have the same password. A solution (invented in Unix) is to hash some random bytes (called salt) along with the password to yield the stored hash. Since different users will have different salt, the hashes will be different even if the passwords are the same. (Of course you need to store the salt along with the hash in order to check passwords.)2 Like Unix, the Alto used salted and hashed passwords.
The Alto's hash algorithm
The source code for the Alto's password algorithm reveals how the password hashing is implemented.3 The Alto uses four words of salt with the password (two words based on the password creation time and two words based on the user name). The password hash is 4 words (64 bits) long. The Alto's password hash algorithm is pretty simple:4
Hash c = -a*x*x + b*y
where a
is the time salt and b
is the user name salt. x
is a one-word value generated from the password string and y
is a two-word value from the password string, both generated by
concatenating characters from the password.5
Disabling the password
There's a way to disable the password on disk, gaining access to the file system.6
The Alto boots from disk by running the file sys.boot
; this file decides if a password is required for boot, based on the 9-word password vector stored inside the file.
The first word is a flag indicating if the disk is password protected.
If the password flag is set, the boot program requires a password before proceeding.
The next four words are the salt, and the final four words are the password hash itself.
The password protection can be disabled by clearing the flag word inside sys.boot
, which is the 128th word in the second block of sys.boot
.
The tricky part is finding where this word is on disk.
The file system stores a directory as the name of each file along with the disk address of the file's first block.
By reading the directory as raw data and interpreting it, we can find the location of sys.boot
.
In the Alto file system, each disk block has pointers to the previous and next block.
So once we've found the first block, we can follow the pointer to find the second block.
(Basically I re-implemented a subset of the Alto file system using the raw disk.)
Erasing the password flag in this block makes the disk bootable without the password.
After implementing this, I realized there's a short cut.
The writers of the disk bootstrap code didn't want to re-implement the file system either, so the Alto simply copies the first block of sys.boot
to the first disk sector.
This makes it trivial to find the file, without needing to scan the directory.
Once I have the first block, I can find the second block from the pointer, access the block, and clear the password flag, disabling the security.7
I made a simple Python program to update the disk image file and clear the password flag (link). After running this program, I was able to access the protected disks. The password-protected disks didn't have any super-secret treasures. However, one disk contained an implementation of APL in Mesa, which was interesting. (APL is a cool programming language known for its extremely terse notation and strange character set. Mesa is a high-level language developed at Xerox PARC; it influenced Java.) We haven't gotten Mesa running on the Alto yet, but this looks like an interesting thing to try out.
Brute-forcing the password
While I could access the disk simply by clearing the password flag, I wondered what the original passwords were. I implemented the password algorithm in C (link), so I could rapidly test passwords. My hope was that testing against a list of top 100,000 passwords would find the passwords, since I didn't expect much in the way of 1970s password security practices. Surprisingly, there were no hits so the passwords weren't common words. My next step was brute-forcing the password by generating all strings of length 1, 2, 3 and so forth.8 Finally at length 8, I cracked the first password with "AATFDAFD". The second disk had password "HGFIHD" and the third had "AAJMAKAY". Apparently random strings were popular passwords back then.
I was a bit suspicious when I saw that both 8-character passwords started with "AA" so I investigated a bit more. It turned out that using "AB" in place of "AA" also worked, as did starting with anything in "{A-Z}{A-G}".
Studying the algorithm more closely, I realized that when x
and y
get too long, the old character bits were just dropped. Thus, when you use a password longer than 6 characters, most of the bits from the first characters are lost. This is a pretty big flaw in the algorithm.
Finding the password with math
It takes half an hour or so to brute force the password; can we do better? Yes, by doing some algebra on the password formula yielding:
y = (c + a*x*x) / b
where x
and y
are generated from the password string, a
and b
are salt, and c
is the stored password hash.
Since x
is only 16 bits, we can easily try all the values, finding ones for which the division works.
When we find a solution for y
, we can recover the original password by chopping x
and y
into 7-bit characters.
Using this technique, the password can be recovered almost instantly.
I implemented this in Python here.
Conclusion
The Xerox Alto's disk passwords can be trivially bypassed, and the password can be easily recovered. The password algorithm has multiple flaws that make it weaker than you'd expect (dropping password characters) and easily reversed. Since the system is almost 45 years old, they had to keep the code small and they weren't facing modern threats. After all, Xerox only promised "modest" security with the passwords, and that's what it provided.
Notes and references
-
For more details on the archiving process, see Carl's post on archiving. He writes about the FPGA disk tool here. ↩
-
Salting passwords also protects against password attacks using precomputed rainbow tables, but that wasn't a concern back then. ↩
-
The password code can be viewed at Password.bcpl. The code is in BCPL, which is a predecessor of C. The syntax has some trivial but confusing differences; I'll explain the most important.
p!1
is just an array access, equivalent top[1]
in C.vec 4
allocates 4 words, essentiallymalloc(4)
.ps>>String.char↑i
is equivalent tops->String.char[i]
, accessing a character from the password structure.$a
is'a'
andrem
is%
. Square brackets in BCPL are blocks, like curly braces in C. Also note that the code has inline assembly for the vector addition and multiplication functions. ↩ -
The negation in the hash function is broken; only the top two words are negated. The Password.bcpl code points out this bug, but notes that they couldn't fix it without invalidating all the existing passwords. ↩
-
The
x
value is generated by concatenating the 1st, 4th, ... characters of the password, whiley
consists of the other characters. The concatenation is done using 7-bit characters, for some reason. ↩ -
I thought I could boot off an unprotected disk and then use the Neptune graphical file browser to view protected files in the second drive. However, they thought of this and Neptune also checks for a password. In the screenshot below, Neptune shows the contents of the left disk, but requires a password to show the contents of the right disk. ↩
The Neptune file browser checks for a password. -
A few more details about the Alto file system for reference. Each disk sector consists of a 2-word header, an 8-word label, and a 256-word data record, each with a checksum. (The Alto's file system (like the Alto) uses 16-bit little-endian words.) The file system structures are defined in include file AltoFileSys.D. The header contains the physical disk address of the sector (sector number, track number, head number), which is used to validate that the sector returned from the drive is the correct sector. The label contains file system information: pointers to the next and previous sectors in the file, the number of characters used and the file id. This information can also be used to recover from corruption using the scavenger program, which is similar to Unix's fsck. See the Operating System Reference Manual; section 4.2 describes disk files. ↩
-
The Alto's password algorithm converted all lower case letters to upper case, which reduced the search space. I'm not sure if special characters are allowed or not. ↩
-
The code that calls the password routine has comments about making sure the cleartext password never gets stored on disk, and the in-memory password buffers are zeroed out so they don't get swapped to disk. So it's clear that the writers of the password code were thinking things through. ↩