## August 14, 2022

I’ve recently been thinking about how to do passwords better. I’m not thinking about how we ought to be replacing passwords with 2FA or hardware based authentication or anything like that, but, if we have a system that requires a username/password combination as a method of authentication, how to come up with a good password. My criteria are:

• no single point of failure
• no reliance on the cloud
• easy to remember

A couple of notes on the first two. I should be able to recover from losing all my stuff: that is, if my laptop and phone fall into the sea, I shouldn’t lose access to all my passwords. I should also be able to recover my password without using a third-party service or online system. I don’t want to be held hostage to the whims of some password manager service.

The space of possible passwords is enormous: you probably have 80 or so characters at least for each possible position. And if your password is 5 characters long that’s already 3.2 billion possibilities. If your password is 10 characters long that’s 10.7 quintillion possibilities. But the collection of “memorable” passwords only explores a tiny tiny fraction of this space. Memorable things include birthdays and actual words, so “memorable” passwords will be primarily lowercase letters or numbers. Rather than 80 possibilities per character, that’s 36 possibilities. (Or 62 if we include uppercase letters as well). Even within the numbers, memorable birthdays are a tiny fraction: there’s only 366 possible birthdays and only a handful of formats one might write them in. Even if you throw in the year as well, there’s only a few hundred thousand possibilities to check. And how many possible combinations of letters are actually a word? So just thinking of a thing and remembering it isn’t a good way to create a password, because you’re exploring a tiny (and likely crowded) part of password-space.

Here’s an alternative approach. Rather than using the memorable thing as a password, the memorable thing is part of a process that generates a more random-looking individual in password space.

So, we have something memorable, a word or phrase and then we also remember some method whereby we transform this phrase into a random-looking (unguessable) sequence of characters that we can then use as a password. What properties does this method have to have?

• I need to be able to implement in quickly on any computer I happen to be on
• It needs to be deterministic so that whenever I do the method, I get the same result
• It needs to generate a password of the right length for whatever service I am trying to authenticate with
• It needs to use only characters permitted by the service
• It needs to fulfil the other requirements of the service (at least one uppercase letter or whatever)

One method that satisfies the second and (sort of) the first of these requirements is a standard off-the-shelf hashing algorithm. Hashing algorithms like MD5 and SHA256, for example, are deterministic, and pretty much any computing environment will have the tools to implement the algorithm. For example, I could generate the MD5 hash of my memorable phrase, and use this number, translated into hex as my password:

import hashlib



Rather than print, I could use a module like pyperclip to copy the password to my clipboard, ready to paste into the password box. Nice!

This solution is not, however very good. There’s a number of problems with it. One issue is that this would be longer than many services permit their passwords to be. For example, the MD5 hash of “Password” is: dc647eb65e6711e155375218212b3964. Another issue is that, since it’s a number translated into hex, it only uses the ten digits and the letters a–f. So, again, we’re not exploring a huge area of password-space. And if the service you are using requires an uppercase letter or a punctuation character, then this approach isn’t going to work.

But this gives us something of an idea of where to look. We want to take a variable-length phrase, and translate it into a fixed-length binary number in a deterministic way. Then we want to translate the binary number into a string of characters from some specified alphabet of symbols such that that string is seemingly random. Given our purposes, we needn’t be picky about the cryptographic properties of the hash function we use, and we aren’t too fussy about the statistical properties of the output string. The important thing is that the algorithm is simple enough that you can remember enough about it to reimplement it on a new computer if all your stuff falls in the sea.

So that’s a problem statement. I’m still experimenting with my solution, but I imagine I’ll revisit this topic when I’ve found something I’m happy with.