THE PROBLEM: Web sites that require us to login and authenticate our identities cannot be trusted to keep our passwords secret. This makes password reuse at multiple sites extremely unsafe: If one site leaks our name, eMail address and password to a hacker they attempt to reuse that authentication to impersonate us on other sites. |
@LulzSec: Reusing passwords is kind of like owning multiple houses and using the same key for each one. Don't expect people not to steal your shit. Fri 24 Jun 10:34 via web (received via TweetDeck) |
What We Need:
We need to somehow arrange to use a different, strong, secure and complex password for each site that requires us to invent an identity so that we can reauthenticate our identity upon our subsequent return.
Security conscious users know that passwords need to be complex and long to be safe. And GRC's Password Haystacks password padding approach offers one solution in this battle to construct secure and memorable passwords. But the trouble is, we need to create a potentially unlimited number of unique passwords. It's one thing to create and memorize and/or record a strong and unique password where we have to, for our most important sites, such as banking and eCommerce. But today we're asked to create passwords even for “throwaway” sites we may never return to, just to post some feedback in a forum of blog. And if we do return, what was that password we created the last time?
The problem has been so intractable and pervasive that many highly useful solutions, such as LastPass, KeePass, and SuperGenPass have been created to lift some of the password management burden from overwhelmed users. But all of these solutions also have liabilities. In mid 2011, LastPass users had a scare when it was revealed that some of its users' database may have escaped LastPass' control. It's convenient to have all of our authentication information stored “in the cloud” . . . but only so long as it is never stolen.
The other concern with cloud-based storage is availability. It's convenient as long as the service is available. Also in mid 2011, the United States FBI (Federal Bureau of Investigation) confiscated three racks worth of web servers, reportedly because they could not be bothered to determine which single server among them was believed to be violating the law. In the process, several score of unrelated web sites disappeared from the Internet. We would not be happy if the password manager we depend upon was among them.
Faced with these many, and growing, problems . . . a new solution was needed.
Immediately after finishing the work on the Password Haystacks password padding approach, I wanted to look in a different direction. My idea was to see whether I could design a secure cryptographic “paper cipher” requiring, for its use, no instrumentality, no technology, no computers, no software, no wires — only a simple piece of paper of some kind. A computer would certainly be required to design and print any instance of the Cipher. But once that was done, no computer would be required to use it.
This is the core of the idea I started with:
The idea of using a computer to encrypt a domain name to create a per-domain password is not new. That's the idea underlying SuperGenPass and others. Its obvious benefit is that instead of needing to record, store, or memorize random passwords that we invent per domain — with the potential problems that invites — we employ an algorithm of some sort to create — and recreate in the future — domain-name-based passwords. Then we don't need to record, store, or memorize them because we can simply recreate the same password from the same domain name any time it's needed. It's a great idea with no obvious drawbacks . . . except that all available solutions are “online” and suffer from the potential of worrisome privacy and security breach problems.
Here were my requirements:
What EXACTLY do we mean by “high security”?
The output encrypted passwords must be long enough to thwart brute force attacks. The Personal Paper Cipher (PPC) expands every case-insensitive input character (a-z) into a pair of pseudo-randomly chosen characters from a large and user-definable character set. Six input characters are therefore expanded into 12 output characters:
| |
The enciphered output depends upon ALL input characters. This is an important property for high security. A minimal change to the input must result in a maximum and unpredictable change in the enciphered output. In the two examples below – which were generated by the PPC system – only the first and the last character was changed in the domain name “amazon”:
| |
Every user's enciphered output is completely different from that of every other. When any user enciphers a domain name — even the same domain name — using their own Personal Paper Cipher, they will obtain a completely unique result from any other user:
| |
Disclosure of SOME domain names and enciphered passwords must NOT compromise the security of ANY other passwords. Even if an attacker knew that you were generating your passwords with the PPC system — and that is not obvious from its output — and if an attacker were to somehow acquire some of your passwords generated by the PPC, it is imperative that so little about your Personal Cipher could be determined that none of your other passwords would be weakened. As we will see below, the Personal Paper Cipher achieves this by embedding a large amount of “entropy” (randomly determined data) into each instance of a user's Cipher. | |
Resistance to “computational” attack. Today's computer hobbyists (and attackers) have access to phenomenal computing power thanks to the awesome power built into modern PC graphics processing units (GPUs). The Personal Paper Cipher resists computational attacks by drawing upon a large “pool of entropy” that is unknown to attackers. Its design significantly obscures each Cipher instance's configuration details even when the operation of the PPC system itself is known. | |
Everything about the Cipher can be fully disclosed. The design of the Personal Paper Cipher is compliant with Kerckhoff's Principle, which states that: “The security of a cryptosystem must not be dependent upon the nondisclosure of the algorithm; it should only depend upon the nondisclosure of the key.” Everything about the design and operation of the PPC is disclosed here. Nothing is kept secret. Yet attackers gain nothing that might help them to crack any user's password sets. |
Here's How It Works
In order to support several of its important security features, such as its ability to have all characters of the input affect all characters of the output, the Personal Paper Cipher must have memory: It's future must be affected by its past. In computer jargon we would say that it must be “stateful” or be able to “save state”. It has a finite number of states. In fact, it has exactly 676. In computer jargon the Personal Paper Cipher would be described as: a finite state machine. Here's what that means:
Consider this grid containing lowercase alphabetic characters:
Even after studying it for some time it probably looks rather random. It actually IS very random (which is a good thing for us), but it was also very deliberately and carefully designed. It has exactly ONE very important property. Can you see what it is? If you have a theory, or if you give up, click the button below to add some highlighting to just the 'a' characters and study their relationship.
Do you see what's so special? What's very special about the grid of characters above, is that EVERY character, not just the 'a's, appears exactly ONCE in every row and column of the grid. This special grid organization is called a Latin Square and it lies at the heart of the operation of the Personal Paper Cipher. In computer jargon we would say that the Personal Paper Cipher is a finite state machine driven by a Latin Square.
Why do Latin Squares matter? . . . Because they allow us to do this:
Note that for illustration purposes we are using a reduced
size 11 by 11 Latin Square, containing only 11 of the 26
lowercase characters of the English alphabet. The actual
Personal Paper Cipher uses a full size 26 by 26 grid.
The key principle of the Personal Paper Cipher is that a 26x26 Latin Square, containing the 26 characters of the lowercase English alphabet, can be used to direct a unique path through the Square, where that path is determined by the Latin Square's specific configuration. |
The Personal Paper Cipher employs two “Phases” for the encryption of a domain name into a secure domain-specific password: The first phase determines the starting point for the second phase by tracing the domain name's characters through the Grid, as shown to the left and (larger) above. Once the first phase has determined the starting point, the second phase emits the enciphered password characters.
As shown in the diagram, each step alternates between following a column or a row. Although you could start from any of the Grid's 26 columns, or any of the Grid's 26 rows, the most important consideration is consistency. So choose a method and stick to it, otherwise you will obtain completely different results from one time to the next. But, at the same time, this flexibility can come in very handy. If you should need to generate alternate passwords for the same domain (such as when a domain's password policies require that passwords are changed), a total of 26+26, or 52, are readily available simply by starting in a different row or column. The general rule for standard PPC operation is to start along the top row, locating the first character of the domain's name there and then finding the domain's second character in the column below.
The second phase of the encryption process is very similar to the first, with a few additions:
First, as shown in the simplified diagram above, the second phase path BEGINS where the first phase path ends. In other words, the path traced during the first phase is used to determine the starting point for the second phase. The row or column where the phase one path ends is the location where the first character of the domain name is found to begin the second phase's path.
In order to output encrypted characters, the grid is expanded to contain a large assortment of randomly chosen output characters:
In the sample grid above, the original Latin Square grid characters are colored blue with a green background to clearly distinguish them from the grid's output characters to make them clear during path following. You will also notice that the red “output characters” surround each blue/green Latin Square character on each side.
As shown by the diagram below, the encrypted domain name is obtained by recording the (red) output characters to the left and right of each of the domain name's first six characters while following the Phase 2 path:
Thus “ amazon ” enciphers to “ )rP-?JD0:/7t ”
And “ amazon ” will encipher to “ )rP-?JD0:/7t ” every time you
do it using the same Personal Paper Cipher grid as your guide.
A three-point summary to set the stage:
So this brings us back to the topic of this point: The phase 1 path length. Now that we've seen what the phase 1 path tracing accomplishes — helping domain names with similar beginnings to encipher differently — you can decide how you want to handle it. The only requirement is that you establish a rule and stick with it. Tracing the entire path for the domain “allthingsconsidered.com” would be extremely tedious and error prone. And it's really unnecessary overkill.
The rule-of-thumb we recommend is to trace up to the top level domain separating dot (.), but never more than six characters. So the path for “grc.com” would be traced for “grc” only, and the path for “allthingsconsidered.com” would stop after “allthi” since that's almost certainly sufficient.
Finally, since the only reason for the initial cipher warm up is to prevent the same initial characters from always generating the same ...
Gang... (from GRC's newsgroups): In thinking this through while writing it up, I have changed my mind about the best way to handle the initialization phase. The point of the initialization is to find a variable but domain-based starting point. It has probably occurred to you guys that the process of tracing a path is akin to hashing, since all characters are mixed and together determine the result, and the process loses information and cannot be reversed. But, in the end, no matter how long a path we follow, we're only selecting from among one of 26 rows or 26 columns — depending upon the parity of the path's length. Following a longer path does have the advantage of removing any frequency distribution bias, so that's somewhat useful. But I don't think that starting the trace at the beginning of the domain name makes the most sense. For example: If we were only to perform a two-character phase 1 warm up, then it would provide ineffective scrambling between any domains beginning with the same first two characters. It makes much more sense, I think, to pick one or more characters that are likely to be different from the characters at the beginning of the domain name. Since any path hashing always distills to one of just 26 rows or columns, it's tempting to ignore path hashing and just choose the last character of the primary domain name (before the TLD separating dot) as the starting column for the enciphering phase. That's very clean and very simple. Much simpler in fact. But if the frequency distribution of last characters was highly skewed, as it likely is, then many columns would be under represented and others would be over represented. So perhaps use the final two characters of the domain name to obtain the enciphering starting point. The second to the last character, found along the top row picks the column, and the last character picks the starting row... and the encipering begins along that row. So the starting ROW, one of the 26 possible, is dependent upon the final two characters of the domain name instead of the beginning characters, which would tend to be duplicative.
(This improvement, which I like the more I think about it, will necessitate changing a bunch of the language and graphics above, but I didn't want to delay sharing this page with the newsgroup gang any longer. So I'm only documenting it here for the moment . . . but I will reflect those changed from here on below.)
Gibson Research Corporation is owned and operated by Steve Gibson. The contents of this page are Copyright (c) 2024 Gibson Research Corporation. SpinRite, ShieldsUP, NanoProbe, and any other indicated trademarks are registered trademarks of Gibson Research Corporation, Laguna Hills, CA, USA. GRC's web and customer privacy policy. |
Last Edit: Jun 28, 2011 at 16:28 (4,911.02 days ago) | Viewed 1 times per day |