The security of users’ passwords is one of the most important aspects of developing your web application. Unfortunately, making a good authentication script, which will safely store your data is not a piece of cake. It’s incredibly easy to get it wrong. The best way is not to store passwords at all, but… sometimes you have to. Let’s think how to make it as safe as possible.
How does it work?
How does it work in theory? It seems really simple. When you register, you choose your login and a password. They are saved to the database. Then when you log in, the server compares what you typed in the login box with the data from the database and if it’s equal, it lets you in.
It’s pretty simple, isn’t it? If you ever learned any server-client related programming language (like PHP, for instance), you surely know that simple authentication script is one of the basic lessons and it sometimes looks like that, but I’ve got something to tell you – it’s wrong! Why?
It may seem a safe solution, but this approach has its cons. Think about this – What if you get hacked? What if your database gets hacked? I’ll help you. We’ll see something like this…
They hacked one password, the one to your DB, and they have got access to all accounts created in your service. “So what?”, you could say if your website has nothing to do with money or any other important data, but… take into account that it’s common for people to use the same password for different services. So the password used on your website could be valid (often with the same login and email!) also emphasized text for Facebook, Youtube… you name it!
When I was starting my adventure with PHP, tutorials about authentication process with this approach was something very popular. If you think it has changed, you can give a quick shot for google and search for examples. I can guarantee you that at least some of them will still teach you to store passwords like that. Don’t get fooled. If you want to make good scripts, always do wide research.
Ok, this approach seems naive now, but if you think there aren’t any websites with such an authentication method, you’re wrong. How to identify such services? By their ability to send you your password by email if you forget it. That means it has to be stored as plain text or it would be an impossible thing to do. That’s why the only way to restore your password is often to set a new one. And don’t think that we are talking only about small personal websites. We mean also websites managed by big corporations, shops and so on. There is a nice service out there – Plain Text Offenders, which collects such websites.
But let’s get back to business. What can we do to strengthen our defense?
The first thought, which comes to mind, when you think how to hide passwords in plain text is to encrypt them. It’s a good idea, but it’s still not so perfect. But before we start analyzing what’s wrong with this approach, let’s think how to bring it to life.
So? How does it work?
Now user passwords get encrypted before they are stored in the database. When your user signs in, the script gets an encrypted password from the database, decrypt it and then compares it with the password you typed. If they’re the same, the script lets you in.
It seems more secure, because now when somebody just reads out your database, he sees something like this:
Of course, if somebody has an access to your database, they can set up a new account with his password and then by searching their record in the database, just copy their encrypted passwords and paste it to every account they want. Then, with their password, they can sign in to every account, even yours – the one with admin permissions! It’s bad, but at least they don’t have passwords of users, so they can’t use them in other websites.
It has got other cons though. You see, encrypting is a two-way function. It’s reversible so you can always decrypt the string and get the original one if you have the key.
So it means that the attacker who gets an access to your database can still get the original passwords. He only needs the key to encrypt them. The other flaw is caused by the incompetence of users. In theory, everybody knows the passwords should be long, safe, original and hard to crack, but there are still individuals who love “123456″, “admin1″ and so on. If your website is small, it’s not so scary, but the more users you have the more users could probably pick “123456″. I wonder If you already know what this means.
Encryption of the same password will be always the same so you could end with a database where encrypted passwords are repeated:
When the attacker sees something like this, it’s easy to get the conclusion that repeated passwords must be something popular. And it would be the truth!
Just one another digression. If you don’t use salt (it’s just a random sequence of bytes that is added to the password before being digested, I’ll tell you more about this in the next chapter) and the same passwords look the same after encryption, another thing which can reveal them is password hints. What do I mean?
Let’s analyze the previous table. If we have the same password, every hint must fit. So if we store hints alongside the user password, the attacker gets as many help as many passwords there are. If we could see only one, as in the standard procedure of restoring a password, it would be hard to guess what it could be. It’s a movie about mafia? There are tons of them… But if we also know, that Marlon Brando played there and Michael Corleone is one of the characters, then the difficulty is decreasing to zero.
We already know that we’d like a safer system for our website. Could you believe that such a weak authentication system was used by, for instance, Adobe?
In 2013, Adobe databases got hacked. Reports revealed that more than 130 millions user accounts got leaked. The whole database was quickly published online and became the target for a great number of hackers and… marketing companies. They received a great list of emails and they knew a lot about this group of people just by the fact they were Adobe clients. In the article we can read:
“The file holds Adobe IDs, email addresses, (encrypted) passwords, credit/debit card numbers, expiration dates, other PII (Personally Identifiable Information) and more.”
The worst thing here is encrypted passwords. We already know what the flaws are…
“The use of a symmetric cipher here, assuming we’re right, is an astonishing blunder, not least because it is both unnecessary and dangerous.
Anyone who computes, guesses or acquires the decryption key immediately gets access to all the passwords in the database.” ~ Paul Ducklin at Sophos [Naked Security]
So… is this a better approach? Yes. But it’s still not what we’re looking for.
So encrypting seems quite a good idea, but the encryption key makes it really unsafe. How can we get rid of an encryption key? By making our encryption one-way only. If we encrypt and never decrypt, we don’t need an encryption key.
You can wonder – “What if one of my users loses a password? Can I send it to her/him?” – No, you can’t, but there are other solutions, like setting new passwords. But let’s get back to the topic.
Again, how does it work?
So what has changed? We NEVER decrypt, so we forget about our encryption key. The same as before we encrypt the password before storing it in the database. The thing that has changed is the part when we sign in. This time, we don’t decrypt our password from the database (we don’t have any key, so we can’t do it anyway), instead, we encrypt typed password in the same way, with the same algorithm as we had done it while registering. If both encrypted strings (typed password and password from DB) are the same, you’re in.
It’s not really complicated. But now we have a question, what password digesting algorithm should we choose? The most popular ones are MD5 and SHA1, SHA2 (from SHA family). They are really easy to use. But now the question arises. If they are popular, are they really safe? Well…
If you type the hashed code to Google and you used a public popular algorithm you’ll probably get a tone of results telling you what would be an encrypted value of such digest. Just try “e00cf25ad42683b3df678c61f42c6bda” for example. The first result will already give you a decoded value “admin1″. There are also the things called rainbow tables, which really speed up cracking of password hashes. Rainbow tables are precomputed tables for reversing cryptographic hash functions. They’re used mainly for cracking password hashes and recovering plaintext passwords. Shortly, it’s a space/time trade-off. You use less computer processing time and more storage. It’s faster than a brute-force attack calculating a hash on every attempt. It’s like the database of already calculated millions of hashes for specified passwords. Scary enough?
And the other flaw is again – common passwords. “123456″ and “123456″ still will give the same results to the database.
So what do we have to do? We need to salt them.
Hashing and salting. They have never invented annoying better than this and it seems a really good approach for now. The salt is a random sequence of bytes that is added to the password before being digested. This causes the digest to be different than the normal encryption. It protects us really well against rainbow tables/dictionary attacks. There are two types of salting:
– fixed salt, a sequence of bytes used for every password. However, even if we keep our salt hidden, our system is still vulnerable to birthday attacks and doesn’t use perfectly the whole idea of salting. A birthday attack is an attack that exploits the Birthday paradox theory. Shortly, it says that if you have a big amount of user password digests, the probability of generating a password colliding with another one of the digests in the table is higher than you could expect. And the more users, the more is the probability and the easier it becomes for an attacker to find a way to crack your passwords.
– Variable salt is generated separately for each password. This causes the digest to be different for the same password with every encryption. It’s safer and we get rid of common password flaw, which I mentioned before.
The first option weakens when the attacker gets the fixed salt. He’s getting the key for all passwords then. There are a few ways for an attacker to know the salt. Brute force on his/her or somebody’s password is enough to get it.
The second option is much safer, but if it’s random, we have to store it unencrypted along the hashed password in the database. Why? Because we will need it to regenerate the same algorithm we used for signing up while logging in. However still every password is different, so they have to be attacked separately.
The best option to mix both of them. Then we generate a random string and compute it also with fixed salt. This way we’re much safer. The minimum recommended size of salt is 8 bytes. At least 8 bytes should be random.
This way every user password is completely random. It means you are not able to pull the password back from it. Do you remember rainbow tables flow? It won’t work now! Salt added to the password ruins it. It’s now a really long random string. The common passwords flaw? It won’t work as well! The only thing the attacker can do is to try every password he could found out. Brute force attack against complicated hashing algorithm and very long salt will need a lifetime to get your password.
Oh! And one other thing – password guessing attempts limit. It’s a simple mechanism to implement and it’s very important against brute-force attacks.
The iteration count
We have now a really safe algorithm, but there is another thing we can also do. The iteration count refers to the number of times the hashing function works on its own results. What do I mean? When you generate your salt and compute it with a password, you apply the hash function (SHA-1, MD5 or something else). To make it safer you can again apply the hash function on an already hashed result and then do it again and again… and again. It makes the final string even more complicated.
The minimum recommended number of iterations is 1,000 and it really should make you feel safer. It’s not really demanding work for the server to do it, but for the attacker increasing such complexity would make brute-force attacking even longer process than it would be already anyway.
We’ve found possibly the best way to protect our users' privacy. Not perfect, but at this moment – the best. Remember, however, that in the security area, in IT at all, everything is changing really fast. While designing your authentication system, search for the newest tricks, follow reputable people from the industry. And never let your authentication system get old!
But… to be honest. The best way to not to worry about passwords security is to not store them at all. If you can, use Facebook login, Google sign-in in your websites to allow users to log in using their other accounts. You don’t have to worry about it yourself then.
- Add Google Sign-In to Your Website
- Integrating Google Sign-In into your web app
- Add Facebook Login to Your App or Website
- Login with facebook using PHP
We already know that weak passwords can provide harm not only to the individual users but even to the whole database. If we’re skillful enough, we can take in account user incompetence and defend against it. However, for users themselves… they’re still easy target with easy passwords. What the perfect password should be? I based my advices on the research of Matteo Dell’Amico, a researcher at Symantec Research and Maurizio Filippone from the French research institute Eurecom. You can find it here.
1. The password must include upper and lowercase letters, and at least one numeric character
The latest research shows that it doesn’t make them really stronger… but if it does even a little, don’t hesitate to use it.
2. Make the password as long as you can
It’s more effective. Try to make your password as long as possible. The longer it is, the harder is to recover it. Learn more about the newest “good password” techniques here.
3. Use symbols
The latest research shows that it’s more effective than using upper and lowercase letters.
4. Try to be the least predictable
Think of weird passwords, don’t use dictionary words.
Get help – useful links
If you want to dig dipper into the subject, here are some useful links:
- How to encrypt user passwords
- Salt (cryptography)
- Safe Password Hashing
- How Rainbow Tables work
- Don’t Hash Secrets
- Why passwords should be hashed
- How to keep track of your passwords without going insane
- Serious Security: How to store your users’ passwords safely
- Password Storage Cheat Sheet
The best way to not store passwords in a bad way is to not store them at all.
Unfortunately, it’s not always possible. So if you really have to do this, pay attention to every detail. The password is the key to your service and sometimes to other people’s lives.
Remember that you have a duty to protect your users’ privacy. We found out probably the best way to protect it at the moment. But remember that world is changing really fast. Don’t let user data exposed by your incompetence. You owe them privacy.