Many people will read the headline and probably think: “No, not another piece of advice that I should hash passwords — uh”. But stop. You’ll learn a lot more here. Promised.
We will cover the “perfect”(Nothing is absolutely perfect and of course I would be more than happy for any suggestions for improvements in the comments 😉) way to handle password credentials from the moment a user types them into a form on the client-side, till the moment they are stored in the database. Furthermore, we will look into common errors, that happen when developers store other credentials like Tokens/Secrets/etc.
Password Handling in 2018
Other Credentials (Token, Secrets, …)
Credits, License, References
Many developers know that they should hash passwords. Most know that they should use a per-password-salt to mitigate rainbow table attacks (What is a rainbow table?). Also most developers know that they shouldn’t use SHA-*, but instead a KDF— a hashing function specifically designed for password-hashing. 
In this short “my best practices” we will cover the things mentioned above and a bit more. Firstly, we will permute the password on the client-side. Secondly, we will encrypt the final hash before writing it into our database, similar to how Dropbox does it. Following the “Dropbox-Way” of presenting your password protection I generated a diagram that shows all the cryptographic layers. In the figure below, everything regarding encoding is emitted (e.g. base64, hex, etc.) 
Although I really like the onion-diagram above, I think for explanation purposes another figure, based on the flow through the system, is easier to understand.
Everything starts with our users entering the password into our website and submitting the login. Here comes the first layer most developers think is irrelevant. Before we send the username and password over the wire we perform a single SHA3-512 round on the plain-text password plus a unique name for our service (for example the domain — this means if the user would use the login system at auth.example.com we would do the following: SHA3-512("plain-text-password-from-user" + "auth.example.com")). Why we add this public, well-known, for every user of the service equal, “salt” is explained later.
Right now most developers think: “Don’t we have https for keeping the password secure?”. And that’s right. But keeping the password secure from eavesdropping, etc. was never the intention of this step. In fact try to see this step as converting-function — after this round the permuted version (e.g. the hash) becomes the user’s so-to-say “actual” password, which gets submitted to the server-side.
Why should we do this?
Simple: It shows respect for the user’s password and that you are aware of the fact that, in most cases, is not exclusive to your software. Additionally we gain a few smaller security bonuses (castle approach): There is no way we could ever accidentally store the user’s plain-text password in our logging system, unlike GitHub and Twitter, which both admitted in May 2018, that they have found plain-text passwords in their logging systems. Also the user password would be slightly protected in a MITM attack or a compromised server. “Slightly” because the strength of a single SHA3-512 hash, purely depends on the user’s input, which is admittedly most times not very good. The last point is that client-side hashing is the only simple way to prove you are not “farming passwords” ✔️ 
Let’s think about the last statement and our assumption about user’s reusing their passwords: If now every site would start to use the client-side hashing approach with SHA3-512(password) the complete idea of respecting the user’s password privacy would be destroyed, as every service could use the hash against other services (as currently with plain-text passwords). Therefore this approach wouldn’t give as any enhancement if deployed widely. However if every login-system would add a global unique salt (e.g. domain — SHA3-512(password + domain)), each website’s server-side would get different “permuted” passwords, even if the user takes the same password for every service.
The only two drawbacks to this approach are that:
you cannot enforce password policies on your server-side. Although, whether password policies make sense or not is a different question anyway. I’ll cover that in a later post.
in case you ever change your company’s domain, you need to ether keep your old domain in the hashing code or you do hashing scheme updates transparently during user authentication (e.g. you do the client-side hashing with the old and the new domain, send both to the server, check if the old domain-calculated hash matches the database entry and update it with the new domain-calculated value)
OK let’s continue with our password flow: The username and the permuted version of the password get transmitted over https(!!)to our server. Normally is is recommended to perform a single round of SHA3-512 now. This would be done to normalize the output to fixed 64 bytes, because a few password hashing functions truncate after N bytes (for example bcrypt truncates its input after 72 bytes or a NUL byte), which reduces the entropy of the password. Other password hashing algorithms (PBKDF2) are vulnerable against DoS attacks, if passwords can be arbitrarily long. 
Because the client-side permutation of the password already was a normalization, this shouldn’t concern us, as long as we check whether the client-side provided string is a valid representation of a SHA3-512 hash. If it is we pass it into our KDF — if not we must abort, as we got a tampered malicious input.
🔑 Speaking about KDFs: There are a few acceptable algorithms from which you can choose — namely: Argon2, bcrypt, scrypt, PBKDF2.
Argon2 has won the password hashing competition in July 2015, out of 24 candidates. Since then nobody has found a real attack vector against it. Therefore most cryptographers believe that Argon2 is highly unlikely to fall victim to attacks that make it worse in practice than one of the others and subsequently recommend using it. In Argon2 you can not only specify a cost-parameter, like in bcrypt, but rather 3 parameters: number of iterations, memory consumption and number of threads. Despite all the benefits, bcrypt is out there since 1999 — that’s close to 20 years without major vulnerabilities! Therefore it can be seen as much more battle proofed than Argon2. Also not all cryptography libraries provide first class Argon2 support. In these cases you should use bcrypt. 
In 2009 scrypt, a brypt like function, which requires more RAM, and subsequently makes it more resistant for hardware accelerated attacks, was published. Unfortunately due to its massive memory requirements it‘s very hard to scale and practically not usable for an authentication system. Lowering the memory usage is not feasible as it then becomes, technically, weaker than bcrypt. Therefore its main usage is only in places, where spending hundreds of megabytes of memory and multiple seconds worth of CPU time for a single hash computation, aren’t a problem (e.g. protecting the encryption key for your computers main hard disk).
The most widely deployed algorithm is probably PBKDF2, although it shouldn’t be your choice if you build a new application nowadays, except if you need FIPS-certification.
OWASP, a big online community that tries to increase web application security, through freely-available articles, methodologies, documentations and tools, recommends the following in their “Password Storage Cheat Sheet”: 
- Argon2 is the winner of the password hashing competition and should be considered as your first choice for new applications; - PBKDF2 when FIPS certification or enterprise support on many platforms is required; - scrypt where resisting any/all hardware accelerated attacks is necessary but support isn’t. - bcrypt where PBKDF2 or scrypt support is not available.
The usage of KDFs is pretty self-explanatory: The credential-specific salt is loaded from the database and used together with the client-side provided hash to compute the KDF output.
As you have seen in the diagram above, after the initial password permutation in the fronted, I maintain two different branches. The left one which is a little bit more conservative and the right one which is a little bit more futuristic. Both are completely safe and it mainly depends on one’s preferences.
🔒 After the KDF our password is computationally secure (e.g. implausible to recover — note that nothing is impossible, therefore implausible refers to the computational hardness assumption: is the hypothesis that a particular problem cannot be solved efficiently (where efficiently typically means “in polynomial time”)). 
Still we perform a last step before persisting it into our database. We encrypt the hash using a symmetric encryption algorithm like AES256-GCM or ChaCha20-Poly1305 as this makes a database dump absolutely worthless for brute-force attacks. That’s a fact that can be inferred from thermodynamics: 
“These numbers have nothing to do with the technology of the devices; they are the maximums that thermodynamics will allow. And they strongly imply that brute-force attacks against 256-bit keys will be infeasible until computers are built from something other than matter and occupy something other than space.”, Bruce Schneier
IF we manage to keep the key secure (and of course no significant vulnerabilities are found in the used algorithm and its implementation). The algorithms AES256-GCM and ChaCha20-Poly1305 are used, because these provide AEAD. 
In order to keep the key as secure as we can (without taking advantage of HSMs) we make use of Hashicorp’s Vault and its ability to do EaaS (Encryption-As-A-Service 😅). Therefore we send the output of the KDF to our Vault instance, get the encrypted hash back and store it inside our database. Next time the user wants to log in we load the encrypted hash from the database, decrypt it with Vault and compare it with the generated hash for this authentication cycle. Don’t forget to do Constant-Time-Comparison (e.g. be resistant against a timing-side-channel attack). Some people probably say it’s not important as we only compare hashes. I would advise you to do it anyways, as it’s a good attitude, if you’re making a comparison related security decisions. For example: the — very good — supplementary cryptography package provided by the Go team also does it in its bcrypt implementation here. 
Please let me know your thoughts about this way of handling user passwords in the comments. How do you handle it at the moment? Is something new to you or should be explained in more detail? Let me know!
Although most developers nowadays at least care a little bit about how they should store their passwords, many don’t think about other credentials like Token, Secrets, etc. (admittedly I didn’t either in the past). In order to fully understand, why we should also protect those let’s have a short look back upon the reasons why I originally chose to protect the passwords of our users so much. If thought long enough about this question, most come to the following two assumptions:
If a database leak happens, we don’t want the attacker …
… to somehow regain access to our user’s plain text password, because most users use the same password on multiple sites.
It’s the second point that most developers just don’t think of, when they hand out registration-token, account-recovery-token, API-token, etc. Although they are mostly the same as passwords, as they can be used to authenticate against the service (sometimes they are even more powerful than a password, because they don’t enforce 2FA or MFA).
Let’s think about that it the scenario of account-recovery-tokens:
User requests the recovery-token and server-side generates the token (hopefully from a CSPRNG) and encodes it into something readable (helpful tip for readability of token: [*]) 
Stores it inside your database and then sends it to your user
User is very security educated and knows that he should write the token down twice on two pieces of paper and store both in distinct places that are fire-resistant. (This is how I would recommend you to safe your TOTP-Secrets, Recovery Tokens, etc.) 
Half a year later a hacker manages to get a complete dump of your account-recovery table. You and your team haven't noticed it yet, so you can’t just invalidate the tokens.
Even though the users haven't done anything wrong regarding their account security, the hacker can enter each account-recovery code into your website, therefore impersonate the actual user and change emails and passwords for each account.
Hopefully you can roll back all the changes from a backup, but earning back the trust of your users definitely will be a challenge
[*] Advice: for encoded strings presented to the user I personally prefer lowercase base32 strings split up in packets of 3 or 4 chars — nr6i vzbv h3so thfc — the reason is rather simple: it’s case-insensitive, easier when writing it down and users are less likely to make an error compared to tokens like: zF61bS1lwnO04eq3
Sites including Twitter, Google, etc. support the functionality of recovery-token and they can also show you the token if you missed to write it down in the first place 😲— they store it in plain-text! An absolute no-go for credentials. I have no doubt that a company like Google is capable of securing those tokens, but I would love to hear the argumentation, why they aren’t hashed.
I don’t want to say too much on the implementation site, as most developers can figure it out on their own, if they are aware of such problems. Very short: of course also hash special tokens, API-keys, etc. with including a per-entry-salt.
I personally handle it this way (for account-recovery tokens — e.g. something the user should write down physically):
Generate 20 bytes of cryptographic randomness (CSPRNG) for our account recovery token. Depending on the use-case the amount of bytes generated should be adjusted. (For example: when creating API-keys, there is nothing against using 64 bytes of randomness.)
Encode it (like described above). This results in a 32 character long string that should look similar to this:
usru kbvj nmvg xly5 4qh3 jnk6 jd2n iadm
Generate additional 32 random bytes and use it as salt for hashing your token with ether an KDF (if performance secondary) or plain SHA3-512
Store the first 10 characters (6 bytes) of your token — usru kbvj nm— as ID, the salt and the generated hash, here called “secret”. My experience: In an actual implementation I also do, depending on the latency and performance requirements, symmetric encryption on the secret field (same process as for passwords).
Send the plain token to your user and afterwards erase it from memory
Your user can now write down the Token. If he enters it later your service will take the first 10 characters of the input and try to find the entry with the same ID. If found he will take the salt and the complete user input, hash it and compare it to the secret. Again please do constant-time-comparison (e.g. timing-side-channel attack resistance), as it’s a good attitude, if you do comparison related security decisions.
The End. Thanks for reading, Florian
A big thank you goes to
Lukas Kurz for proofreading the text for linguistic correctness and comprehensibility before I published it.
Tim Heckman for giving me technical feedback and advise on many of the article’s chapters before I published it.
I always have an open ear — email@example.com — just contact me!