SHA-1, SHA-2, SHA-256, SHA-384 – What does it all mean!!
If you have heard about “SHA” in its many forms, but are not totally sure what it’s an acronym for or why it’s important, we’re going to try to shine a little bit of light on that here today.
Before we can get to SHA itself though, we need to run through what a hash is, and then we’ll get into how SSL certificates use hashes to form digital signatures. These are critical concepts to understand before you’ll be able to follow what SHA-1 and SHA-2 are.
What is a Hash?
A hashing algorithm is a mathematical function that condenses data to a fixed size. So, for example, if we took the sentence…
“The Quick Brown Fox Jumps Over The Lazy Dog”
…and ran it through a specific hashing algorithm known as CRC32 we would get:
This result is known as a hash or a hash value. Sometimes hashing is referred to as one-way encryption.
Hashes are convenient for situations where computers may want to identify, compare, or otherwise run calculations against files and strings of data. It is easier for the computer to first compute a hash and then compare the values than it would be to compare the original files.
One of the key properties of hashing algorithms is determinism. Any computer in the world that understands the hashing algorithm you have chosen can locally compute the hash of our example sentence and get the same answer.
Hashing algorithms are used in all sorts of ways – they are used for storing passwords, in computer vison, in databases, etc.
There are hundreds of hashing algorithms out there and they all have specific purposes – some are optimized for certain types of data, others are for speed, security, etc.
For the sake of today’s discussion, all we care about are the SHA algorithms. SHA stands for Secure Hashing Algorithm – its name gives away its purpose – it’s for cryptographic security.
If you only take away one thing from this section, it should be: cryptographic hash algorithms produce irreversible and unique hashes. Irreversible meaning that if you only had the hash you couldn’t use that to figure out what the original piece of data was, therefore allowing the original data to remain secure and unknown. Unique meaning that two different pieces of data can never produce the same hash – the next section explains why this is so important.
Note: To make it easier to read and comprehend this article I am using an example data string and hashing algorithm that is significantly shorter than what would actually be used in practice. The hashes you have seen thus far are NOT SHA hashes of any type.
Now that we know what hashes are, we can explain how they are used in SSL Certificates.
The SSL/TLS protocol is used to enable secure transmission of data from one device to another across the internet. For succinctness, it seems SSL is often explained as “encryption.” But don’t forget that SSL also provides authentication. The SSL certificate file is tasked with providing the necessary information needed for authentication. Or put another way, SSL certificates bind a specific public key to an identity.
Remember that the SSL/TLS protocol facilitates a connection using asymmetric encryption. This means there are two encryption keys that each handle one half of the process: a public key for encryption, and a private key for decryption. Every SSL certificate contains a public key that can be used by the client to encrypt data, and the owner of said SSL certificate securely stores a private key on their server which they use to decrypt that data and make it readable.
Ultimately, the primary purpose of this asymmetric encryption is secure key exchange. Owing to the computing power asymmetric keys require, it’s more practical (and still safe) to use smaller symmetric keys for the actual communication portion of the connection. So the client generates a session key, then encrypts a copy of it and sends it to the server where it can be decrypted and used for communicating throughout the duration of the connection (or until it’s rotated out).
The is why Authentication is incredibly important to making sure SSL/TLS actually provides meaningful security. Imagine if your computer had no reliable way to know who owned the encryption key you were using? Encrypting your session key with that public key would not be useful because you would not know who possessed the corresponding private key that decrypts it. After all, encrypting data is of little use if you are sending it directly to a man-in-the-middle attacker or a malicious party at the other end of the connection.
Digital signatures are an important part of how SSL certificates provide authentication. When a certificate is issued, it is digitally signed by the Certificate Authority (CA) you have chosen as your certificate provider (for example Sectigo, DigiCert, etc). This signature provides cryptographic proof that the CA signed the SSL certificate and that the certificate has not been modified or reproduced. More importantly, it an authentic signature is cryptographic proof that the information contained in the certificate has been verified by a trusted third party.
Now let’s talk about how a digital signature is made, applied, affixed – you pick the terminology. The asymmetric keys we mentioned before are used again, but for the purpose of signing not encrypting. Mathematically, signing involves flipping around the way the data and keys are combined (We won’t go too far into the weeds on the specifics of how signatures are created because it gets complicated quickly. If you are interested in that, Joshua Davies has written a great post on how digital signatures work). To make it easier for computers to quickly, yet still securely, create and check these signatures, the CA first hashes the certificate file and signs the resulting hash. This is more efficient than signing the entire certificate.
That Digital signatures then provides the needed proof that the certificate you have been given is the exact certificate issued by a trusted CA to the website in question. No tricks. No spoofing. No man-in-the-middle manipulation of the SSL/TLS certificate file.
Digital signatures are incredibly sensitive – any change to the file will cause the signature to change. If we took our example sentence from the previous section and made it entirely lowercase (“the quick brown fox jumps over the lazy dog”) the resulting hash would be entirely different. That means the resulting signature of that hash would also be different. Even changing one bit of a multi-thousand gigabyte document would result in an entirely different hash.
This makes it impossible for an attacker to modify a legitimate certificate or create a fraudulent certificate that looks legitimate. A different hash means that the signature would no longer be valid, and your computer would know this when it’s authenticating the SSL certificate. If your computer encountered an invalid signature, it would trigger an error and entirely prevent a secure connection.
SHA-1 and SHA-2
Now that we have laid the foundation, we can get on to the star of the show.
As I said earlier, SHA stands for Secure Hashing Algorithm. SHA-1 and SHA-2 are two different versions of that algorithm. They differ in both construction (how the resulting hash is created from the original data) and in the bit-length of the signature. You should think of SHA-2 as the successor to SHA-1, as it is an overall improvement.
Primarily, people focus on the bit-length as the important distinction. SHA-1 is a 160-bit hash. SHA-2 is actually a “family” of hashes and comes in a variety of lengths, the most popular being 256-bit.
The variety of SHA-2 hashes can lead to a bit of confusion, as websites and authors express them differently. If you see “SHA-2,” “SHA-256” or “SHA-256 bit,” those names are referring to the same thing. If you see “SHA-224,” “SHA-384,” or “SHA-512,” those are referring to the alternate bit-lengths of SHA-2. You may also see some sites being more explicit and writing out both the algorithm and bit-length, such as “SHA-2 384.” But that’s obnoxious like making people include your middle initial when you say your name.
The SSL industry has picked SHA as its hashing algorithm for digital signatures
From 2011 to 2015, SHA-1 was the primary algorithm. A growing body of research showing the weaknesses of SHA-1 prompted a revaluation. In fact, Google has even gone so far as to create a SHA-1 collision (when two pieces of disparate data create the same hash value) just to provide. So, from 2016 onward, SHA-2 is the new standard. If you are receiving an SSL/TLS certificate today it must be using that signature at a minimum.
Occasionally you will see certificates using SHA-2 384-bit. You will rarely see the 224-bit variety, which is not approved for use with publicly trusted certificates, or the 512-bit variety which is less widely supported by software.
SHA-2 will likely remain in use for at least five years. However, some unexpected attack against the algorithm could be discovered which would prompt an earlier transition.
Here is what A SHA-1 and SHA-2 hash of our website’s SSL Certificate looks like:
So, yes. This is what all the fuss is about. It may not look like much – but digital signatures are incredibly important for ensuring the security of SSL/TLS.
A larger bit hash can provide more security because there are more possible combinations. Remember that one of the important functions of a cryptographic hashing algorithm is that is produces unique hashes. Again, if two different values or files can produce the same hash, you create what we call a collision.
The security of digital signatures can only be guaranteed as long as collisions do not occur. Collisions are extremely dangerous because they allow two files to produce the same signature, thus, when a computer checks the signature, it may appear to be valid even though that file was never actually signed.
How Many Hashes?
If a hashing algorithm is supposed to produce unique hashes for every possible input, just how many possible hashes are there?
A bit has two possible values: 0 and 1. The possible number of unique hashes can be expressed as the number of possible values raised to the number of bits. For SHA-256 there are 2256 possible combinations.
So, 2256 combinations. How many is that? Well, it’s a huge number. Seriously. It puts numbers like trillion and septillion to shame. It far exceeds the how many grains of sand are in the world.
The larger the number of possible hashes, the smaller the chance that two values will create the same hash.
There are (technically) an infinite number of possible inputs, yet a limited number of outputs. So, eventually, every hashing algorithm, including a secure one, produces a collision. But we are mostly concerned with how easy it would be to do so. SHA-1 was deemed insecure because, due to both its size and construction, it was feasible to produce a collision.
Note that a large bit-length does not automatically mean a hashing algorithm produces more secure hashes. The construction of the algorithm is also incredibly important – that’s why the SSL industry uses hashing algorithms specifically designed for cryptographic security.
The Move To SHA-2
In 2015 the SSL industry went through the “SHA-2 Transition.” It involved re-issuing thousands of existing certificates so that new files could be created and signed with SHA-2. It also involved major updates to the issuance software that publicly-trusted CAs operate (there are dozens of them). As expected, there were some hiccups.
The deadline for issuing new SSL certificates with SHA-1 hashes was December 31st, 2015. For the most part, the industry has stuck by that deadline. Since then, a few mistakes have been made, and a few special cases were granted.
But over the last three years SHA-1 certificates have almost entirely died out. Today, if you encounter a SHA-1 certificate, you will see an unmistakable warning. It’s been escalating. In Google Chrome, all SHA-1 certificates expiring in 2016 didn’t show the green padlock in secure connections, and instead displayed the same icon as an unsecured HTTP connection. You can click the icon to get more specific information about why it’s being displayed, in case there are other reasons unrelated to the signature.
Browsers treated SHA-1 signed certificates that expire in 2017 with a more intense warning. This is because the security of a signature is directly related to how long it’s valid.
Now, in 2018, Google just summarily executes the site owner and leaves his corpse displayed as a warning to others that might dare to commit the same sins.
Keeping Signatures Secure
As time progresses, attacks against cryptography will improve, and computer processing power will become cheaper. This makes a valid SHA-2 signature less secure in 2020 than it was in 2016. For this reason, the choice of algorithm will be much beefier than immediately necessary so that short term improvements do not result in a compromise of security. It is not unrealistic for a particular hashing algorithm to remain secure for a decade.
Industry experts and security researchers across the world are continually analyzing SHA-2 and other cryptographic hashing algorithms, so rest assured that current SSL certificates will have reliable and secure digital signatures for a while.
That does not mean that cryptographers will just sit around and wait until there is a problem. The successor to SHA-2, conveniently named SHA-3, has already been finalized. When it’s time to make another switch, the SSL industry may use SHA-3 as its next choice, or it may look to an entirely different algorithm.
It takes years to properly research and vet new cryptographic standards, and then develop software that supports them. Hopefully it is reassuring to know the industry is always at least one step ahead.
Every so often we like to re-Hash some of our best, older content in the hopes that our new readers may enjoy it. This article, which originally ran June 29, 2016, it has been updated and revised by Patrick Nohe for 2018.