Unpacking the serial number fiasco playing out in the digital certificate industry
Last week, we discussed a developing situation where several major companies had to mass revoke millions of mis-issued digital certificates.
We gave kind of a cursory explanation of the issue itself at the time, but since then we’ve gotten several questions about what specifically the issue is, and also why it’s significant.
So today we’re going to delve a little bit deeper into the 63-bit serial number debacle and try to paint a clearer picture of what specifically happened and why this is (or isn’t) significant.
That means we’re going to kick the tires on hashing, talk a little about random number generators and then we’ll finish by going into what impact this has on the industry and – and perhaps more importantly – what it says about the current climate between CAs and browsers.
Let’s hash it out.
Certificate Generation and Serial Numbers
Before we get into the crux of the serial number issue – random number generators (RNGs) – let’s start conceptually. Do you remember when we discussed hashing and salting last year? A quick refresher, hashing is a mathematical process that maps data of any length to a fixed length output. It’s a one-way function. Or rather, it’s computationally prohibitive to reverse a hash so it is effectively a one-way function. Hashing algorithms are typically known for their use as check-sums and the like, but they also form the basis for most RNGs.
All hashes, or hash values or digests – whatever you refer to them by – must be unique. If a hashing algorithm ever produces two identical hash values it’s called a collision and it effectively renders the algorithm broken.
Let’s look at a practical example using the now-outmoded SHA-1. Let’s go with a line from Hamlet and shoehorn some culture into this article.
The lady doth protest too much, methinks.
Notice it maps the sentence to a fixed-length output. Now let’s make a small tweak, and drop the period at the end of the sentence.
The lady doth protest too much, methinks
Totally different hash value. This is how it’s supposed to work.
Google once spent millions of dollars and threw some of the brightest minds in technology and mathematics at a two-year project to create a SHA-1 collision – just to prove a point. That’s really only tangentially relevant to today’s discussion, but it’s interesting, nonetheless.
Anyway, when it comes to hashing, collision resistance is the name of the game. That’s exactly why MD5 and SHA-1 have been sunset in favor of SHA-2 – lack of collision resistance.
So how does salting come into this? Well, one way to improve the collision resistance of a hash is to salt it. This is extremely common with passwords. Salting is done by affixing an additional value, called a salt, to the end of the data being hashed. Remember, even the tiniest change to the data being hashed will cause a completely different hash value to be produced.
So if we wanted to salt Shakespeare, it’d look something like this:
The lady doth protest too much, methinks. SALTVALUE
Now, after salting, in order to create a collision, an attacker would also need to know or crack your salt value, too.
Ok, so what does any of that have to do with serial numbers?
We’re getting there. First, let’s apply hashing to SSL/TLS certificates – though this is pretty much the same with all digital certificates – when you generate a certificate, you get a big unwieldy block of letters and numbers sandwiched between a “begin…” and “end certificate” prompt.
It looks like this:
Obviously, this isn’t what you see when you click a certificates’ details, but this is the certificate in its raw form. This is a hash value that was produced using SHA-2. And it will be used as a check-sum during the issuance process. When you generate a certificate signing request (CSR), the server takes the input you’ve given and hashes it. When you submit the CSR to the issuing certificate authority, it will once again take the information you’ve provided and run the same hash function that the server used to check the hash value it generates against the one submitted in the CSR. That’s why even the tiniest mistake in your application/enrollment information requires you to start over – it causes different hash values to be created.
Hashing also plays a role in the digital signature that will be affixed to the certificate by one of the signing CA’s private keys. When a certificate is signed, it’s hashed, then padded with some non-random bytes that denote the hash function used (ii…) and the hash value (hh…). Depending on the algorithm being used, the resulting sequence – represented as an integer – is then signed with the private key.
In the case of RSA, it’s exponentiated using the private key’s value. We’re kind of getting into the weeds here, but the signature will prove vital in authenticating the certificate.
We still haven’t gotten to serial numbers…
Ok, ok. Let’s tie this all together. But first, quick question, what would it take to effectively spoof a digital certificate?
A lot? Yes, and that’s quite by design. To effectively spoof a certificate, you would need to be able to produce several hash collisions. First, you’d need to effectively reproduce the hash value of the certificate itself. Then you would need to be able spoof the signature to produce a collision and create a Boolean match.
Frankly, at their current state of evolution, it would require vast computational resources to accomplish that. Unless, of course, you could compromise the private key that was used to sign the certificate – which would substantially lower the level of difficulty.
Even though the risk of spoofing is currently remote, there’s a degree of severity in terms of the risk presented that warrants an additional safeguard. So, CAs also generate a sufficiently random serial number alongside the certificate, also using SHA-2.
Now let’s circle back to salting. Because it’s relevant in two ways. First of all, during the generation of a serial number, CAs typically add what is called a nonce, a proprietary salt that is applied to an output with 64 bits of entropy and then hashed to produce the value you see in the certificate details.
Entropy is a measure of randomness. In this case, 64 bits of entropy would be 2^64, which creates a probability of one in over 18 quintillion – a number so big it feels totally abstract – that you could guess the key. It would take thousands of years for today’s computers to potentially calculate that value.
We mentioned salting though, because not only does serial number generation typically incorporate it, but in a very practical way serial numbers also serve as a salt in that it helps add another layer of collision resistance to the certificate itself, preventing leverage of collision attacks. After all, that’s one more thing you have to be able to calculate.
Obviously, that rules out generating serial numbers sequentially, but there’s also a more practical, business-related reason why serial numbers can’t be doled out sequentially, too. It’s also because – for the price of one certificate per week/month – you could see how many certificates your competitors are signing.
Why 64 bits of entropy?
Remember earlier how we said that SHA-1 and MD5 were found to be vulnerable to some collision attacks? Not necessarily in a practical sense, but theoretically both can be exploited. MD5, specifically, was found to be vulnerable to collision attacks in 2005 when researchers were able to create a pair of X.509 digital certificates with identical hash values. Then, in 2008 a group of researchers announced they had created a rogue intermediate CA that could pass muster when checked by its MD5 hash.
Obviously, that was a major problem. So, the industry switched to SHA-1 until 2016, when it became theoretically vulnerable to collision attacks, too. The keyword there is theoretically.
Regardless, the baseline requirements were updated to reflect this. From section 7.1:
Effective September 30, 2016, CAs SHALL generate non-sequential Certificate serial numbers greater than zero (0) containing at least 64 bits of output from a CSPRNG.
So how did these CAs run afoul of that requirement?
Well, it has to do with the requirement that all serial numbers must be positive integers, because every now and then the open source software being used, EJBCA, would spit out a negative integer. When that happened EJBCA was misconfigured to pad the value with a fixed output that makes the integer positive.
In laymen’s terms, it was putting a zero at the beginning of the serial number. The serial number is a fixed length, it cuts off at 64 bits, but if one of those bits is necessarily a zero – you’ve just lost one bit of entropy. And while that may seem trivial, there is a 9 quintillion+ difference between 2^63 and 2^64.
The real crux of the issue is with the random number generators that are at the heart of so many encryption-related functions – everything from key generation to serial numbers rely on the random generation of numbers. But EJBCA, like so many of its contemporaries, isn’t truly random.
As we touched on last week, EJBCA is an example of a cryptographically secure pseudo-random number generator (CSPRNG). The key word there is “pseudo.” As in, something resembling or imitating; or purporting to be something when it is genuinely not.
Bank of America’s Don Davis is the VP of Security and Cryptography, he wrote on the value of randomness in a whitepaper a couple of years ago:
Most of the fast secure PRNGs are built around cryptographic hash algorithms, like SHA-1 or SHA-256. These hash functions have tended to have short lifetimes of 6-10 years, from initial publication to the utter ruin of cryptanalysis. Other fast PRNGs have been built around ciphers, like NIST’s standard AES, but again, these ciphers were only designed to be hard to break; they’re not proven to be secure. Only two PRNGs have been proven to be secure. Both are appallingly slow (by 2000x or more), and even so, their security proofs must be qualified: these algorithms were “proven” to be hard to break, assuming that factoring larger numbers is hard, or else assuming that the discrete-logarithm problem is hard. There’s no proof, though, that those problems in turn, are actually intractable.
And a lot of that potentially goes out the window when quantum computers become viable.
As a result, we – as an industry – are going to continue to have this collision issue. And the more CAs that rely on the same pseudo-random number generators the more that risk will grow. We’re already seeing this play out with RSA key gen where the same seeds/vectors are used for far too many digital certificates.
That’s actually how you get encryption backdoors, too. At least functionally. The historical example would be the NSA’s backdoor into RSA, but the general idea is that if you know the seeds used to initialize the RNG, you’ll have a much easier time cracking keys and compromising encryption.
Is there an alternative to CSPRNGs?
Yes there is, they are called TRNGs or True Random Number Generators, and frankly the explanation I’m about to give would be equally at home coming from an MIT-grad cryptographic expert or a freshman bio major on an acid trip.
Truly Random Number Generators use naturally occurring physical events to map bits and achieve true randomness. A few months ago we wrote about how some researchers at Penn State University had devised a key generation scheme that plotted the movement of cells against a fixed grid to generate keys.
That’s also similar to the way CloudFlare uses the lava lamps in its HQ’s lobby for key generation. While it’s challenging to really define the concept of true randomness in an academic sense – though it’s inherently easy to understand – The US National Institute for Standards in Technology (NIST) gave it the old college try.
One might approach it by studying infinite sequences of bits or samples and measure their properties, some statistical such as bias, others non-statistical such as lack of computable correlations. Three of the most frequently used characteristics of true randomness are i) unpredictability, which is a measure of the strong non-computability of the bits in the sequence; ii) uniform distribution of the bits in the sequence; iii) lack of patterns in the sequence. It is worth pointing out that iii) implies both i) and ii). However, i) does not imply ii) and similarly ii) does not guarantee i).
I was with them until we got to the propositional calculus at the end, so trying not to get too distracted with that – the overarching idea is that there are truly random events in nature that can be used to generate random numbers and keys. We all remember the scene from Jurassic Park where Jeff Goldblum explains chaos theory by dripping some water on another scientist’s hand, TRNGs use this naturally occurring randomness as the basis of their number/key generation.
Some common examples of the types of phenomena used are:
- Radioactive Decay
- Thermal Noise
- Quantum Fluctuations
- Air Turbulence
- Cell Movement
“The advantage of these naturally-random sources is that even if an attacker were able to know all of the system’s internal state variables with high precision, this still wouldn’t help him to predict the TRNG’s next random number,” writes Davis. “This unpredictability is guaranteed for some TRNGs by the mathematics underlying quantum mechanics; for others, the mathematics of nonlinear dynamics is equally airtight. In both areas of physics, the unpredictability is not only empirically verified, but is also well-understood mathematically.”
Doug Hill is the founder of RealRandom, which has created its own TRNG and is now looking to break into the crypto-wallet and encryption security sectors. (I can think of a few CAs that might want to give him a call.)
Random numbers are essential to securing computerized networks and there are multiple ways to generate cryptographically secure keys but a cryptographic system is only as strong as the source of randomness that it employs and to generate true randomness requires a naturally occurring physical phenomenon, which Real Random has been able to accomplish. We have built a hardware physical random bit generator that utilizes a mechanical process to create a chaotic state in a controlled environment to produce completely unpredictable bits, to enhance the strength of cryptographic keys.
Long-term, this is the direction the CA industry is going to need to head. A proactive approach to solving this issue and strengthening the entire ecosystem would be to work towards widespread implementation of TRNGs, but as we’re about to discuss – that doesn’t seem to be the tact everyone is taking.
Does any of this matter?
From a practical standpoint: no, not right now. But this whole situation also highlights a key stress point in the industry. This 64-bit requirement, while good in theory, was a safeguard against collision attacks that SHA-2 isn’t even known to be vulnerable to. So, as of now, it’s actually overkill. That is to say, there’s no threat presented by these serial numbers and the certificates will be long expired before there is one.
Besides, nine quintillion is still plenty collision resistant. The issue, distilled to its simplest, comes down to a simple misconfiguration.
Granted, the certificates are non-compliant, but some CAs are pushing back on the strict revocation requirements set forth by the CA/B Forum. It’s here we begin highlighting the stress point I just mentioned.
And that comes down to what a business disruption this is – and how that concern is perceived by other stakeholders. In some sense – and this is based not just on the response to this incident, but also on having kept close tabs on this industry for the past three years – there seems to be more of an interest in being punitive than in being proactive about addressing these kinds of issues.
At issue is the open secret that some at the CA/B Forum vehemently disagree with the notion that digital certificates should be sold – rather, they believe SSL certificates should all be DV, and all be free.
But while commercial business interests (and needs) might be anathema to some parties at the Forum, there’s a fine line between upholding a stern interpretation of the baseline requirements and engaging in outright activism.
To be clear, we work in a number of different capacities with several different Certificate Authorities. We have an intimate understanding of their pain points and their reasoning. But even more so, given our position in the market, we have a man-on-the-ground, clear-eyed perspective of the challenges faced by the average consumer – by SMBs, non-profits, enterprises, etc.
And the unfortunate truth is that while this industry is content to sit and split hairs on the entropy level of serial numbers, the average consumer doesn’t give a $#!%. They see things through the lens of how it will impact their business or website. And for some, this will be the second time in about two years that the digital certificate industry will have caused them an undue disruption over a technical issue that they neither understood, nor had the inclination to learn about.
And even if there is no direct impact, this was newsworthy. Apple, Google and GoDaddy are all huge companies. Revoking millions of certificates (and the word millions stands out) just sounds salacious.
It’s another black eye for the industry. But while Google can afford to take a hardline stance on this situation – owing to the fact its own mis-issued certificates were all in-house so there was no real disruption for its customers – commercial CAs like GoDaddy need to prioritize their customers’ needs.
The CAs aren’t perfect, but they’re at least cognizant of the greater implications of these kinds of debacles. This goes beyond any financial hit – which is the first place the CAs’ critics will go – and into the realm of faltering confidence in the technology itself.
Because right now, to the average consumer, digital certificates are approaching a point where they’re more of a pain in the butt than a benefit. No industry wants that.
Speaking to another industry professional recently – this was off the record so obviously it’s unattributed – the clash between strict adherence to the baseline requirements and the need to do what’s best for the customer couldn’t have been more starkly on display.
“I think we’ve done a lot for standardization and regulation – regulation of ourselves – and often it’s in the purity sense, and not in the consumer’s sense… When we [went through the transition from SHA-1] and it was browsers and CAs, and then all of the sudden payment processors and point of sale terminal people said, “we can’t move on! It’s not possible for us to update these devices and we can’t just go [replace them]. No, it’s a non-starter,” so we had to go ask for exclusions and the CA/B Forum doesn’t allow for exclusions, it generally only allows set rules, and [arbitrates] if you’re in violation or not – we said, ‘we’re breaking the rules! our customer can’t go on.’ But there were a couple of times where we said, ‘we’re going to support our customer in not being compliant’ – because you can’t break the internet.”
Obviously, that opinion isn’t shared by all at the CA/B Forum, but the debate over purity vs practicality is perfectly encapsulated by this serial number snafu, too.
And it shows no signs of abating.
As always, leave any comments or questions below…