This is one of a series of posts about trust, which may sound strange since in security we learn that we shouldn’t trust anyone or anything.
In a not so distant past, in order to deal with the browsers’ maximum concurrent connections to the same domain we started spreading content across subdomains controlled by us. Most of the time they were just alias of our application/website domain.
Then, to speed up our application, we started moving our applications/websites static contents to third party storages, accessible over HTTP(s). On that day, CDN(s) (Content Delivery Networks) entered our relationship with the end-users.
From that day on, we started trusting third parties to store our content and later serve them directly to our users. The problem is that we don’t have guarantees that end-users are receiving what we put to that storage.
//some-third-party.tld/jquery-latest.js is not what we were expecting? How bad can things get if
some-third-party.tld is hacked and
jquery-latest.js is modified, perhaps to hold extra malicious source code?
Compromise of the third-party service should not automatically mean compromise of every site which includes its scripts. Content authors will have a mechanism by which they can specify expectations for content they load, meaning for example that they could load a specific script, and not any script that happens to have a particular URL.
Keeping things simple, as they should be, this working draft introduces a new attribute
integrity to the
<script /> tag which enables the developer to tell the browser to perform an integrity check before executing the script: execute the script only if its digest is the one provided.
<script src="https://analytics-r-us.com/v1.0/include.js" integrity="sha256-SDfwewFAE...wefjijfE" crossorigin="anonymous"></script>
Hash digests have been used for years to check the integrity of downloaded files: the content provider gives you the file and publishes a digest on their server. After downloading you compute the local file digest using the same cryptographic hash function (usually md5) and compare it with the published one. If they match, you are good to go, otherwise the downloaded file is deemed corrupted.
Subresource integrity can also be used in scenarios where you are loading third- party Mashups or widgets. Sure, sometimes that code will be very dynamic and as such, it will not be a very good candidate for hashing. But if you are loading dynamic code from a partner website, you’d better trust him, right? Maybe that can be alleviated by splitting what is static in that mashup and what is dynamic, and then try to reduce the dynamic part to a JSON that you can validate on your side.
Now, consider another scenario where the client-side code may use this feature to verify if the scripts that are being loaded from its own server are valid. This may sound a bit silly at first. Usually we don’t trust the client, but the server is usually trustworthy, and using good channel encryption protects you from corrupted code being loaded. However, recent vulnerabilities were discovered in OpenSSL, MitM (Man-in-the-Middle) attacks were proved to be still around and Man-in-the-Browser (MitB) attacks are constantly surging on client devices. Using Subresource integrity to check resources loaded from the server could be an extra verification that those attacks would have to pass. And many MitM/MitB are still too simple, so this would be an extra hurdle.
In conclusion, Subresource integrity is a (not so) simple improvement which can make a huge difference on web application security. Other techniques, like Content Security Policy (CSP) which we will address soon in a blog post, are available and should be part of any application security policy.
Final thought about trust: should we trust (free) obfuscators?