Allow multiple integrity hashes on the integrity field#29192
Allow multiple integrity hashes on the integrity field#29192willstranton wants to merge 1 commit intobazelbuild:masterfrom
Conversation
This aligns the integrity field more closely with the web spec: https://www.w3.org/TR/sri-2/ Specifically, multiple hashes can now be provided. Before - only one integrity hash could be provided: > repository_ctx.download("url", integrity="sha256-1234") After - multiple integrity hashes can now be provided: > repository_ctx.download("url", integrity="sha256-1234 sha384-1234 sha512-1234") Per the spec, only the strongest algorithm will be used. In the latter example, the sha512 hash would be used. Fixes bazelbuild#15758
|
I don't quite understand, why couldn't we only specify the hash with the strongest algorithm? |
I'll copy from what the spec says here https://www.w3.org/TR/sri-2/#agility
I think the spec authors believe there will be a day when sha256 will not be sufficient and Bazel will want to migrate to another hash algorithm. It made me think about how git will be moving from SHA-1 to SHA-256 in git 3.0:
|
Sorry, I don't think I answered the question directly, and instead just copy-pasted the spec. Yes, you're right that it would make sense to only set the strongest algorithm. Ideally, that's what would happen, but in reality, I think it's hard to make a swift transition from one algorithm to another. I think that was the intent for why the spec writers put in this future-proof specification. Here's an imaginary scenario that the spec helps future proof:
Is this a fantastical scenario? Perhaps 😛 . Regardless, this helps bazel get closer to the integrity spec, which I think is still a good thing. |
|
I don't think this argument applies to Bazel as much as it does to web browsers. Websites may need to support decade old browsers that don't receive updates, but for Bazel, we only support the three latest major versions. If SHA-256 is broken, all rulesets can just migrate to SHA-512 without any backwards compatibility issues. If the entire current SHA family is broken, we would only have to add support for a new hash function to a couple Bazel major versions via patch releases so that rulesets can adopt them without fear of breaking existing users. Do you see any (even unlikely) scenario in which the ability to add multiple hashes would actually help? I am worried that we would just cause churn as well-meaning ruleset authors would probably feel incentived to add a bunch of hashes just to be "on the safe side". |
Isn't the scenario I added above what you are asking for? Or maybe you are discounting that scenario because you are saying the time window in which a new hash algorithm to be backported to a patch release is much smaller (~2 months) than what browsers have to support (multiple years) AND are assuming that getting everyone to upgrade to a latest major/minor/patch release will be quick too. But I don't think it's the case that everyone upgrades quickly - and because of that I'd say that makes rule authors have the extra burden of having to determine when they can safely use a new hash algorithm. Going back to that example, suppose SHA3 family hashes (SHA3-256/SHAKE-256) are added in Bazel 16, and are backported to Bazel 15.4 and Bazel 14.7 (the three latest major versions). When can a rule author start switching over to using the new SHA3 family? With this pull request, assuming it gets into Bazel 10, they can start adding the new hashes whenever they want (they can supply the existing SHA256 hash and the newly introduced SHA3-256). Without it, they risk breaking users who want to use their ruleset that are still on earlier Bazel versions, like 15.3. Is it safe to say that they'd wait another year after support was added before adding the new SHA3 hashes? I'm not a rule author, so that would be my general assumption.
This is hard to quantify. This is more of an opt-in future-proof security measure as opposed to one where it is forced upon rule authors. So I want to point out that this would be self-imposed churn, as opposed to a scenario where Bazel forces the community to adopt a new standard (eg. drops support for SHA256 integrity hashes) |
If SHA3 is added to Bazel 16 because folks are increasingly worried about the security guarantees of SHA2 (similar to what happened with SHA1), we will have years to let users migrate. Keep in mind that SHA1 is still considered to be resistant to preimage attacks (not collision attacks) today, which is what Bazel's use case relies on. Minor releases are expected to be fully compatible, so rulesets can assume that users update much more readily than they would to a new major version. Rulesets could also leverage bazel_features to gate the new algorithm if desired. If SHA3 is added to Bazel 16 because preimage attacks on SHA2 have become possible essentially over night, the world will be in big trouble and integrity hashes on Bazel source archives will probably be far down the priority list. But even in that case we could still cut a Bazel minor release with just support for SHA3 for each major release immediately and all rulesets could require that version immediately as well. What do you think of just adding support for SHA3, but not for multiple hashes? This would allow ruleset authors to adopt the new function soon without introducing new complexity or practices. |
|
I don't have any further arguments to add. It's well within the Bazel maintainers prerogative to reject a pull request if they don't believe the value added is commensurate with the added code/complexity. Though if that's the case - please remove the I'll add in the original issue filer @uhthomas if they had any additional comments/concerns.
In this discussion, we've been assuming SHA3 will be the next family of hash algorithms to use if SHA2 fails, but even that is in dispute. Eg. on the web side of things, they are conflicted over the inclusion of SHA3 - w3c/webcrypto#319 (comment) . So I'm ambivalent on it as well. It does seem that YAGNI applies to both proposals here. |
Description
This aligns the integrity field more closely with the web spec: https://www.w3.org/TR/sri-2/ Specifically, multiple hashes can now be provided.
Before - only one integrity hash could be provided:
After - multiple integrity hashes can now be provided:
Per the spec, only the strongest algorithm will be used. In the latter example, the sha512 hash would be used.
Motivation
Fixes #15758
Build API Changes
See #15758
Possibly? Some error messages are changed when there are bad integrity hashes.
Existing Bazel builds will still continue to work.
There is no migration plan.
Checklist
Release Notes
RELNOTES: The
integrityfield ofrepository_ctx.download()anddownload_and_extract()can now take multiple integrity hashes, separated by spaces.