Skip to content

Allow multiple integrity hashes on the integrity field#29192

Open
willstranton wants to merge 1 commit intobazelbuild:masterfrom
willstranton:integrity-multiple
Open

Allow multiple integrity hashes on the integrity field#29192
willstranton wants to merge 1 commit intobazelbuild:masterfrom
willstranton:integrity-multiple

Conversation

@willstranton
Copy link
Copy Markdown
Contributor

Description

This aligns the integrity field more closely with the web spec: https://www.w3.org/TR/sri-2/ Specifically, multiple hashes can now be provided.

Before - only one integrity hash could be provided:

repository_ctx.download("url", integrity="sha256-1234")

After - multiple integrity hashes can now be provided:

repository_ctx.download("url", integrity="sha256-1234 sha384-1234 sha512-1234")

Per the spec, only the strongest algorithm will be used. In the latter example, the sha512 hash would be used.

Motivation

Fixes #15758

Build API Changes

  1. Has this been discussed in a design doc or issue? (Please link it)

See #15758

  1. Is the change backward compatible?

Possibly? Some error messages are changed when there are bad integrity hashes.
Existing Bazel builds will still continue to work.

  1. If it's a breaking change, what is the migration plan?

There is no migration plan.

Checklist

  • I have added tests for the new use cases (if any).
  • I have updated the documentation (if applicable).

Release Notes

RELNOTES: The integrity field of repository_ctx.download() and download_and_extract() can now take multiple integrity hashes, separated by spaces.

This aligns the integrity field more closely with the web spec: https://www.w3.org/TR/sri-2/
Specifically, multiple hashes can now be provided.

Before - only one integrity hash could be provided:

>  repository_ctx.download("url", integrity="sha256-1234")

After - multiple integrity hashes can now be provided:

>  repository_ctx.download("url", integrity="sha256-1234 sha384-1234 sha512-1234")

Per the spec, only the strongest algorithm will be used. In the latter example,
the sha512 hash would be used.

Fixes bazelbuild#15758
@willstranton willstranton marked this pull request as ready for review April 2, 2026 03:08
@github-actions github-actions Bot added team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. awaiting-review PR is awaiting review from an assigned reviewer labels Apr 2, 2026
@meteorcloudy meteorcloudy requested a review from Wyverald April 8, 2026 15:20
@meteorcloudy
Copy link
Copy Markdown
Member

I don't quite understand, why couldn't we only specify the hash with the strongest algorithm?

@willstranton
Copy link
Copy Markdown
Contributor Author

why couldn't we only specify the hash with the strongest algorithm?

I'll copy from what the spec says here https://www.w3.org/TR/sri-2/#agility

Multiple sets of integrity metadata may be associated with a single resource in order to provide agility in the face of future cryptographic discoveries....
...
When a hash function is determined to be insecure, user agents SHOULD deprecate and eventually remove support for integrity validation using the insecure hash function. User agents MAY check the validity of responses using a digest based on a deprecated function.

To allow authors to switch to stronger hash functions without being held back by older user agents, validation using unsupported hash functions acts like no integrity value was provided (see the § 3.3.4 Do bytes match metadataList? algorithm below). Authors are encouraged to use strong hash functions, and to begin migrating to stronger hash functions as they become available.

I think the spec authors believe there will be a day when sha256 will not be sufficient and Bazel will want to migrate to another hash algorithm. It made me think about how git will be moving from SHA-1 to SHA-256 in git 3.0:

The default hash function for new repositories will be changed from "sha1" to "sha256". SHA-1 has been deprecated by NIST in 2011 and is nowadays recommended against in FIPS 140-2 and similar certifications. Furthermore, there are practical attacks on SHA-1 that weaken its cryptographic properties:...

@willstranton
Copy link
Copy Markdown
Contributor Author

why couldn't we only specify the hash with the strongest algorithm?

Sorry, I don't think I answered the question directly, and instead just copy-pasted the spec. Yes, you're right that it would make sense to only set the strongest algorithm. Ideally, that's what would happen, but in reality, I think it's hard to make a swift transition from one algorithm to another. I think that was the intent for why the spec writers put in this future-proof specification.

Here's an imaginary scenario that the spec helps future proof:

In 20XX, the SHA2 family (SHA256,SHA384/SHA512) is deprecated/broken/is not in compliance with newly written FIPS-140-5
standard. Instead, a SHA3 family hash needs to be used (SHA3-256, SHAKE-256). Bazel 14 doesn't support any SHA3 family hashes, but the upcoming Bazel 15 does, but migrating to Bazel 15 is complicated because Bzlmod is being deprecated for Bzlmod2. Migration will be a year-long effort - in the meanwhile, a majority of the team will need to continue working with Bazel 14, while a small contingent starts using Bazel 15 to ship a niche government project that requires FIPS-140-5 compliance. Being able to set a SHA256 (SHA2 family) and a SHAKE-256 (SHA3 family) integrity allows both sets of team to continue working while they navigate the Bazel 15 -> Bazel 16 transition.

Is this a fantastical scenario? Perhaps 😛 . Regardless, this helps bazel get closer to the integrity spec, which I think is still a good thing.

@fmeum
Copy link
Copy Markdown
Collaborator

fmeum commented Apr 15, 2026

I don't think this argument applies to Bazel as much as it does to web browsers. Websites may need to support decade old browsers that don't receive updates, but for Bazel, we only support the three latest major versions. If SHA-256 is broken, all rulesets can just migrate to SHA-512 without any backwards compatibility issues. If the entire current SHA family is broken, we would only have to add support for a new hash function to a couple Bazel major versions via patch releases so that rulesets can adopt them without fear of breaking existing users.

Do you see any (even unlikely) scenario in which the ability to add multiple hashes would actually help? I am worried that we would just cause churn as well-meaning ruleset authors would probably feel incentived to add a bunch of hashes just to be "on the safe side".

@willstranton
Copy link
Copy Markdown
Contributor Author

Do you see any (even unlikely) scenario in which the ability to add multiple hashes would actually help?

Isn't the scenario I added above what you are asking for? Or maybe you are discounting that scenario because you are saying the time window in which a new hash algorithm to be backported to a patch release is much smaller (~2 months) than what browsers have to support (multiple years) AND are assuming that getting everyone to upgrade to a latest major/minor/patch release will be quick too.

But I don't think it's the case that everyone upgrades quickly - and because of that I'd say that makes rule authors have the extra burden of having to determine when they can safely use a new hash algorithm.

Going back to that example, suppose SHA3 family hashes (SHA3-256/SHAKE-256) are added in Bazel 16, and are backported to Bazel 15.4 and Bazel 14.7 (the three latest major versions). When can a rule author start switching over to using the new SHA3 family? With this pull request, assuming it gets into Bazel 10, they can start adding the new hashes whenever they want (they can supply the existing SHA256 hash and the newly introduced SHA3-256). Without it, they risk breaking users who want to use their ruleset that are still on earlier Bazel versions, like 15.3. Is it safe to say that they'd wait another year after support was added before adding the new SHA3 hashes? I'm not a rule author, so that would be my general assumption.

I am worried that we would just cause churn as well-meaning ruleset authors would probably feel incentived to add a bunch of hashes just to be "on the safe side"

This is hard to quantify. This is more of an opt-in future-proof security measure as opposed to one where it is forced upon rule authors. So I want to point out that this would be self-imposed churn, as opposed to a scenario where Bazel forces the community to adopt a new standard (eg. drops support for SHA256 integrity hashes)

@fmeum
Copy link
Copy Markdown
Collaborator

fmeum commented Apr 15, 2026

Going back to that example, suppose SHA3 family hashes (SHA3-256/SHAKE-256) are added in Bazel 16, and are backported to Bazel 15.4 and Bazel 14.7 (the three latest major versions). When can a rule author start switching over to using the new SHA3 family?

If SHA3 is added to Bazel 16 because folks are increasingly worried about the security guarantees of SHA2 (similar to what happened with SHA1), we will have years to let users migrate. Keep in mind that SHA1 is still considered to be resistant to preimage attacks (not collision attacks) today, which is what Bazel's use case relies on. Minor releases are expected to be fully compatible, so rulesets can assume that users update much more readily than they would to a new major version. Rulesets could also leverage bazel_features to gate the new algorithm if desired.

If SHA3 is added to Bazel 16 because preimage attacks on SHA2 have become possible essentially over night, the world will be in big trouble and integrity hashes on Bazel source archives will probably be far down the priority list. But even in that case we could still cut a Bazel minor release with just support for SHA3 for each major release immediately and all rulesets could require that version immediately as well.

What do you think of just adding support for SHA3, but not for multiple hashes? This would allow ruleset authors to adopt the new function soon without introducing new complexity or practices.

@willstranton
Copy link
Copy Markdown
Contributor Author

I don't have any further arguments to add. It's well within the Bazel maintainers prerogative to reject a pull request if they don't believe the value added is commensurate with the added code/complexity. Though if that's the case - please remove the help wanted and p2 labels on the original issue or just close the issue.

I'll add in the original issue filer @uhthomas if they had any additional comments/concerns.

What do you think of just adding support for SHA3, but not for multiple hashes?

In this discussion, we've been assuming SHA3 will be the next family of hash algorithms to use if SHA2 fails, but even that is in dispute. Eg. on the web side of things, they are conflicted over the inclusion of SHA3 - w3c/webcrypto#319 (comment) . So I'm ambivalent on it as well.

It does seem that YAGNI applies to both proposals here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

awaiting-review PR is awaiting review from an assigned reviewer team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Subresource Integrity should accept multiple checksums

3 participants