fix: prevent partial secret leak in sensitive_data redaction when secrets share substrings (#4609)#4628
Conversation
…rets share substrings (browser-use#4609)
|
TRIX Bot seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
There was a problem hiding this comment.
Pull request overview
Fixes a sensitive-data redaction edge case in the message manager where replacing shorter secrets before longer ones could leave fragments of longer secrets unredacted (when secrets share substrings).
Changes:
- Sort
sensitive_valuesby descending value length before performing replacements inMessageManager._filter_sensitive_data.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # Sort by length descending to avoid partial leaks when one secret is a substring of another | ||
| for key, val in sorted(sensitive_values.items(), key=lambda x: len(x[1]), reverse=True): | ||
| value = value.replace(val, f'<secret>{key}</secret>') |
There was a problem hiding this comment.
Regression coverage: this change fixes substring-overlap leaks by sorting sensitive values by length, but there’s no test asserting that when one secret is a substring of another the longer secret is fully redacted (and no suffix fragment remains). Please add a test case for this scenario (e.g., short='abc', long='abc-SECRET') so the bug can’t regress.
| # Sort by length descending to avoid partial leaks when one secret is a substring of another | ||
| for key, val in sorted(sensitive_values.items(), key=lambda x: len(x[1]), reverse=True): | ||
| value = value.replace(val, f'<secret>{key}</secret>') |
There was a problem hiding this comment.
This PR fixes substring-order leaks in MessageManager._filter_sensitive_data, but the same replacement logic still exists in browser_use/agent/views.py (AgentHistory._filter_sensitive_data_from_string) without length-sorting and is used by AgentHistory.save_to_file(sensitive_data=...). That path can still partially leak secrets when values share substrings; consider applying the same ordering (or extracting a shared helper) so the fix is consistent across redaction entry points.
|
Thanks for the PR! Closing this in favor of #4660, which covers both redaction sites (this one fixes message_manager but misses the history serialization path in agent/views.py where secrets can also leak). |
Summary
The sensitive_data redaction in _filter_sensitive_data iterates over secrets in arbitrary order. When one secret is a substring of another (e.g. \�pi_key=abc123xyz\ and \key=abc123), the shorter secret may be replaced first, leaving a fragment of the longer secret in place.
Example:
Fix: Sort secrets by length descending (longest first) before applying replacements.
Files Changed
Closes #4609
Summary by cubic
Fix partial secret leaks in sensitive data redaction when secrets share substrings. Secret values are now sorted by length (longest first) before replacement to ensure longer secrets are fully masked.
Written for commit 78a93eb. Summary will update on new commits.