Skip to content

fix _flash_3_varlen_hub mask handling#14115

Open
zhtmike wants to merge 2 commits into
huggingface:mainfrom
zhtmike:fix_fa3_mask
Open

fix _flash_3_varlen_hub mask handling#14115
zhtmike wants to merge 2 commits into
huggingface:mainfrom
zhtmike:fix_fa3_mask

Conversation

@zhtmike

@zhtmike zhtmike commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

What does this PR do?

As discussed in #13809, we fixed the mask handling in separate PR.

This PR is to fix the non-contiguous mask handling in #14114

Fixes # (issue)
Fix #14114

Before submitting

Who can review?

@sayakpaul

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@github-actions github-actions Bot added models size/S PR with diff < 50 LOC labels Jul 3, 2026

@sayakpaul sayakpaul left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. How can we test it?

@zhtmike

zhtmike commented Jul 3, 2026

Copy link
Copy Markdown
Contributor Author

Thanks. How can we test it?

Seem there is no such test for different backend, should we write a Mixin for different attention backend, like AttentionBackendTesterMixin?

@sayakpaul

Copy link
Copy Markdown
Member

Or maybe a standalone reproducer that passes on this PR branch and fails on main?

@zhtmike

zhtmike commented Jul 3, 2026

Copy link
Copy Markdown
Contributor Author

Or maybe a standalone reproducer that passes on this PR branch and fails on main?

can be verified in #14114

@sayakpaul

Copy link
Copy Markdown
Member

I guess what I am more worried about is whether this PR is breaking existing functionalities and other supported pipelines.

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

Hi @zhtmike, thanks for the PR! It does not appear to link an issue it fixes. If this PR addresses an existing issue, please add a closing keyword (e.g. Fixes #1234) to the PR description so the issue is linked. See the contribution guide for more details. If this PR intentionally does not fix a tracked issue, a maintainer can add the no-issue-needed label to silence this reminder.

@zhtmike

zhtmike commented Jul 3, 2026

Copy link
Copy Markdown
Contributor Author

I guess what I am more worried about is whether this PR is breaking existing functionalities and other supported pipelines.

I agree with this. The attention backend lacks test coverage, so it is better to protect it with test cases.

The current fix is for the _flash_3_varlen_hub backend only. It can now correctly handle non-mask, contiguous mask, and non-contiguous mask cases. In my view, it should be safe.

I have also updated the PR content according to the GitHub Action review. Thanks.

@sayakpaul

Copy link
Copy Markdown
Member

Let's introduce a test suite for this and add that to concerned models then.

@github-actions github-actions Bot added tests size/M PR with diff < 200 LOC fixes-issue and removed size/S PR with diff < 50 LOC labels Jul 3, 2026
@zhtmike

zhtmike commented Jul 3, 2026

Copy link
Copy Markdown
Contributor Author

Sorry, I just realized that there is already an AttentionBackendTesterMixin in the latest main branch. We don’t need to introduce a new suite.

I have added an extra TestQwenImageTransformerAttentionBackend to guard against the error. The local test has passed. (The main branch will this test will fail)

I have also tested with TestFluxTransformerAttentionBackend, it also passed.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

_flash_3_varlen_hub backend cannot handle non-contiguous mask

2 participants