fix _flash_3_varlen_hub mask handling#14115
Conversation
sayakpaul
left a comment
There was a problem hiding this comment.
Thanks. How can we test it?
Seem there is no such test for different backend, should we write a Mixin for different attention backend, like |
|
Or maybe a standalone reproducer that passes on this PR branch and fails on |
can be verified in #14114 |
|
I guess what I am more worried about is whether this PR is breaking existing functionalities and other supported pipelines. |
|
Hi @zhtmike, thanks for the PR! It does not appear to link an issue it fixes. If this PR addresses an existing issue, please add a closing keyword (e.g. |
I agree with this. The attention backend lacks test coverage, so it is better to protect it with test cases. The current fix is for the I have also updated the PR content according to the GitHub Action review. Thanks. |
|
Let's introduce a test suite for this and add that to concerned models then. |
|
Sorry, I just realized that there is already an I have added an extra I have also tested with Thanks! |
What does this PR do?
As discussed in #13809, we fixed the mask handling in separate PR.
This PR is to fix the non-contiguous mask handling in #14114
Fixes # (issue)
Fix #14114
Before submitting
.ai/review-rules.md?documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
@sayakpaul
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.