Skip to content

🚨Fix memory leaks caused by lru decorators in vision models#45922

Merged
yonigozlan merged 14 commits into
huggingface:mainfrom
yonigozlan:fix-lru-memory-leaks
May 15, 2026
Merged

🚨Fix memory leaks caused by lru decorators in vision models#45922
yonigozlan merged 14 commits into
huggingface:mainfrom
yonigozlan:fix-lru-memory-leaks

Conversation

@yonigozlan
Copy link
Copy Markdown
Member

@yonigozlan yonigozlan commented May 12, 2026

Lru decorators were keeping references to entire models causing memory to not be freed when deleting the models

Fixes #45412

Breaking changes:

  • text_embeds input in sam3/edgetam/sam3_lite_text is now expected to be full text embeds and not just pooler outputs, to be coherent with other models and the existing doc .
  • change attr num_pos_feats to num_position_features in several models in an effort to standardize attr names across models

@yonigozlan
Copy link
Copy Markdown
Member Author

run-slow: beit, conditional_detr, d_fine, data2vec, deformable_detr, deimv2, detr, edgetam, edgetam_video, mask2former, maskformer, oneformer, pp_doclayout_v2, pp_doclayout_v3, rt_detr, rt_detr_v2, sam2, sam2_video, sam3, sam3_lite_text, sam3_tracker, sam3_tracker_video

@yonigozlan yonigozlan requested a review from molbap May 12, 2026 21:08
@github-actions
Copy link
Copy Markdown
Contributor

Workflow Run ⚙️

This comment contains run-slow, running the specified jobs:

models: ["models/beit", "models/conditional_detr", "models/d_fine", "models/data2vec", "models/deformable_detr", "models/deimv2", "models/detr", "models/edgetam", "models/edgetam_video", "models/mask2former", "models/maskformer", "models/oneformer", "models/pp_doclayout_v2", "models/pp_doclayout_v3", "models/rt_detr", "models/rt_detr_v2", "models/sam2", "models/sam2_video", "models/sam3", "models/sam3_lite_text", "models/sam3_tracker", "models/sam3_tracker_video"]
quantizations: []

@yonigozlan yonigozlan requested a review from vasqu May 12, 2026 21:08
@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@github-actions
Copy link
Copy Markdown
Contributor

CI Results

Workflow Run ⚙️

Commit Info

Context Commit Description
RUN 024b266b workflow commit (merge commit)
PR 24b1cdde branch commit (from PR)
main 7ee56fc2 base commit (on main)

⚠️ Model CI failed to report results

The test failure analysis could not be completed. Please check the workflow run for details.

@yonigozlan
Copy link
Copy Markdown
Member Author

run-slow: beit, conditional_detr, d_fine, data2vec, deformable_detr, deimv2, detr, edgetam, edgetam_video, mask2former, maskformer, oneformer, pp_doclayout_v2, pp_doclayout_v3, rt_detr, rt_detr_v2, sam2, sam2_video, sam3, sam3_lite_text, sam3_tracker, sam3_tracker_video

@github-actions
Copy link
Copy Markdown
Contributor

Workflow Run ⚙️

This comment contains run-slow, running the specified jobs:

models: ["models/beit", "models/conditional_detr", "models/d_fine", "models/data2vec", "models/deformable_detr", "models/deimv2", "models/detr", "models/edgetam", "models/edgetam_video", "models/mask2former", "models/maskformer", "models/oneformer", "models/pp_doclayout_v2", "models/pp_doclayout_v3", "models/rt_detr", "models/rt_detr_v2", "models/sam2", "models/sam2_video", "models/sam3", "models/sam3_lite_text", "models/sam3_tracker", "models/sam3_tracker_video"]
quantizations: []

@github-actions
Copy link
Copy Markdown
Contributor

CI Results

Workflow Run ⚙️

Commit Info

Context Commit Description
RUN 14a5605c workflow commit (merge commit)
PR 344c4c3a branch commit (from PR)
main 7ee56fc2 base commit (on main)

⚠️ Model CI failed to report results

The test failure analysis could not be completed. Please check the workflow run for details.

Copy link
Copy Markdown
Contributor

@vasqu vasqu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just did a sanity check and went through a few files. I think this is fine but would like your opinion on my comments; currently we do have a few breaking spots so we should be careful



@compile_compatible_method_lru_cache(maxsize=10)
def _cached_generate_relative_position_index(window_size: tuple[int, int]) -> torch.Tensor:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be sure that I understood the root cause --> the lru cache saved a reference of self as it's a model's internal function?

Would it be possible to instead make these staticmethod? Might be a more natural move than all the private global fns

Anyways we do need to add 🚨 because it changes the behavior slightly and people might have relied on these internal functions

Copy link
Copy Markdown
Member Author

@yonigozlan yonigozlan May 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be sure that I understood the root cause --> the lru cache saved a reference of self as it's a model's internal function?

Yes, I'm not sure about the exact inner workings of lru_cache but as a general rule it's not good practice to apply them to instance methods. That was my mistake

Would it be possible to instead make these staticmethod? Might be a more natural move than all the private global fns

That's a good point! I'll try to refactor with staticmethod instead, and make sure lru_cache doesn't cause issues when applied to static method (should be ok)



@compile_compatible_method_lru_cache(maxsize=1)
def _cached_build_sine_position_embedding(*args, **kwargs) -> torch.Tensor:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok here we keep BC - we really need to decide what would be better ig but should be consistent then

text_embeds = model.get_text_features(
input_ids=inputs_dict["input_ids"], attention_mask=inputs_dict["attention_mask"], return_dict=True
).pooler_output
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is definitely breaking 👀 Any reason we could not keep the same behavior there?

Copy link
Copy Markdown
Member Author

@yonigozlan yonigozlan May 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it was broken before this fix, as the docs were already using the behavior of this PR, but it's a niche feature so maybe why it was never flagged. I'll add a 🚨

@yonigozlan
Copy link
Copy Markdown
Member Author

Thanks for the review @vasqu ! Indeed it might be better to use staticmethod, I'll make the changes and ping you when it's ready :)

@yonigozlan
Copy link
Copy Markdown
Member Author

run-slow: beit, conditional_detr, d_fine, data2vec, deformable_detr, deimv2, detr, edgetam, edgetam_video, mask2former, maskformer, oneformer, pp_doclayout_v2, pp_doclayout_v3, rt_detr, rt_detr_v2, sam2, sam2_video, sam3, sam3_lite_text, sam3_tracker, sam3_tracker_video

@github-actions
Copy link
Copy Markdown
Contributor

Workflow Run ⚙️

This comment contains run-slow, running the specified jobs:

models: ["models/beit", "models/conditional_detr", "models/d_fine", "models/data2vec", "models/deformable_detr", "models/deimv2", "models/detr", "models/edgetam", "models/edgetam_video", "models/mask2former", "models/maskformer", "models/oneformer", "models/pp_doclayout_v2", "models/pp_doclayout_v3", "models/rt_detr", "models/rt_detr_v2", "models/sam2", "models/sam2_video", "models/sam3", "models/sam3_lite_text", "models/sam3_tracker", "models/sam3_tracker_video"]
quantizations: []

@github-actions
Copy link
Copy Markdown
Contributor

CI Results

Workflow Run ⚙️

Commit Info

Context Commit Description
RUN 98f1299d workflow commit (merge commit)
PR 4c8136f3 branch commit (from PR)
main 98a25186 base commit (on main)

Model CI Report

29 new failed tests from this PR 😭

  • edgetam_video:
    tests/models/edgetam_video/test_modeling_edgetam_video.py::EdgeTamVideoModelIntegrationTest::test_inference_mask_generation_video_multi_objects_multi_points (✅ ⟹ ❌)
    tests/models/edgetam_video/test_modeling_edgetam_video.py::EdgeTamVideoModelIntegrationTest::test_inference_mask_generation_video_multi_points (✅ ⟹ ❌)
    tests/models/edgetam_video/test_modeling_edgetam_video.py::EdgeTamVideoModelIntegrationTest::test_inference_mask_generation_video_one_bb (✅ ⟹ ❌)
    tests/models/edgetam_video/test_modeling_edgetam_video.py::EdgeTamVideoModelIntegrationTest::test_inference_mask_generation_video_one_point (✅ ⟹ ❌)
    tests/models/edgetam_video/test_modeling_edgetam_video.py::EdgeTamVideoModelIntegrationTest::test_inference_mask_generation_video_one_point_one_bb (✅ ⟹ ❌)
    tests/models/edgetam_video/test_modeling_edgetam_video.py::EdgeTamVideoModelIntegrationTest::test_inference_mask_generation_video_one_point_propagate_in_video_directly (✅ ⟹ ❌)
    tests/models/edgetam_video/test_modeling_edgetam_video.py::EdgeTamVideoModelIntegrationTest::test_inference_propagate_on_streamed_video (✅ ⟹ ❌)
    tests/models/edgetam_video/test_modeling_edgetam_video.py::EdgeTamVideoModelIntegrationTest::test_inference_propagate_video_from_mask_input (✅ ⟹ ❌)
    tests/models/edgetam_video/test_modeling_edgetam_video.py::EdgeTamVideoModelIntegrationTest::test_inference_with_different_dtypes (✅ ⟹ ❌)

  • sam2_video:
    tests/models/sam2_video/test_modeling_sam2_video.py::Sam2VideoModelIntegrationTest::test_inference_mask_generation_video_batched_bb (✅ ⟹ ❌)
    tests/models/sam2_video/test_modeling_sam2_video.py::Sam2VideoModelIntegrationTest::test_inference_mask_generation_video_multi_objects_multi_points (✅ ⟹ ❌)
    tests/models/sam2_video/test_modeling_sam2_video.py::Sam2VideoModelIntegrationTest::test_inference_mask_generation_video_multi_points (✅ ⟹ ❌)
    tests/models/sam2_video/test_modeling_sam2_video.py::Sam2VideoModelIntegrationTest::test_inference_mask_generation_video_one_bb (✅ ⟹ ❌)
    tests/models/sam2_video/test_modeling_sam2_video.py::Sam2VideoModelIntegrationTest::test_inference_mask_generation_video_one_point (✅ ⟹ ❌)
    tests/models/sam2_video/test_modeling_sam2_video.py::Sam2VideoModelIntegrationTest::test_inference_mask_generation_video_one_point_one_bb (✅ ⟹ ❌)
    tests/models/sam2_video/test_modeling_sam2_video.py::Sam2VideoModelIntegrationTest::test_inference_mask_generation_video_one_point_propagate_in_video_directly (✅ ⟹ ❌)
    tests/models/sam2_video/test_modeling_sam2_video.py::Sam2VideoModelIntegrationTest::test_inference_propagate_on_streamed_video (✅ ⟹ ❌)
    tests/models/sam2_video/test_modeling_sam2_video.py::Sam2VideoModelIntegrationTest::test_inference_propagate_video_from_mask_input (✅ ⟹ ❌)
    tests/models/sam2_video/test_modeling_sam2_video.py::Sam2VideoModelIntegrationTest::test_inference_with_different_dtypes (✅ ⟹ ❌)

  • sam3_tracker_video:
    tests/models/sam3_tracker_video/test_modeling_sam3_tracker_video.py::Sam3TrackerVideoModelIntegrationTest::test_inference_mask_generation_video_batched_bb (✅ ⟹ ❌)
    tests/models/sam3_tracker_video/test_modeling_sam3_tracker_video.py::Sam3TrackerVideoModelIntegrationTest::test_inference_mask_generation_video_multi_objects_multi_points (✅ ⟹ ❌)
    tests/models/sam3_tracker_video/test_modeling_sam3_tracker_video.py::Sam3TrackerVideoModelIntegrationTest::test_inference_mask_generation_video_multi_points (✅ ⟹ ❌)
    tests/models/sam3_tracker_video/test_modeling_sam3_tracker_video.py::Sam3TrackerVideoModelIntegrationTest::test_inference_mask_generation_video_one_bb (✅ ⟹ ❌)
    tests/models/sam3_tracker_video/test_modeling_sam3_tracker_video.py::Sam3TrackerVideoModelIntegrationTest::test_inference_mask_generation_video_one_point (✅ ⟹ ❌)
    tests/models/sam3_tracker_video/test_modeling_sam3_tracker_video.py::Sam3TrackerVideoModelIntegrationTest::test_inference_mask_generation_video_one_point_one_bb (✅ ⟹ ❌)
    tests/models/sam3_tracker_video/test_modeling_sam3_tracker_video.py::Sam3TrackerVideoModelIntegrationTest::test_inference_mask_generation_video_one_point_propagate_in_video_directly (✅ ⟹ ❌)
    tests/models/sam3_tracker_video/test_modeling_sam3_tracker_video.py::Sam3TrackerVideoModelIntegrationTest::test_inference_propagate_on_streamed_video (✅ ⟹ ❌)
    tests/models/sam3_tracker_video/test_modeling_sam3_tracker_video.py::Sam3TrackerVideoModelIntegrationTest::test_inference_propagate_video_from_mask_input (✅ ⟹ ❌)
    tests/models/sam3_tracker_video/test_modeling_sam3_tracker_video.py::Sam3TrackerVideoModelIntegrationTest::test_inference_with_different_dtypes (✅ ⟹ ❌)

@yonigozlan
Copy link
Copy Markdown
Member Author

run-slow: beit, conditional_detr, d_fine, data2vec, deformable_detr, deimv2, detr, edgetam, edgetam_video, mask2former, maskformer, oneformer, pp_doclayout_v2, pp_doclayout_v3, rt_detr, rt_detr_v2, sam2, sam2_video, sam3, sam3_lite_text, sam3_tracker, sam3_tracker_video

@github-actions
Copy link
Copy Markdown
Contributor

Workflow Run ⚙️

This comment contains run-slow, running the specified jobs:

models: ["models/beit", "models/conditional_detr", "models/d_fine", "models/data2vec", "models/deformable_detr", "models/deimv2", "models/detr", "models/edgetam", "models/edgetam_video", "models/mask2former", "models/maskformer", "models/oneformer", "models/pp_doclayout_v2", "models/pp_doclayout_v3", "models/rt_detr", "models/rt_detr_v2", "models/sam2", "models/sam2_video", "models/sam3", "models/sam3_lite_text", "models/sam3_tracker", "models/sam3_tracker_video"]
quantizations: []

@yonigozlan yonigozlan changed the title Fix memory leaks caused by lru decorators in vision models 🚨Fix memory leaks caused by lru decorators in vision models May 15, 2026
@yonigozlan
Copy link
Copy Markdown
Member Author

Waiting to see if all slow tests pass but it should be good to go then @vasqu !

@vasqu
Copy link
Copy Markdown
Contributor

vasqu commented May 15, 2026

Sounds good, keeping an eye here to review later today 🫡

@github-actions
Copy link
Copy Markdown
Contributor

CI Results

Workflow Run ⚙️

Commit Info

Context Commit Description
RUN 7fbd5dec workflow commit (merge commit)
PR c307ba63 branch commit (from PR)
main 331f1007 base commit (on main)

✅ No failing test specific to this PR 🎉 👏 !

Copy link
Copy Markdown
Contributor

@vasqu vasqu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few smaller comments 🤗 Can you also add the potentially breaking stuff into the PR description, e.g. the sam3 pooler behavior, small renames for args (pos features)

Comment thread src/transformers/models/conditional_detr/modular_conditional_detr.py Outdated
Comment on lines +743 to +746
@staticmethod
@compile_compatible_method_lru_cache(maxsize=32)
def _cached_build_2d_sinusoidal_position_embedding(*args, **kwargs) -> torch.Tensor:
return build_2d_sinusoidal_position_embedding(*args, **kwargs)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this wrapper structure is a bit awkward, could we move things here a bit to make it directly use build_2d_sinusoidal_position_embedding or is the inheritance too awkward?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure we can do that and still have it standardized everywhere it's used, because some models call build_2d_sinusoidal_position_embedding at init and copy the result in their weights, so they don't need to lru_cache it at all.

self.num_feature_levels = 3
self.position_embedder = Mask2FormerSinePositionEmbedding(num_pos_feats=hidden_dim // 2, normalize=True)
self.position_embedder = Mask2FormerSinePositionEmbedding(
num_position_features=hidden_dim // 2, normalize=True
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is slightly breaking because of the rename

Comment thread src/transformers/models/maskformer/modular_maskformer.py Outdated
@github-actions
Copy link
Copy Markdown
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: beit, conditional_detr, d_fine, dab_detr, data2vec, deformable_detr, deimv2, detr, edgetam, edgetam_video, got_ocr2, mask2former, maskformer, oneformer, pp_doclayout_v2, pp_doclayout_v3

@yonigozlan yonigozlan enabled auto-merge May 15, 2026 19:53
@yonigozlan yonigozlan added this pull request to the merge queue May 15, 2026
Merged via the queue into huggingface:main with commit 3ef2781 May 15, 2026
92 of 93 checks passed
@yonigozlan yonigozlan deleted the fix-lru-memory-leaks branch May 15, 2026 20:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

RT-DETR models do not release memory when deleted / garbage-collected

4 participants