[RL] make sync weights conditional by Datta0 · Pull Request #4925 · unslothai/unsloth

Datta0 · 2026-04-08T16:36:46Z

When using shared weights, we were removing sync_weights or reload_weights calls.
We now want to make it conditional so that people can use GRPOTrainer for non weight shared models as well.
When we detect weight sharing, we don't sync/reload. We use LoRARequest

gemini-code-assist

Code Review

This pull request introduces a shared_weights attribute to the vLLM engine across Llama and Vision models to manage weight synchronization and reloading more granularly. Instead of unconditionally removing these calls, the code now uses conditional guards to skip them only when weights are shared. Additionally, it ensures lora_request is correctly passed during generation when weights are shared. The review feedback identifies several instances of redundant logic where multiple checks are used to verify the same state, suggesting simplifications to improve code clarity and consistency.

for more information, see https://pre-commit.ci

make sync weights conditional

d8cfb01

gemini-code-assist Bot reviewed Apr 8, 2026

View reviewed changes

Comment thread unsloth/models/rl.py Outdated

Comment thread unsloth/models/rl_replacements.py Outdated

Comment thread unsloth/models/rl_replacements.py Outdated

Comment thread unsloth/models/rl_replacements.py

Comment thread unsloth/models/rl_replacements.py

Datta0 and others added 2 commits April 8, 2026 16:51

Also conditionalise vllm creation

0d04c0b

[pre-commit.ci] auto fixes from pre-commit.com hooks

5a1c4b9

for more information, see https://pre-commit.ci

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[RL] make sync weights conditional#4925

[RL] make sync weights conditional#4925
Datta0 wants to merge 3 commits intounslothai:mainfrom
Datta0:guard-sync-weights

Datta0 commented Apr 8, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

Datta0 commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Datta0 commented Apr 8, 2026 •

edited

Loading