-
University of Science and Technology of China
- Shen Zhen
Pinned Loading
-
LLaMA-Factory
LLaMA-Factory PublicForked from Daniel-bupt/LLaMA-Factory
A WebUI for Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
Python 1
-
verl-project/verl
verl-project/verl Publicverl/HybridFlow: A Flexible and Efficient RL Post-Training Framework
-
modelscope/ms-swift
modelscope/ms-swift PublicUse PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3.6, DeepSeek-R1, GLM-5.1, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Gemma4, Llava, …
-
OpenRLHF/OpenRLHF
OpenRLHF/OpenRLHF PublicAn Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & VLM & TIS & vLLM & Ray & Async RL)
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.