-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Pull requests: huggingface/trl
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Add explicit response_schema override for GRPO
#5833
opened May 25, 2026 by
haimianxing
Loading…
5 of 6 tasks
feat(grpo): tool response/observation training
#5832
opened May 25, 2026 by
LeonEricsson
Collaborator
•
Draft
1 of 8 tasks
feat(sft): configurable chunked-NLL chunk size
#5829
opened May 24, 2026 by
mrs83
Loading…
4 of 8 tasks
[feat] add Dapo prompt level avg
#5828
opened May 24, 2026 by
DagaBhai
Contributor
Loading…
3 of 8 tasks
Fix NaN loss when completions are fully truncated
#5826
opened May 24, 2026 by
matdou
Contributor
Loading…
4 of 8 tasks
Align KTO with DPO: Simplify metrics from sum/count to direct averages
#5820
opened May 22, 2026 by
albertvillanova
Member
Loading…
Add SFT + reward-model recipe + diagnostics for PPO TL;DR example
#5813
opened May 22, 2026 by
kohsheen1234
Loading…
5 of 6 tasks
Warn when GRPOTrainer use_liger_kernel masks LoRA adapters on lm_head
#5808
opened May 21, 2026 by
adityasingh2400
Loading…
5 tasks done
Make
trl vllm-serve OpenAI-compatible (exploratory)
#5803
opened May 21, 2026 by
qgallouedec
Member
Loading…
docs: document expandable_segments allocator config for memory tuning
#5794
opened May 20, 2026 by
akshansh47
Loading…
4 of 8 tasks
Add compute_metrics support to GRPOTrainer
#5790
opened May 19, 2026 by
JulesRoussel2001
Loading…
5 of 8 tasks
[GKD] Use vLLM for student generation
#5782
opened May 18, 2026 by
roycho96
Contributor
Loading…
4 of 8 tasks
Continuous Batching support for AsyncGRPO
#5781
opened May 16, 2026 by
qgallouedec
Member
•
Draft
8 tasks
use eager attn for test_train_vlm_multi_image as a WA
#5774
opened May 15, 2026 by
kaixuanliu
Contributor
Loading…
cleanup xpu cahce memory after each test
#5771
opened May 15, 2026 by
kaixuanliu
Contributor
Loading…
Memory-efficient PEFT/LoRA vLLM weight sync under DeepSpeed ZeRO-3
#5766
opened May 13, 2026 by
rak96
Loading…
7 of 14 tasks
feat(grpo): replace deprecated
use_transformers_paged with transformers continuous batching
#5765
opened May 13, 2026 by
sergiopaniego
Member
Loading…
4 of 8 tasks
docs: set max_completion_length=1024 in GRPO quickstart examples
#5759
opened May 13, 2026 by
dhruvnigam93
Loading…
5 of 8 tasks
Tighten old_per_token_logps recomputation check in GRPO
#5757
opened May 12, 2026 by
wengeezhang
Loading…
5 of 8 tasks
Previous Next
ProTip!
Mix and match filters to narrow down what you’re looking for.