Skip to content

Pull requests: huggingface/trl

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Add explicit response_schema override for GRPO
#5833 opened May 25, 2026 by haimianxing Loading…
5 of 6 tasks
feat(grpo): tool response/observation training
#5832 opened May 25, 2026 by LeonEricsson Collaborator Draft
1 of 8 tasks
[WIP] Add support for Audio
#5830 opened May 25, 2026 by qgallouedec Member Loading…
8 tasks
feat(sft): configurable chunked-NLL chunk size
#5829 opened May 24, 2026 by mrs83 Loading…
4 of 8 tasks
[feat] add Dapo prompt level avg
#5828 opened May 24, 2026 by DagaBhai Contributor Loading…
3 of 8 tasks
Fix NaN loss when completions are fully truncated
#5826 opened May 24, 2026 by matdou Contributor Loading…
4 of 8 tasks
Add Entropy Adaptive Fine Tuning
#5823 opened May 23, 2026 by electroglyph Loading…
3 of 6 tasks
MADPO
#5804 opened May 21, 2026 by qgallouedec Member Draft
Make trl vllm-serve OpenAI-compatible (exploratory)
#5803 opened May 21, 2026 by qgallouedec Member Loading…
Add trust_remote_code to trainer configs
#5802 opened May 20, 2026 by qgallouedec Member Loading…
docs: document expandable_segments allocator config for memory tuning
#5794 opened May 20, 2026 by akshansh47 Loading…
4 of 8 tasks
[WIP] New parsing approach
#5791 opened May 19, 2026 by qgallouedec Member Loading…
8 tasks
Add compute_metrics support to GRPOTrainer
#5790 opened May 19, 2026 by JulesRoussel2001 Loading…
5 of 8 tasks
[GKD] Use vLLM for student generation
#5782 opened May 18, 2026 by roycho96 Contributor Loading…
4 of 8 tasks
Continuous Batching support for AsyncGRPO
#5781 opened May 16, 2026 by qgallouedec Member Draft
8 tasks
use eager attn for test_train_vlm_multi_image as a WA
#5774 opened May 15, 2026 by kaixuanliu Contributor Loading…
cleanup xpu cahce memory after each test
#5771 opened May 15, 2026 by kaixuanliu Contributor Loading…
Memory-efficient PEFT/LoRA vLLM weight sync under DeepSpeed ZeRO-3
#5766 opened May 13, 2026 by rak96 Loading…
7 of 14 tasks
docs: set max_completion_length=1024 in GRPO quickstart examples
#5759 opened May 13, 2026 by dhruvnigam93 Loading…
5 of 8 tasks
Tighten old_per_token_logps recomputation check in GRPO
#5757 opened May 12, 2026 by wengeezhang Loading…
5 of 8 tasks
async_grpo don't return on queue.Empty
#5751 opened May 12, 2026 by AmineDiro Member Loading…
ProTip! Mix and match filters to narrow down what you’re looking for.