Back to List
Notice:This resource is provided by a third-party author. Please review the code with AI tools or manually before use to ensure security and compatibility.
PythonNVlabs/QeRL

QeRL

[ICLR 2026]QeRL enables RL for 32B LLMs on a single H100 GPU.

50.2/100
506Forks: 51
View on GitHub
Loading report...

Similar Projects

ART

91

Agent Reinforcement Trainer: train multi-step agents for real-world tasks using GRPO. Give your agents on-the-job training. Reinforcement learning for Qwen3.6, GPT-OSS, Llama, and more!

Python10.0K

Skywork-R1V

55

Skywork-R1V is an advanced multimodal AI model series developed by Skywork AI, specializing in vision-language reasoning.

Python3.2K

Awesome-LLM-Post-training

47

Awesome Reasoning LLM Tutorial/Survey/Guide

Python2.4K

AgentFlow

52

AgentFlow: In-the-Flow Agentic System Optimization

Python1.9K
Back to List