⚠

Notice:This resource is provided by a third-party author. Please review the code with AI tools or manually before use to ensure security and compatibility.

PythonNVlabs/QeRL

QeRL

[ICLR 2026]QeRL enables RL for 32B LLMs on a single H100 GPU.

45.8/100

★ 511Forks: 51

View on GitHub →

Loading report...

Similar Projects

ART

Agent Reinforcement Trainer: train multi-step agents for real-world tasks using GRPO. Give your agents on-the-job training. Reinforcement learning for Qwen3.6, GPT-OSS, Llama, and more!

Python★ 10.5K

Awesome-LLM-Post-training

Awesome Reasoning LLM Tutorial/Survey/Guide

Python★ 2.5K

AgentFlow

AgentFlow: In-the-Flow Agentic System Optimization

Python★ 2.0K

auto-round

A SOTA quantization algorithm for high-accuracy low-bit LLM inference, seamlessly optimized for CPU/XPU/CUDA, with multi-datatype support and full compatibility with vLLM, SGLang, and Transformers.

Python★ 1.5K

← Back to List