Notice:This resource is provided by a third-party author. Please review the code with AI tools or manually before use to ensure security and compatibility.
Pythonlsdefine/simple_GRPO
simple_GRPO
A very simple GRPO implement for reproducing r1-like LLM thinking.