⚠

Notice:This resource is provided by a third-party author. Please review the code with AI tools or manually before use to ensure security and compatibility.

Pythonsail-sg/oat

oat

🌾 OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.

60.1/100

★ 667Forks: 63

View on GitHub →

Loading report...

Similar Projects

OpenJudge

OpenJudge: A Unified Framework for Holistic Evaluation and Quality Rewards

Python★ 745

LlamaFactory

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python★ 73.5K

MedicalGPT

MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training Pipeline. 训练医疗大模型，实现了包括增量预训练(PT)、有监督微调(SFT)、RLHF、DPO、ORPO、GRPO。

Python★ 5.7K

alignment-handbook

Robust recipes to align language models with human and AI preferences

Python★ 5.6K

← Back to List