Back to List
Notice:This resource is provided by a third-party author. Please review the code with AI tools or manually before use to ensure security and compatibility.
Pythonbytedance/Sa2VA

Sa2VA

Official Repo For Pixel-LLM Codebase

70.3/100
1.6KForks: 112
View on GitHub
Loading report...

Similar Projects

cambrian

51

Cambrian-1 is a family of multimodal LLMs with a vision-centric design.

Python2.0K

JarvisArt

64

[NeurIPS' 2025] JarvisArt: Liberating Human Artistic Creativity via an Intelligent Photo Retouching Agent

Python772

4KAgent

56

[NeurIPS 2025] 4KAgent: Agentic Any Image to 4K Super-Resolution. An intelligent computer vision agent that can magically restore any image to perfect-4K!

Python753

MMMU

69

This repo contains evaluation code for the paper "MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI"

Python548
Back to List