Lists (2)
Sort Name ascending (A-Z)
Stars
openvla / openvla
Forked from TRI-ML/prismatic-vlmsOpenVLA: An open-source vision-language-action model for robotic manipulation.
[CVPR 2026] EffectErase: Joint Video Object Removal and Insertion for High-Quality Effect Erasing
Seoul World Model: Grounding World Simulation Models in a Real-World Metropolis
[WACV 2026] Official implementation of "Edge-Aware Image Manipulation via Diffusion Models with a Novel Structure-Preservation Loss"
[CVPR2024 Highlight] VBench - We Evaluate Video Generation
[CVPR 2026] MotionCrafter: Dense Geometry and Motion Reconstruction with a 4D VAE
General plug-and-play inference library for Recursive Language Models (RLMs), supporting various sandboxes.
nanoRLHF: from-scratch journey into how LLMs and RLHF really work.
Official repository for K-EXAONE built by LG AI Research
self-play 방법론중, 모델의 behavior를 고려하여 diversity를 추가는 방법을 논의하는 git-hub입니다.
LAVIS - A One-stop Library for Language-Vision Intelligence
[ArXiv 2025] DiffusionVL: Translating Any Autoregressive Models into Diffusion Vision Language Models
The repository provides code for running inference with the Meta Segment Anything Audio Model (SAM-Audio), links for downloading the trained model checkpoints, and example notebooks that show how t…
Official repo for paper "IC-Effect: Precise and Efficient Video Effects Editing via In-Context Learning"
GUNETR_pplus: Gradient enhanced UNETR_pplus with tumor segmentation
GUNETR_pplus: Gradient enhanced UNETR_pplus with MSD liver segmentation
GUNETR_pplus: Gradient enhanced UNETR_pplus with MSD liver segmentation
[NeurIPS 2025] Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance
Implementing DeepSeek R1's GRPO algorithm from scratch
A very simple GRPO implement for reproducing r1-like LLM thinking.
[CVPR 2025] MatAnyone: Stable Video Matting with Consistent Memory Propagation
Official inference repo for FLUX.2 models
PyTorch implementation of JiT https://arxiv.org/abs/2511.13720
Light Image Video Generation Inference Framework
[CVPR 2026] High-Quality Text-to-Video Generation with Alpha Channel
Generative Omnimatte (CVPR 2025)