-
Ytech Kwai
- Beijing,China
Stars
[ICML'25] EQ-VAE: Equivariance Regularized Latent Space for Improved Generative Image Modeling.
PyTorch implementation of CLIP Maximum Mean Discrepancy (CMMD) for evaluating image generation models.
This repo contains the code for 1D tokenizer and generator
[NeurIPS 2024]OmniTokenizer: one model and one weight for image-video joint tokenization.
The best OSS video generation models, created by Genmo
Allegro is a powerful text-to-video model that generates high-quality videos up to 6 seconds at 15 FPS and 720p resolution from simple text input.
Movie Gen Bench - two media generation evaluation benchmarks released with Meta Movie Gen
Official inference repo for FLUX.1 models
[CVPR2024] Official implementation of SplattingAvatar.
[ECCV'24] TalkingGaussian: Structure-Persistent 3D Talking Head Synthesis via Gaussian Splatting
[CVPR '24] DiffusionAvatars: Deferred Diffusion for High-fidelity 3D Head Avatars
[Siggraph '23] NeRSemble: Neural Radiance Field Reconstruction of Human Heads
[ECCV 2024] The official implementation of paper "BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion"
Prompt-Free Diffusion: Taking "Text" out of Text-to-Image Diffusion Models, arxiv 2023 / CVPR 2024
[CVPR 2024] The official repo for "GaussianAvatar: Towards Realistic Human Avatar Modeling from a Single Video via Animatable 3D Gaussians"
Official Pytorch Implementation of Paper - Stable-Pose: Leveraging Transformers for Pose-Guided Text-to-Image Generation - NeurIPS 2024
[ICLR 2025 Oral] On Scaling Up 3D Gaussian Splatting Training
Decoupled Video Instance Segmentation Framework, improved version of dvis
Decoupled Video Instance Segmentation Framework
[CVPR'24 Highlight] Official PyTorch implementation of CoDeF: Content Deformation Fields for Temporally Consistent Video Processing
VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models
Zero-Shot Video Editing Using Off-The-Shelf Image Diffusion Models
Official implementation for "ControlVideo: Adding Conditional Control for One Shot Text-to-Video Editing"
Custom Diffusion: Multi-Concept Customization of Text-to-Image Diffusion (CVPR 2023)
Code for "Unleashing Large-Scale Video Generative Pre-training for Visual Robot Manipulation"