-
Shanghai AI Laboratory
- China
- https://wyhsirius.github.io
- @yaohuiwang_yh
Stars
[CVPR 2025] The Devil is in the Prompts: Retrieval-Augmented Prompt Optimization for Text-to-Video Generation
[ICLR 2026] Official repository of "InternSVG: Towards Unified SVG Tasks with Multimodal Large Language Models".
[ICLR2026] Video-GPT via Next Clip Diffusion.
[CVPR 2025] Consistent and Controllable Image Animation with Motion Diffusion Models
Ouroboros3D: Image-to-3D Generation via 3D-aware Recursive Diffusion (CVPR2025)
A Next-Generation Training Engine Built for Ultra-Large MoE Models
[TMLR 2025] Latte: Latent Diffusion Transformer for Video Generation.
[CVPR2024] VideoBooth: Diffusion-based Video Generation with Image Prompts
[CVPR2024 Highlight] VBench - We Evaluate Video Generation
[IJCV 2024] LaVie: High-Quality Video Generation with Cascaded Latent Diffusion Models
[ICLR 2024] SEINE: Short-to-Long Video Diffusion Model for Generative Transition and Prediction
[ICCV 2023] Latent Action Composition for Skeleton-based Action Segmentation
Training-Free Condition-Guided Text-to-Video Generation
An open-source tool-augmented conversational language model from Fudan University
Official PyTorch implementation of LongVideoGAN
The official PyTorch implementation of the paper "Human Motion Diffusion Model"
Implementation of Generating Diverse High-Fidelity Images with VQ-VAE-2 in PyTorch
Implementation of Imagen, Google's Text-to-Image Neural Network, in Pytorch
[CVPR 2022] StyleSwin: Transformer-based GAN for High-resolution Image Generation
A curated list of awesome 3d generation papers
Official PyTorch implementation of "Playable Environments: Video Manipulation in Space and Time", CVPR 2022
A curated list of resources on implicit neural representations.
[WACV 2021]"Guided Attentive Feature Fusion for Multispectral Pedestrian Detection"
Localize to Classify and Classify to Localize: Mutual Guidance in Object Detection
[BMVC 2021 Oral] Official implementation of our paper "A Unified Framework for Real-world Skeleton-based Action Recognition" on Toyota Smarthome/Penn Action/NTU-RGB+D/Posetics datasets