Stars
OSI: One-step Inversion Excels in Extracting Diffusion Watermarks
[NeurIPS'25] A work to improve CLIP's visual detail capturing ability by inverting the unCLIP generative model.
[NeurIPS 2025] The official repository of "Sekai: A Video Dataset towards World Exploration"
[TPAMI 2025] Pytorch Implementation of the paper "StylizedGS: Controllable Stylization for 3D Gaussian Splatting"
Jodi: Unification of Visual Generation and Understanding via Joint Modeling
PreLAR: World Model Pre-training with Learnable Action Representation, ECCV 2024
This repository is the official implementation of the paper "Understanding Few-Shot Learning: Measuring Task Relatedness and Adaptation Difficulty via Attributes" in Neural Information Processing S…
This repository contains the reference source code for the paper ["Scalable Modular Network: A Framework for Adaptive Learning via Agreement Routing"](https://openreview.net/forum?id=pEKJl5sflp) in…
official codes for our WACV 2024 paper (Interpretable Object Recognition by Semantic Prototype Analysis)
Codes for the WACV 2023 paper: "Semantic Guided Latent Parts Embedding for Few-Shot Learning"
Breaking Boundary Between Pre-training and Fine-tuning with Hybrid Prompting for Knowledge-Based VQA
This is a custom node that collects the tools I use frequently.
Official implementation of BMVC2023 Oral paper: 《Describe Your Facial Expressions by Linking Image Encoders and Large Language Models》
State-of-the-art 2D and 3D Face Analysis Project
[WACV'25 Oral] Precise Integral in NeRFs: Overcoming the Approximation Errors of Numerical Quadrature
Implements commonly used datasets based on torch and torchvision.
Implement visual tokenizers with PyTorch.
⚡Batch Face Processing for Fast Modern Research, including face detection, face alignment, face reconstruction, head pose estimation, face parsing
💡 Resources about Sign Language Processing (e.g., Sign Language Recognition / Translation / Production)
[ICLR 2025] Codebase for "CtrLoRA: An Extensible and Efficient Framework for Controllable Image Generation"
Dysca: A Dynamic and Scalable Benchmark for Evaluating Perception Ability of LVLMs
Anonymous Github is a proxy server to support anonymous browsing of Github repositories for open-science code and data.
Codebase of ICCV 2023 paper "Hierarchical Contrastive Learning for Pattern-Generalizable Image Corruption Detection"
ICCV23 "Householder Projector for Unsupervised Latent Semantics Discovery"