Code for paper titled, "Learning to Predict Task Progress by Self-Supervised Video Alignment" by Gerard Donahue and Ehsan Elhamifar, published at CVPR 2024.

Python 16 2 Updated Jul 26, 2024

roboflow / rf-detr

[ICLR 2026] RF-DETR is a real-time object detection and segmentation model architecture developed by Roboflow, SOTA on COCO, designed for fine-tuning.

Python 6,101 742 Updated Apr 2, 2026

ankitects / anki

Anki is a smart spaced repetition flashcard program

Rust 27,207 2,886 Updated Apr 2, 2026

xiaomi-research / q-frame

[ICCV 2025] Implementation of the paper "Q-Frame: Query-aware Frame Selection and Multi-Resolution Adaptation for Video-LLMs"

Python 72 3 Updated Oct 25, 2025

shashankvkt / DoRA_ICLR24

This repo contains the official implementation of ICLR 2024 paper "Is ImageNet worth 1 video? Learning strong image encoders from 1 long unlabelled video""

Python 95 12 Updated May 17, 2024

shyamal-b / ss-tad

End-to-End, Single-Stream Temporal Action Detection in Untrimmed Videos (Official Repo for SS-TAD)

Python 108 23 Updated Oct 12, 2017

amathislab / BehaveMAE

[ECCV 2024] "Elucidating the Hierarchical Nature of Behavior with Masked Autoencoders"

Python 28 4 Updated Nov 13, 2025

zlngan / ASQuery

Python 12 1 Updated Jan 26, 2025

hao-ai-lab / FastVideo

A unified inference and post-training framework for accelerated video generation.

Python 3,339 308 Updated Apr 1, 2026

LiUzHiAn / hf2vad

Python 142 31 Updated Apr 28, 2022

aleflabo / PREGO

The official PyTorch implementation of the IEEE/CVF Computer Vision and Pattern Recognition (CVPR) '24 paper PREGO: online mistake detection in PRocedural EGOcentric videos.

Python 32 4 Updated Jun 9, 2025

jyFengGoGo / InstructDet

Python 37 2 Updated Mar 22, 2024

AFeng-x / PixWizard

[ICLR2025] A versatile image-to-image visual assistant, designed for image generation, manipulation, and translation based on free-from user instructions.

Python 210 2 Updated May 5, 2025

IDEA-Research / TAPTR

[ECCV 2024 & NeurIPS 2024 & ICLR 2026] Official implementation of the paper TAPTR & TAPTRv2 & TAPTRv3

277 15 Updated Feb 10, 2026

syp2ysy / VRP-SAM

[CVPR 2024] Official implementation of "VRP-SAM: SAM with Visual Reference Prompt"

Python 178 19 Updated Sep 27, 2024

timothybrooks / instruct-pix2pix

Python 6,882 587 Updated Mar 3, 2024

AlaaLab / InstructCV

[ ICLR 2024 ] Official Codebase for "InstructCV: Instruction-Tuned Text-to-Image Diffusion Models as Vision Generalists"

Python 461 40 Updated Apr 27, 2024

baaivision / Painter

Painter & SegGPT Series: Vision Foundation Models from BAAI

Python 2,591 180 Updated Dec 6, 2024

boheumd / MA-LMM

(2024CVPR) MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding

Python 349 30 Updated Jul 19, 2024

Finspire13 / DiffAct

Code for Diffusion Action Segmentation (ICCV 2023)

Python 73 13 Updated Aug 16, 2023

robert80203 / EgoPER_official

The official implementation of Error Detection in Egocentric Procedural Task Videos

Python 22 5 Updated Sep 20, 2025

zhenyingfang / Awesome-Temporal-Action-Detection-Temporal-Action-Proposal-Generation

Temporal Action Detection & Weakly Supervised Temporal Action Detection & Temporal Action Proposal Generation

575 43 Updated Apr 2, 2026

Yusn kdplus

Highlights

Organizations

Lists (1)

ML/DL/AI

Starred repositories

video-generation