Skip to content
View forkni's full-sized avatar
💭
Research & development • TouchDesigner components 🧪 • real-time GenAI 🤖
💭
Research & development • TouchDesigner components 🧪 • real-time GenAI 🤖

Block or report forkni

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
Jupyter Notebook 126 10 Updated Dec 9, 2025

writing really fast kernels

Cuda 6 Updated Mar 19, 2026

This is a list of useful libraries and resources for CUDA development.

605 51 Updated Oct 8, 2017

Fast SAM 3D Body: Accelerating SAM 3D Body for Real-Time Full-Body Human Mesh Recovery

Python 138 7 Updated Mar 17, 2026

Faster Green Screen Keys — async multi-GPU inference engine for professional VFX pipelines

Python 18 2 Updated Mar 17, 2026

Speak Friend and Enter

Python 263 17 Updated Mar 2, 2026

pprofile + matplotlib = Python program profiled as an awesome heatmap!

Python 844 48 Updated Jul 4, 2023

A powerful set of Python debugging tools, based on PySnooper

Python 1,447 41 Updated Jan 11, 2026

Comprehensive GPU specifications database with 2,824 GPUs across NVIDIA, AMD, and Intel

68 12 Updated Jan 7, 2026

Adobe's reference implementation of the OpenPBR BSDF

C 356 21 Updated Mar 20, 2026

Tangle is a web app that allows the users to build and run Machine Learning pipelines without having to set up development environment.

Python 206 14 Updated Mar 24, 2026

This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, s…

Cuda 1,256 177 Updated Jul 29, 2023

[CVPR 2023] DepGraph: Towards Any Structural Pruning; LLMs, Vision Foundation Models, etc.

Python 3,274 375 Updated Sep 7, 2025

Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"

Python 8,444 770 Updated May 31, 2024

Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"

Python 13,361 891 Updated Dec 17, 2024

Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.

Python 4,393 374 Updated Oct 19, 2025

Example models using DeepSpeed

Python 6,809 1,116 Updated Mar 4, 2026

MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.

Python 2,103 192 Updated Jun 30, 2025

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Python 41,886 4,763 Updated Mar 24, 2026

Simple real time visualisation of the execution of a Python program.

Python 1,838 122 Updated Nov 13, 2021

A list of papers, docs, codes about model quantization. This repo is aimed to provide the info for model quantization research, we are continuously improving the project. Welcome to PR the works (p…

2,335 232 Updated Jan 29, 2026

[ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models

Python 1,626 201 Updated Jul 12, 2024

[ICLR2025] Accelerating Diffusion Transformers with Token-wise Feature Caching

Python 214 9 Updated Mar 14, 2025

AI agents running research on single-GPU nanochat training automatically

Python 53,005 7,380 Updated Mar 21, 2026

SD.Next Quantization Engine

Python 103 9 Updated Mar 13, 2026

Courses on building, compressing, evaluating, and deploying efficient AI models.

Jupyter Notebook 71 5 Updated Mar 23, 2026

Unbearably fast near-real-time pure-Python runtime-static type-checker.

Python 3,387 72 Updated Mar 21, 2026

Pruna is a model optimization framework built for developers, enabling you to deliver faster, more efficient models with minimal overhead.

Python 1,141 83 Updated Mar 24, 2026

Official Repository of the paper "Trajectory Consistency Distillation"

Python 363 13 Updated Apr 28, 2024
Next