A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit and 4-bit floating point (FP8 and FP4) precision on Hopper, Ada and Blackwell GPUs, to provide better performance…

Python 3,240 673 Updated Mar 25, 2026

pytorch / ao

PyTorch native quantization and sparsity for training and inference

Python 2,745 466 Updated Mar 25, 2026

linkedin / Liger-Kernel

Efficient Triton Kernels for LLM Training

Python 6,238 505 Updated Mar 25, 2026

Dao-AILab / quack

A Quirky Assortment of CuTe Kernels

Python 865 100 Updated Mar 24, 2026

KellerJordan / modded-nanogpt

NanoGPT (124M) in 2 minutes

Python 5,000 683 Updated Mar 17, 2026

Ledzy / StreamBP

Official code of "StreamBP: Memory-Efficient Exact Backpropagation for Long Sequence Training of LLMs".

Python 74 5 Updated Jun 23, 2025

Foundations-of-Computer-Vision / visionbook

<Foundations of Computer Vision> Book

PostScript 470 117 Updated Mar 22, 2026

google-gemini / gemini-fullstack-langgraph-quickstart

Get started with building Fullstack Agents using Gemini 2.5 and LangGraph

Jupyter Notebook 18,032 3,060 Updated Mar 21, 2026

maomaocun / dLLM-cache

Official PyTorch implementation of the paper "dLLM-Cache: Accelerating Diffusion Large Language Models with Adaptive Caching" (dLLM-Cache).

Python 200 14 Updated Nov 17, 2025

SparkAudio / Spark-TTS

Spark-TTS Inference Code

Python 10,959 1,171 Updated Apr 9, 2025

ylacombe / finetune-hf-vits

Finetune VITS and MMS using HuggingFace's tools

Python 194 72 Updated Mar 31, 2024

jujumilk3 / leaked-system-prompts

Collection of leaked system prompts

14,318 1,999 Updated Mar 23, 2026

TsinghuaC3I / Fourier-Position-Embedding

[ICML 2025] Fourier Position Embedding: Enhancing Attention’s Periodic Extension for Length Generalization

Python 110 8 Updated Jun 2, 2025

fla-org / native-sparse-attention

🐳 Efficient Triton implementations for "Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention"

Python 977 50 Updated Feb 5, 2026

humanlayer / 12-factor-agents

What are the principles we can use to build LLM-powered software that is actually good enough to put in the hands of production customers?

TypeScript 18,913 1,436 Updated Sep 21, 2025

comet-ml / opik

Debug, evaluate, and monitor your LLM applications, RAG systems, and agentic workflows with comprehensive tracing, automated evaluations, and production-ready dashboards.

Python 18,473 1,415 Updated Mar 25, 2026