huwade

Follow

wade huwade

Follow

9 followers · 24 following

taiwan
10:20 (UTC -12:00)
https://www.linkedin.com/in/wadehuang811/

Achievements

Achievements

Starred repositories

spcl / QuaRot

Code for Neurips24 paper: QuaRot, an end-to-end 4-bit inference of large language models.

Python 496 67 Updated Nov 26, 2024

dingmyu / davit

[ECCV 2022]Code for paper "DaViT: Dual Attention Vision Transformer"

Python 374 34 Updated Feb 13, 2024

zhiqwang / yolort

yolort is a runtime stack for yolov5 on specialized accelerators such as tensorrt, libtorch, onnxruntime, tvm and ncnn.

Python 730 154 Updated Mar 26, 2026

udacity / cs344

Introduction to Parallel Programming class code

Cuda 1,347 1,144 Updated Jun 27, 2022

sysprog21 / compute-pi

Leibniz formula for π

C 13 107 Updated Oct 2, 2019

aryagxr / cuda

coding CUDA everyday!

Cuda 74 2 Updated Feb 5, 2026

facebookincubator / AITemplate

AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.

Python 4,713 382 Updated Mar 16, 2026

Maharshi-Pandya / cudacodes

Learnings and programs related to CUDA

Cuda 435 20 Updated Jun 29, 2025

nunchaku-ai / deepcompressor

Model Compression Toolbox for Large Language Models and Diffusion Models

Python 766 89 Updated Aug 14, 2025

karpathy / llm.c

LLM training in simple, raw C/CUDA

Cuda 29,266 3,452 Updated Jun 26, 2025

lucidrains / vit-pytorch

Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

Python 24,987 3,478 Updated Feb 11, 2026

Comfy-Org / ComfyUI

The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.

Python 107,051 12,335 Updated Mar 26, 2026

CodedK / CUDA-by-Example-source-code-for-the-book-s-examples-

CUDA by Example, written by two senior members of the CUDA software platform team, shows programmers how to employ this new technology. The authors introduce each area of CUDA development through w…

C 474 149 Updated Jun 30, 2023

NVlabs / MambaVision

[CVPR 2025] Official PyTorch Implementation of MambaVision: A Hybrid Mamba-Transformer Vision Backbone

Python 2,084 131 Updated Mar 11, 2026

tspeterkim / flash-attention-minimal

Flash Attention in ~100 lines of CUDA (forward pass only)

Cuda 1,101 110 Updated Dec 30, 2024

taizilongxu / interview_python

关于Python的面试题

Shell 17,265 5,532 Updated Mar 5, 2025

KolosalAI / Kolosal

Kolosal AI is an OpenSource and Lightweight alternative to LM Studio to run LLMs 100% offline on your device.

C++ 442 30 Updated May 22, 2025

AutoMQ / automq

AutoMQ is a diskless Kafka® on S3. 10x Cost-Effective. No Cross-AZ Traffic Cost. Autoscale in seconds. Single-digit ms latency. Multi-AZ Availability.

Java 9,628 674 Updated Mar 24, 2026

nunchaku-ai / nunchaku

[ICLR2025 Spotlight] SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models

Python 3,746 233 Updated Mar 7, 2026

mit-han-lab / efficientvit

Efficient vision foundation models for high-resolution generation and perception.

Python 3,272 235 Updated Sep 5, 2025

mit-han-lab / mcunet

[NeurIPS 2020] MCUNet: Tiny Deep Learning on IoT Devices; [NeurIPS 2021] MCUNetV2: Memory-Efficient Patch-based Inference for Tiny Deep Learning

Python 671 106 Updated Mar 29, 2024

tc2230 / ocr-api-example

An OCR API service implementation using PaddleOCR and FastAPI

Python 1 Updated Mar 3, 2025

xlite-dev / LeetCUDA

📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉

Cuda 10,024 1,006 Updated Mar 23, 2026

KarhouTam / cuda-kernels

Some common CUDA kernel implementations (Not the fastest).

Cuda 29 3 Updated Dec 5, 2025

state-spaces / mamba

Mamba SSM architecture

Python 17,736 1,660 Updated Mar 26, 2026

tensorzero / tensorzero

TensorZero is an open-source LLMOps platform that unifies an LLM gateway, observability, evaluation, optimization, and experimentation.

Rust 11,154 797 Updated Mar 26, 2026

njvisionpower / Safety-Helmet-Wearing-Dataset

Safety helmet wearing detect dataset, with pretrained model

Python 1,672 419 Updated Dec 17, 2019

PeterH0323 / Smart_Construction

Base on YOLOv5 Head Person Helmet Detection on Construction Sites，基于目标检测工地安全帽和禁入危险区域识别系统，🚀😆附 YOLOv5 训练自己的数据集超详细教程🚀😆2021.3新增可视化界面❗❗

Python 2,583 486 Updated Apr 11, 2024

GNOME / glib

Read-only mirror of https://gitlab.gnome.org/GNOME/glib

C 1,722 573 Updated Mar 26, 2026

marcoslucianops / DeepStream-Yolo

NVIDIA DeepStream SDK 8.0 / 7.1 / 7.0 / 6.4 / 6.3 / 6.2 / 6.1.1 / 6.1 / 6.0.1 / 6.0 / 5.1 implementation for YOLO models

Python 1,979 455 Updated Jan 25, 2026

Starred topics

pose-estimation