Skip to content
View ptaxom's full-sized avatar

Block or report ptaxom

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Efficient implementation of DeepSeek Ops (Blockwise FP8 GEMM, MoE, and MLA) for AMD Instinct MI300X

C++ 75 6 Updated Feb 11, 2026

gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI

Python 19,948 2,061 Updated Mar 27, 2026

NVIDIA PhysX SDK

C++ 4,470 600 Updated Mar 18, 2026

Linux keyboard-operated window manager. Mirror of gitea instance listed below!

Rust 55 Updated Mar 12, 2026

High-performance automatic differentiation of LLVM and MLIR.

LLVM 1,570 158 Updated Mar 30, 2026
Cuda 132 16 Updated Mar 19, 2026

A micro Wayland compositor that can be used as a Gstreamer plugin

Rust 57 17 Updated Feb 25, 2026
Python 93 8 Updated Nov 11, 2025

Command and Conquer: Generals - Zero Hour

C++ 4,537 1,625 Updated Feb 27, 2025

リアルタイムボイスチェンジャー Realtime Voice Changer

Python 19,927 2,281 Updated Mar 21, 2026

GPU LBVH builder implemented in Vulkan and GLSL.

C++ 60 4 Updated Aug 27, 2023

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit and 4-bit floating point (FP8 and FP4) precision on Hopper, Ada and Blackwell GPUs, to provide better performance…

Python 3,249 679 Updated Mar 30, 2026

NVIDIA curated collection of educational resources related to general purpose GPU programming.

Jupyter Notebook 1,419 250 Updated Mar 30, 2026

An implementation of physically based shading & image based lighting in D3D11, D3D12, Vulkan, and OpenGL 4.

C++ 1,496 117 Updated Oct 30, 2021

CUDA Templates and Python DSLs for High-Performance Linear Algebra

C++ 9,507 1,757 Updated Mar 30, 2026

CUDA Matrix Multiplication Optimization

Cuda 263 25 Updated Jul 19, 2024

A repo for numerical methods written in Rust with wrappers for Python

Rust 1 Updated Aug 4, 2024

A Reconfigurable RISC-V Core for Approximate Computing

Verilog 130 82 Updated May 30, 2025

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.

C++ 12,840 2,332 Updated Mar 25, 2026

pnn is Darknet compatible neural nets inference engine implemented in Rust.

Rust 17 1 Updated Jan 8, 2022

A personal knowledge management and sharing system for VSCode

TypeScript 16,979 755 Updated Mar 30, 2026

CUDA Core Compute Libraries

C++ 2,246 371 Updated Mar 30, 2026

A minimal GPU design in Verilog to learn how GPUs work from the ground up

SystemVerilog 12,066 1,104 Updated Aug 18, 2024

A modern C++ BVH construction and traversal library

C++ 1,139 115 Updated Jun 16, 2025

Build mindmaps with plain text

TypeScript 12,608 954 Updated Jun 12, 2025

A mini x86 linux debugger for teaching purposes

C++ 650 107 Updated Aug 2, 2024

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.

LLVM 37,655 16,717 Updated Mar 30, 2026

Pure DOOM - Single Header Doom Source Port

C++ 391 41 Updated Feb 7, 2026
Next