-
alibabacloud
- Beijing
Stars
Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞
JobSet: a k8s native API for distributed ML training and HPC workloads
Algorithm powering the For You feed on X
Kubernetes-native AI serving platform for scalable model serving.
Achieve state of the art inference performance with modern accelerators on Kubernetes
Cost-efficient and pluggable Infrastructure components for GenAI inference
A fast communication-overlapping library for tensor/expert parallelism on GPUs.
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
My learning notes for ML SYS.
verl: Volcano Engine Reinforcement Learning for LLMs
Train transformer language models with reinforcement learning.
Byted PyTorch Distributed for Hyperscale Training of LLMs and RLs
A throughput-oriented high-performance serving framework for LLMs
NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
Get up and running with Kimi-K2.5, GLM-5, MiniMax, DeepSeek, gpt-oss, Qwen, Gemma and other models.
SGLang is a high-performance serving framework for large language models and multimodal models.
Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"
A modular graph-based Retrieval-Augmented Generation (RAG) system
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
A high-throughput and memory-efficient inference and serving engine for LLMs
Fast and memory-efficient exact attention
Kubernetes community content
Standardized Distributed Generative and Predictive AI Inference Platform for Scalable, Multi-Framework Deployment on Kubernetes
AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
The simplest, fastest repository for training/finetuning medium-sized GPTs.