-
Sun Yat-Sen University
- Alibaba, Hangzhou, China
- https://www.zhihu.com/people/jian-xin-15-96
Stars
Official code of "StreamBP: Memory-Efficient Exact Backpropagation for Long Sequence Training of LLMs".
verl-agent is an extension of veRL, designed for training LLM/VLM agents via RL. verl-agent is also the official code for paper "Group-in-Group Policy Optimization for LLM Agent Training"
Pytorch Implementation of "Multi-Level Optimal Transport for Universal Cross-Tokenizer Knowledge Distillation on Language Models", AAAI 2025
Pytorch Implementation of "Sinkhorn Distance Minimization for Knowledge Distillation", COLING 2024 and TNNLS 2024
Unleashing the Power of Reinforcement Learning for Math and Code Reasoners
Super-Efficient RLHF Training of LLMs with Parameter Reallocation
Official Repo for Open-Reasoner-Zero
Scalable RL solution for advanced reasoning of language models
Efficient Triton Kernels for LLM Training
[ICLR 2024]EMO: Earth Mover Distance Optimization for Auto-Regressive Language Modeling(https://arxiv.org/abs/2310.04691)
OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models
[ICLR2023] PLOT: Prompt Learning with Optimal Transport for Vision-Language Models
Implementation of Sinkhorn algorithms in Torch.
code for paper "BiLD: Bi-directional Logits Difference Loss for Large Language Model Distillation"
An Open Source Toolkit For LLM Distillation
llm deploy project based mnn. This project has merged into MNN.
[CVPR 2023] DepGraph: Towards Any Structural Pruning; LLMs, Vision Foundation Models, etc.
A curated list of neural network pruning resources.
[NeurIPS 2023] LLM-Pruner: On the Structural Pruning of Large Language Models. Support Llama-3/3.1, Llama-2, LLaMA, BLOOM, Vicuna, Baichuan, TinyLlama, etc.
This repository collects papers for "A Survey on Knowledge Distillation of Large Language Models". We break down KD into Knowledge Elicitation and Distillation Algorithms, and explore the Skill & V…
[ICLR 2024] Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning
A family of compressed models obtained via pruning and knowledge distillation
A flexible and efficient training framework for large-scale alignment tasks