Agent Reinforcement Trainer: train multi-step agents for real-world tasks using GRPO. Give your agents on-the-job training. Reinforcement learning for Qwen3.5, GPT-OSS, Llama, and more!

Python 9,068 774 Updated Mar 24, 2026

lyogavin / airllm

AirLLM 70B inference with single 4GB GPU

Jupyter Notebook 14,383 1,445 Updated Mar 10, 2026

jarrodwatts / claude-hud

A Claude Code plugin that shows what's happening - context usage, active tools, running agents, and todo progress

JavaScript 12,590 521 Updated Mar 23, 2026

udlbook / udlbook

Understanding Deep Learning - Simon J.D. Prince

Jupyter Notebook 9,243 2,178 Updated Feb 24, 2026

QwenLM / Qwen3-Coder

Qwen3-Coder is the code version of Qwen3, the large language model series developed by Qwen team.

Python 16,124 1,147 Updated Mar 24, 2026

hossein-amirkhani / VocabLevel

A simple but efficient method to approximately calculate the users' vocabulary level

Python 41 16 Updated Sep 6, 2018

Jinx1910 / ML_PAPERS

Machine Learning Paper Implementations

Python 4 Updated Dec 27, 2025

thunlp / KG-Infused-RAG

Official implementation for the paper "KG-Infused RAG: Augmenting Corpus-Based RAG with External Knowledge Graphs"

Python 22 2 Updated Jan 18, 2026

thunlp / JustRL

[ICLR 2026 Blogpost Track Poster] JustRL: Scaling a 1.5B LLM with a Simple RL Recipe

Python 257 12 Updated Mar 11, 2026

kundushounak / kundushounak.github.io

HTML 1 Updated Dec 22, 2025

armchr / armchr

Armchr is a set of tools for AI coding agents.

JavaScript 24 3 Updated Feb 19, 2026

opendatalab / MinerU

Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.

Python 57,097 4,721 Updated Mar 24, 2026

macwiatrak / Learn-RL

Learn Reinforcement Learning - A short repo of resources for studying reinforcement learning

3 1 Updated Aug 28, 2019

coolcoder001 / Machine-Learning-Blueprint

Forked from pnjoroge54/Machine-Learning-Blueprint

Jupyter Notebook 1 Updated Nov 6, 2025

berkeley-reclab / RecLab

Python 67 7 Updated Feb 16, 2023

google-research / recsim_ng

RecSim NG: Toward Principled Uncertainty Modeling for Recommender Ecosystems

Jupyter Notebook 125 15 Updated Apr 26, 2022

sb-ai-lab / Sim4Rec

Simulator for training and evaluation of Recommender Systems

Jupyter Notebook 57 5 Updated Mar 24, 2025

unslothai / unsloth

Unsloth Studio is a web UI for training and running open models like Qwen, DeepSeek, gpt-oss and Gemma locally.

Python 58,000 4,892 Updated Mar 24, 2026

allenai / open-instruct

AllenAI's post-training codebase

Python 3,651 515 Updated Mar 24, 2026

LuizEdCard / backend_consolidated

Python 1 1 Updated Jun 2, 2025

The-AiEdge / RAG-Code-Solution

Python 4 6 Updated Oct 13, 2025

ritvikmath / Time-Series-Analysis

code and data for the time series analysis vids on my YouTube channel

Jupyter Notebook 769 698 Updated Apr 18, 2024

mitmath / 18S096

18.S096 three-week course at MIT

Jupyter Notebook 269 51 Updated Mar 25, 2023