YOLO-World-ONNX is a Python package for running inference on YOLO-WORLD Open-vocabulary-object detection model using ONNX models. It provides an easy-to-use interface for performing inference on im…

Python 16 2 Updated Feb 6, 2026

AILab-CVC / YOLO-World

[CVPR 2024] Real-Time Open-Vocabulary Object Detection

Python 6,281 590 Updated Feb 26, 2025

bubbliiiing / count-mAP-txt

这个是一个在SSD的基础上用于生成绘制mAP代码所用的txt的例子。（目的是生成txt）

Python 128 40 Updated Jan 31, 2021

meta-llama / llama-models

Utilities intended for use with Llama models.

Python 7,540 1,352 Updated Feb 11, 2026

deepseek-ai / DeepSeek-V3

Python 102,416 16,611 Updated Aug 28, 2025

Infrasys-AI / AISystem

AISystem 主要是指AI系统，包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术

Jupyter Notebook 16,539 2,345 Updated Sep 3, 2025

sgl-project / sglang

SGLang is a high-performance serving framework for large language models and multimodal models.

Python 25,230 5,061 Updated Mar 30, 2026

huggingface / text-generation-inference

Large Language Model Text Generation Inference

Python 10,818 1,262 Updated Mar 21, 2026

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 74,734 14,966 Updated Mar 30, 2026

meta-llama / llama

Inference code for Llama models

Python 59,277 9,827 Updated Jan 26, 2025

InternLM / lmdeploy

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Python 7,738 675 Updated Mar 30, 2026

open-compass / opencompass

OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.

Python 6,811 752 Updated Mar 30, 2026

mit-han-lab / streaming-llm

[ICLR 2024] Efficient Streaming Language Models with Attention Sinks

Python 7,207 396 Updated Jul 11, 2024

lovemefan / paraformer-python

paraformer(chinense asr) online onnx runtime for python

Python 54 6 Updated Mar 27, 2024

modelscope / FunASR

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

Python 15,462 1,623 Updated Mar 17, 2026

triton-inference-server / triton_cli

Triton CLI is an open source command line interface that enables users to create, deploy, and profile models served by the Triton Inference Server.

Python 74 5 Updated Mar 10, 2026

triton-inference-server / perf_analyzer

Python 139 41 Updated Mar 13, 2026

gkamradt / LLMTest_NeedleInAHaystack

Doing simple retrieval from LLM models at various context lengths to measure accuracy

Jupyter Notebook 2,229 237 Updated Aug 17, 2024

google-coral / edgetpu

Coral issue tracker (and legacy Edge TPU API source)

C++ 475 128 Updated Oct 27, 2021

taehokim20 / LLMem

LLMem: GPU Memory Estimation for Fine-Tuning Pre-Trained LLMs

Python 29 3 Updated May 31, 2025

microsoft / LoRA

Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"

Python 13,376 895 Updated Dec 17, 2024

InternLM / xtuner

A Next-Generation Training Engine Built for Ultra-Large MoE Models

Python 5,109 414 Updated Mar 30, 2026

axolotl-ai-cloud / axolotl

Go ahead and axolotl questions

Python 11,544 1,284 Updated Mar 30, 2026

Instruction-Tuning-with-GPT-4 / GPT-4-LLM

Instruction Tuning with GPT-4

HTML 4,335 311 Updated Jun 11, 2023

artidoro / qlora

QLoRA: Efficient Finetuning of Quantized LLMs

Jupyter Notebook 10,862 871 Updated Jun 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fancy TS-toolchain

Block or report TS-toolchain

Stars

QwenLM / Qwen3-VL

nndeploy / nndeploy

alibaba / MNN

pytorch / executorch

Infrasys-AI / AIInfra

Ziad-Algrafi / yolo-world-onnx