Skip to content
View Weili-NLP's full-sized avatar
  • Baidu
  • Beijing

Block or report Weili-NLP

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞

TypeScript 338,055 66,362 Updated Mar 27, 2026

[Survey] A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems

1,987 140 Updated Oct 11, 2025
Python 1,294 107 Updated Feb 12, 2026

Official Repo of paper "KnowCoder: Coding Structured Knowledge into LLMs for Universal Information Extraction". In the paper, we propose KnowCoder, the most powerful large language model so far for…

Python 114 12 Updated May 28, 2025

Embodied-Reasoner: Synergizing Visual Search, Reasoning, and Action for Embodied Interactive Tasks

Python 191 16 Updated Sep 24, 2025

Official repository of 'Visual-RFT: Visual Reinforcement Fine-Tuning' & 'Visual-ARFT: Visual Agentic Reinforcement Fine-Tuning'’

Jupyter Notebook 2,321 105 Updated Oct 29, 2025

NVIDIA Isaac GR00T N1.6 - A Foundation Model for Generalist Robots.

Jupyter Notebook 6,535 1,087 Updated Mar 26, 2026

[ECCV2024] 🐙Octopus, an embodied vision-language model trained with RLEF, emerging superior in embodied visual planning and programming.

Python 297 20 Updated May 20, 2024

AutoCoA (Automatic generation of Chain-of-Action) is an agent model framework that enhances the multi-turn tool usage capability of reasoning models.

Python 131 9 Updated Mar 18, 2025

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 69,144 8,433 Updated Mar 27, 2026

Implementation code of the paper MIGE: A Unified Framework for Multimodal Instruction-Based Image Generation and Editing

Python 72 4 Updated Jul 13, 2025

Official repo for GPTFUZZER : Red Teaming Large Language Models with Auto-Generated Jailbreak Prompts

Python 576 81 Updated Feb 27, 2026

Repo of ACL 2025 Paper "Quantification of Large Language Model Distillation"

Python 100 9 Updated Mar 5, 2026

Towards Large Multimodal Models as Visual Foundation Agents

Python 259 10 Updated Apr 24, 2025

A generative world for general-purpose robotics & embodied AI learning.

Python 28,348 2,634 Updated Mar 25, 2026

✨✨Latest Advances on Multimodal Large Language Models

17,529 1,119 Updated Mar 20, 2026

[ICLR 2023] ReAct: Synergizing Reasoning and Acting in Language Models

Jupyter Notebook 3,695 363 Updated Feb 6, 2024

Official repo with the MM-PlanLLM code, from the paper Show and Guide: Instructional-Plan Grounded Vision and Language Model.

Python 2 Updated Nov 12, 2024

Pytorch implementation of Transfusion, "Predict the Next Token and Diffuse Images with One Multi-Modal Model", from MetaAI

Python 1,327 71 Updated Jan 27, 2026

✨✨[NeurIPS 2025] VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction

Python 2,501 183 Updated Mar 28, 2025

Tarsier -- a family of large-scale video-language models, which is designed to generate high-quality video descriptions , together with good capability of general video understanding.

Python 532 29 Updated Aug 14, 2025

【EMNLP 2024🔥】Video-LLaVA: Learning United Visual Representation by Alignment Before Projection

Python 3,469 251 Updated Dec 3, 2024

OpenVLA: An open-source vision-language-action model for robotic manipulation.

Python 5,680 669 Updated Mar 23, 2025

A flexible and efficient codebase for training visually-conditioned language models (VLMs)

Python 958 978 Updated Jul 4, 2024

PyTorch implementation for Score-Based Generative Modeling through Stochastic Differential Equations (ICLR 2021, Oral)

Jupyter Notebook 2,090 353 Updated Jul 14, 2024

PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis

Python 3,285 202 Updated Oct 31, 2024
89 Updated Jan 25, 2024

AppAgent: Multimodal Agents as Smartphone Users, an LLM-based multimodal agent framework designed to operate smartphone apps.

Python 6,618 735 Updated Mar 19, 2025

Easy-to-use and powerful NLP library with Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including Neural Search, Question Answering, Information Ex…

Python 2 1 Updated Dec 9, 2025
Next