Skip to content
View polaris79's full-sized avatar

Highlights

  • Pro

Block or report polaris79

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Compendium of over 50 benchmarks for evaluating AI agents, categorized into Function Calling & Tool Use, General Assistant & Reasoning, Coding & Software Engineering, and Computer Interaction.

114 10 Updated Oct 15, 2025

💻 A curated list of papers and resources for multi-modal Graphical User Interface (GUI) agents.

1,149 67 Updated Aug 17, 2025

The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.

Python 106,928 12,320 Updated Mar 25, 2026

User-friendly AI Interface (Supports Ollama, OpenAI API, ...)

Python 128,648 18,187 Updated Mar 25, 2026

Build and run agents you can see, understand and trust.

Python 19,774 1,883 Updated Mar 24, 2026

Deepchecks: Tests for Continuous Validation of ML Models & Data. Deepchecks is a holistic open-source solution for all of your AI & ML validation needs, enabling to thoroughly test your data and mo…

Python 3,998 290 Updated Dec 28, 2025

🐙 Guides, papers, lessons, notebooks and resources for prompt engineering, context engineering, RAG, and AI Agents.

MDX 72,252 7,714 Updated Mar 11, 2026

Production-ready platform for agentic workflow development.

TypeScript 134,422 20,927 Updated Mar 25, 2026

A simulation framework for RLHF and alternatives. Develop your RLHF method without collecting human data.

Python 843 63 Updated Jul 1, 2024

A tool for evaluating LLMs

TypeScript 428 42 Updated Mar 15, 2026

An Analytical Evaluation Board of Multi-turn LLM Agents [NeurIPS 2024 Oral]

SAS 402 41 Updated May 20, 2024

Build, run, manage agentic software at scale.

Python 38,927 5,160 Updated Mar 25, 2026

The agent engineering platform

Python 131,063 21,584 Updated Mar 25, 2026

showing various ways to serve Keras based stable diffusion

Jupyter Notebook 111 5 Updated Feb 28, 2023

Implementation of Dreambooth (https://arxiv.org/abs/2208.12242) with Stable Diffusion

Jupyter Notebook 7,744 803 Updated Dec 8, 2022

Scrapes the Robinhood API to retrieve + store popularity and price data.

JavaScript 689 194 Updated Oct 31, 2024

Models and examples built with TensorFlow

Python 77,688 45,205 Updated Mar 17, 2026

Learning to Rank in TensorFlow

Python 2,775 479 Updated Mar 18, 2024

ClickModels is a small set of Python scripts for the user click models initially developed at Yandex. A Click Model is a probabilistic graphical model used to predict search engine click data from …

Python 239 71 Updated Jun 6, 2018

A PyTorch implementation of Paragraph Vectors (doc2vec)

Python 1 Updated Sep 20, 2017

A tensorflow implementation of Learning to Rank Short Text Pairs with Convolutional Deep Neural Networks

Python 188 84 Updated Aug 4, 2016