Stars
OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.
A large-scale multilingual dataset for Information Retrieval. Thorough human-annotations across 18 diverse languages.
Python wrapper and evaluation tools for the Approach Zero search engine core.
An optimized deep prompt tuning strategy comparable to fine-tuning across scales and tasks
Train Dense Passage Retriever (DPR) with a single GPU
WSDM'22 Best Paper: Learning Discrete Representations via Constrained Clustering for Effective and Efficient Dense Retrieval
EMNLP'2021: Simple Entity-centric Questions Challenge Dense Retrievers https://arxiv.org/abs/2109.08535
Tevatron - Unified Document Retrieval Toolkit across Scale, Language, and Modality. Demo in SIGIR 2023, SIGIR 2025.
A toolkit for building dense retrievers with deep language models.
A library for efficient similarity search and clustering of dense vectors.
TensorFlow code and pre-trained models for BERT
Compatibility module providing Lua-5.3-style APIs for Lua 5.2 and 5.1