-
Oklahoma Medical Research Foundation
- Oklahoma City, OK
- @cbgiles
Starred repositories
An open infrastructure to democratize and decentralize the development of superintelligence for humanity.
Convert PDF to markdown + JSON quickly with high accuracy
NCBI Datasets is a new resource that lets you easily gather data from across NCBI databases.
A Model Context Protocol (MCP) server that provides LLMs with real-time access to scientific papers from arXiv and OpenAlex.
[COLM 2025] Know Me, Respond to Me: Benchmarking LLMs for Dynamic User Profiling and Personalized Responses at Scale
Implementation for OAgents: An Empirical Study of Building Effective Agents
The raw UserRL repo under construction
A self-hostable bookmark-everything app (links, notes and images) with AI-based automatic tagging and full text search
[EMNLP2025] "LightRAG: Simple and Fast Retrieval-Augmented Generation"
Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.
Generate High-Quality Synthetics, Train, Measure, and Evaluate in a Single Pipeline
CRDT-based offline-first sync for SQLite. Syncs automatically with SQLite Cloud, PostgreSQL, and Supabase. No conflicts, no data loss, no backend to build. For offline-first apps and AI agents.
Scripts for harvesting from repositories using OAI-PMH
RDFLib is a Python library for working with RDF, a simple yet powerful language for representing information.
Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.
A high-performance Python-based I/O system for large (and small) deep learning problems, with strong support for PyTorch.
An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.
A simple, extendable, and clean backtesting framework for portfolio allocation problems (and more).
Fast inference engine for Transformer models
Object-oriented handling of audio data, with GPU-powered augmentations, and more.
Build local voice agents with open-source models
An interactive explorer for single-cell transcriptomics data
a decentralized dataset generator and manipulator.