Map your codebase. Navigate by graph, not grep.
cartog gives your AI coding agent a pre-computed code graph — symbols, calls, imports, inheritance — so it queries structure in 1-2 calls instead of 6+. Everything runs locally: no API calls, no cloud, no data leaves your machine.
| grep/cat workflow | cartog | |
|---|---|---|
| Tokens per query | ~1,700 | ~280 (83% fewer) |
| Recall (completeness) | 78% | 97% |
| Query latency | multi-step | 8-450 us |
| Privacy | n/a | 100% local — no remote calls |
| Transitive analysis | impossible | impact --depth 3 traces callers-of-callers |
Where cartog shines most: tracing call chains (88% token reduction, 35% grep recall vs 100% cartog), finding callers (95% reduction), and type references (93% reduction).
Measured across 13 scenarios, 5 languages (full benchmark suite).
- Single binary, self-contained —
cargo install cartogand you're done. No Docker, no config. - 100% offline — tree-sitter parsing + SQLite storage + ONNX embeddings. Your code never leaves your machine, ever.
- Optional LSP precision — auto-detects language servers on PATH to boost edge resolution from ~25% to ~42-81%. Works without them, better with them.
- Smart search routing — keyword search (sub-ms, symbol names) and semantic search (natural language queries) work together. Run both in parallel when unsure.
- Live index —
cartog watchauto re-indexes on file changes. Your agent always queries fresh data. - MCP server —
cartog serveexposes 12 tools over stdio. Plug into Claude Code, Cursor, Windsurf, Zed, or any MCP-compatible agent.
cargo install cartog
cd your-project
cartog index . # build the graph (~95ms for 4k LOC, incremental)
cartog search validate # find symbols by name (sub-millisecond)
cartog refs validate_token # who calls/imports/references this?
cartog impact validate_token # what breaks if I change this?cartog rag setup # download embedding + re-ranker models (~1.2GB, one-time)
cartog rag index . # embed all symbols into sqlite-vec
cartog rag search "authentication token validation" # natural language queriesModels are downloaded once to ~/.cache/cartog/models/ and run locally via ONNX Runtime. No API keys, no network calls at query time.
cargo install cartog # core (heuristic resolution only)
cargo install cartog --features lsp # + LSP-based resolution (recommended)The lsp feature adds ~50KB to the binary. It auto-detects language servers on PATH (rust-analyzer, pyright, typescript-language-server, gopls, ruby-lsp, solargraph) and uses them to resolve edges that heuristic matching can't. No extra config needed — if a server is on PATH, it's used automatically.
Download from GitHub Releases:
# macOS (Apple Silicon)
curl -L https://github.com/jrollin/cartog/releases/latest/download/cartog-aarch64-apple-darwin.tar.gz | tar xz
sudo mv cartog /usr/local/bin/
# macOS (Intel)
curl -L https://github.com/jrollin/cartog/releases/latest/download/cartog-x86_64-apple-darwin.tar.gz | tar xz
sudo mv cartog /usr/local/bin/
# Linux (x86_64)
curl -L https://github.com/jrollin/cartog/releases/latest/download/cartog-x86_64-unknown-linux-gnu.tar.gz | tar xz
sudo mv cartog /usr/local/bin/
# Linux (ARM64)
curl -L https://github.com/jrollin/cartog/releases/latest/download/cartog-aarch64-unknown-linux-gnu.tar.gz | tar xz
sudo mv cartog /usr/local/bin/
# Windows (x86_64) — download .zip from releases pageThe database path is resolved automatically — no config needed for standard use:
--dbflag /CARTOG_DBenv var — explicit override (highest priority).cartog.tomlat the git root — project-specific config- Auto git-root detection — DB placed at the root of the git repository
- cwd fallback —
.cartog.dbin the current directory
# Override database location
cartog --db /tmp/myproject.db index .
CARTOG_DB=~/.local/share/cartog/proj.db cartog search foo
# --db is global — applies to all subcommands
cartog --db /tmp/x.db stats.cartog.toml (optional, place at project root):
[database]
path = "~/.local/share/cartog/myproject.db"Useful when indexing from a parent directory across multiple projects, or when storing the DB outside the repo. See docs/usage.md for details.
cartog offers two search modes that complement each other:
| Query type | Command | Speed | Best for |
|---|---|---|---|
| Symbol name / partial name | cartog search parse |
sub-ms | You know the name: validate_token, AuthService |
| Natural language / concept | cartog rag search "error handling" |
~150-500ms | You know the behavior, not the name |
| Broad keyword, unsure | Run both in parallel | sub-ms + ~300ms | auth, config — catch names + semantics |
Narrowing pattern: cartog search parse returns 30 hits? Narrow with cartog rag search "parse JSON response body" to pinpoint the right ones.
# Direct keyword search — fast, exact
cartog search validate_token
cartog search parse --kind function --limit 10
# Semantic search — natural language, conceptual
cartog rag search "database connection pooling"
cartog rag search "error handling" --kind function
# Both in parallel when unsure
cartog search auth & cartog rag search "authentication and authorization"# Index
cartog index . # Build the graph (with LSP if available)
cartog index . --no-lsp # Fast heuristic-only (~1-4s)
cartog index . --force # Re-index all files
# Search
cartog search validate # Find symbols by partial name
cartog search validate --kind function # Filter by kind
cartog rag search "token validation" # Semantic search (natural language)
# Navigate
cartog outline src/auth/tokens.py # File structure without reading it
cartog refs validate_token # Who references this? (calls, imports, inherits, types)
cartog refs validate_token --kind calls # Filter: only call sites
cartog callees authenticate # What does this call?
cartog impact SessionManager --depth 3 # What breaks if I change this?
cartog hierarchy BaseService # Inheritance tree
cartog deps src/routes/auth.py # File-level imports
cartog stats # Index summary
# Watch (auto re-index on file changes)
cartog watch . # Watch for changes, re-index automatically
cartog watch . --rag # Also re-embed symbols (deferred)
# MCP Server
cartog serve # MCP server over stdio (12 tools)
cartog serve --watch # With background file watcher
cartog serve --watch --rag # Watcher + deferred RAG embeddingAll commands support --json for structured output.
Example outputs
$ cartog outline auth/tokens.py
from datetime import datetime, timedelta L3
from typing import Optional L4
import hashlib L5
class TokenError L11-14
class ExpiredTokenError L17-20
function generate_token(user: User, expires_in: int = 3600) -> str L23-27
function validate_token(token: str) -> Optional[User] L30-44
function lookup_session(token: str) -> Optional[Session] L47-49
function refresh_token(old_token: str) -> str L52-56
function revoke_token(token: str) -> bool L59-65
$ cartog search validate
function validate_token auth/tokens.py:30
function validate_session auth/tokens.py:68
function validate_user services/user.py:12
Results ranked: exact match > prefix > substring. Case-insensitive.
$ cartog impact validate_token --depth 3
calls get_current_user auth/service.py:40
calls refresh_token auth/tokens.py:54
calls impersonate auth/service.py:52
$ cartog refs UserService
imports ./service routes/auth.py:3
calls login routes/auth.py:15
inherits AdminService auth/service.py:47
references process routes/auth.py:22
graph LR
A["Source files<br/>(py, ts, rs, go, rb, java)"] -->|tree-sitter| B["Symbols + Edges"]
B -->|write| C[".cartog.db<br/>(SQLite)"]
C -->|query| D["search / refs / impact<br/>outline / callees / hierarchy"]
C -->|embed locally| E["ONNX embeddings<br/>(sqlite-vec)"]
E -->|query| F["rag search<br/>(FTS5 + vector KNN + reranker)"]
- Index — walks your project, parses each file with tree-sitter, extracts symbols (functions, classes, methods, imports, variables) and edges (calls, imports, inherits, raises, type references)
- Store — writes everything to a local
.cartog.dbSQLite file - Resolve (heuristic) — links edges by name with scope-aware matching (same file > import path > same directory > unique project match)
- Resolve (LSP, optional) — for edges the heuristic couldn't resolve, sends
textDocument/definitionto language servers for compiler-grade precision. Results persist in the DB. - Embed (optional) — generates vector embeddings locally with ONNX Runtime (
BAAI/bge-small-en-v1.5), stored in sqlite-vec - Query — instant lookups against the pre-computed graph, hybrid FTS5 + vector search with RRF merge and cross-encoder re-ranking
Re-indexing is incremental: git diff + SHA-256 skips unchanged files, and Merkle-tree diffing within changed files updates only modified symbols. cartog watch automates this on file changes.
Everything runs on your machine. No API keys. No cloud endpoints. No telemetry. Your code stays local.
cartog runs as an MCP server, exposing 12 tools (10 core + 2 RAG) over stdio.
# Claude Code
claude mcp add cartog -- cartog serve
# With live re-indexing
claude mcp add cartog -- cartog serve --watch --rag
# Cursor — add to .cursor/mcp.json
# Windsurf — add to ~/.codeium/windsurf/mcp_config.json
# OpenCode — add to .opencode.json
# Zed — add to ~/.config/zed/settings.jsonCommon config (JSON):
{
"mcpServers": {
"cartog": {
"command": "cartog",
"args": ["serve", "--watch", "--rag"]
}
}
}See Usage — MCP Server for per-client installation details.
Install cartog as an Agent Skill for Claude Code, Cursor, Copilot, and other compatible agents:
npx skills add jrollin/cartogOr install manually:
cp -r skills/cartog ~/.claude/skills/The skill teaches your AI agent when and how to use cartog — including search routing (rag search as default, structural search for refs/callees/impact), refactoring workflows, and when to fall back to grep. See Agent Skill for details.
cartog is designed for air-gapped and privacy-conscious environments:
- Parsing: tree-sitter runs in-process, no external calls
- Storage: SQLite file in your project directory (
.cartog.db) - Embeddings: ONNX Runtime inference, models cached locally (
~/.cache/cartog/models/) - Re-ranking: cross-encoder runs locally via ONNX, no API
- MCP server: communicates over stdio only, no network sockets
- No telemetry, no analytics, no phone-home of any kind
Your code never leaves your machine. Not during indexing, not during search, not ever.
| Language | Extensions | Symbols | Edges |
|---|---|---|---|
| Python | .py, .pyi | functions, classes, methods, imports, variables | calls, imports, inherits, raises, type refs |
| TypeScript | .ts, .tsx | functions, classes, methods, imports, variables | calls, imports, inherits, type refs, new |
| JavaScript | .js, .jsx, .mjs, .cjs | functions, classes, methods, imports, variables | calls, imports, inherits, new |
| Rust | .rs | functions, structs, traits, impls, imports | calls, imports, inherits (trait impl), type refs |
| Go | .go | functions, structs, interfaces, imports | calls, imports, type refs |
| Ruby | .rb | functions, classes, modules, imports | calls, imports, inherits, raises, rescue types |
| Java | .java | classes, interfaces, enums, methods, imports, variables | calls, imports, inherits, raises, type refs, new |
Indexing: 69 files / 4k LOC in 95ms (Python fixture, release build). Incremental re-index skips unchanged files.
Query latency (criterion benchmarks on the same fixture):
| Query type | Latency |
|---|---|
| outline | 8-14 us |
| hierarchy | 8-9 us |
| deps | 25 us |
| stats | 32 us |
| search | 81-102 us |
| callees | 177-180 us |
| refs | 258-471 us |
| impact (depth 3) | 2.7-17 ms |
cartog uses a two-tier resolution strategy. The heuristic pass runs instantly; LSP is optional and adds precision.
| Project type | Language | Heuristic only | With LSP | Time (LSP) |
|---|---|---|---|---|
| TS microservice (230 files) | TypeScript | 37% | 81% | 13s |
| Vue.js SPA (739 files) | Vue/TS/JS | 31% | 72% | 25s |
| Rust CLI (358 files) | Rust | 25% | 44% | 72s |
Remaining unresolved edges are mostly calls to external libraries (std, node_modules, crates) — definitions outside the project boundary.
When to use LSP: before a major refactoring, when refs or impact seem incomplete.
When to skip (--no-lsp): day-to-day exploration, post-change verification, watch mode.
- Two-tier resolution — fast heuristic pass (~1s) for daily use, optional LSP for precision refactoring. Results persist in SQLite — pay the LSP cost once.
- Self-contained — single binary, all dependencies compiled in. LSP is opt-in via language servers already on your PATH.
- Incremental — git diff + SHA256 per file, Merkle-tree diff per symbol. Stable IDs survive line movements.
- Local-first — embedding models run via ONNX Runtime on your CPU. Slower than API calls, but your code stays private.
MIT