Data Crawler & Automation Engineer building production scraping systems and open-source extraction tools.
From closed APIs → reverse-engineered protocols → pip-installable packages → data at scale.
| Area | What you can expect |
|---|---|
| API Reverse Engineering | Decode private protocols (protobuf, GraphQL, internal REST) → direct HTTP extraction, no browser needed, 50x faster |
| Anti-Bot Evasion | Bypass Cloudflare, Shape Security, Incapsula, DataDome using anti-detect browsers, TLS fingerprinting & ISP proxies |
| Scalable Data Pipelines | Async scraping at 100K+ records/week • proxy rotation • checkpoint resumption • structured output |
| Open-Source Tooling | Production-ready pip packages with full docs, streaming APIs, event systems & CLI interfaces |
| Government Data Collection | 23 US states automated — business registrations & professional licenses from gov portals |
| Project | What it does | Link |
|---|---|---|
| GoogleMapsCollector | Reverse-engineers Google Maps' internal protobuf API — 100K+ records/week, no browser, no API key | Repo · pip install gmaps-extractor |
| MetaAdsCollector | Reverse-engineers Meta's private GraphQL API — full Ad Library extraction across all countries | Repo · pip install meta-ads-collector |
| google-maps-pb-decoder | Protobuf decoder for Google Maps' binary wire format — research & extraction toolkit | Repo |
| Project | What it does | Link |
|---|---|---|
| generic-scraper-1 | LLM-powered structured extraction from any website — define fields, get data, no selectors needed | Repo · pip install scraper |
| linkedin-profile-extractor | LinkedIn profile extraction with anti-detection — experience, education, skills, full profiles | Repo |
| google-maps-scraper | Google Maps scraping via browser automation with stealth mode | Repo |
| Project | What it does | Link |
|---|---|---|
| gov_websites_collector | Collects business registrations & professional licenses from 23 US state government websites — Camoufox anti-detect + ISP proxies | Repo |
| Category | Tools |
|---|---|
| Languages & Core | |
| Scraping & Automation | |
| Reverse Engineering | |
| Backend & APIs | |
| Databases | |
| Infrastructure |
- 🔓 API Reverse Engineering: Decode closed/private APIs (protobuf, GraphQL, internal REST) for direct, fast data extraction
- 🛡️ Anti-Bot Evasion: Defeat Cloudflare, Imperva, Shape Security, DataDome — TLS fingerprinting, anti-detect browsers, ISP proxies
- ⚡ Scalable Scraping Systems: Async pipelines, proxy rotation, checkpoint resumption — 100K+ records/week on a single machine
- 📦 Open-Source Tooling: Production-ready pip packages with streaming APIs, event systems, and full documentation
Best contact: LinkedIn
If you find the work useful, a ⭐ helps more people discover it.