Skip to content

alicankiraz1/newscrux

Repository files navigation

📡 Newscrux

AI-powered news aggregator with structured multilingual summaries and push notifications

Node.js TypeScript License: MIT Languages ESM


What It Does

Newscrux monitors 13 AI/ML RSS feeds, filters articles by relevance using AI, extracts full article content when needed, generates structured summaries in your chosen language, and delivers them as rich push notifications to your phone via Pushover.

Every notification tells you what happened, why it matters, and one key detail — in English, Turkish, German, French, or Spanish.


Notification Examples

English (--lang=en):

Title: OpenAI announces enterprise agent toolkit

📰 TechCrunch AI

What happened: OpenAI released a new suite of tools for building
enterprise-grade autonomous agents, including improved function
calling, a persistent memory API, and a new orchestration layer.

Why it matters: This could significantly accelerate agent-based
automation in large organizations by reducing integration complexity.

💡 Initial access is being rolled out to select enterprise customers.

Turkish (--lang=tr):

Title: AGI'ye doğru ilerlemeyi ölçmek: Bilişsel bir çerçeve

📰 Google DeepMind

Ne oldu: Google DeepMind, yapay genel zeka (AGI) yolunda ilerlemeyi
değerlendirmek için bilişsel bilim temelli bir çerçeve yayınladı.
10 temel bilişsel yeteneği tanımlıyor ve AI sistemlerinin yeteneklerini
sınıflandırmaya yönelik bir taksonomi sunuyor.

Neden önemli: Bu çerçeve, AI sistemlerinin genel zeka yeteneklerini
bilişsel perspektiften değerlendirmek için ortak bir temel sağlayabilir.

💡 200.000 dolar ödüllü Kaggle hackathonu başlatıldı.

Features

  • 🌍 5 languages — English, Turkish, German, French, Spanish via --lang flag
  • 🧠 Structured summaries — What happened + Why it matters + Key detail, generated by DeepSeek
  • 📰 13 RSS sources — OpenAI, Google AI, DeepMind, TechCrunch, arXiv, and more
  • 🔍 AI relevance filtering — Only delivers news that matters; irrelevant articles are dropped before summarization
  • 📄 Hybrid content extraction — RSS snippet first, full-text scraping (via cheerio) when snippet is too short
  • Article state pipelinediscovered → enriched → summarized → sent with persistence
  • 🔒 No data loss — Atomic queue writes, retry on transient failure, articles survive restarts
  • 📊 Operational metrics — Per-cycle stats logged (discovered, enriched, sent, failed, truncated)
  • 🏷️ Feed typing — Official blogs (official_blog) bypass the relevance filter automatically
  • 🔁 Cross-source deduplication — Title similarity check prevents the same story from multiple sources

Quick Start

git clone https://github.com/alicankiraz1/newscrux.git
cd newscrux
npm install
cp .env.example .env        # Edit with your API keys
npm run build
npm start -- --lang=en      # or: tr, de, fr, es

Prerequisites:


Architecture

RSS Feeds (13 sources)
        │
        ▼
  Fetch + Parse
        │
        ▼
  Cross-source Dedup (title similarity)
        │
        ▼
  Discover → Queue (persistent JSON)
        │
        ├─ high priority (official_blog) ────────────────────┐
        │                                                     │
        ▼                                                     │
  Relevance Filter (AI scores 1-10)                          │
  Drop below threshold                                        │
        │                                                     │
        └─────────────────────────────────────────────────── ▼
                                                   Enrich (snippet or scrape)
                                                             │
                                                             ▼
                                                   Summarize (DeepSeek JSON)
                                                             │
                                                             ▼
                                                   Render Notification
                                                   (HTML, smart truncation)
                                                             │
                                                             ▼
                                                   Send via Pushover
                                                             │
                                                             ▼
                                                   Mark Sent in Queue

Supported Languages

Code Language Notification labels
en English "What happened:" / "Why it matters:" / "Read More"
tr Turkish "Ne oldu:" / "Neden önemli:" / "Devamını Oku"
de German "Was passiert ist:" / "Warum es wichtig ist:" / "Weiterlesen"
fr French "Ce qui s'est passé :" / "Pourquoi c'est important :" / "Lire la suite"
es Spanish "Qué pasó:" / "Por qué importa:" / "Leer más"

Each language pack includes a full AI system prompt in that language, feed kind labels, and all notification UI strings. The AI model produces translated_title, what_happened, why_it_matters, and key_detail in the selected language.


Configuration

CLI Options

Flag Description Default
--lang <code>, -l <code> Summary language: en, tr, de, fr, es en
--help, -h Show help message and exit
--version, -v Show version number and exit

Examples:

newscrux --lang=tr      # Start with Turkish summaries
newscrux -l de          # Start with German summaries
newscrux                # Start with English summaries (default)

Environment Variables (.env)

Variable Required Default Description
OPENROUTER_API_KEY Yes OpenRouter API key
PUSHOVER_USER_KEY Yes Pushover user key
PUSHOVER_APP_TOKEN Yes Pushover app token
OPENROUTER_MODEL No deepseek/deepseek-v3.2-speciale AI model for summarization
POLL_INTERVAL_MINUTES No 15 Minutes between feed polls
MAX_ARTICLES_PER_POLL No 10 Max regular articles processed per cycle
ARXIV_MAX_PER_POLL No 15 Max arXiv papers processed per cycle
RELEVANCE_THRESHOLD No 6 Minimum AI relevance score (1–10)
LOG_LEVEL No info Log verbosity: debug, info, warn, error

RSS Sources

Source Type Priority
OpenAI News official_blog high (bypasses filter)
Google AI Blog official_blog high (bypasses filter)
Google DeepMind official_blog high (bypasses filter)
Hugging Face Blog official_blog normal
TechCrunch AI media normal
MIT Technology Review AI media normal
The Verge AI media normal
Ars Technica media normal
arXiv cs.CL research normal
arXiv cs.LG research normal
arXiv cs.AI research normal
Import AI newsletter normal
Ahead of AI newsletter normal

To add or remove feeds, edit the feeds array in src/config.ts.


Deployment

Raspberry Pi / Linux server (systemd)

# 1. Clone and build
git clone https://github.com/alicankiraz1/newscrux.git ~/newscrux
cd ~/newscrux
npm install
cp .env.example .env
nano .env                                       # fill in your API keys
npm run build

# 2. Install and configure service
cp newscrux.service ~/.config/systemd/user/
nano ~/.config/systemd/user/newscrux.service    # adjust --lang flag if needed

# 3. Enable and start (user-level systemd)
systemctl --user daemon-reload
systemctl --user enable newscrux
systemctl --user start newscrux

# 4. View live logs
journalctl --user -u newscrux -f

Note: The service file uses %h (systemd home directory specifier) so paths are automatically resolved to your home directory. No root access needed.


How It Works

  1. Fetch — Polls all 13 RSS feeds every 15 minutes (configurable) using rss-parser
  2. Deduplicate — Cross-source title similarity check prevents the same story from appearing twice
  3. Discover — New articles are added to a persistent JSON queue (data/article-queue.json) with state discovered
  4. Filter — AI scores each article's relevance 1–10; articles below the threshold are dropped before any summarization cost is incurred. High-priority (official_blog) sources bypass this step entirely.
  5. Enrich — Checks RSS snippet length; if shorter than 300 characters, scrapes the full article using cheerio. Content is capped at 3,000 characters for the summarizer.
  6. Summarize — Sends article content to DeepSeek (via OpenRouter) with a structured JSON prompt in the selected language. Output: translated_title, what_happened, why_it_matters, key_detail, source_type.
  7. Render — Builds the Pushover notification message with HTML formatting and smart truncation to stay within the 1,024-character limit.
  8. Send — POSTs the notification to the Pushover API. The article is only marked sent after a confirmed successful delivery.
  9. Retry — Articles that fail enrichment, summarization, or sending remain in the queue as failed and are retried on the next cycle.

Contributing

See CONTRIBUTING.md for how to add languages, submit fixes, or suggest features.


Author

Alican Kiraz

LinkedIn X Medium HuggingFace GitHub


License

MIT — see LICENSE

About

AI-powered news aggregator with structured multilingual summaries and push notifications

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors