Social Sentinel
January 7, 2026
Tech/AI news aggregator and real-time social media dashboard. Pulls from 60+ RSS sources with sentiment analysis, AI-generated briefings, live pulse charts, and trend tracking.
What is it?
I built this as a real-time tech and AI news dashboard that pulls from RSS feeds, subreddits, and Twitter/X, then pushes everything live to the browser. I wanted one place where I could see fresh updates, sentiment signals, and AI-written summaries without manually checking ten different sites.
The system combines scraping, background collection, sentiment analysis, briefing generation, and live delivery. That mix is what made it fun. It was not just a news page. It was a streaming data pipeline with a frontend on top.
Twitter without an API key
This was one of the messier parts technically. Twitter’s official API is restrictive and expensive, but the website still has to talk to its own backend somehow. So I used a library that authenticates with stored cookies and hits the same GraphQL endpoints the browser uses.
The annoying part is that those internals are not stable. Query IDs change, frontend bundles change, rate limits happen, and scraping can break silently if you are not careful. So I built the collector to re-discover the current GraphQL identifiers, slow itself down between requests, and back off properly when it starts seeing 429s. It is not pretty, but it works.
The SQLite to PostgreSQL migration
I started with SQLite because it was the fastest way to get moving. That was fine at first, but once I deployed on Cloud Run, the limits became obvious. The local filesystem is ephemeral, and that makes SQLite a bad long-term fit for data I actually care about keeping.
So I migrated the project to PostgreSQL on Cloud SQL. I wrote migration scripts, rebuilt the schema cleanly, and moved the app over to a setup that made sense for a deployed service. It was a good reminder that a tool can be correct for v1 and still be the wrong tool for production.
Data collection and the API key rotation
The collection layer runs across multiple sources. RSS and Reddit pulls are much simpler than Twitter, so I let those feed a shared pipeline that deduplicates articles using a stable hash of the URL. Once I have fresh items, I can enrich them instead of storing raw noise.
For the LLM layer, I rotated across multiple Gemini keys because rate limits add up quickly when you are generating summaries repeatedly. That rotation kept the pipeline moving without turning one key into a single point of failure. Once new content is ready, the backend pushes it over WebSockets so the browser updates immediately instead of polling.
Deployment
The deployed version runs on Cloud Run with more memory than most of my other projects because scraping state, sentiment processing, and LLM context all compete for RAM. This app is not compute-heavy in one single burst, but it carries a lot of moving parts in memory over time.
The frontend lives separately on Netlify, which kept the UI deployment lightweight while the backend handled the heavier collection work. I liked that split because it kept the hosting decisions aligned with what each layer actually needed.
Key takeaways
- Twitter scraping without API: cookie auth, GraphQL endpoint discovery, runtime patching for frontend changes
- SQLite to PostgreSQL migration: when to upgrade, migration scripts, Cloud Run persistence constraints
- Multiple Gemini API key rotation for rate limit management
- feedparser for RSS/Atom: handling malformed feeds, deduplication by URL hash
- WebSocket fan-out in FastAPI: connection management, push on new data