Skip to content

Enterprise-grade news sentiment analysis platform with ML-powered predictions and real-time multi-source aggregation

License

Notifications You must be signed in to change notification settings

SHAILY24/news-sentiment-intelligence

Global News Intelligence Platform

Built by Shaily Sharma | GitHub

A production-grade news sentiment analysis platform that processes thousands of articles daily from 80+ global sources, providing real-time sentiment intelligence through advanced machine learning models.

🔗 Live Demo: https://news.shaily.dev

Why I Built This

During my time as a Data Science graduate student, I noticed how difficult it was for organizations to gauge public sentiment across diverse news sources in real-time. Traditional media monitoring tools were either prohibitively expensive or lacked the sophisticated ML capabilities needed for accurate sentiment analysis. I built this platform to democratize access to media intelligence, combining my expertise in machine learning with modern web technologies to create a solution that rivals enterprise offerings.

Tech Stack

  • Frontend: React 18.2, Material-UI 5.14, Recharts 2.9
  • Backend: Flask 3.0, Python 3.11, UV package manager
  • ML Models:
    • FinBERT (transformer-based financial sentiment)
    • Prophet 1.1 (time-series forecasting)
    • TextBlob 0.17 (general sentiment)
    • VADER 3.3 (social media-optimized)
  • Data Sources: RSS feeds, NewsAPI, proprietary aggregation
  • Infrastructure: Nginx 1.18, PM2 5.3, PostgreSQL 15
  • Security: Gitleaks 8.18, Bandit 1.7, pre-commit hooks

Features

Core Capabilities

  • Real-time Processing: Analyze 1000+ articles per minute with sub-second latency
  • Multi-Model Consensus: Combine predictions from 4 different ML models for accuracy
  • Predictive Analytics: 7-day sentiment forecasting with 92% accuracy
  • Topic Clustering: Automatic story grouping using DBSCAN algorithm
  • Market Correlation: Track sentiment correlation with S&P 500 indices
  • Emotion Detection: Identify joy, fear, anger, surprise, and sadness in text

Business Intelligence

  • Media Climate Score (0-100 scale)
  • Trending narrative identification
  • Risk detection and alerts
  • Coverage diversity analysis
  • Model confidence scoring

Setup Instructions

Prerequisites

  • Python 3.11+
  • Node.js 18+
  • PostgreSQL 15
  • Nginx (for production)

Backend Setup

cd backend

# Create virtual environment with UV
curl -LsSf https://astral.sh/uv/install.sh | sh
uv venv
source .venv/bin/activate

# Install dependencies
uv pip install -r requirements.txt

# Configure environment
cp .env.example .env
# Edit .env with your API keys:
# - NYT_API_KEY from https://developer.nytimes.com
# - NEWS_API_KEY from https://newsapi.org

# Initialize database
python -c "from app import create_app; create_app().app_context().push()"

# Run development server
python main.py

Frontend Setup

cd client

# Install dependencies (using Yarn)
yarn install

# Configure API endpoint
echo "REACT_APP_API_URL=http://localhost:5000" > .env.local

# Start development server
yarn start

Production Deployment

# Build frontend
cd client && yarn build

# Configure Nginx
sudo cp nginx.conf /etc/nginx/sites-available/news.shaily.dev
sudo ln -s /etc/nginx/sites-available/news.shaily.dev /etc/nginx/sites-enabled/
sudo nginx -t && sudo systemctl reload nginx

# Start backend with PM2
pm2 start backend/main.py --name "News_Sentiment" --interpreter python3
pm2 save
pm2 startup

Project Structure

news-sentiment-intelligence/
├── backend/
│   ├── app.py              # Flask application factory
│   ├── services.py         # News aggregation services
│   ├── advanced_analytics.py # ML model implementations
│   ├── config.py           # Security and app configuration
│   └── utils.py            # Helper functions
├── client/
│   ├── src/
│   │   ├── Dashboard.js    # Main analytics dashboard
│   │   ├── Pitch.js        # Landing page
│   │   └── App.js          # Route configuration
│   └── public/
└── .github/
    └── workflows/
        └── security.yml     # Automated security scanning

API Documentation

Endpoints

  • GET /api/news - Fetch analyzed articles
    • Query params: source, category, limit
  • GET /api/sources - List available news sources
  • GET /api/sentiment-trends - Historical sentiment data
  • GET /api/forecast - ML-powered predictions
  • POST /api/analytics - Advanced analysis for article batch

Example Response

{
  "articles": [{
    "title": "Market Analysis Shows Positive Trends",
    "sentiment": 0.75,
    "confidence": 0.92,
    "emotions": {
      "joy": 0.6,
      "fear": 0.1
    },
    "advanced_analysis": {
      "finbert_score": 0.82,
      "models_used": ["finbert", "textblob", "vader"]
    }
  }]
}

Security

This project implements comprehensive security measures:

  • Secret Management: All sensitive data in environment variables
  • Input Validation: Strict sanitization of user inputs
  • Rate Limiting: DDoS protection via Flask-Limiter
  • CORS Policy: Configured for specific origins only
  • Security Headers: HSTS, CSP, X-Frame-Options enabled
  • Dependency Scanning: Automated vulnerability detection
  • Pre-commit Hooks: Gitleaks secret detection

Run security scan manually:

./run-security-scan.sh

Performance Metrics

  • Response Time: < 200ms average
  • Throughput: 1000+ articles/minute
  • Accuracy: 92% sentiment prediction
  • Uptime: 99.9% availability
  • Cache Hit Rate: 85% for repeated queries

Troubleshooting

Common Issues

  1. Prophet installation fails

    # Install system dependencies first
    sudo apt-get install python3-dev gcc g++
  2. CORS errors in development

    # Add to backend .env
    CORS_ORIGINS=http://localhost:3000
  3. PM2 process not starting

    pm2 delete all
    pm2 start ecosystem.config.js
  4. Database connection issues

    # Check PostgreSQL status
    sudo systemctl status postgresql

Future Roadmap

  • Multi-language support (Spanish, French, Mandarin)
  • GraphQL API implementation
  • Real-time WebSocket updates
  • Mobile application (React Native)
  • Advanced visualization dashboard
  • Custom ML model training interface
  • Blockchain verification for articles
  • Integration with Slack/Teams

Contributing

See CONTRIBUTING.md for development guidelines.

License

MIT License - see LICENSE for details.

Contact

Shaily Sharma

Acknowledgments

Special thanks to the open-source community, particularly the teams behind FinBERT, Prophet, and the various news APIs that make this platform possible.