LinuxReport - Multi-Platform News Aggregation

Simple, fast, and intelligent news aggregation platform built with Python/Flask. Designed as a modern drudgereport.com clone that automatically aggregates and curates news from multiple categories, updated 24/7 with AI-powered headline generation.

This project is free and open source software released under the GNU Lesser General Public License v3.0 (LGPL v3).

DeepWiki provides excellent analysis of the codebase, including visual dependency graphs.

🌐 Live Sites

Category	URL	Focus
Linux	linuxreport.net	Linux news, open source, tech
COVID	covidreport.org	Health, pandemic updates
AI	aireport.keithcu.com	Artificial intelligence, ML
Solar/PV	pvreport.org	Solar energy, renewable tech
Techno	news.thedetroitilove.com	Detroit techno music
Space	news.spaceelevatorwiki.com	Space exploration

✨ Key Features

🚀 High Performance: Thread pools and Apache process pools for scalability
🤖 AI-Powered Headlines: Automatic headline curation using 30+ LLM models via OpenRouter.ai
🎯 Multi-Platform: Support for multiple news categories in one codebase
🌙 Dark Mode: User-customizable themes and font sizes
📱 Mobile Responsive: Optimized for all devices
⚡ Advanced Caching: Multi-layer caching system for optimal performance
🌐 CDN Support: s3cmd integration with long cache expiration headers for optimal image delivery
🔒 Secure: Rate limiting, admin authentication, input validation
🛠️ Configurable: Easy RSS feed management and customization

🧠 AI-Powered Intelligence

The system uses sophisticated AI for headline generation through OpenRouter.ai, randomly selecting from over 30 free models including:

Llama 4
Qwen
Mistral variants

If a model fails, it automatically falls back to Mistral Small for reliability. See the model selection logic for implementation details.

🚀 Quick Start

# Clone the repository
git clone https://github.com/KeithCu/LinuxReport
cd LinuxReport

# Install dependencies
pip install -r requirements.txt

# Configure (see Configuration section below)
cp config.yaml.example config.yaml
# Edit config.yaml with your settings

# Run development server
python -m flask run

🏗️ Architecture Overview

LinuxReport uses a sophisticated multi-layered architecture designed for performance and scalability:

Core Technologies

Backend: Python 3.x + Flask with extensions (Login, Limiter, Assets, Mobility)
Database: SQLite via Diskcache for high-performance persistent storage
Caching: Multi-layer system (disk, memory, file-based)
Frontend: Responsive HTML/CSS/JS with automatic bundling and minification
Scraping: BeautifulSoup4 + Selenium with Tor support for complex sites
Images: Automatic WebP conversion and optimization

Performance Features

The system achieves high performance through:

Thread Pools: Concurrent RSS feed processing
Multi-layer Caching: Disk, memory, and file-based caching strategies
CDN Integration: s3cmd synchronization with long cache expiration headers for static assets
Asset Optimization: Automatic JavaScript bundling and CSS minification
Smart Deduplication: Article deduplication across feeds and time periods
Rate Limiting: Intelligent request throttling and IP blocking

📋 Configuration

Required Setup

Edit config.yaml (copy from config.yaml.example if needed):

# IMPORTANT: Change default password for security!
admin:
  password: "YOUR_SECURE_PASSWORD_HERE"
  secret_key: "your-super-secret-key-change-this-in-production"

# Configure your domains
settings:
  allowed_domains:
    - "https://yourdomain.com"
    - "https://www.yourdomain.com"

Configure Report Types: Edit *_report_settings.py files to customize RSS feeds and appearance for each report type.
Production Deployment: Use the included httpd-vhosts-sample.conf for Apache configuration.

Adding New Report Types

To add a new report category:

Create {type}_report_settings.py with RSS feeds and configuration
Add HTML template {type}reportabove.html for custom headlines
Add logos and assets to static/images/
Configure automatic updates in systemd (optional)

🔧 Development

Project Structure

LinuxReport/
├── app.py                    # Flask application setup and configuration
├── routes.py                 # Main routing and request handling
├── shared.py                 # Shared utilities and constants
├── models.py                 # Data models and configurations
├── workers.py                # Background feed processing
├── auto_update.py            # AI headline generation
├── caching.py                # Multi-layer caching system
├── *_report_settings.py      # Report-specific configurations
├── templates/                # Jinja2 templates + modular JavaScript
├── static/                   # CSS, images, compiled assets
├── tests/                    # Test suite
└── config.yaml               # Configuration file

Key Features for Developers

Modular JavaScript: Source files in templates/ auto-bundle to static/
Hot Reload: Development mode with unminified assets for debugging
Type Safety: Type hints throughout the codebase
Comprehensive Caching: See Caching.md for detailed documentation
Test Suite: pytest-based testing in tests/ directory

📖 Documentation

agents.md: Comprehensive guide for AI agents and developers
Caching.md: Detailed caching system documentation
ROADMAP.md: Future development plans
Scaling.md: Performance optimization strategies

🔒 Security

Admin Mode Protection

Admin functionality is protected by authentication:

# config.yaml
admin:
  password: "CHANGE_THIS_DEFAULT_PASSWORD"

⚠️ IMPORTANT: Change the default password immediately after installation!

Security Features

Rate Limiting: Configurable per-endpoint throttling
Input Validation: Secure file uploads and form processing
CORS Protection: Configurable domain allowlists
Security Headers: XSS protection, content type validation
IP Blocking: Persistent banned IP storage

🚀 Production Deployment

Apache Configuration

Use the included httpd-vhosts-sample.conf:

<VirtualHost *:443>
    ServerName yourdomain.com
    WSGIDaemonProcess linuxreport python-path=/path/to/LinuxReport
    WSGIProcessGroup linuxreport
    WSGIScriptAlias / /path/to/LinuxReport/wsgi.py
    # SSL and other configurations...
</VirtualHost>

Systemd Services

For automatic headline updates:

# Copy service files
sudo cp update-headlines.service /etc/systemd/system/
sudo cp update-headlines.timer /etc/systemd/system/

# Enable and start
sudo systemctl enable update-headlines.timer
sudo systemctl start update-headlines.timer

🤝 Contributing

We welcome contributions! Please:

Fork the repository
Create a feature branch
Run tests: pytest tests/
Submit a pull request

Feel free to request new RSS feeds or suggest improvements.

📈 Performance

LinuxReport demonstrates that Python can be incredibly fast when properly architected. The system typically starts returning pages after less than 10 lines of Python code, dispelling myths about Python's performance.

Key performance metrics:

Ultra-fast response times: Averaged 0.01 seconds over a 4-hour production period (on AMD EPYC, standard Python without PyPy)
Zero-read performance: Multi-layer caching (page, sitebox) eliminates most database reads despite constant background feed updates
Concurrent processing of 20+ RSS feeds
Automatic scaling via Apache process pools
Intelligent caching reduces redundant processing by 95%+

The architecture achieves this performance through smart cache layering that serves most requests from memory while background workers continuously update feeds, proving that well-designed caching can deliver bare-metal speeds without requiring specialized hardware or runtime optimizations.

Multi-Process Scalability: LinuxReport elegantly sidesteps Python's GIL limitations by using multiple Apache processes with intelligent cache invalidation. Each process maintains its own memory cache but uses fast SQLite queries to detect when feeds have changed (checking last_render_time only when page cache expires). This eliminates the need for complex message queues, Redis, or inter-process communication while maintaining perfect cache consistency across all processes.

🔧 FastAPI vs Flask (Historical Context)

While FastAPI is a modern, high-performance framework with excellent async support, this project intentionally uses Flask for several reasons:

Why Flask Works Best Here

Simplicity: Flask's synchronous model matches the project's needs perfectly
Maturity: Battle-tested with vast ecosystem and community support
Performance: Current thread pool + caching implementation achieves excellent performance
Development Speed: Flask's simplicity enables rapid iteration and maintenance

FastAPI Considerations

FastAPI offers benefits like automatic API documentation and modern async support, but these are less relevant because:

The site primarily serves HTML pages rather than JSON APIs
Current synchronous code already performs excellently
Existing thread pool implementation handles I/O efficiently
The effort to migrate wouldn't justify the benefits for this use case

If considering a FastAPI migration, you would need to:

Rewrite core application logic
Modify Apache configuration
Restructure the caching system
Update all dependencies and extensions

📄 License

This project is free and open source software released under the GNU Lesser General Public License v3.0 (LGPL v3). See the LICENSE file for complete details.

CDN and Static Asset Delivery

LinuxReport includes sophisticated CDN support for optimal performance:

s3cmd Integration: Automated synchronization of static images to object storage
Long Cache Headers: HTTP expiration headers set to instruct clients to cache images for extended periods
Bandwidth Optimization: Significantly reduces server bandwidth usage and improves global load times
Edge Delivery: Static assets served from CDN edge locations closest to users

The CDN configuration is easily managed through config.yaml and automatically handles cache-busting when needed.

Built with ❤️ for the free and open source community

Name		Name	Last commit message	Last commit date
Latest commit History 1,248 Commits
.vscode		.vscode
static		static
templates		templates
tests		tests
Caching.md		Caching.md
FeedHistory.py		FeedHistory.py
LICENSE		LICENSE
ObjectStorageLock.py		ObjectStorageLock.py
PWA.md		PWA.md
README.md		README.md
README_object_storage_sync.md		README_object_storage_sync.md
ROADMAP.md		ROADMAP.md
Reddit.py		Reddit.py
Scaling.md		Scaling.md
SqliteLock.py		SqliteLock.py
Tor.py		Tor.py
__init__.py		__init__.py
admin_stats.py		admin_stats.py
advanced_minimize_requirements.py		advanced_minimize_requirements.py
agents.md		agents.md
ai_report_settings.py		ai_report_settings.py
analyze_feed_activity.py		analyze_feed_activity.py
app.py		app.py
app_config.py		app_config.py
article_deduplication.py		article_deduplication.py
auto_update.py		auto_update.py
caching.py		caching.py
chat.py		chat.py
config.py		config.py
config.yaml		config.yaml
convert_png_to_webp.py		convert_png_to_webp.py
covid_report_settings.py		covid_report_settings.py
custom_site_handlers.py		custom_site_handlers.py
deploy.py		deploy.py
deploy.sh		deploy.sh
feedfilter.py		feedfilter.py
forms.py		forms.py
function_dependency_graph.svg		function_dependency_graph.svg
generate_dependency_graph.py		generate_dependency_graph.py
generate_docs.py		generate_docs.py
html_generation.py		html_generation.py
httpd-vhosts-sample.conf		httpd-vhosts-sample.conf
image_parser.py		image_parser.py
image_processing.py		image_processing.py
image_utils.py		image_utils.py
linux_report_settings.py		linux_report_settings.py
linuxreportabove.html		linuxreportabove.html
migrate_to_sqlite.py		migrate_to_sqlite.py
models.py		models.py
object_storage_config.py		object_storage_config.py
object_storage_sync.py		object_storage_sync.py
old_headlines.py		old_headlines.py
pv_report_settings.py		pv_report_settings.py
pyproject.toml		pyproject.toml
request_utils.py		request_utils.py
requirements.txt		requirements.txt
requirements_autoupdate.txt		requirements_autoupdate.txt
routes.py		routes.py
seleniumfetch.py		seleniumfetch.py
shared.py		shared.py
space_report_settings.py		space_report_settings.py
sync_static.py		sync_static.py
techno_report_settings.py		techno_report_settings.py
technoreportabove.html		technoreportabove.html
trump_report_settings.py		trump_report_settings.py
trumpreportabove.html		trumpreportabove.html
update-headlines.service		update-headlines.service
update-headlines.timer		update-headlines.timer
update_free_models.py		update_free_models.py
update_headlines.sh		update_headlines.sh
weather.py		weather.py
workers.py		workers.py

License

KeithCu/LinuxReport

Folders and files

Latest commit

History

Repository files navigation

LinuxReport - Multi-Platform News Aggregation

🌐 Live Sites

✨ Key Features

🧠 AI-Powered Intelligence

🚀 Quick Start

🏗️ Architecture Overview

Core Technologies

Performance Features

📋 Configuration

Required Setup

Adding New Report Types

🔧 Development

Project Structure

Key Features for Developers

📖 Documentation

🔒 Security

Admin Mode Protection

Security Features

🚀 Production Deployment

Apache Configuration

Systemd Services

🤝 Contributing

📈 Performance

🔧 FastAPI vs Flask (Historical Context)

Why Flask Works Best Here

FastAPI Considerations

📄 License

CDN and Static Asset Delivery

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages