🌾 AgriIntel: Climate Impact Analytics Platform

💾 MongoDB-First Big Data Architecture

A comprehensive, MongoDB-first Streamlit web application for analyzing and predicting climate change impacts on global agriculture. Built to showcase the power of MongoDB aggregation pipelines for processing large-scale datasets without loading data into memory.

🎯 Project Overview

AgriIntel is a data-driven platform that helps researchers, policymakers, and farmers understand and adapt to climate change impacts on agriculture through:

🔥 MongoDB-First Architecture: All data operations use MongoDB queries and aggregations (30+ pipelines)
📊 Big Data Processing: Handle millions of records without memory issues
🤖 Advanced Analytics: Exploratory analysis, risk assessment, and ML forecasting
📈 Predictive Modeling: ML-powered yield forecasting through 2050
🎨 Beautiful UI: Professional design with interactive visualizations
💾 Full CRUD: Complete database management with MongoDB operations
⚡ High Performance: Indexed queries, aggregation pipelines, zero data loading

🌟 Why MongoDB-First?

Traditional approach (loading all data to pandas):

df = pd.read_csv('huge_file.csv')  # 500MB+ in memory ❌
grouped = df.groupby('Country')['Yield'].mean()  # Slow, not scalable ❌

Our approach (MongoDB aggregation):

pipeline = [
    {'$group': {'_id': '$Country', 'avg_yield': {'$avg': '$Crop_Yield_MT_per_HA'}}},
    {'$sort': {'avg_yield': -1}}
]
results = handler.aggregate('climate_agriculture_data', pipeline)  # Fast, scalable ✅

Benefits:

✅ Process billions of records
✅ Millisecond query performance
✅ Zero memory overhead
✅ Production-ready scalability

🏗️ Project Architecture

project/
│
├── app.py                       # Main launcher with navigation
├── db_connection.py             # MongoDB connection handler
│
├── pages/
│   ├── 1_📊_EDA.py              # Exploratory Data Analysis
│   ├── 2_🌪️_Extreme_Weather.py  # Extreme Weather Risk Analysis
│   ├── 3_📈_Forecasting.py       # Time-series & ML Prediction
│   ├── 4_🔬_Correlation_Lab.py   # Correlation Analysis
│   ├── 5_🧠_Adaptation.py        # Strategy Simulation
│   ├── 6_🤖_Farmer_Assistant.py  # Yield Prediction Assistant
│   ├── 7_🗂️_Admin_Panel.py       # MongoDB CRUD & Uploads
│   └── 8_💾_MongoDB_Analytics.py # Advanced MongoDB Operations Hub ⭐
│
├── models/
│   └── (trained models saved here)
│
├── data/
│   └── climate_change_impact_on_agriculture_2024.csv
│
├── requirements.txt
└── README.md

⚙️ Tech Stack

Frontend

Streamlit - Web framework
Plotly - Interactive visualizations
Folium - Geographic mapping
streamlit-option-menu - Enhanced navigation

Backend

Python 3.8+
Pandas & NumPy - Data processing
scikit-learn - Machine learning
Prophet - Time-series forecasting
PyMongo - MongoDB driver

Database

MongoDB - NoSQL database for climate data storage

🚀 Installation & Setup

Prerequisites

Python 3.8 or higher

MongoDB installed and running locally

Download from: https://www.mongodb.com/try/download/community

Or install via package manager:

# macOS
brew tap mongodb/brew
brew install mongodb-community

# Ubuntu
sudo apt-get install mongodb

# Windows: Download installer from MongoDB website

Step 1: Clone/Download Project

# If using git
git clone <repository-url>
cd agri-intel

# Or extract downloaded zip file
unzip agri-intel.zip
cd agri-intel

Step 2: Create Virtual Environment

# Create virtual environment
python -m venv venv

# Activate virtual environment
# On Windows:
venv\Scripts\activate

# On macOS/Linux:
source venv/bin/activate

Step 3: Install Dependencies

pip install -r requirements.txt

Step 4: Start MongoDB

# Start MongoDB service
# On macOS:
brew services start mongodb-community

# On Ubuntu:
sudo systemctl start mongod

# On Windows: Start MongoDB service from Services panel
# Or run: mongod --dbpath <path-to-data-directory>

Verify MongoDB is running:

# Connect using MongoDB Compass or mongo shell
mongosh
# Should connect without errors

Step 5: Prepare Dataset

Place your CSV file in the data/ directory:

data/climate_change_impact_on_agriculture_2024.csv

Required CSV columns (minimum):
- Country
- Year
- Crop_Type
- Crop_Yield_MT_per_HA
- Average_Temperature_C
- Total_Rainfall_mm
- CO2_Emissions_MT
Optional columns for enhanced features:
- Fertilizer_Use_KG_per_HA
- Pesticide_Use_KG_per_HA
- Irrigation_Access_Pct
- Extreme_Weather_Events
- Precipitation_Anomaly_mm

🎮 Running the Application

Start the App

streamlit run app.py

The application will open in your browser at http://localhost:8501

First Time Setup

Navigate to Admin Panel (page 7)
Upload your CSV file to MongoDB
Wait for data import to complete
Explore other modules - data is now available across all pages

📚 Module Guide

1. 🏠 Home Dashboard

Project overview and mission
Key performance indicators (KPIs)
Global metrics visualization
Quick navigation to all modules

2. 📊 EDA (Exploratory Data Analysis) ⭐ MongoDB-Powered

Filters: Country, crop, year range
Visualizations: Trends, distributions, geographic maps
MongoDB Aggregations:
- get_time_series_aggregation() - Yearly trends via $group
- get_country_rankings() - Top countries via aggregation
- get_filtered_statistics() - Real-time stats via $avg, $stdDevPop
- Nested grouping for complex regional analysis
Export: Download filtered data as CSV

Key MongoDB Pipeline Example:

pipeline = [
    {'$match': {'Country': 'India'}},
    {'$group': {
        '_id': '$Year',
        'avg_yield': {'$avg': '$Crop_Yield_MT_per_HA'},
        'count': {'$sum': 1}
    }},
    {'$sort': {'_id': 1}}
]

3. 🌪️ Extreme Weather Risk Analysis ⭐ MongoDB-Powered

Risk Index Calculation: Weighted climate anomaly scores
Geographic Risk Maps: Identify vulnerable regions
Trend Analysis: Risk evolution over time
MongoDB Aggregations:
- get_extreme_weather_analysis() - Nested $group for variance calculation
- Temperature/rainfall anomaly buckets via $bucket
- Multi-stage pipelines with $stdDevPop
Automated Insights: Top affected regions

Key MongoDB Pipeline Example:

pipeline = [
    {'$group': {
        '_id': {'country': '$Country', 'year': '$Year'},
        'avg_temp': {'$avg': '$Average_Temperature_C'}
    }},
    {'$group': {
        '_id': '$_id.country',
        'temp_variance': {'$stdDevPop': '$avg_temp'},
        'years_tracked': {'$sum': 1}
    }},
    {'$sort': {'temp_variance': -1}}
]

4. 📈 Forecasting

ML Model: Random Forest Regressor
Predictions: Yield forecasts through 2050
Confidence Intervals: 95% prediction bands
Scenario Analysis: Test climate change scenarios
MongoDB: Load training data via aggregation, save predictions

5. 🔬 Correlation Lab

Pairwise Correlation: Pearson/Spearman methods
Correlation Matrix: Full heatmap visualization
Multiple Variables: Compare predictors
MongoDB: Efficient data sampling and filtering

6. 🧠 Adaptation Strategy Simulator

Climate Scenarios: IPCC RCP 2.6, 4.5, 8.5
Adaptation Measures: Irrigation, fertilizer, technology
Comparison: With/without adaptation analysis
Economic Impact: Revenue projections
MongoDB: Baseline statistics via aggregation

7. 🤖 Farmer Assistant

Input Form: Region, crop, climate conditions
AI Predictions: Expected yield and risk category
Personalized Advice: Adaptive strategies based on inputs
Export Reports: Download prediction reports
MongoDB: Train ML model from aggregated data

8. 💾 MongoDB Analytics Hub ⭐ PRIMARY SHOWCASE

Query Builder: Build and execute MongoDB queries
- Simple match, range queries, complex conditions, custom JSON
Aggregation Pipelines: 10+ pre-built pipelines
- Yearly trends, country rankings, crop comparisons
- Climate impact bucketing, multi-stage operations
Statistical Analysis: Compute stats in MongoDB
- Descriptive statistics, variance analysis, percentiles
Geospatial Queries: Country-level aggregations
Performance Metrics: Query benchmarking and optimization
Custom Pipelines: Execute your own JSON pipelines

Featured Pipelines:

Yearly yield trends with $group and $avg
Country performance with $addToSet and $size
Crop comparison with $stdDevPop and coefficient of variation
Climate bucketing with $bucket
Complex multi-stage with $match, $addFields, $switch

9. 🗂️ Admin Panel ⭐ MongoDB CRUD Operations

Upload CSV: Bulk insert via insert_dataframe()
View Collections: Browse data with query_to_dataframe()
Search & Filter: MongoDB regex and range queries
Delete Records: Safe deletion with delete_documents()
Statistics: Collection stats via get_collection_stats()
Index Management: Create indexes for performance

🔧 Configuration

MongoDB Connection

Edit db_connection.py to change MongoDB settings:

# Default connection
uri = "mongodb://localhost:27017/"
db_name = "agri_intel"

# Custom connection
uri = "mongodb://username:password@host:port/"
db_name = "your_database_name"

Performance Tuning

For large datasets (>100K records):

Limit data loading in load_climate_data():

df = load_climate_data(limit=50000)  # Adjust as needed

Create indexes in Admin Panel for faster queries
Use filters in EDA to reduce data processing

🐛 Troubleshooting

MongoDB Connection Issues

Error: ServerSelectionTimeoutError

Solution: Ensure MongoDB is running: brew services list or sudo systemctl status mongod
Check connection URI in db_connection.py

Missing Dependencies

Error: ModuleNotFoundError

Solution: Reinstall requirements: pip install -r requirements.txt --upgrade

Slow Performance

Issue: App is slow with large dataset

Solution:
- Reduce data limit in cache functions
- Create MongoDB indexes (Admin Panel)
- Use date range filters in EDA

Data Upload Fails

Error: Upload to MongoDB fails

Solution:
- Check CSV format and encoding (UTF-8 recommended)
- Ensure column names match expected format
- Verify MongoDB connection is active

📊 Sample Workflow

For Researchers

Upload Data → Admin Panel
Explore Patterns → EDA module
Assess Risks → Extreme Weather module
Make Predictions → Forecasting module
Export Results → Download CSVs and reports

For Farmers

Get Prediction → Farmer Assistant
Input Farm Details → Climate and soil conditions
Review Advice → Adaptive strategies
Download Report → Save recommendations

For Policy Makers

Identify Vulnerable Regions → Extreme Weather
Forecast Future Impacts → Forecasting
Analyze Correlations → EDA
Export Insights → Generate reports

🔒 Data Security

Local Storage: All data stored in local MongoDB instance
No Cloud Upload: Data never leaves your machine
Backup Recommended: Regular MongoDB backups for production use

🚀 Future Enhancements

Modules planned for future versions:

Correlation Lab: Interactive correlation matrix builder
Adaptation Strategy Simulator: Test "what-if" scenarios
Real-time Data: Integration with live weather APIs
Multi-user Support: Authentication and role-based access
Mobile App: React Native companion app

📝 License

This project is created for educational and research purposes. Feel free to modify and extend for your needs.

🤝 Contributing

Contributions welcome! Areas for improvement:

Additional ML models (LSTM, XGBoost)
Enhanced visualizations
Real-time data integration
Performance optimizations
Documentation improvements

📧 Support

For issues or questions:

Check Troubleshooting section
Review MongoDB and Streamlit documentation
Create an issue in the repository

🙏 Acknowledgments

Streamlit - Amazing framework for data apps
MongoDB - Flexible database solution
Plotly - Beautiful interactive charts
scikit-learn - Powerful ML library

📌 Quick Reference

Common Commands

# Start app
streamlit run app.py

# Start MongoDB
brew services start mongodb-community  # macOS
sudo systemctl start mongod            # Linux

# Update dependencies
pip install -r requirements.txt --upgrade

# Clear Streamlit cache
# Use "Clear cache" button in app or restart app

Key Files

app.py - Main entry point
db_connection.py - Database utilities
pages/ - Individual modules
requirements.txt - Dependencies

Built with ❤️ for Climate-AI Research 2025

🌱 Data-driven insights for sustainable agriculture in a changing climate

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
__pycache__		__pycache__
data		data
pages		pages
Agri.jpg		Agri.jpg
LICENSE		LICENSE
PROJECT_SUMMARY.md		PROJECT_SUMMARY.md
README.md		README.md
app.py		app.py
db_connection.py		db_connection.py
home.py		home.py
requirements.txt		requirements.txt

License

vivek081202/AgriIntel

Folders and files

Latest commit

History

Repository files navigation