Skip to content

CodeWithDks/Salary-Predictor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

💰 Advanced Salary Prediction Web App

🚀 AI-Powered Salary Insights for the Modern Workforce

Python Streamlit Scikit-learn License

🎯 91.8% Accuracy | 📊 Advanced ML Model | 🌐 Interactive Web Interface


🌟 Overview

Welcome to the Advanced Salary Prediction Web App - a cutting-edge machine learning application that predicts salaries with remarkable 91.8% accuracy! Built using state-of-the-art Gradient Boosting algorithms, this app provides data-driven salary insights for professionals across various industries.

✨ Key Highlights

  • 🧠 Advanced ML Model: Gradient Boosting with optimized hyperparameters
  • 📈 High Accuracy: 91.8% prediction accuracy on test data
  • 🎨 Interactive UI: Beautiful Streamlit-powered web interface
  • 📊 Rich Visualizations: Comprehensive data analysis and insights
  • 🔍 Feature Analysis: Detailed breakdown of salary-influencing factors
  • 💡 Smart Predictions: Confidence intervals and key factor explanations

🎯 Model Performance

Our advanced machine learning model delivers exceptional results:

🏆 Best Model: Gradient Boosting Regressor
📊 Test Accuracy: 91.8%
📉 Generalization Gap: 0.0451
🎯 RMSE: $11,434
📍 MAE: $9,238

📈 Performance Metrics

  • Training R²: 0.9636
  • Test R²: 0.9185
  • Low Overfitting: Excellent generalization capability
  • Robust Predictions: Consistent across different salary ranges

🚀 Quick Start

Prerequisites

1️⃣ Clone the Repository

git clone https://github.com/yourusername/salary-predictor.git
cd salary-predictor

2️⃣ Create Virtual Environment

# Create virtual environment
python -m venv venv

# Activate virtual environment
# Windows:
venv\Scripts\activate
# macOS/Linux:
source venv/bin/activate

3️⃣ Install Dependencies

pip install -r requirements.txt

4️⃣ Launch the Web App

streamlit run app.py

🎉 That's it! The app will open in your browser at http://localhost:8501


📂 Project Structure

salary-predictor/
├── 📊 data/
│   └── Salary_Data.csv              # Training dataset
├── 🧠 models/
│   ├── salary_prediction_model.joblib    # Trained ML model
│   ├── model_metadata.json              # Model configuration
│   └── visualization_data.json          # Visualization data
├── 📈 visualizations/                    # Generated plots and charts
├── 🌐 app.py                            # Main Streamlit application
├── 🔧 salary_model_api.py               # Model API and utilities
├── 📋 requirements.txt                   # Python dependencies
├── 📖 README.md                         # This file
└── 🎯 .gitignore                        # Git ignore file

💻 Features

🔮 Salary Prediction

  • Real-time Predictions: Get instant salary estimates
  • Confidence Intervals: Understand prediction reliability
  • Range Estimates: Min-max salary ranges for better planning

📊 Data Insights

  • Feature Importance: See what factors matter most
  • Correlation Analysis: Understand relationships between variables
  • Interactive Visualizations: Explore data through beautiful charts

🎯 Key Factors Analyzed

  • Years of Experience (30.8% impact)
  • Age (22.4% impact)
  • Education Level (10.4% impact)
  • Job Title
  • Gender
  • Industry Specifics

📊 Sample Predictions

👨‍💻 Senior Software Engineer

  • Predicted Salary: $152,670
  • Confidence Range: $144,506 - $160,833
  • Key Factors: Master's degree, 8+ years experience

👩‍🔬 Junior Data Scientist

  • Predicted Salary: $82,826
  • Confidence Range: $74,663 - $90,990
  • Key Factors: High-demand role, growth potential

👨‍💼 Experienced Manager

  • Predicted Salary: $217,888
  • Confidence Range: $209,725 - $226,052
  • Key Factors: Leadership role, extensive experience

🛠️ Technical Details

Machine Learning Pipeline

  1. Data Preprocessing: Feature engineering and cleaning
  2. Model Selection: Comprehensive algorithm comparison
  3. Hyperparameter Tuning: Grid search optimization
  4. Validation: Cross-validation and performance testing
  5. Deployment: Streamlit web interface

Model Architecture

GradientBoostingRegressor(
    learning_rate=0.1,
    max_depth=5,
    min_samples_split=20,
    n_estimators=150,
    subsample=0.8
)

Dataset Statistics

  • Average Salary: $135,375
  • Salary Range: $67,247 - $234,803
  • Sample Size: Comprehensive professional dataset
  • Features: Age, Experience, Education, Job Title, Gender

📋 Requirements

pandas>=1.5.3
numpy>=1.23.5
scikit-learn>=1.2.2
scipy>=1.10.1
matplotlib>=3.6.3
seaborn>=0.12.2
plotly>=5.14.1
joblib>=1.2.0
streamlit>=1.22.0

🎨 Screenshots & Demos

🖥️ Main Interface

Beautiful, intuitive web interface for salary predictions

📊 Data Visualizations

Interactive charts showing salary distributions and correlations

🔍 Prediction Results

Detailed prediction breakdown with confidence intervals


🚀 Deployment Options

Local Development

streamlit run app.py

Streamlit Cloud

  1. Push to GitHub
  2. Connect to Streamlit Cloud
  3. Deploy with one click

Docker Deployment

FROM python:3.11-slim
COPY . /app
WORKDIR /app
RUN pip install -r requirements.txt
EXPOSE 8501
CMD ["streamlit", "run", "app.py"]

🤝 Contributing

We welcome contributions! Here's how you can help:

  1. 🍴 Fork the repository
  2. 🌿 Create a feature branch (git checkout -b feature/AmazingFeature)
  3. 💾 Commit your changes (git commit -m 'Add some AmazingFeature')
  4. 📤 Push to the branch (git push origin feature/AmazingFeature)
  5. 🔄 Open a Pull Request

Ideas for Contribution

  • 🎨 UI/UX improvements
  • 📊 Additional visualization features
  • 🧠 Model performance enhancements
  • 📱 Mobile responsiveness
  • 🌐 Multi-language support

📈 Future Enhancements

  • 🔄 Real-time data updates
  • 🌍 Geographic salary variations
  • 📱 Mobile app development
  • 🤖 Advanced AI features
  • 📊 Industry-specific models
  • 🔐 User authentication system

🐛 Troubleshooting

Common Issues

Issue: ModuleNotFoundError

# Solution: Ensure virtual environment is activated
pip install -r requirements.txt

Issue: Port already in use

# Solution: Use different port
streamlit run app.py --server.port 8502

Issue: Model file not found

# Solution: Ensure models directory exists
mkdir models
# Re-run training script if needed

📞 Support & Contact


📄 License

This project is licensed under the MIT License - see the LICENSE file for details.


🙏 Acknowledgments

  • 📚 Scikit-learn team for the amazing ML library
  • 🎨 Streamlit for the beautiful web framework
  • 📊 Plotly for interactive visualizations
  • 🤝 Open Source Community for inspiration and support

🌟 Star this repository if you found it helpful! 🌟

Made with ❤️ and lots of ☕

⬆ Back to Top

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages