π Economics Undergraduate @ UCalgary | Aspiring Data Scientist | Future Business Analyst
My name is Bozhao Wang. I am a data science enthusiast with expertise in machine learning, predictive modelling, and statistical analysis. With a background in economics and data science, I specialize in using machine learning algorithms to extract insights from complex datasets and drive data-driven decision-making. I am actively seeking opportunities in data science, business analytics and related fields where I can apply my quantitative expertise and analytical skills to solve real-world challenges.
Programming Languages:
- Python
Machine Learning & Statistical Modeling:
- Classification & Regression (Logistic Regression, Random Forest, XGBoost)
- Hyperparameter Tuning (GridSearchCV, RandomizedSearchCV)
- Handling Imbalanced Data (SMOTE)
- Model Evaluation (AUC, Precision, Recall, Confusion Matrix)
Data Handling & Visualization:
- Exploratory Data Analysis (EDA)
- Data cleaning and preprocessing
- Data visualization
- Dashboard development (PowerBI, EXCEL)
Tools & Platforms:
- GitHub
- Jupyter Notebooks
- Microsoft Excel
- Power BI
- Credit Risk Prediction Using Supervised Machine Learning Model
Built and compared supervised learning models to predict credit card default using imbalanced financial data. Applied SMOTE oversampling and model evaluation metrics to enhance predictive performance and support risk assessment strategies. - Urban System Revenue Prediction with XGBoost (DSMLC Competition)
Applied XGBoost regression modelling to predict municipal revenue in urban systems using infrastructure investment data. Feature engineering, log transformation, and model tuning achieved high predictive accuracy. - Quantify Energy Risk Case Competition 2025
Built classification models (Logistic Regression, Random Forest, XGBoost) to predict high-loss CAT events. Created an interactive Power BI dashboard with parametric triggers and strategic recommendations for renewable expansion. - Demographic Trends and Housing Analysis in Calgary (Capstone Project) Conducted regression analysis on Calgary's housing supply and population growth using historical census and building permit data. Identified key factors influencing demographic shifts and housing demands to inform urban planning strategies.
- Detecting COVID-19 Health Misinformation Targeting Older Adults
Developed and compared TFβIDF + Logistic Regression and fine-tuned BERT classifiers on COVID-19 tweets, evaluated cross-platform robustness on senior-focused Reddit posts, applied SHAP for interpretability, and used LDA topic modelling to uncover key misinformation themes.
- π» Currently Learning NLP & LLM.
- π§ͺ Participating in national competitions:
- National Mental Health Datathon 2025
- First DREAM Target 2035 Drug Discovery Challenge
- Advancing my skills in machine learning and Python
- π― Open to hackathons, case competitions, and interdisciplinary collaborations
- π Open to collaboration and internships in Data Science, Business Analytics, or Applied Research.
- π’ How to reach me: [email protected]