Welcome to the Data Science: Time Series Analysis repository! This repository contains studies, projects, and hands-on experiments developed during the Data Science course at Alura. The primary focus is on data analysis, predictive modeling, and applied statistics, with a special emphasis on time series analysis.
- Data Cleaning and Transformation: Techniques to handle messy data, including normalization, scaling, and encoding.
- Handling Missing Values and Outliers: Strategies to impute missing data and detect/remove outliers for robust analysis.
- Statistical Metrics and Visualizations: Using descriptive statistics and visual tools to uncover insights.
- Identifying Patterns and Trends: Analyzing data to detect seasonality, trends, and anomalies.
- Formulating and Validating Hypotheses: Designing experiments and testing assumptions.
- Applying Statistical Tests: Using tests like t-tests, ANOVA, and chi-square to validate hypotheses.
- Understanding Temporal Patterns: Decomposing time series into trend, seasonality, and residuals.
- Building Predictive Models: Leveraging tools like Prophet and ARIMA for forecasting.
- Evaluating Model Performance: Metrics such as MAE, RMSE, and MAPE to assess accuracy.
- Handling Outliers in Forecasting: Advanced methods to mitigate the impact of outliers on predictions.
- Interactive Visualization: Using libraries like Plotly and Dash for dynamic and interpretable visualizations.
- Programming Language: Python
- Libraries:
- Data Manipulation:
Pandas,NumPy - Visualization:
Matplotlib,Seaborn,Plotly - Statistical Analysis:
Statsmodels,SciPy - Machine Learning:
Scikit-learn,Prophet
- Data Manipulation:
- Google Colab: For interactive coding and visualization.
- Python 3.8 or higher installed.
- Basic understanding of Python, statistics, and machine learning.