This project analyzes the UN General Debate Corpus from 1970 to 2023. It includes exploratory data analysis (EDA), predictive modeling, and data visualizations focusing on uncovering insights from political speeches and their connection to global challenges.
Figure 1. United Nations General Debate Corpus 1946-2023 | Dataset |
Start by cloning the repository to your local machine:
git clone https://github.com/danilotpnta/UN-General-Debate-Analysis-SDGs.git
cd UN-General-Debate-Analysis-SDGs
Create the Conda environment using the provided environment.yml
file. This will install all the necessary dependencies, including Python 3.9, JupyterLab, and various data analysis and visualization libraries.
conda env create -f environment.yml
Once the environment is created, activate it with the following command:
conda activate debates_analysis
The project includes a script to download files from the Dataverse repository. You can run this script to download the raw data needed for the analysis. The data will be saved in the data/raw/
directory.
python utils/dataverse_downloader.py
To run the notebook, launch JupyterLab or Jupyter Notebook:
jupyter lab
This will open a new tab in your browser. You can navigate to the notebook.ipynb
file and start running the cells.
Open the notebook.ipynb
file in Jupyter and run the cells. The notebook will guide you through the exploratory and predictive analysis of the UN General Debate dataset.