Skip to content

Latest commit

 

History

History
75 lines (49 loc) · 3.7 KB

README.md

File metadata and controls

75 lines (49 loc) · 3.7 KB

Astronaut Analysis

This analysis is based on publicly available astronauts data from Wikidata. In this context, we investigated aspects such as time humans spent in space as well as the age distribution of the astronauts.

Total Time Human in Space

Total Time Females in Space

Total Time Males in Space

Age Distribution Box Plot

Age Distribution Histogram

The repository is organized as follows:

  • data: Contains the astronauts data set retrieved from Wikidata
  • code: Contains the astronaut analysis script
  • results: Contains the resulting analysis plots

Astronaut Data

The data set has been generated using the following SPARQL query [1] (retrieval date: 2018-10-25).

You can also analyze a recent version of the astronaut data by replacing the data set and re-running the analysis script:

  • Run the SPARQL query
  • Download the resulting data formatted as JSON
  • Replace the file data/astronauts.json
  • Run the analysis script

Astronaut Analysis Script

The script requires Python >= 3.8 and uses the libraries pandas (BSD 3-Clause License) as well as matplotlib (Matplotlib License).

The script has been successfully tested on Windows 10 and Linux with Python 3.8.

Installation

Please clone this repository and install the required dependencies as follows:

git clone ...
cd astronaut-analysis/code
pip install -r requirements.txt

Usage

You can run the script as follows:

python astronauts-analysis.py

The script processes the astronauts data set and stores the plots in the same directory. Existing result plots will be overwritten.

Testing

The test.sh script performs some basic checks to support maintaining the analysis script:

  • It installs the required packages.
  • It runs the flake8 linter to find programming mistakes and code style issues.
  • It runs the analysis script and checks that the expected plots are produced.

The script runs as part of the GitLab build pipeline to find errors introduced by new commits.

License

Please see the file LICENSE.md for further information about how the content is licensed.