Collaboration/preprocessing at master · Biswajee/Collaboration

History

Name		Name	Last commit message	Last commit date
parent directory ..
data.json		data.json
pantheon.tsv		pantheon.tsv
prepare_json.ipynb		prepare_json.ipynb
readme.md		readme.md

readme.md

Data preprocessing for visualizations module

This repository contains files used to develop data.json for use in visualizations module.

pantheon.tsv

This dataset is taken from Harvard Dataverse and contains a manually verified dataset of individuals that have transcended linguistic, temporal, and geographic boundaries. The Pantheon 1.0 dataset includes the 11,341 biographies present in more than 25 languages in Wikipedia and is enriched with:
- manually verified demographic information (place and date of birth, gender)
- a taxonomy of occupations classifying each biography at three levels of aggregation and
- two measures of global popularity including the number of languages in which a biography is present in Wikipedia (L), and the Historical Popularity Index (HPI) a metric that combines information on L, time since birth, and page-views (2008-2013).
prepare_json.ipynb - Jupyter notebook that generates the JSON file from dataset.
data.json - the json file to be used in visualizations module.

Sample entries in data.json

[
    {
      "category_type": "POLITICIAN",
      "name": "Abraham Lincoln",
      "id": 307,
      "description": "UNITED STATES"
    },
    {
      "category_type": "PHILOSOPHER",
      "name": "Aristotle",
      "id": 308,
      "description": "Greece"
    }
]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

preprocessing

preprocessing

readme.md

Data preprocessing for visualizations module

Dataset URL : https://dataverse.harvard.edu/file.xhtml?persistentId=doi:10.7910/DVN/28201/VEG34D&version=1.0

Files

preprocessing

Directory actions

More options

Directory actions

More options

Latest commit

History

preprocessing

Folders and files

parent directory

readme.md

Data preprocessing for visualizations module

Dataset URL : https://dataverse.harvard.edu/file.xhtml?persistentId=doi:10.7910/DVN/28201/VEG34D&version=1.0