GitHub - younginnovations/iati-organisations-cleanup: Code to cleanup iati organisations data

IATI Organisations Cleanup

Scrapper prepares organisation.data.xml.csv from publishers' organisation XML files and publishers.data.scrapping.csv from publishers information from the IATI Registry.

For each organisation data, the script checks (see OrganisationCollection>checkAndUpdate)

whether the organisation-list part of the identifier is valid or not based on the org-id.guide
whether the organisation identifier is present in IATI organisation codelist or not
if the identifer already exists, then the metadata is updated if there's a change
if the name already exists, it ignores that organisation and uses the initial identifier that has been saved
else the data is added to the csv list for importing to the database

Usage

Data Cleanup

source are in src/cleanup
Run python initial_cleanup.py to cleanup organisation data

It reads data/organisation.data.xml.csv and data/publishers.data.scrapping.csv and generates out/organisations-clean.csv containing valid organisations information.

The organisations-clean.csv is cleaned-up manually if needed.

Data Dump

source are in src/dump
copy config.py.bak to config.py
create postgres database and update config.py with credentials
Run python dump.py which reads organisations-clean.csv and dumps the data into the database you have just created

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
data		data
out		out
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
readme.md		readme.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

IATI Organisations Cleanup

Usage

Data Cleanup

Data Dump

About

Uh oh!

Releases

Packages

Languages

License

younginnovations/iati-organisations-cleanup

Folders and files

Latest commit

History

Repository files navigation

IATI Organisations Cleanup

Usage

Data Cleanup

Data Dump

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages