Skip to content

UWNETLAB/tidyextractors

Repository files navigation

tidyextractors

Overview

tidyextractors makes extracting data from supported sources as painless as possible, delivering you a populated Pandas DataFrame in just a few lines of code. tidyextractors was inspired by Hadley Wickham's (2014) paper <http://vita.had.co.nz/papers/tidy-data.html>_ which introduces "tidy data" as a conceptual framework for data preparation.

For more information, including code examples, API reference, and general documentation, click HERE <http://tidyextractors.readthedocs.io/en/latest/>_.

Features

  • Extracts data with minimal effort.
  • Creates readable code that requires minimal explanation.
  • Exports Pandas Dataframes to maximize compatibility with the Python data science ecosystem.

Currently Implemented Data Sources

  • Local Git Repositories <http://tidyextractors.readthedocs.io/en/latest/git_overview.html>_
  • Twitter User Data (including Tweets) using the Twitter API <http://tidyextractors.readthedocs.io/en/latest/twitter_overview.html>_
  • Emails stored in the Mbox file format. <http://tidyextractors.readthedocs.io/en/latest/mbox_overview.html>_

Installing

Just run pip3 install tidyextractors.

About

A collection of tools for extracting tidy data.

Resources

License

Stars

Watchers

Forks

Packages

No packages published