- 
                Notifications
    You must be signed in to change notification settings 
- Fork 2
Home
        Jesús Alberto Martínez Mendoza edited this page Mar 22, 2020 
        ·
        1 revision
      
    This project is built with Django 3.0 and uses the following libraries:
- 
beautifulsoup4: Library for extract PDF links from Government website.
- 
camelot-py: Super powerful tool to parse PDF to CSV.
- 
pandas: Auxilary library to handle CSV in an easy way.
- 
requests: Library to make HTTP requests.
All the libraries are found in the requirements.txt file and can be install using the command pip install -r requirements.txt. It's recommended to use a Virtual Environment when installing new libraries.
Data extracted from Mexican Government Daily Technical Report.
All the data mining is found in the file
scripts/fetch_data.py. It contains all the functions to web scrap, download, parse and store in CSV format.
It can be run using Django Extensions:
python3 manage.py runscript fetch_data -v2
At the end of the script it will generate 2 filse with the confirmed and suspected cases.
Example: 2020.03.21_confirmed_cases.csv and 2020.03.21_suspected_cases.csv