Skip to content

sharof20/HousePrice

Repository files navigation

House Price Project

The House Price python project comprises of tasks starting from the one for collecting apartment ads in Ulaanbaatar, Mongolia placed in www.unegui.mn, the most popular portal site for ads and to the one building a machine learning model for estimating apartment prices.

The project utilizes and operates on the workflow management system pytask.

How to Get Started:

To get started, follow these steps:

  1. Clone the repository to have a local copy of the project on your machine.

  2. Create a Conda environment specific to this project and activate it using the following commands:

    $ conda env create -f environment.yml
    $ conda activate houseprice

    This will help you manage dependencies and ensure that you have the necessary packages installed which are written down in the file environment.yml.

  3. Install Chromedriver: This is not required to run the project by default and you can already proceed to the next step. We skip the task of data collection through webscraping in Chrome as it takes time (about 10 hours) and the rest of the tasks will use the data that we have already collected. Therefore, you can install chromedriver only if you want to collect data from scratch. Go here to download and install the chromedriver on your machine. Make sure you have a version that is compatible with your Chrome browser.

    On the other hand, if you want to run the webscraping part, then set NO_LONG_RUNNING_TASKS to False in config.py under src folder.

Usage:

Running the project

Once you have set up the environment and installed the necessary packages, you can build the project running the pytask command in the home folder of the project:

$ pytask

The project results will be created in bld folder. The project contains the following task modules which each of which consists of several tasks:

  • data_collection - Data collection via webscraping (skip by default)
  • data_management - Clean raw data
  • analysis - Visualize the cleaned data
  • model - Run machine learning model on the data
  • paper - Short summary of the project and its results

Tests

We have implemented various tests in the tests directory. The tests are based on the testing framework pytest. You can run them as follows:

$ pytest

Credits

This project was created with cookiecutter and the econ-project-templates. We appreciate the contributions of the open-source community in creating and maintaining these tools.

About

This project involves creating a model that can accurately estimate the prices of houses.

Resources

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •