Skip to content

Commit 9e4724c

Browse files
Modify README
1 parent 551a506 commit 9e4724c

File tree

2 files changed

+28
-7
lines changed

2 files changed

+28
-7
lines changed

README.md

Lines changed: 28 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,9 @@ This repository provides a modular template for building recommender systems in
2121

2222
### 📦 Dataset
2323

24-
As an example, this template uses the [ContentWise Impressions](https://github.com/ContentWise/contentwise-impressions) dataset, which contains real-world implicit feedback data.
24+
As an example, this template uses the [ContentWise Impressions](https://github.com/ContentWise/contentwise-impressions) dataset - a collection of implicit interactions and impressions of movies and TV series from an Over-The-Top media service, which delivers its media contents over the Internet. ***In the preprocessing phase the dataset is being limited to content of movies only.***
25+
26+
Exporatory data analysis can be found in [contentwise_eda.ipynb](notebooks/contentwise_eda.ipynb).
2527

2628
### 🚀 Use Cases
2729

@@ -44,12 +46,12 @@ To make use of this repository, follow these steps:
4446

4547
2. **Set up external services**
4648
- Configure your connection to a ClearML server for experiment tracking.
47-
- (Optional) Set up access to AWS S3 if you want to use remote storage for data or models.
49+
- (Optional) Set up access to AWS S3 if you want to use remote storage for data or/and models.
4850

4951

5052
## Configuration and installation
5153

52-
Prepare environment variables in .env (see .env.example):
54+
Prepare environment variables related to ClearML and AWS in .env (see .env.example):
5355
```
5456
CLEARML_CONFIG_FILE=clearml.conf
5557
CLEARML_WEB_HOST=<your-clearml-web-host>
@@ -67,22 +69,41 @@ Install with pip:
6769
pip install . # Add flag -e to install in editable mode
6870
```
6971

70-
Using docker compose:
72+
(Optional) Using docker compose:
7173
```bash
7274
docker compose up -d # Run container based on docker-compose.yml
7375
```
7476

75-
Using plain docker:
77+
(Optional) Using plain docker:
7678
```bash
7779
docker build -t ds-image . # Build image defined in Dockerfile
7880
docker run -dit --gpus all --name ds-container ds-image # Run container based on that image
7981
```
8082

81-
## Quick start
83+
## Run pipeline steps
84+
85+
### 1. Data preparation
8286

8387
```bash
8488
python steps/process_data.py
85-
python steps/compute_baseline.py
89+
```
90+
91+
After running this script the following datasets are being generated:
92+
- `train.parquet` - behavioral data about 'movies consumption' for training (implict feedback)
93+
- `validation.parquet` - behavioral data for validation
94+
- `user_mapper.parquet` - user name to user index mapper
95+
- `item_mapper.parquet` - item name to item index mapper
96+
- `last_user_histories.parquet` - histories of last *n* consumed item per user - computed on train data
97+
98+
![alt text](static/process_data.png)
99+
100+
### 2. Baselines evalution
101+
102+
```bash
103+
python steps/evaluate_baselines.py
104+
```
105+
106+
```bash
86107
python steps/train.py
87108
python steps/infer.py
88109
```

static/process_data.png

117 KB
Loading

0 commit comments

Comments
 (0)