Skip to content

Conversation

demoncoder-crypto
Copy link

Description

Related Issues

References

Checklist:

  • I have followed the contribution guidelines and code style for this project.
  • I have added tests covering my contributions.
  • I have updated the documentation accordingly.
  • I have signed the commits, e.g. git commit -s -m "your commit message".
  • This PR is being made to staging branch AND NOT TO main branch.

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@@ -0,0 +1,10 @@
{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure if there was an error in the submission of the notebook, but I can't see it

# Licensed under the MIT License.

import numpy as np
import warnings # Added for R-Precision warning
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this file is still from the other PR, it needs to be removed

Comment on lines +1 to +33
# Copyright (c) Recommenders contributors.
# Licensed under the MIT License.

import os
import numpy as np
import pandas as pd
import torch

from recommenders.utils.constants import (
DEFAULT_USER_COL,
DEFAULT_ITEM_COL,
DEFAULT_RATING_COL,
DEFAULT_PREDICTION_COL,
DEFAULT_K,
)

def predict_rating(
model,
test_df,
col_user=DEFAULT_USER_COL,
col_item=DEFAULT_ITEM_COL,
col_rating=DEFAULT_RATING_COL,
col_prediction=DEFAULT_PREDICTION_COL,
batch_size=1024,
):
"""Predict ratings for user-item pairs in test data.
Args:
model (NNEmbeddingRanker): Trained embedding ranker model.
test_df (pandas.DataFrame): Test dataframe containing user-item pairs.
col_user (str): User column name.
col_item (str): Item column name.
col_rating (str): Rating column name.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think all this can become the notebook.

Some thoughts for the notebook:

  • Here is a good example of a useful notebook: https://github.com/recommenders-team/recommenders/blob/main/examples/02_model_collaborative_filtering/lightgcn_deep_dive.ipynb it explains both the math behind it and an implementation
  • Think of the notebook as a way to showcase how to use the ranker and what is the ranker about.
  • The objective of the notebook is that it needs to be useful. That's the most important metric.
  • Ideally, a person could go to this notebook, add their data, run it, and understand how to use it.
  • For the notebook, you can just showcase the content of these functions directly inside the notebook.
  • Something very important is that we follow the principle of explicit is better than implicit. For example, we don't something like for metric_name, metric_func in metrics.items(), because it adds a layer of complexity. Instead we show the metrics explicitely: rmse(true, pred), precision_at_k(true, pred, params), etc. Each person with a quick view can see what is going on.
  • Feel free to come to our Monday meeting if you want to understand better how we do the notebooks @demoncoder-crypto

self.col_rating = col_rating
self.col_prediction = col_prediction
self.threshold = threshold
self.rating_pred_raw = rating_pred # Store raw predictions before processing
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rating_pred is already stored in self.rating_pred, and self.rating_pred_raw is not used in this class. Any reason you want to store?

introducing serendipity into music recommendation, WSDM 2012
Eugene Yan, Serendipity: Accuracy’s unpopular best friend in Recommender Systems,
Eugene Yan, Serendipity's unpopular best friend in Recommender Systems,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all_pairs = []
for user in valid_users:
for item in all_items:
all_pairs.append((user, item))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all_pairs = [(u, i) for u in valid_users for i in all_items]
or

from itertools import product
all_pairs = list(product(valid_users, all_items))

# Filter out seen pairs
result_df = result_df[~result_df.apply(lambda row: (row[col_user], row[col_item]) in seen_pairs, axis=1)]

# Get top-k recommendations for each user
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you reuse predict_rating if generating_recommendation is using the same logic under the hood and sort & cut top_k in the end?

# Calculate metrics
results = {}
for metric_name, metric_func in metrics.items():
# Different metrics may have different required parameters
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the only difference is k, you may:

results[metric_name] = metric_func(
    test_df,
    predictions_df,
    col_user=col_user,
    col_item=col_item,
    col_rating=col_rating,
    col_prediction=col_prediction,
    k=k if 'k' in metric_func.__code__.co_varnames else None,
 )

@miguelgfierro
Copy link
Collaborator

@setuc FYI

@miguelgfierro
Copy link
Collaborator

@demoncoder-crypto how is this work going?
@setuc can support

@miguelgfierro
Copy link
Collaborator

@jmarrietar do you think you would be able to take over this work? It is very similar to embdotbias

@jmarrietar
Copy link
Contributor

Hi, @miguelgfierro . I'll be on a tight schedule for the following months. But I can take a look when I free up a little bit 😄 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants