Skip to content

Conversation

@orionw
Copy link
Collaborator

@orionw orionw commented Jul 7, 2024

To avoid bandwidth issues with Github's 1GB of Git-LFS bandwidth, remove files that use Git LFS:

  • Remove the videos to a separate HF space that we can just download in. I assume this won't change much so it's okay if we view them as resources to download in
  • Remove pickle files (since all binary files have to be git lfs) and use jsonl files instead.

@Muennighoff @isaac-chung Can I remove the index_*/passages.*.pt files from Github LFS? I assume not yet and when the other indexes are ready we can remove them, so just lmk. Those are the last git lfs files.



# download the videos
from huggingface_hub import hf_hub_url
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At runtime we download the videos from Huggingface. No need to keep them in this repo as I assume they are static and people won't be iterating on them.


with open(elo_rating_pkl, "rb") as fin:
elo_rating_results = pickle.load(fin)
elo_rating_results = load_results(elo_rating_folder)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Results are now saved and loaded as folders, but it does make the commits quite long as each dataframe is a separate file... sorry!

mteb
plotly
umap-learn
kaleido
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently is needed to write Plotly plots to file. We could not save to file, but it was saved in the pickle files, so I thought we might as well write it to file for now.

Copy link
Contributor

@isaac-chung isaac-chung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally it looks good! Just wondering if we can simplify the structure a bit.

|-- elo_results_TASK
  |-- anony
-   |-- average_win_rate_bar
-      |-- default.png
+   |-- average_win_rate_bar.png
  |-- full

Comment on lines 3 to +4
mkdir -p results

mkdir -p results/latest
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably don't need mkdir -p results since we have the new line?

@isaac-chung isaac-chung mentioned this pull request Jul 7, 2024
17 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants