Skip to content

Dataset and pretrained model weights (w/o leakage)

Latest
Compare
Choose a tag to compare
@malteos malteos released this 07 Mar 10:40
· 7 commits to main since this release
  • Pretrained model weights: config.json, pytorch_model.bin (also available on Huggingface malteos/scincl-wol)
  • Tokenizer: See w/ leakage release
  • Triples (query, positive, negative) and paper metadata: train_triples.csv.gz, train_metadata.jsonl.gz
  • Corpus and query papers: s2orc_paper_ids.seed_0.json, query_s2orc_paper_ids.seed_0.json