Skip to content

jordimas/MLSUM-Catalan

Repository files navigation

MLSUM-Catalan

A Catalan corpus based on https://github.com/recitalAI/MLSUM concepts.

Original context is from Vilaweb licensed under Attribution-NonCommercial-NoDerivs which allows sharing.

Files:

  • URLs used at urls/train.ca.txt.urls
  • Text and summaries: processed/ca_train.txt (2678 entries)

The text and summaries are in the same format that MLSum corpus (tab separated).

Releases

No releases published

Packages

No packages published