Instruction tuning dataset for Danish #210
KennethEnevoldsen
started this conversation in
Missing pieces for Danish NLP
Replies: 1 comment
-
|
The repo for creating the dataset can be found here: https://github.com/kasperjunge/skolegpt-instruct-dataset It will likely be moved to Københavns Professionshøjskoles GitHub Organisation. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Statement of Need
Dataset for instruction tuning are important for improving the quality of Danish LLMs
Current status
As far as I know, I know that @kasperjunge and the DanskGPT project are working on developing an instruction-tuned dataset for Danish NLP by translating OpenOrca. Models released under this would be problematic to use in a commercial setting (due to the potential legal implication as the dataset is generated using OpenAI).
Beta Was this translation helpful? Give feedback.
All reactions