Distilled model for high-throughput Danish NLP model #200

KennethEnevoldsen · 2023-12-08T15:52:31Z

KennethEnevoldsen
Dec 8, 2023
Maintainer

Statement of need

It is important to have highly performant models for Danish NLP as these are useful for processing large amount of text on limited compute budgets.

Current status

Not in development

Approach

My guess is that distilling a larger model such as the dfm-encoder-large-v1 would be the best option, though it might be a better approach to simply train a small model from scratch.
We have not looked into what the best distillation approach is.

If you wish to take on the project feel free to start a discussion here.

KasperGroesLudvigsen · 2023-12-08T16:47:56Z

KasperGroesLudvigsen
Dec 8, 2023

What about quantization? Is that an undesirable approach for optimization of processing speed in your view?

…

On Fri, 8 Dec 2023 at 16.52, Kenneth Enevoldsen ***@***.***> wrote: Statement of need It is important to have highly performant models for Danish NLP as these are useful for processing large amount of text on limited compute budgets. Current status Not in development Approach - My guess is that distilling a larger model such as the dfm-encoder-large-v1 would be the best option, though it might be a better approach to simply train a small model from scratch. - We have not looked into what the best distillation approach is. If you wish to take on the project feel free to start a discussion here. — Reply to this email directly, view it on GitHub <#200>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AJAFWNLRBSZVGBVGTAQDU4LYIMZUXAVCNFSM6AAAAABAM3DBRGVHI2DSMVQWIX3LMV43ERDJONRXK43TNFXW4OZVHEZTSMJQG4> . You are receiving this because you are subscribed to this thread.Message ID: <centre-for-humanities-computing/danish-foundation-models/repo-discussions/200 @github.com>

1 reply

KennethEnevoldsen Dec 8, 2023
Maintainer Author

quantisation would def. be a reasonable approach as well. Should def. be an approach to look into as well. I know there is some work on pre-quantising the model, but typically it is done during fine-tuning or after.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Distilled model for high-throughput Danish NLP model #200

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Distilled model for high-throughput Danish NLP model #200

Uh oh!

KennethEnevoldsen Dec 8, 2023 Maintainer

Statement of need

Current status

Approach

Replies: 1 comment · 1 reply

Uh oh!

KasperGroesLudvigsen Dec 8, 2023

Uh oh!

KennethEnevoldsen Dec 8, 2023 Maintainer Author

KennethEnevoldsen
Dec 8, 2023
Maintainer

Replies: 1 comment 1 reply

KasperGroesLudvigsen
Dec 8, 2023

KennethEnevoldsen Dec 8, 2023
Maintainer Author