File format and workflow... can we all "agree" on one file format? #1252

MSDNAndi · 2023-04-30T10:18:01Z

MSDNAndi
Apr 30, 2023

I love this project - my one issue is - HD space and so many conversions.

eg the workflow from start to finish with a LLaMa derived model when done "properly" is painful with the StabilityLM or OpenAssistant XORs and more model variants appearing.

a) download model weights in the original format
b) convert to hugging face / transformer format
c) apply xor/delta
d) convert to ggml

And I did not even mention quantizing.

If HF file format(s) are missing important things, could what's missing be contributed there?

Or can we hope that everybody will switch to GGML?

Just this weekend I filled up 900GB HD space just with playing around with new model versions and file formats.

Any creative ideas other than deleting and redownloading/regenerating?

SlyEcho · 2023-05-01T20:13:25Z

SlyEcho
May 1, 2023
Collaborator

I am keeping only the ggml f16 models around, that's because there is still a lot of work and research being done for quantization. From f16 it is easy to generate new quantized files. Also note that if you clone a HF repo with git-lfs it will duplicate the data.

Otherwise, storage is very cheap nowadays. I'm actually uploading some files to a CDN so I can access the files over FTP and HTTP.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

File format and workflow... can we all "agree" on one file format? #1252

{{title}}

Replies: 1 comment

{{title}}

Select a reply

File format and workflow... can we all "agree" on one file format? #1252

MSDNAndi Apr 30, 2023

Replies: 1 comment

SlyEcho May 1, 2023 Collaborator

MSDNAndi
Apr 30, 2023

SlyEcho
May 1, 2023
Collaborator