[Feature] Add support for RAG specialized model PleIAs_Pleias-Nano #3343

ThiloteE · 2024-12-21T21:29:39Z

Feature Request

Add support for https://huggingface.co/PleIAs/Pleias-Nano

manyoso · 2024-12-22T15:32:19Z

Looking at the model it seems to have some specialized chat template for RAG that we'd need to adapt. Fortunately, the new jinja system will make this considerably easier.

The truth is we would need some more help from community to step up and do the coding to get this supported in short term. The coding would be a combination of c++/qml and jinja.

I am happy to help and can offer advice/guidance and even screen share/ tutorials for a dedicated member of community who wanted to step up implement.

If no one from community steps up it will be a more long/medium term situation and would have to be triaged for priority.

ThiloteE · 2024-12-22T16:27:34Z

Thank you for your offer. The benchmarks sound really good, the model seems to be best in its parameter class, so small that it is really fast in terms of t/s) and licensing is phantastic, but it only supports a 2k context window. I can fiddle with jinja, but c++/qml would be a first for me. You would need to help me a lot.

If it is difficult to implement, let's postpone and wait for a version that supports longer context. Meanwhile, users can use llama-3.2-3b or qwen 3b. IMHO, fixing jinja stuff is more important right now.

manyoso · 2024-12-22T16:33:52Z

The Jinja stuff can be fixed in some combo of four possible ways:

Make jinja2cpp more compatible with the python jinja parser
Adding more built-in compat templates to swap with the ones detected in sideloaded ggufs
Adding more curated models to our model list
Educating the user base on how to modify and amend the jinja templates

#1 is obviously the best solution, but perhaps also the one that takes the most time. #2 is a good stop gap, but there will always be models missing. #3 we should do anyway. #4 is also a great idea as modified templates will be necessary to support all the internal tools we plan on adding.

Keep in mind the majority of users use the curated models. The ones who are doing sideloading are the minority and also the ones who should be most capable of #4. Hopefully now with the new reasoning capability people will sympathize with the reason we made this change. It unlocks a whole lot of future features and functionality.

ThiloteE added the enhancement New feature or request label Dec 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Add support for RAG specialized model PleIAs_Pleias-Nano #3343

[Feature] Add support for RAG specialized model PleIAs_Pleias-Nano #3343

ThiloteE commented Dec 21, 2024

manyoso commented Dec 22, 2024

ThiloteE commented Dec 22, 2024 •

edited

Loading

manyoso commented Dec 22, 2024

[Feature] Add support for RAG specialized model PleIAs_Pleias-Nano #3343

[Feature] Add support for RAG specialized model PleIAs_Pleias-Nano #3343

Comments

ThiloteE commented Dec 21, 2024

Feature Request

manyoso commented Dec 22, 2024

ThiloteE commented Dec 22, 2024 • edited Loading

manyoso commented Dec 22, 2024

ThiloteE commented Dec 22, 2024 •

edited

Loading