Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bash vs shell code container #417

Open
beliashou opened this issue Jan 23, 2025 · 4 comments
Open

Bash vs shell code container #417

beliashou opened this issue Jan 23, 2025 · 4 comments

Comments

@beliashou
Copy link

Good day!

All my LLM when generate response used bash , not shell. And gptme don't execute the commands. What Do I wrong? How can I fix it?

@ErikBjare
Copy link
Owner

ErikBjare commented Jan 23, 2025

You are probably using a model that is bad at instruction following. If it cannot follow the simple system instructions, I wouldn't expect the commands to be very correct either.

You can try the --tool-format tool (might not be supported for your provider) or --tool-format xml to try other tool calling formats, but you should probably just use a better model (Claude is best, GPT-4o is good, Llama 405B should also work ok).

You can see some basic eval results for different models here: https://gptme.org/docs/evals.html

@beliashou
Copy link
Author

I try to use ollama with deepseek-r1. It's not bad. Where can I see the system prompt for shell plugin?

@ErikBjare
Copy link
Owner

ErikBjare commented Jan 24, 2025

You can see the system prompt by running gptme with --show-hidden, or by reading the docs: https://gptme.org/docs/prompts.html

Deepseek R1 distillations in Ollama have been very disappointing for me when it comes to tool use/instruction following. Going to try a 70B distillation now that it's available on OpenRouter.

@ErikBjare
Copy link
Owner

I can confirm that https://openrouter.ai/deepseek/deepseek-r1-distill-llama-70b works pretty well

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants