-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Description
Please describe the feature you want
Increase the default completion timeout to 1 minute and document the configuration for it.
Additional context
So, I was trying to use qwen2.5-coder:14b under Ollama, but everytime I tried to trigger the completion it wouldnt work. I could see that the GPU memory was being used. I entered the "Information > System" tab on the web UI and I could see that there the model was being loaded successfully. I then tried the 7b model and the completion worked fine. After that, I realized the 14b model loading was being cancelled exactly 30 seconds after it started. I then started search and look inside tabby's code to see where to configure that. I found out that this is configured via the [server]
section inside config.toml, but it is not documented on the official documentation.
Please reply with a 👍 if you want this feature.