We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
本地部署的deepseek-coder-33b,两块RTX A6000 48G的显卡 python -m vllm.entrypoints.openai.api_server --model /home/superadmin/coder33b --trust-remote-code --tensor-parallel-size=2 --served-model-name=deepseek-coder 启动后调用/v1/completions接口获取推理结果,但是效率极低,平均15tokens/s,如图
有人遇到过吗,如何解决有大佬指教吗
The text was updated successfully, but these errors were encountered:
+1
Sorry, something went wrong.
No branches or pull requests
本地部署的deepseek-coder-33b,两块RTX A6000 48G的显卡
python -m vllm.entrypoints.openai.api_server --model /home/superadmin/coder33b --trust-remote-code --tensor-parallel-size=2 --served-model-name=deepseek-coder
启动后调用/v1/completions接口获取推理结果,但是效率极低,平均15tokens/s,如图
有人遇到过吗,如何解决有大佬指教吗
The text was updated successfully, but these errors were encountered: