We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tf1.15版本,使用1chief 1ps 4worker进行分布式训练,训练期间ps的cpu持续增长,chief和worker的cpu后续有降低的情况,每秒训练批次也变少了,这是因为什么原因?
The text was updated successfully, but these errors were encountered:
请使用perf 查看一下ps 进程的cpu时间主要消耗在哪些函数里面,再进一步分析这些消耗是否合理
Sorry, something went wrong.
No branches or pull requests
训练期间ps的cpu使用率变化情况
训练期间chief的cpu使用率变化情况(worker类似)
训练期间每秒训练批次变化情况
tf1.15版本,使用1chief 1ps 4worker进行分布式训练,训练期间ps的cpu持续增长,chief和worker的cpu后续有降低的情况,每秒训练批次也变少了,这是因为什么原因?
The text was updated successfully, but these errors were encountered: