Is there any way quant model on multi nodes? #1382

shuxiaobo · 2025-04-25T05:27:40Z

Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
When I quant a 200B model on 8xA100, it really slowly on entire process which spend 24 hours

Describe the solution you'd like
A clear and concise description of what you want to happen.
if there is any way quant model on multi nodes, it will be accelerate

brian-dellabetta · 2025-05-01T19:44:16Z

Hi @shuxiaobo , most compression algorithms have to run layer-by-layer, so that quantization error can be taken into account in the following layers. We have discussed a data-parallel quantization, to leverage several GPUs for each layer, which is highly desired but will require some time and thought to implement. So at the moment it is not being actively developed.

Which recipes are you trying with the 200B parameter model?

shuxiaobo added the enhancement New feature or request label Apr 25, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is there any way quant model on multi nodes? #1382

Is there any way quant model on multi nodes? #1382

shuxiaobo commented Apr 25, 2025

brian-dellabetta commented May 1, 2025

Is there any way quant model on multi nodes? #1382

Is there any way quant model on multi nodes? #1382

Comments

shuxiaobo commented Apr 25, 2025

brian-dellabetta commented May 1, 2025