Description
==============================
Versions of relevant libraries
[pip3] flake8==7.1.1
[pip3] numpy==1.26.4
[pip3] nvidia-cublas-cu12==12.6.4.1
[pip3] nvidia-cuda-cupti-cu12==12.6.80
[pip3] nvidia-cuda-nvrtc-cu12==12.6.77
[pip3] nvidia-cuda-runtime-cu12==12.6.77
[pip3] nvidia-cudnn-cu12==9.5.1.17
[pip3] nvidia-cufft-cu12==11.3.0.4
[pip3] nvidia-cufile-cu12==1.11.1.6
[pip3] nvidia-curand-cu12==10.3.7.77
[pip3] nvidia-cusolver-cu12==11.7.1.2
[pip3] nvidia-cusparse-cu12==12.5.4.2
[pip3] nvidia-cusparselt-cu12==0.6.3
[pip3] nvidia-ml-py==12.570.86
[pip3] nvidia-nccl-cu12==2.26.2
[pip3] nvidia-nvjitlink-cu12==12.6.85
[pip3] nvidia-nvtx-cu12==12.6.77
[pip3] pynvml==12.0.0
[pip3] pyzmq==26.2.0
[pip3] torch==2.7.0
[pip3] torchaudio==2.7.0
[pip3] torchvision==0.22.0
[pip3] transformers==4.53.0
[pip3] transformers-stream-generator==0.0.5
[pip3] triton==3.3.0
[conda] numpy 1.26.4 pypi_0 pypi
[conda] nvidia-cublas-cu12 12.6.4.1 pypi_0 pypi
[conda] nvidia-cuda-cupti-cu12 12.6.80 pypi_0 pypi
[conda] nvidia-cuda-nvrtc-cu12 12.6.77 pypi_0 pypi
[conda] nvidia-cuda-runtime-cu12 12.6.77 pypi_0 pypi
[conda] nvidia-cudnn-cu12 9.5.1.17 pypi_0 pypi
[conda] nvidia-cufft-cu12 11.3.0.4 pypi_0 pypi
[conda] nvidia-cufile-cu12 1.11.1.6 pypi_0 pypi
[conda] nvidia-curand-cu12 10.3.7.77 pypi_0 pypi
[conda] nvidia-cusolver-cu12 11.7.1.2 pypi_0 pypi
[conda] nvidia-cusparse-cu12 12.5.4.2 pypi_0 pypi
[conda] nvidia-cusparselt-cu12 0.6.3 pypi_0 pypi
[conda] nvidia-ml-py 12.570.86 pypi_0 pypi
[conda] nvidia-nccl-cu12 2.26.2 pypi_0 pypi
[conda] nvidia-nvjitlink-cu12 12.6.85 pypi_0 pypi
[conda] nvidia-nvtx-cu12 12.6.77 pypi_0 pypi
[conda] pynvml 12.0.0 pypi_0 pypi
[conda] pyzmq 26.2.0 pypi_0 pypi
[conda] torch 2.7.0 pypi_0 pypi
[conda] torchaudio 2.7.0 pypi_0 pypi
[conda] torchvision 0.22.0 pypi_0 pypi
[conda] transformers 4.53.0 pypi_0 pypi
[conda] transformers-stream-generator 0.0.5 pypi_0 pypi
[conda] triton 3.3.0 pypi_0 pypi
🐛 Describe the bug
The model underwent structural sparsity optimization (2:4) and is now being fine-tuned. When running the fine-tuning using torchrun in parallel mode with the command:
torchrun --nproc_per_node=4 --nnodes=1 --node_rank=0 --master_addr=localhost --master_port=29500 llama7b_sparse_w4a16.py
it produces the following error:
[rank0]: Traceback (most recent call last):
[rank0]: File "/opt/conda/lib/python3.10/site-packages/llmcompressor/modifiers/pruning/utils/pytorch/layer_mask.py", line 60, in add_mask
[rank0]: parameterized_layer.layer.get_buffer(mask_name)
[rank0]: File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 863, in get_buffer
[rank0]: raise AttributeError(
[rank0]: AttributeError: Linear has no attribute mask
Before submitting a new issue...
- Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.