Skip to content

exporting qwen3-0.6B using the w8a8 quantization format, a segmentation error occurs. #16013

@cxn-selfie

Description

@cxn-selfie

When the quantization format was changed to w8a8, a segmentation fault occurred during the conversion to PTE.
Quantitative configuration:
class Qwen3_0_6BQuantRecipe(StaticLLMQuantRecipe):
# default_quant_dtype = QuantDtype.use_16a4w
default_quant_dtype = QuantDtype.use_8a8w

def __init__(self, verbose: bool = False):
    super().__init__()

    self.recipe = (
        QuantRecipe(
            self.default_quant_dtype,
            False,
            act_observer=MinMaxObserver,
            granularity=QuantGranularity.PER_TENSOR,
            verbose=verbose,
        )
         .add_node_target(
            {
                torch.ops.aten.conv2d.default,
            },
            self.default_quant_dtype,
            False,
            act_observer=MinMaxObserver,
            granularity=QuantGranularity.PER_CHANNEL,
        )
        .add_regex(
            {
                r"layers\..*\.feed_forward\.w2_conv",
            },
            self.default_quant_dtype,
            False,
            act_observer=MinMaxObserver,
            granularity=QuantGranularity.PER_CHANNEL,
        )
       
    )

fault message:
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_matmul_default_895, aten.matmul.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_cat_default_1147, aten.cat.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_view_copy_default_1931, aten.view_copy.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_permute_copy_default_3332, aten.permute_copy.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_1004, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_permute_copy_default_3333, aten.permute_copy.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_view_copy_default_1932, aten.view_copy.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_add_tensor_1174, aten.add.Tensor
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_rms_norm_default_727, aten.rms_norm.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_view_copy_default_1933, aten.view_copy.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_permute_copy_default_3334, aten.permute_copy.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_1005, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_1006, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_sigmoid_default_27, aten.sigmoid.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_mul_tensor_2743, aten.mul.Tensor
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_mul_tensor_2744, aten.mul.Tensor
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_1007, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_permute_copy_default_3335, aten.permute_copy.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_view_copy_default_1934, aten.view_copy.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_add_tensor_1175, aten.add.Tensor
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_rms_norm_default_728, aten.rms_norm.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_unsqueeze_copy_default, aten.unsqueeze_copy.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_permute_copy_default_3336, aten.permute_copy.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_1008, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_permute_copy_default_3337, aten.permute_copy.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten__to_copy_default_1589, aten._to_copy.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_squeeze_copy_dims, aten.squeeze_copy.dims

====== DDR bandwidth summary ======
spill_bytes=0
fill_bytes=0
write_total_bytes=2447360
read_total_bytes=672777216

[ERROR] [Qnn ExecuTorch]: grdep_clone_op.cc:340::ERROR:failed to clone op(0x18FDB00000AA27)

段错误 (核心已转储)

cc @cccclai @winskuo-quic @shewu-quic @haowhsu-quic @DannyYuyang-quic @cbilgin

Metadata

Metadata

Assignees

Labels

module: qnnIssues related to Qualcomm's QNN delegate and code under backends/qualcomm/partner: qualcommFor backend delegation, kernels, demo, etc. from the 3rd-party partner, Qualcomm

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions