Description
Name and Version
llama.cpp b5780 running in a Ubuntu 24.04 Docker container. The host system is Debian 12.
Clone: git clone --depth=1 --branch b5780 https://github.com/ggml-org/llama.cpp llama.cpp-b5780
CMake: cmake .. -DCMAKE_BUILD_TYPE=Debug -DGGML_OPENCL=ON -DGGML_OPENCL_USE_ADRENO_KERNELS=OFF
(Debug/Release build does not affect the issue)
/models/llama.cpp-b5780/build-opencl-debug# bin/llama-cli --version
ggml_opencl: selected platform: 'Intel(R) OpenCL Graphics'
ggml_opencl: device: 'Intel(R) UHD Graphics (OpenCL 3.0 NEO )'
ggml_opencl: OpenCL driver: 23.43.027642
ggml_opencl: vector subgroup broadcast support: false
ggml_opencl: device FP16 support: true
ggml_opencl: mem base addr align: 128
ggml_opencl: max mem alloc size: 4095 MB
ggml_opencl: SVM coarse grain buffer support: true
ggml_opencl: SVM fine grain buffer support: false
ggml_opencl: SVM fine grain system support: false
ggml_opencl: SVM atomics support: false
ggml_opencl: flattening quantized weights representation as struct of arrays (GGML_OPENCL_SOA_Q)
ggml_opencl: loading OpenCL kernels............................................
ggml_opencl: default device: 'Intel(R) UHD Graphics (OpenCL 3.0 NEO )'
register_backend: registered backend OpenCL (1 devices)
register_device: registered device GPUOpenCL (Intel(R) UHD Graphics)
register_backend: registered backend CPU (1 devices)
register_device: registered device CPU (Intel(R) Core(TM) i3-N305)
load_backend: failed to find ggml_backend_init in /models/llama.cpp-b5780/build-opencl-debug/bin/libggml-opencl.so
load_backend: failed to find ggml_backend_init in /models/llama.cpp-b5780/build-opencl-debug/bin/libggml-cpu.so
version: 1 (caf5681)
built with cc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0 for x86_64-linux-gnu
Operating systems
Linux
GGML backends
CPU OpenCL
P.S. The "GGML backends" section in issue template does not show "OpenCL" as an option; consider a fix?
Hardware
Intel Core i3-N305, with its UHD Graphics iGPU.
The system has 32GB of RAM, and the affected model is confirmed to run correctly on the CPU backend, or on the OpenCL backend with -ngl 0
. Issue occur when offloading at least 1 layer to iGPU on the OpenCL backend.
Models
Qwen3-30B-A3B Q4_0 from unsloth. SHA256 confirmed to match what Huggingface reports.
Also tested on a "pure" variant of this model, generated by ./llama-quantize --allow-requantize --pure Qwen3-30B-A3B-Q4_0.gguf Qwen3-30B-A3B-Q4_0-pure.gguf Q4_0
; the issue persists.
Problem description & steps to reproduce
I'm trying to run a LLM model on my homelab, and thought Qwen3-30B-A3B is a good choice. I tested with a CPU build of llama.cpp and it works fine; however it consumes all of CPU, and I wanted to offload the model to the (otherwise unused) iGPU. I tried ipex-llm and SYCL backends, but they really don't like MoE models, so after 2 weeks of frustration I gave up and started trying other backends. Albeit experimental, the OpenCL backend works well for me; except when the prompt is long enough, llama-server silently crashes with SIGFPE. After many trial-and-error and debugging, I've found that the crash occurs only when actually using the iGPU, and only when the prompt is longer than ubatch size (-ub
).
To reproduce the issue, compile llama.cpp with CMake flags given above, and run: bin/llama-cli -no-cnv -ngl 99 --model /models/Qwen3-30B-A3B-Q4_0.gguf -p "$(python3 -c 'print(" ".join("a" * 513))')"
. Notice that instead of generating output normally, llama-cli crashes with "Floating point exception (core dumped)". (A GDB log of llama.cpp crashing is appended.) Reduce prompt size to <= 512 tokens (by modifying the Python one-liner) or set -ub
to larger than 512 and crash is gone.
First Bad Commit
Didn't have time to do a through bisect, but since qwen3moe support is merged only on April 8th, I suspect that every llama.cpp version since then had the issue.
Relevant log output
/models/llama.cpp-b5780/build-opencl-debug# gdb bin/llama-cli
GNU gdb (Ubuntu 15.0.50.20240403-0ubuntu1) 15.0.50.20240403-git
Copyright (C) 2024 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from bin/llama-cli...
(gdb) run -v -no-cnv -ngl 99 --model /models/Qwen3-30B-A3B-Q4_0.gguf -p "$(python3 -c 'print(" ".join("a" * 513))')"
Starting program: /models/llama.cpp-b5780/build-opencl-debug/bin/llama-cli -v -no-cnv -ngl 99 --model /models/Qwen3-30B-A3B-Q4_0.gguf -p "$(python3 -c 'print(" ".join("a" * 513))')"
warning: Error disabling address space randomization: Operation not permitted
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
warning: could not find '.gnu_debugaltlink' file for /lib/x86_64-linux-gnu/liblber.so.2
warning: could not find '.gnu_debugaltlink' file for /lib/x86_64-linux-gnu/libbrotlidec.so.1
warning: could not find '.gnu_debugaltlink' file for /lib/x86_64-linux-gnu/libbrotlicommon.so.1
[New Thread 0x7fc1cab026c0 (LWP 4931)]
warning: could not find '.gnu_debugaltlink' file for /lib/x86_64-linux-gnu/libtinfo.so.6
ggml_opencl: selected platform: 'Intel(R) OpenCL Graphics'
ggml_opencl: device: 'Intel(R) UHD Graphics (OpenCL 3.0 NEO )'
ggml_opencl: OpenCL driver: 23.43.027642
ggml_opencl: vector subgroup broadcast support: false
ggml_opencl: device FP16 support: true
ggml_opencl: mem base addr align: 128
ggml_opencl: max mem alloc size: 4095 MB
ggml_opencl: SVM coarse grain buffer support: true
ggml_opencl: SVM fine grain buffer support: false
ggml_opencl: SVM fine grain system support: false
ggml_opencl: SVM atomics support: false
ggml_opencl: flattening quantized weights representation as struct of arrays (GGML_OPENCL_SOA_Q)
ggml_opencl: loading OpenCL kernels............................................
ggml_opencl: default device: 'Intel(R) UHD Graphics (OpenCL 3.0 NEO )'
register_backend: registered backend OpenCL (1 devices)
register_device: registered device GPUOpenCL (Intel(R) UHD Graphics)
register_backend: registered backend CPU (1 devices)
register_device: registered device CPU (Intel(R) Core(TM) i3-N305)
load_backend: failed to find ggml_backend_init in /models/llama.cpp-b5780/build-opencl-debug/bin/libggml-opencl.so
load_backend: failed to find ggml_backend_init in /models/llama.cpp-b5780/build-opencl-debug/bin/libggml-cpu.so
[New Thread 0x7fc1c98f96c0 (LWP 4932)]
build: 1 (caf5681) with cc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0 for x86_64-linux-gnu (debug)
main: llama backend init
main: load the model and apply lora adapter, if any
llama_model_load_from_file_impl: using device GPUOpenCL (Intel(R) UHD Graphics) - 0 MiB free
llama_model_loader: loaded meta data with 35 key-value pairs and 579 tensors from /models/Qwen3-30B-A3B-Q4_0.gguf (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv 0: general.architecture str = qwen3moe
llama_model_loader: - kv 1: general.type str = model
llama_model_loader: - kv 2: general.name str = Qwen3-30B-A3B
llama_model_loader: - kv 3: general.basename str = Qwen3-30B-A3B
llama_model_loader: - kv 4: general.quantized_by str = Unsloth
llama_model_loader: - kv 5: general.size_label str = 30B-A3B
llama_model_loader: - kv 6: general.repo_url str = https://huggingface.co/unsloth
llama_model_loader: - kv 7: qwen3moe.block_count u32 = 48
llama_model_loader: - kv 8: qwen3moe.context_length u32 = 40960
llama_model_loader: - kv 9: qwen3moe.embedding_length u32 = 2048
llama_model_loader: - kv 10: qwen3moe.feed_forward_length u32 = 6144
llama_model_loader: - kv 11: qwen3moe.attention.head_count u32 = 32
llama_model_loader: - kv 12: qwen3moe.attention.head_count_kv u32 = 4
llama_model_loader: - kv 13: qwen3moe.rope.freq_base f32 = 1000000.000000
llama_model_loader: - kv 14: qwen3moe.attention.layer_norm_rms_epsilon f32 = 0.000001
llama_model_loader: - kv 15: qwen3moe.expert_used_count u32 = 8
llama_model_loader: - kv 16: qwen3moe.attention.key_length u32 = 128
llama_model_loader: - kv 17: qwen3moe.attention.value_length u32 = 128
llama_model_loader: - kv 18: qwen3moe.expert_count u32 = 128
llama_model_loader: - kv 19: qwen3moe.expert_feed_forward_length u32 = 768
llama_model_loader: - kv 20: tokenizer.ggml.model str = gpt2
llama_model_loader: - kv 21: tokenizer.ggml.pre str = qwen2
llama_model_loader: - kv 22: tokenizer.ggml.tokens arr[str,151936] = ["!", "\"", "#", "$", "%", "&", "'", ...
llama_model_loader: - kv 23: tokenizer.ggml.token_type arr[i32,151936] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
llama_model_loader: - kv 24: tokenizer.ggml.merges arr[str,151387] = ["Ġ Ġ", "ĠĠ ĠĠ", "i n", "Ġ t",...
llama_model_loader: - kv 25: tokenizer.ggml.eos_token_id u32 = 151645
llama_model_loader: - kv 26: tokenizer.ggml.padding_token_id u32 = 151654
llama_model_loader: - kv 27: tokenizer.ggml.add_bos_token bool = false
llama_model_loader: - kv 28: tokenizer.chat_template str = {%- if tools %}\n {{- '<|im_start|>...
llama_model_loader: - kv 29: general.quantization_version u32 = 2
llama_model_loader: - kv 30: general.file_type u32 = 2
llama_model_loader: - kv 31: quantize.imatrix.file str = Qwen3-30B-A3B-GGUF/imatrix_unsloth.dat
llama_model_loader: - kv 32: quantize.imatrix.dataset str = unsloth_calibration_Qwen3-30B-A3B.txt
llama_model_loader: - kv 33: quantize.imatrix.entries_count i32 = 384
llama_model_loader: - kv 34: quantize.imatrix.chunks_count i32 = 685
llama_model_loader: - type f32: 241 tensors
llama_model_loader: - type q4_0: 331 tensors
llama_model_loader: - type q4_1: 6 tensors
llama_model_loader: - type q6_K: 1 tensors
print_info: file format = GGUF V3 (latest)
print_info: file type = Q4_0
print_info: file size = 16.18 GiB (4.55 BPW)
init_tokenizer: initializing tokenizer for type 2
load: control token: 151660 '<|fim_middle|>' is not marked as EOG
load: control token: 151659 '<|fim_prefix|>' is not marked as EOG
load: control token: 151653 '<|vision_end|>' is not marked as EOG
load: control token: 151648 '<|box_start|>' is not marked as EOG
load: control token: 151646 '<|object_ref_start|>' is not marked as EOG
load: control token: 151649 '<|box_end|>' is not marked as EOG
load: control token: 151655 '<|image_pad|>' is not marked as EOG
load: control token: 151651 '<|quad_end|>' is not marked as EOG
load: control token: 151647 '<|object_ref_end|>' is not marked as EOG
load: control token: 151652 '<|vision_start|>' is not marked as EOG
load: control token: 151654 '<|vision_pad|>' is not marked as EOG
load: control token: 151656 '<|video_pad|>' is not marked as EOG
load: control token: 151644 '<|im_start|>' is not marked as EOG
load: control token: 151661 '<|fim_suffix|>' is not marked as EOG
load: control token: 151650 '<|quad_start|>' is not marked as EOG
load: special tokens cache size = 26
load: token to piece cache size = 0.9311 MB
print_info: arch = qwen3moe
print_info: vocab_only = 0
print_info: n_ctx_train = 40960
print_info: n_embd = 2048
print_info: n_layer = 48
print_info: n_head = 32
print_info: n_head_kv = 4
print_info: n_rot = 128
print_info: n_swa = 0
print_info: is_swa_any = 0
print_info: n_embd_head_k = 128
print_info: n_embd_head_v = 128
print_info: n_gqa = 8
print_info: n_embd_k_gqa = 512
print_info: n_embd_v_gqa = 512
print_info: f_norm_eps = 0.0e+00
print_info: f_norm_rms_eps = 1.0e-06
print_info: f_clamp_kqv = 0.0e+00
print_info: f_max_alibi_bias = 0.0e+00
print_info: f_logit_scale = 0.0e+00
print_info: f_attn_scale = 0.0e+00
print_info: n_ff = 6144
print_info: n_expert = 128
print_info: n_expert_used = 8
print_info: causal attn = 1
print_info: pooling type = 0
print_info: rope type = 2
print_info: rope scaling = linear
print_info: freq_base_train = 1000000.0
print_info: freq_scale_train = 1
print_info: n_ctx_orig_yarn = 40960
print_info: rope_finetuned = unknown
print_info: ssm_d_conv = 0
print_info: ssm_d_inner = 0
print_info: ssm_d_state = 0
print_info: ssm_dt_rank = 0
print_info: ssm_dt_b_c_rms = 0
print_info: model type = 30B.A3B
print_info: model params = 30.53 B
print_info: general.name = Qwen3-30B-A3B
print_info: n_ff_exp = 768
print_info: vocab type = BPE
print_info: n_vocab = 151936
print_info: n_merges = 151387
print_info: BOS token = 11 ','
print_info: EOS token = 151645 '<|im_end|>'
print_info: EOT token = 151645 '<|im_end|>'
print_info: PAD token = 151654 '<|vision_pad|>'
print_info: LF token = 198 'Ċ'
print_info: FIM PRE token = 151659 '<|fim_prefix|>'
print_info: FIM SUF token = 151661 '<|fim_suffix|>'
print_info: FIM MID token = 151660 '<|fim_middle|>'
print_info: FIM PAD token = 151662 '<|fim_pad|>'
print_info: FIM REP token = 151663 '<|repo_name|>'
print_info: FIM SEP token = 151664 '<|file_sep|>'
print_info: EOG token = 151643 '<|endoftext|>'
print_info: EOG token = 151645 '<|im_end|>'
print_info: EOG token = 151662 '<|fim_pad|>'
print_info: EOG token = 151663 '<|repo_name|>'
print_info: EOG token = 151664 '<|file_sep|>'
print_info: max token length = 256
load_tensors: loading model tensors, this can take a while... (mmap = true)
load_tensors: layer 0 assigned to device GPUOpenCL, is_swa = 0
load_tensors: layer 1 assigned to device GPUOpenCL, is_swa = 0
load_tensors: layer 2 assigned to device GPUOpenCL, is_swa = 0
load_tensors: layer 3 assigned to device GPUOpenCL, is_swa = 0
load_tensors: layer 4 assigned to device GPUOpenCL, is_swa = 0
load_tensors: layer 5 assigned to device GPUOpenCL, is_swa = 0
load_tensors: layer 6 assigned to device GPUOpenCL, is_swa = 0
load_tensors: layer 7 assigned to device GPUOpenCL, is_swa = 0
load_tensors: layer 8 assigned to device GPUOpenCL, is_swa = 0
load_tensors: layer 9 assigned to device GPUOpenCL, is_swa = 0
load_tensors: layer 10 assigned to device GPUOpenCL, is_swa = 0
load_tensors: layer 11 assigned to device GPUOpenCL, is_swa = 0
load_tensors: layer 12 assigned to device GPUOpenCL, is_swa = 0
load_tensors: layer 13 assigned to device GPUOpenCL, is_swa = 0
load_tensors: layer 14 assigned to device GPUOpenCL, is_swa = 0
load_tensors: layer 15 assigned to device GPUOpenCL, is_swa = 0
load_tensors: layer 16 assigned to device GPUOpenCL, is_swa = 0
load_tensors: layer 17 assigned to device GPUOpenCL, is_swa = 0
load_tensors: layer 18 assigned to device GPUOpenCL, is_swa = 0
load_tensors: layer 19 assigned to device GPUOpenCL, is_swa = 0
load_tensors: layer 20 assigned to device GPUOpenCL, is_swa = 0
load_tensors: layer 21 assigned to device GPUOpenCL, is_swa = 0
load_tensors: layer 22 assigned to device GPUOpenCL, is_swa = 0
load_tensors: layer 23 assigned to device GPUOpenCL, is_swa = 0
load_tensors: layer 24 assigned to device GPUOpenCL, is_swa = 0
load_tensors: layer 25 assigned to device GPUOpenCL, is_swa = 0
load_tensors: layer 26 assigned to device GPUOpenCL, is_swa = 0
load_tensors: layer 27 assigned to device GPUOpenCL, is_swa = 0
load_tensors: layer 28 assigned to device GPUOpenCL, is_swa = 0
load_tensors: layer 29 assigned to device GPUOpenCL, is_swa = 0
load_tensors: layer 30 assigned to device GPUOpenCL, is_swa = 0
load_tensors: layer 31 assigned to device GPUOpenCL, is_swa = 0
load_tensors: layer 32 assigned to device GPUOpenCL, is_swa = 0
load_tensors: layer 33 assigned to device GPUOpenCL, is_swa = 0
load_tensors: layer 34 assigned to device GPUOpenCL, is_swa = 0
load_tensors: layer 35 assigned to device GPUOpenCL, is_swa = 0
load_tensors: layer 36 assigned to device GPUOpenCL, is_swa = 0
load_tensors: layer 37 assigned to device GPUOpenCL, is_swa = 0
load_tensors: layer 38 assigned to device GPUOpenCL, is_swa = 0
load_tensors: layer 39 assigned to device GPUOpenCL, is_swa = 0
load_tensors: layer 40 assigned to device GPUOpenCL, is_swa = 0
load_tensors: layer 41 assigned to device GPUOpenCL, is_swa = 0
load_tensors: layer 42 assigned to device GPUOpenCL, is_swa = 0
load_tensors: layer 43 assigned to device GPUOpenCL, is_swa = 0
load_tensors: layer 44 assigned to device GPUOpenCL, is_swa = 0
load_tensors: layer 45 assigned to device GPUOpenCL, is_swa = 0
load_tensors: layer 46 assigned to device GPUOpenCL, is_swa = 0
load_tensors: layer 47 assigned to device GPUOpenCL, is_swa = 0
load_tensors: layer 48 assigned to device GPUOpenCL, is_swa = 0
load_tensors: tensor 'token_embd.weight' (q4_0) (and 6 others) cannot be used with preferred buffer type CPU_REPACK, using CPU instead
load_tensors: offloading 48 repeating layers to GPU
load_tensors: offloading output layer to GPU
load_tensors: offloaded 49/49 layers to GPU
load_tensors: OpenCL model buffer size = 15682.23 MiB
load_tensors: CPU_Mapped model buffer size = 2032.76 MiB
....................................................................................................
llama_context: constructing llama_context
llama_context: n_seq_max = 1
llama_context: n_ctx = 4096
llama_context: n_ctx_per_seq = 4096
llama_context: n_batch = 2048
llama_context: n_ubatch = 512
llama_context: causal_attn = 1
llama_context: flash_attn = 0
llama_context: freq_base = 1000000.0
llama_context: freq_scale = 1
llama_context: n_ctx_per_seq (4096) < n_ctx_train (40960) -- the full capacity of the model will not be utilized
set_abort_callback: call
llama_context: CPU output buffer size = 0.58 MiB
create_memory: n_ctx = 4096 (padded)
llama_kv_cache_unified: layer 0: dev = GPUOpenCL
llama_kv_cache_unified: layer 1: dev = GPUOpenCL
llama_kv_cache_unified: layer 2: dev = GPUOpenCL
llama_kv_cache_unified: layer 3: dev = GPUOpenCL
llama_kv_cache_unified: layer 4: dev = GPUOpenCL
llama_kv_cache_unified: layer 5: dev = GPUOpenCL
llama_kv_cache_unified: layer 6: dev = GPUOpenCL
llama_kv_cache_unified: layer 7: dev = GPUOpenCL
llama_kv_cache_unified: layer 8: dev = GPUOpenCL
llama_kv_cache_unified: layer 9: dev = GPUOpenCL
llama_kv_cache_unified: layer 10: dev = GPUOpenCL
llama_kv_cache_unified: layer 11: dev = GPUOpenCL
llama_kv_cache_unified: layer 12: dev = GPUOpenCL
llama_kv_cache_unified: layer 13: dev = GPUOpenCL
llama_kv_cache_unified: layer 14: dev = GPUOpenCL
llama_kv_cache_unified: layer 15: dev = GPUOpenCL
llama_kv_cache_unified: layer 16: dev = GPUOpenCL
llama_kv_cache_unified: layer 17: dev = GPUOpenCL
llama_kv_cache_unified: layer 18: dev = GPUOpenCL
llama_kv_cache_unified: layer 19: dev = GPUOpenCL
llama_kv_cache_unified: layer 20: dev = GPUOpenCL
llama_kv_cache_unified: layer 21: dev = GPUOpenCL
llama_kv_cache_unified: layer 22: dev = GPUOpenCL
llama_kv_cache_unified: layer 23: dev = GPUOpenCL
llama_kv_cache_unified: layer 24: dev = GPUOpenCL
llama_kv_cache_unified: layer 25: dev = GPUOpenCL
llama_kv_cache_unified: layer 26: dev = GPUOpenCL
llama_kv_cache_unified: layer 27: dev = GPUOpenCL
llama_kv_cache_unified: layer 28: dev = GPUOpenCL
llama_kv_cache_unified: layer 29: dev = GPUOpenCL
llama_kv_cache_unified: layer 30: dev = GPUOpenCL
llama_kv_cache_unified: layer 31: dev = GPUOpenCL
llama_kv_cache_unified: layer 32: dev = GPUOpenCL
llama_kv_cache_unified: layer 33: dev = GPUOpenCL
llama_kv_cache_unified: layer 34: dev = GPUOpenCL
llama_kv_cache_unified: layer 35: dev = GPUOpenCL
llama_kv_cache_unified: layer 36: dev = GPUOpenCL
llama_kv_cache_unified: layer 37: dev = GPUOpenCL
llama_kv_cache_unified: layer 38: dev = GPUOpenCL
llama_kv_cache_unified: layer 39: dev = GPUOpenCL
llama_kv_cache_unified: layer 40: dev = GPUOpenCL
llama_kv_cache_unified: layer 41: dev = GPUOpenCL
llama_kv_cache_unified: layer 42: dev = GPUOpenCL
llama_kv_cache_unified: layer 43: dev = GPUOpenCL
llama_kv_cache_unified: layer 44: dev = GPUOpenCL
llama_kv_cache_unified: layer 45: dev = GPUOpenCL
llama_kv_cache_unified: layer 46: dev = GPUOpenCL
llama_kv_cache_unified: layer 47: dev = GPUOpenCL
llama_kv_cache_unified: OpenCL KV buffer size = 384.00 MiB
llama_kv_cache_unified: size = 384.00 MiB ( 4096 cells, 48 layers, 1 seqs), K (f16): 192.00 MiB, V (f16): 192.00 MiB
llama_context: enumerating backends
llama_context: backend_ptrs.size() = 2
llama_context: max_nodes = 65536
llama_context: worst-case: n_tokens = 512, n_seqs = 1, n_outputs = 0
graph_reserve: reserving a graph for ubatch with n_tokens = 512, n_seqs = 1, n_outputs = 512
ggml_gallocr_reserve_n: reallocating OpenCL buffer from size 0.00 MiB to 316.75 MiB
ggml_gallocr_reserve_n: reallocating CPU buffer from size 0.00 MiB to 76.01 MiB
graph_reserve: reserving a graph for ubatch with n_tokens = 1, n_seqs = 1, n_outputs = 1
graph_reserve: reserving a graph for ubatch with n_tokens = 512, n_seqs = 1, n_outputs = 512
llama_context: OpenCL compute buffer size = 316.75 MiB
llama_context: CPU compute buffer size = 76.01 MiB
llama_context: graph nodes = 3174
llama_context: graph splits = 98
clear_adapter_lora: call
common_init_from_params: setting dry_penalty_last_n to ctx_size = 4096
common_init_from_params: warming up the model with an empty run - please wait ... (--no-warmup to disable)
set_warmup: value = 1
ggml_backend_sched_alloc_splits: failed to allocate graph, reserving (backend_ids_changed = 1)
[New Thread 0x7fc16320b6c0 (LWP 4933)]
[New Thread 0x7fc162a0a6c0 (LWP 4934)]
[New Thread 0x7fc1622096c0 (LWP 4935)]
[New Thread 0x7fc161a086c0 (LWP 4936)]
[New Thread 0x7fc1612076c0 (LWP 4937)]
[New Thread 0x7fc160a066c0 (LWP 4938)]
[New Thread 0x7fc1602056c0 (LWP 4939)]
set_warmup: value = 0
main: llama threadpool init, n_threads = 8
attach_threadpool: call
system_info: n_threads = 8 (n_threads_batch = 8) / 8 | CPU : SSE3 = 1 | SSSE3 = 1 | AVX = 1 | AVX_VNNI = 1 | AVX2 = 1 | F16C = 1 | FMA = 1 | BMI2 = 1 | LLAMAFILE = 1 | OPENMP = 1 | REPACK = 1 |
n_ctx: 4096, add_bos: 0
tokenize the prompt
prompt: "a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a"
tokens: [ 'a':64, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264,' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, 'a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264,' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, 'a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264,' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, 'a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264,' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, 'a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264,' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, 'a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264,' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264 ]
recalculate the cached logits (check): embd_inp.size() 513, n_matching_session_tokens 0, embd_inp.size() 513, session_tokens.size() 0
sampler seed: 3686199137
sampler params:
repeat_last_n = 64, repeat_penalty = 1.000, frequency_penalty = 0.000, presence_penalty = 0.000
dry_multiplier = 0.000, dry_base = 1.750, dry_allowed_length = 2, dry_penalty_last_n = 4096
top_k = 40, top_p = 0.950, min_p = 0.050, xtc_probability = 0.000, xtc_threshold = 0.100, typical_p = 1.000, top_n_sigma = -1.000, temp = 0.800
mirostat = 0, mirostat_lr = 0.100, mirostat_ent = 5.000
sampler chain: logits -> logit-bias -> penalties -> dry -> top-n-sigma -> top-k -> typical -> top-p -> min-p -> xtc -> temp-ext -> dist
generate: n_ctx = 4096, n_batch = 2048, n_predict = -1, n_keep = 0
embd_inp.size(): 513, n_consumed: 0
a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a aa a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a aa a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a aa a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a aa a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a aa a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a aeval: [ 'a':64, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264, ' a':264 ]
ggml_backend_sched_alloc_splits: failed to allocate graph, reserving (backend_ids_changed = 1)
Thread 1 "llama-cli" received signal SIGFPE, Arithmetic exception.
0x00007fc1ce2fa8c0 in ggml_cl_soft_max (backend=0x55b71c6b2de0, src0=0x7fc1966f2820, src1=0x0, dst=0x7fc1966f2990)
at /models/llama.cpp-b5780/ggml/src/ggml-opencl/ggml-opencl.cpp:5727
5727 const int n_head = nrows_x/nrows_y;
(gdb) bt
#0 0x00007fc1ce2fa8c0 in ggml_cl_soft_max (backend=0x55b71c6b2de0, src0=0x7fc1966f2820, src1=0x0, dst=0x7fc1966f2990)
at /models/llama.cpp-b5780/ggml/src/ggml-opencl/ggml-opencl.cpp:5727
#1 0x00007fc1ce30067c in ggml_cl_compute_forward (backend=0x55b71c6b2de0, tensor=0x7fc1966f2990) at /models/llama.cpp-b5780/ggml/src/ggml-opencl/ggml-opencl.cpp:6368
#2 0x00007fc1ce2db477 in ggml_backend_opencl_graph_compute (backend=0x55b71c6b2de0, cgraph=0x55b71c6365a8)
at /models/llama.cpp-b5780/ggml/src/ggml-opencl/ggml-opencl.cpp:2170
#3 0x00007fc1cd9b864a in ggml_backend_graph_compute_async (backend=0x55b71c6b2de0, cgraph=0x55b71c6365a8) at /models/llama.cpp-b5780/ggml/src/ggml-backend.cpp:334
#4 0x00007fc1cd9bc84e in ggml_backend_sched_compute_splits (sched=0x55b71c6bea80) at /models/llama.cpp-b5780/ggml/src/ggml-backend.cpp:1405
#5 0x00007fc1cd9bd4ee in ggml_backend_sched_graph_compute_async (sched=0x55b71c6bea80, graph=0x7fc196354030) at /models/llama.cpp-b5780/ggml/src/ggml-backend.cpp:1597
#6 0x00007fc1cdfbcc63 in llama_context::graph_compute (this=0x55b71c5fa2b0, gf=0x7fc196354030, batched=true) at /models/llama.cpp-b5780/src/llama-context.cpp:1375
#7 0x00007fc1cdfb9a6e in llama_context::process_ubatch (this=0x55b71c5fa2b0, ubatch=..., gtype=LLM_GRAPH_TYPE_DECODER, mctx=0x55b71d17abf0,
ret=@0x7fff1daf5618: GGML_STATUS_FAILED) at /models/llama.cpp-b5780/src/llama-context.cpp:712
#8 0x00007fc1cdfbaef9 in llama_context::decode (this=0x55b71c5fa2b0, batch_inp=...) at /models/llama.cpp-b5780/src/llama-context.cpp:1012
#9 0x00007fc1cdfc1af5 in llama_decode (ctx=0x55b71c5fa2b0, batch=...) at /models/llama.cpp-b5780/src/llama-context.cpp:2775
#10 0x000055b715a3433d in main (argc=9, argv=0x7fff1daf7998) at /models/llama.cpp-b5780/tools/main/main.cpp:671
(gdb) info locals
backend_ctx = 0x55b71b405c00
extra0 = 0x55b71d12ff60
extrad = 0x55b71d12ff40
extra1 = 0x0
offset0 = 0
offsetd = 0
offset1 = 0
ne00 = 128
ne01 = 0
ne02 = 1
ne03 = 1
scale = 1
max_bias = 0
nrows_x = 0
nrows_y = 0
n_head = 0
n_head_log2 = 0
m0 = 0
m1 = -5460992
use_f16 = false
nth = 0
kernel = 0x55b71ed6a980
global_work_size = {4096, 0, 1}
local_work_size = {32, 1, 1}
(gdb) info args
backend = 0x55b71c6b2de0
src0 = 0x7fc1966f2820
src1 = 0x0
dst = 0x7fc1966f2990
(gdb) print *src0
$1 = {type = GGML_TYPE_F32, buffer = 0x55b71d126ab0, ne = {128, 0, 1, 1}, nb = {4, 512, 0, 0}, op = GGML_OP_MUL_MAT, op_params = {0 <repeats 16 times>}, flags = 0, src = {
0x55b71ed579e0, 0x7fc1966f26b0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, view_src = 0x0, view_offs = 0, data = 0x80,
name = "ffn_moe_logits-47", '\000' <repeats 46 times>, extra = 0x55b71d12ff60, padding = "\000\000\000\000\000\000\000"}
(gdb)