Skip to content

Eval bug: b7574 error loading model: read error: Bad address #18473

@thomas-0816

Description

@thomas-0816

Name and Version

./llama-cli --version
load_backend: loaded RPC backend from /media/veracrypt1/code/llama-b7574/libggml-rpc.so
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = AMD Unknown (RADV GFX1103_R1) (radv) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 65536 | int dot: 1 | matrix cores: none
load_backend: loaded Vulkan backend from /media/veracrypt1/code/llama-b7574/libggml-vulkan.so
load_backend: loaded CPU backend from /media/veracrypt1/code/llama-b7574/libggml-cpu-zen4.so
version: 7574 (5b1248c)
built with GNU 11.4.0 for Linux x86_64

Operating systems

Linux

GGML backends

Vulkan

Hardware

AMD Ryzen 7 PRO 7840U w/ Radeon 780M Graphics

Models

unsloth/gpt-oss-20b-GGUF:F16

Problem description & steps to reproduce

./llama-server -hf unsloth/gpt-oss-20b-GGUF:F16 --jinja -ngl 99 --threads -1 --parallel 4 --ctx-size 16384 --temp 1.0 --top-p 1.0 --top-k 0 --no-mmap --kv-unified --n_predict 4096 --chat-template-kwargs '{"reasoning_effort": "low"}'

using b7574 (Ubuntu x64 Vulkan), it fails with:

load_tensors:      Vulkan0 model buffer size = 12036.67 MiB
load_tensors:  Vulkan_Host model buffer size =  1104.61 MiB
llama_model_load: error loading model: read error: Bad address
llama_model_load_from_file_impl: failed to load model
common_init_from_params: failed to load model '/home/tb/.cache/llama.cpp/unsloth_gpt-oss-20b-GGUF_gpt-oss-20b-F16.gguf'
srv    load_model: failed to load model, '/home/tb/.cache/llama.cpp/unsloth_gpt-oss-20b-GGUF_gpt-oss-20b-F16.gguf'
srv    operator(): operator(): cleaning up before exit...
main: exiting due to model loading error

using b7472 (Ubuntu x64 Vulkan), there is no such error.

I'm using Ubuntu 22.04.5 LTS

First Bad Commit

b7574 fails
b7502 fails
b7501 succeeds
b7472 succeeds

Relevant log output

Logs
load_tensors: loading model tensors, this can take a while... (mmap = false)
load_tensors: offloading output layer to GPU
load_tensors: offloading 23 repeating layers to GPU
load_tensors: offloaded 25/25 layers to GPU
load_tensors:      Vulkan0 model buffer size = 12036.67 MiB
load_tensors:  Vulkan_Host model buffer size =  1104.61 MiB
llama_model_load: error loading model: read error: Bad address
llama_model_load_from_file_impl: failed to load model
common_init_from_params: failed to load model '/home/tb/.cache/llama.cpp/unsloth_gpt-oss-20b-GGUF_gpt-oss-20b-F16.gguf'
srv    load_model: failed to load model, '/home/tb/.cache/llama.cpp/unsloth_gpt-oss-20b-GGUF_gpt-oss-20b-F16.gguf'
srv    operator(): operator(): cleaning up before exit...
main: exiting due to model loading error

Metadata

Metadata

Assignees

No one assigned

    Labels

    AMD GPUIssues specific to AMD GPUsVulkanIssues specific to the Vulkan backendbug-unconfirmed

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions