We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
webui crashes after sending prompt
./start_linux.sh
No response
❯ ./start_linux.sh 14:01:15-573868 INFO Starting Text generation web UI Running on local URL: http://127.0.0.1:7860 14:01:45-565977 INFO Loading "Phi-3-mini-4k-instruct-fp16.gguf" 14:01:45-598096 INFO llama.cpp weights detected: "models/Phi-3-mini-4k-instruct-fp16.gguf" llama_model_loader: loaded meta data with 24 key-value pairs and 291 tensors from models/Phi-3-mini-4k-instruct-fp16.gguf (version GGUF V3 (latest)) llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output. llama_model_loader: - kv 0: general.architecture str = llama llama_model_loader: - kv 1: general.name str = LLaMA v2 llama_model_loader: - kv 2: llama.vocab_size u32 = 32064 llama_model_loader: - kv 3: llama.context_length u32 = 4096 llama_model_loader: - kv 4: llama.embedding_length u32 = 3072 llama_model_loader: - kv 5: llama.block_count u32 = 32 llama_model_loader: - kv 6: llama.feed_forward_length u32 = 8192 llama_model_loader: - kv 7: llama.rope.dimension_count u32 = 96 llama_model_loader: - kv 8: llama.attention.head_count u32 = 32 llama_model_loader: - kv 9: llama.attention.head_count_kv u32 = 32 llama_model_loader: - kv 10: llama.attention.layer_norm_rms_epsilon f32 = 0.000010 llama_model_loader: - kv 11: llama.rope.freq_base f32 = 10000.000000 llama_model_loader: - kv 12: general.file_type u32 = 1 llama_model_loader: - kv 13: tokenizer.ggml.model str = llama llama_model_loader: - kv 14: tokenizer.ggml.tokens arr[str,32064] = ["<unk>", "<s>", "</s>", "<0x00>", "<... llama_model_loader: - kv 15: tokenizer.ggml.scores arr[f32,32064] = [0.000000, 0.000000, 0.000000, 0.0000... llama_model_loader: - kv 16: tokenizer.ggml.token_type arr[i32,32064] = [2, 3, 3, 6, 6, 6, 6, 6, 6, 6, 6, 6, ... llama_model_loader: - kv 17: tokenizer.ggml.bos_token_id u32 = 1 llama_model_loader: - kv 18: tokenizer.ggml.eos_token_id u32 = 32000 llama_model_loader: - kv 19: tokenizer.ggml.unknown_token_id u32 = 0 llama_model_loader: - kv 20: tokenizer.ggml.padding_token_id u32 = 32000 llama_model_loader: - kv 21: tokenizer.ggml.add_bos_token bool = true llama_model_loader: - kv 22: tokenizer.ggml.add_eos_token bool = false llama_model_loader: - kv 23: tokenizer.chat_template str = {{ bos_token }}{% for message in mess... llama_model_loader: - type f32: 65 tensors llama_model_loader: - type f16: 226 tensors llm_load_vocab: control-looking token: '<|end|>' was not control-type; this is probably a bug in the model. its type will be overridden llm_load_vocab: control-looking token: '<|endoftext|>' was not control-type; this is probably a bug in the model. its type will be overridden llm_load_vocab: special tokens cache size = 67 llm_load_vocab: token to piece cache size = 0.1691 MB llm_load_print_meta: format = GGUF V3 (latest) llm_load_print_meta: arch = llama llm_load_print_meta: vocab type = SPM llm_load_print_meta: n_vocab = 32064 llm_load_print_meta: n_merges = 0 llm_load_print_meta: vocab_only = 0 llm_load_print_meta: n_ctx_train = 4096 llm_load_print_meta: n_embd = 3072 llm_load_print_meta: n_layer = 32 llm_load_print_meta: n_head = 32 llm_load_print_meta: n_head_kv = 32 llm_load_print_meta: n_rot = 96 llm_load_print_meta: n_swa = 0 llm_load_print_meta: n_embd_head_k = 96 llm_load_print_meta: n_embd_head_v = 96 llm_load_print_meta: n_gqa = 1 llm_load_print_meta: n_embd_k_gqa = 3072 llm_load_print_meta: n_embd_v_gqa = 3072 llm_load_print_meta: f_norm_eps = 0.0e+00 llm_load_print_meta: f_norm_rms_eps = 1.0e-05 llm_load_print_meta: f_clamp_kqv = 0.0e+00 llm_load_print_meta: f_max_alibi_bias = 0.0e+00 llm_load_print_meta: f_logit_scale = 0.0e+00 llm_load_print_meta: n_ff = 8192 llm_load_print_meta: n_expert = 0 llm_load_print_meta: n_expert_used = 0 llm_load_print_meta: causal attn = 1 llm_load_print_meta: pooling type = 0 llm_load_print_meta: rope type = 0 llm_load_print_meta: rope scaling = linear llm_load_print_meta: freq_base_train = 10000.0 llm_load_print_meta: freq_scale_train = 1 llm_load_print_meta: n_ctx_orig_yarn = 4096 llm_load_print_meta: rope_finetuned = unknown llm_load_print_meta: ssm_d_conv = 0 llm_load_print_meta: ssm_d_inner = 0 llm_load_print_meta: ssm_d_state = 0 llm_load_print_meta: ssm_dt_rank = 0 llm_load_print_meta: ssm_dt_b_c_rms = 0 llm_load_print_meta: model type = 7B llm_load_print_meta: model ftype = F16 llm_load_print_meta: model params = 3.82 B llm_load_print_meta: model size = 7.12 GiB (16.00 BPW) llm_load_print_meta: general.name = LLaMA v2 llm_load_print_meta: BOS token = 1 '<s>' llm_load_print_meta: EOS token = 32000 '<|endoftext|>' llm_load_print_meta: UNK token = 0 '<unk>' llm_load_print_meta: PAD token = 32000 '<|endoftext|>' llm_load_print_meta: LF token = 13 '<0x0A>' llm_load_print_meta: EOT token = 32007 '<|end|>' llm_load_print_meta: EOG token = 32000 '<|endoftext|>' llm_load_print_meta: EOG token = 32007 '<|end|>' llm_load_print_meta: max token length = 48 ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no ggml_cuda_init: found 1 ROCm devices: Device 0: AMD Radeon RX 6900 XT, compute capability 10.3, VMM: no llm_load_tensors: ggml ctx size = 0.27 MiB llm_load_tensors: offloading 32 repeating layers to GPU llm_load_tensors: offloading non-repeating layers to GPU llm_load_tensors: offloaded 33/33 layers to GPU llm_load_tensors: ROCm0 buffer size = 7100.64 MiB llm_load_tensors: CPU buffer size = 187.88 MiB ................................................................................................. llama_new_context_with_model: n_ctx = 4096 llama_new_context_with_model: n_batch = 512 llama_new_context_with_model: n_ubatch = 512 llama_new_context_with_model: flash_attn = 0 llama_new_context_with_model: freq_base = 10000.0 llama_new_context_with_model: freq_scale = 1 llama_kv_cache_init: ROCm0 KV buffer size = 1536.00 MiB llama_new_context_with_model: KV self size = 1536.00 MiB, K (f16): 768.00 MiB, V (f16): 768.00 MiB llama_new_context_with_model: ROCm_Host output buffer size = 0.12 MiB llama_new_context_with_model: ROCm0 compute buffer size = 288.00 MiB llama_new_context_with_model: ROCm_Host compute buffer size = 14.01 MiB llama_new_context_with_model: graph nodes = 1030 llama_new_context_with_model: graph splits = 2 AVX = 1 | AVX_VNNI = 0 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | AVX512_BF16 = 0 | FMA = 1 | NEON = 0 | SVE = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | RISCV_VECT = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | MATMUL_INT8 = 0 | LLAMAFILE = 1 | Model metadata: {'tokenizer.chat_template': "{{ bos_token }}{% for message in messages %}{{'<|' + message['role'] + '|>' + '\n' + message['content'] + '<|end|>\n' }}{% endfor %}{% if add_generation_prompt %}{{ '<|assistant|>\n' }}{% else %}{{ eos_token }}{% endif %}", 'tokenizer.ggml.add_eos_token': 'false', 'tokenizer.ggml.padding_token_id': '32000', 'tokenizer.ggml.unknown_token_id': '0', 'tokenizer.ggml.eos_token_id': '32000', 'tokenizer.ggml.model': 'llama', 'general.architecture': 'llama', 'llama.rope.freq_base': '10000.000000', 'llama.context_length': '4096', 'general.name': 'LLaMA v2', 'llama.vocab_size': '32064', 'general.file_type': '1', 'tokenizer.ggml.add_bos_token': 'true', 'llama.embedding_length': '3072', 'llama.feed_forward_length': '8192', 'llama.attention.layer_norm_rms_epsilon': '0.000010', 'llama.rope.dimension_count': '96', 'tokenizer.ggml.bos_token_id': '1', 'llama.attention.head_count': '32', 'llama.block_count': '32', 'llama.attention.head_count_kv': '32'} Available chat formats from metadata: chat_template.default Using gguf chat template: {{ bos_token }}{% for message in messages %}{{'<|' + message['role'] + '|>' + ' ' + message['content'] + '<|end|> ' }}{% endfor %}{% if add_generation_prompt %}{{ '<|assistant|> ' }}{% else %}{{ eos_token }}{% endif %} Using chat eos_token: <|endoftext|> Using chat bos_token: <s> 14:01:47-732817 INFO Loaded "Phi-3-mini-4k-instruct-fp16.gguf" in 2.17 seconds. 14:01:47-733607 INFO LOADER: "llama.cpp" 14:01:47-734197 INFO TRUNCATION LENGTH: 4096 14:01:47-734685 INFO INSTRUCTION TEMPLATE: "Custom (obtained from model metadata)" ggml_cuda_compute_forward: RMS_NORM failed CUDA error: invalid device function current device: 0, in function ggml_cuda_compute_forward at /home/runner/work/llama-cpp-python-cuBLAS-wheels/llama-cpp-python-cuBLAS-wheels/vendor/llama.cpp/ggml/src/ggml-cuda.cu:2368 err /home/runner/work/llama-cpp-python-cuBLAS-wheels/llama-cpp-python-cuBLAS-wheels/vendor/llama.cpp/ggml/src/ggml-cuda.cu:106: CUDA error [New LWP 12809] [New LWP 12789] [New LWP 12788] [New LWP 12751] [New LWP 12716] [New LWP 12715] [New LWP 12714] [New LWP 12713] [New LWP 12712] [New LWP 12711] [New LWP 12710] [New LWP 12709] [New LWP 12708] [New LWP 12707] [New LWP 12706] [New LWP 12705] [New LWP 12704] [New LWP 12703] [New LWP 12702] [New LWP 12701] [New LWP 12700] [New LWP 12699] This GDB supports auto-downloading debuginfo from the following URLs: <https://debuginfod.fedoraproject.org/> Enable debuginfod for this session? (y or [n]) [answered N; input not from terminal] Debuginfod has been disabled. To make this setting permanent, add 'set debuginfod enabled off' to .gdbinit. [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". 0x00007fe8071e5c13 in clock_nanosleep@GLIBC_2.2.5 () from /lib64/libc.so.6 #0 0x00007fe8071e5c13 in clock_nanosleep@GLIBC_2.2.5 () from /lib64/libc.so.6 #1 0x0000000000645275 in pysleep (timeout=<optimized out>) at /usr/local/src/conda/python-3.11.10/Modules/timemodule.c:2159 warning: 2159 /usr/local/src/conda/python-3.11.10/Modules/timemodule.c: No such file or directory #2 time_sleep (self=<optimized out>, timeout_obj=<optimized out>) at /usr/local/src/conda/python-3.11.10/Modules/timemodule.c:383 383 in /usr/local/src/conda/python-3.11.10/Modules/timemodule.c #3 0x0000000000511e46 in _PyEval_EvalFrameDefault (tstate=tstate@entry=0x8a7a38 <_PyRuntime+166328>, frame=<optimized out>, frame@entry=0x7fe8073fa020, throwflag=throwflag@entry=0) at /usr/local/src/conda/python-3.11.10/Python/ceval.c:5020 warning: 5020 /usr/local/src/conda/python-3.11.10/Python/ceval.c: No such file or directory #4 0x00000000005cc1ea in _PyEval_EvalFrame (throwflag=0, frame=0x7fe8073fa020, tstate=0x8a7a38 <_PyRuntime+166328>) at /usr/local/src/conda/python-3.11.10/Include/internal/pycore_ceval.h:73 warning: 73 /usr/local/src/conda/python-3.11.10/Include/internal/pycore_ceval.h: No such file or directory #5 _PyEval_Vector (tstate=tstate@entry=0x8a7a38 <_PyRuntime+166328>, func=func@entry=0x7fe8070987c0, locals=locals@entry=0x7fe8070f24c0, args=args@entry=0x0, argcount=argcount@entry=0, kwnames=kwnames@entry=0x0) at /usr/local/src/conda/python-3.11.10/Python/ceval.c:6434 warning: 6434 /usr/local/src/conda/python-3.11.10/Python/ceval.c: No such file or directory #6 0x00000000005cb8bf in PyEval_EvalCode (co=co@entry=0xbac8130, globals=globals@entry=0x7fe8070f24c0, locals=locals@entry=0x7fe8070f24c0) at /usr/local/src/conda/python-3.11.10/Python/ceval.c:1148 1148 in /usr/local/src/conda/python-3.11.10/Python/ceval.c #7 0x00000000005ec9e7 in run_eval_code_obj (tstate=tstate@entry=0x8a7a38 <_PyRuntime+166328>, co=co@entry=0xbac8130, globals=globals@entry=0x7fe8070f24c0, locals=locals@entry=0x7fe8070f24c0) at /usr/local/src/conda/python-3.11.10/Python/pythonrun.c:1741 warning: 1741 /usr/local/src/conda/python-3.11.10/Python/pythonrun.c: No such file or directory #8 0x00000000005e8580 in run_mod (mod=mod@entry=0xbae9900, filename=filename@entry=0x7fe80702d300, globals=globals@entry=0x7fe8070f24c0, locals=locals@entry=0x7fe8070f24c0, flags=flags@entry=0x7fff954f7af8, arena=arena@entry=0x7fe80701b630) at /usr/local/src/conda/python-3.11.10/Python/pythonrun.c:1762 1762 in /usr/local/src/conda/python-3.11.10/Python/pythonrun.c #9 0x00000000005fd4d2 in pyrun_file (fp=fp@entry=0xba23080, filename=filename@entry=0x7fe80702d300, start=start@entry=257, globals=globals@entry=0x7fe8070f24c0, locals=locals@entry=0x7fe8070f24c0, closeit=closeit@entry=1, flags=0x7fff954f7af8) at /usr/local/src/conda/python-3.11.10/Python/pythonrun.c:1657 1657 in /usr/local/src/conda/python-3.11.10/Python/pythonrun.c #10 0x00000000005fc89f in _PyRun_SimpleFileObject (fp=0xba23080, filename=0x7fe80702d300, closeit=1, flags=0x7fff954f7af8) at /usr/local/src/conda/python-3.11.10/Python/pythonrun.c:440 440 in /usr/local/src/conda/python-3.11.10/Python/pythonrun.c #11 0x00000000005fc5c3 in _PyRun_AnyFileObject (fp=0xba23080, filename=filename@entry=0x7fe80702d300, closeit=closeit@entry=1, flags=flags@entry=0x7fff954f7af8) at /usr/local/src/conda/python-3.11.10/Python/pythonrun.c:79 79 in /usr/local/src/conda/python-3.11.10/Python/pythonrun.c #12 0x00000000005f723e in pymain_run_file_obj (skip_source_first_line=0, filename=0x7fe80702d300, program_name=0x7fe8070f26b0) at /usr/local/src/conda/python-3.11.10/Modules/main.c:360 warning: 360 /usr/local/src/conda/python-3.11.10/Modules/main.c: No such file or directory #13 pymain_run_file (config=0x88da80 <_PyRuntime+59904>) at /usr/local/src/conda/python-3.11.10/Modules/main.c:379 379 in /usr/local/src/conda/python-3.11.10/Modules/main.c #14 pymain_run_python (exitcode=0x7fff954f7af0) at /usr/local/src/conda/python-3.11.10/Modules/main.c:605 605 in /usr/local/src/conda/python-3.11.10/Modules/main.c #15 Py_RunMain () at /usr/local/src/conda/python-3.11.10/Modules/main.c:684 684 in /usr/local/src/conda/python-3.11.10/Modules/main.c #16 0x00000000005bbf89 in Py_BytesMain (argc=<optimized out>, argv=<optimized out>) at /usr/local/src/conda/python-3.11.10/Modules/main.c:738 738 in /usr/local/src/conda/python-3.11.10/Modules/main.c #17 0x00007fe80712c088 in __libc_start_call_main () from /lib64/libc.so.6 #18 0x00007fe80712c14b in __libc_start_main_impl () from /lib64/libc.so.6 #19 0x00000000005bbdd3 in _start () [Inferior 1 (process 12681) detached]
fedora 6.11.5-200.fc40.x86_64 Fedora 40 CPU: AMD Ryzen 7 5800X (16) @ 4.85 GHz GPU: AMD Radeon RX 6900 XT [Discrete] rocminfo ======== ROCk module is loaded ===================== HSA System Attributes ===================== Runtime Version: 1.1 Runtime Ext Version: 1.4 System Timestamp Freq.: 1000.000000MHz Sig. Max Wait Duration: 18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count) Machine Model: LARGE System Endianness: LITTLE Mwaitx: DISABLED DMAbuf Support: YES ========== HSA Agents ========== ******* Agent 1 ******* Name: AMD Ryzen 7 5800X 8-Core Processor Uuid: CPU-XX Marketing Name: AMD Ryzen 7 5800X 8-Core Processor Vendor Name: CPU Feature: None specified Profile: FULL_PROFILE Float Round Mode: NEAR Max Queue Number: 0(0x0) Queue Min Size: 0(0x0) Queue Max Size: 0(0x0) Queue Type: MULTI Node: 0 Device Type: CPU Cache Info: L1: 32768(0x8000) KB Chip ID: 0(0x0) ASIC Revision: 0(0x0) Cacheline Size: 64(0x40) Max Clock Freq. (MHz): 4851 BDFID: 0 Internal Node ID: 0 Compute Unit: 16 SIMDs per CU: 0 Shader Engines: 0 Shader Arrs. per Eng.: 0 WatchPts on Addr. Ranges:1 Features: None Pool Info: Pool 1 Segment: GLOBAL; FLAGS: FINE GRAINED Size: 40969404(0x27124bc) KB Allocatable: TRUE Alloc Granule: 4KB Alloc Recommended Granule:4KB Alloc Alignment: 4KB Accessible by all: TRUE Pool 2 Segment: GLOBAL; FLAGS: KERNARG, FINE GRAINED Size: 40969404(0x27124bc) KB Allocatable: TRUE Alloc Granule: 4KB Alloc Recommended Granule:4KB Alloc Alignment: 4KB Accessible by all: TRUE Pool 3 Segment: GLOBAL; FLAGS: COARSE GRAINED Size: 40969404(0x27124bc) KB Allocatable: TRUE Alloc Granule: 4KB Alloc Recommended Granule:4KB Alloc Alignment: 4KB Accessible by all: TRUE ISA Info: ******* Agent 2 ******* Name: gfx1030 Uuid: GPU-762c9ecf002e0002 Marketing Name: AMD Radeon RX 6900 XT Vendor Name: AMD Feature: KERNEL_DISPATCH Profile: BASE_PROFILE Float Round Mode: NEAR Max Queue Number: 128(0x80) Queue Min Size: 64(0x40) Queue Max Size: 131072(0x20000) Queue Type: MULTI Node: 1 Device Type: GPU Cache Info: L1: 16(0x10) KB L2: 4096(0x1000) KB L3: 131072(0x20000) KB Chip ID: 29615(0x73af) ASIC Revision: 1(0x1) Cacheline Size: 128(0x80) Max Clock Freq. (MHz): 2720 BDFID: 2816 Internal Node ID: 1 Compute Unit: 80 SIMDs per CU: 2 Shader Engines: 4 Shader Arrs. per Eng.: 2 WatchPts on Addr. Ranges:4 Coherent Host Access: FALSE Features: KERNEL_DISPATCH Fast F16 Operation: TRUE Wavefront Size: 32(0x20) Workgroup Max Size: 1024(0x400) Workgroup Max Size per Dimension: x 1024(0x400) y 1024(0x400) z 1024(0x400) Max Waves Per CU: 32(0x20) Max Work-item Per CU: 1024(0x400) Grid Max Size: 4294967295(0xffffffff) Grid Max Size per Dimension: x 4294967295(0xffffffff) y 4294967295(0xffffffff) z 4294967295(0xffffffff) Max fbarriers/Workgrp: 32 Packet Processor uCode:: 120 SDMA engine uCode:: 83 IOMMU Support:: None Pool Info: Pool 1 Segment: GLOBAL; FLAGS: COARSE GRAINED Size: 16760832(0xffc000) KB Allocatable: TRUE Alloc Granule: 4KB Alloc Recommended Granule:2048KB Alloc Alignment: 4KB Accessible by all: FALSE Pool 2 Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED Size: 16760832(0xffc000) KB Allocatable: TRUE Alloc Granule: 4KB Alloc Recommended Granule:2048KB Alloc Alignment: 4KB Accessible by all: FALSE Pool 3 Segment: GROUP Size: 64(0x40) KB Allocatable: FALSE Alloc Granule: 0KB Alloc Recommended Granule:0KB Alloc Alignment: 0KB Accessible by all: FALSE ISA Info: ISA 1 Name: amdgcn-amd-amdhsa--gfx1030 Machine Models: HSA_MACHINE_MODEL_LARGE Profiles: HSA_PROFILE_BASE Default Rounding Mode: NEAR Default Rounding Mode: NEAR Fast f16: TRUE Workgroup Max Size: 1024(0x400) Workgroup Max Size per Dimension: x 1024(0x400) y 1024(0x400) z 1024(0x400) Grid Max Size: 4294967295(0xffffffff) Grid Max Size per Dimension: x 4294967295(0xffffffff) y 4294967295(0xffffffff) z 4294967295(0xffffffff) FBarrier Max Size: 32 *** Done *** rocm-clinfo =========== Number of platforms: 1 Platform Profile: FULL_PROFILE Platform Version: OpenCL 2.1 AMD-APP (3614.0) Platform Name: AMD Accelerated Parallel Processing Platform Vendor: Advanced Micro Devices, Inc. Platform Extensions: cl_khr_icd cl_amd_event_callback Platform Name: AMD Accelerated Parallel Processing Number of devices: 1 Device Type: CL_DEVICE_TYPE_GPU Vendor ID: 1002h Board name: AMD Radeon RX 6900 XT Device Topology: PCI[ B#11, D#0, F#0 ] Max compute units: 40 Max work items dimensions: 3 Max work items[0]: 1024 Max work items[1]: 1024 Max work items[2]: 1024 Max work group size: 256 Preferred vector width char: 4 Preferred vector width short: 2 Preferred vector width int: 1 Preferred vector width long: 1 Preferred vector width float: 1 Preferred vector width double: 1 Native vector width char: 4 Native vector width short: 2 Native vector width int: 1 Native vector width long: 1 Native vector width float: 1 Native vector width double: 1 Max clock frequency: 2720Mhz Address bits: 64 Max memory allocation: 14588628168 Image support: Yes Max number of images read arguments: 128 Max number of images write arguments: 8 Max image 2D width: 16384 Max image 2D height: 16384 Max image 3D width: 16384 Max image 3D height: 16384 Max image 3D depth: 8192 Max samplers within kernel: 16 Max size of kernel argument: 1024 Alignment (bits) of base address: 1024 Minimum alignment (bytes) for any datatype: 128 Single precision floating point capability Denorms: Yes Quiet NaNs: Yes Round to nearest even: Yes Round to zero: Yes Round to +ve and infinity: Yes IEEE754-2008 fused multiply-add: Yes Cache type: Read/Write Cache line size: 128 Cache size: 16384 Global memory size: 17163091968 Constant buffer size: 14588628168 Max number of constant args: 8 Local memory type: Scratchpad Local memory size: 65536 Max pipe arguments: 16 Max pipe active reservations: 16 Max pipe packet size: 1703726280 Max global variable size: 14588628168 Max global variable preferred total size: 17163091968 Max read/write image args: 64 Max on device events: 1024 Queue on device max size: 8388608 Max on device queues: 1 Queue on device preferred size: 262144 SVM capabilities: Coarse grain buffer: Yes Fine grain buffer: Yes Fine grain system: No Atomics: No Preferred platform atomic alignment: 0 Preferred global atomic alignment: 0 Preferred local atomic alignment: 0 Kernel Preferred work group size multiple: 32 Error correction support: 0 Unified memory for Host and Device: 0 Profiling timer resolution: 1 Device endianess: Little Available: Yes Compiler available: Yes Execution capabilities: Execute OpenCL kernels: Yes Execute native function: No Queue on Host properties: Out-of-Order: No Profiling : Yes Queue on Device properties: Out-of-Order: Yes Profiling : Yes Platform ID: 0x7efe81d1c7c8 Name: gfx1030 Vendor: Advanced Micro Devices, Inc. Device OpenCL C version: OpenCL C 2.0 Driver version: 3614.0 (HSA1.1,LC) Profile: FULL_PROFILE Version: OpenCL 2.0 Extensions: cl_khr_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_gl_sharing cl_amd_device_attribute_query cl_amd_media_ops cl_amd_media_ops2 cl_khr_image2d_from_buffer cl_khr_subgroups cl_khr_depth_images cl_amd_copy_buffer_p2p cl_amd_assembly_program
The text was updated successfully, but these errors were encountered:
No branches or pull requests
Describe the bug
webui crashes after sending prompt
Is there an existing issue for this?
Reproduction
./start_linux.sh
Screenshot
No response
Logs
System Info
The text was updated successfully, but these errors were encountered: