-
Notifications
You must be signed in to change notification settings - Fork 710
Open
Description
Attaching to llama-gpt-llama-gpt-api-cuda-ggml-1, llama-gpt-llama-gpt-ui-1
llama-gpt-llama-gpt-ui-1 | [INFO wait] --------------------------------------------------------
llama-gpt-llama-gpt-ui-1 | [INFO wait] docker-compose-wait 2.12.1
llama-gpt-llama-gpt-ui-1 | [INFO wait] ---------------------------
llama-gpt-llama-gpt-ui-1 | [DEBUG wait] Starting with configuration:
llama-gpt-llama-gpt-ui-1 | [DEBUG wait] - Hosts to be waiting for: [llama-gpt-api-cuda-ggml:8000]
llama-gpt-llama-gpt-ui-1 | [DEBUG wait] - Paths to be waiting for: []
llama-gpt-llama-gpt-ui-1 | [DEBUG wait] - Timeout before failure: 3600 seconds
llama-gpt-llama-gpt-ui-1 | [DEBUG wait] - TCP connection timeout before retry: 5 seconds
llama-gpt-llama-gpt-ui-1 | [DEBUG wait] - Sleeping time before checking for hosts/paths availability: 0 seconds
llama-gpt-llama-gpt-ui-1 | [DEBUG wait] - Sleeping time once all hosts/paths are available: 0 seconds
llama-gpt-llama-gpt-ui-1 | [DEBUG wait] - Sleeping time between retries: 1 seconds
llama-gpt-llama-gpt-ui-1 | [DEBUG wait] --------------------------------------------------------
llama-gpt-llama-gpt-ui-1 | [INFO wait] Checking availability of host [llama-gpt-api-cuda-ggml:8000]
llama-gpt-llama-gpt-api-cuda-ggml-1 |
llama-gpt-llama-gpt-api-cuda-ggml-1 | ==========
llama-gpt-llama-gpt-api-cuda-ggml-1 | == CUDA ==
llama-gpt-llama-gpt-api-cuda-ggml-1 | ==========
llama-gpt-llama-gpt-api-cuda-ggml-1 |
llama-gpt-llama-gpt-api-cuda-ggml-1 | CUDA Version 12.1.1
llama-gpt-llama-gpt-api-cuda-ggml-1 |
llama-gpt-llama-gpt-api-cuda-ggml-1 | Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
llama-gpt-llama-gpt-api-cuda-ggml-1 |
llama-gpt-llama-gpt-api-cuda-ggml-1 | This container image and its contents are governed by the NVIDIA Deep Learning Container License.
llama-gpt-llama-gpt-api-cuda-ggml-1 | By pulling and using the container, you accept the terms and conditions of this license:
llama-gpt-llama-gpt-api-cuda-ggml-1 | https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license
llama-gpt-llama-gpt-api-cuda-ggml-1 |
llama-gpt-llama-gpt-api-cuda-ggml-1 | A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.
llama-gpt-llama-gpt-api-cuda-ggml-1 |
llama-gpt-llama-gpt-api-cuda-ggml-1 | /models/llama-2-7b-chat.bin model found.
llama-gpt-llama-gpt-api-cuda-ggml-1 | make: *** No rule to make target 'build'. Stop.
llama-gpt-llama-gpt-api-cuda-ggml-1 | Initializing server with:
llama-gpt-llama-gpt-api-cuda-ggml-1 | Batch size: 2096
llama-gpt-llama-gpt-api-cuda-ggml-1 | Number of CPU threads: 12
llama-gpt-llama-gpt-api-cuda-ggml-1 | Number of GPU layers: 10
llama-gpt-llama-gpt-api-cuda-ggml-1 | Context window: 4096
llama-gpt-llama-gpt-ui-1 | [INFO wait] Host [llama-gpt-api-cuda-ggml:8000] not yet available...
llama-gpt-llama-gpt-api-cuda-ggml-1 exited with code 0
llama-gpt-llama-gpt-api-cuda-ggml-1 |
llama-gpt-llama-gpt-api-cuda-ggml-1 | /models/llama-2-7b-chat.bin model found.
llama-gpt-llama-gpt-api-cuda-ggml-1 | make: *** No rule to make target 'build'. Stop.
llama-gpt-llama-gpt-api-cuda-ggml-1 | Initializing server with:
llama-gpt-llama-gpt-api-cuda-ggml-1 | Batch size: 2096
llama-gpt-llama-gpt-api-cuda-ggml-1 | Number of CPU threads: 12
llama-gpt-llama-gpt-api-cuda-ggml-1 | Number of GPU layers: 10
llama-gpt-llama-gpt-api-cuda-ggml-1 | Context window: 4096
llama-gpt-llama-gpt-api-cuda-ggml-1 exited with code 0
llama-gpt-llama-gpt-api-cuda-ggml-1 |
llama-gpt-llama-gpt-api-cuda-ggml-1 | /models/llama-2-7b-chat.bin model found.
llama-gpt-llama-gpt-api-cuda-ggml-1 | make: *** No rule to make target 'build'. Stop.
llama-gpt-llama-gpt-api-cuda-ggml-1 | Initializing server with:
llama-gpt-llama-gpt-api-cuda-ggml-1 | Batch size: 2096
llama-gpt-llama-gpt-api-cuda-ggml-1 | Number of CPU threads: 12
llama-gpt-llama-gpt-api-cuda-ggml-1 | Number of GPU layers: 10
llama-gpt-llama-gpt-api-cuda-ggml-1 | Context window: 4096
llama-gpt-llama-gpt-ui-1 | [INFO wait] Host [llama-gpt-api-cuda-ggml:8000] not yet available...
llama-gpt-llama-gpt-api-cuda-ggml-1 exited with code 132
llama-gpt-llama-gpt-api-cuda-ggml-1 |
llama-gpt-llama-gpt-api-cuda-ggml-1 | /models/llama-2-7b-chat.bin model found.
llama-gpt-llama-gpt-api-cuda-ggml-1 | make: *** No rule to make target 'build'. Stop.
llama-gpt-llama-gpt-api-cuda-ggml-1 | Initializing server with:
llama-gpt-llama-gpt-api-cuda-ggml-1 | Batch size: 2096
llama-gpt-llama-gpt-api-cuda-ggml-1 | Number of CPU threads: 12
llama-gpt-llama-gpt-api-cuda-ggml-1 | Number of GPU layers: 10
llama-gpt-llama-gpt-api-cuda-ggml-1 | Context window: 4096
llama-gpt-llama-gpt-api-cuda-ggml-1 exited with code 132
Run on Ubuntu Server 22.04.3. It looks like there isn't a Makefile included for the CUDA container that the corresponding run.sh is looking for.
Dev-Force
Metadata
Metadata
Assignees
Labels
No labels