Skip to content

Unable to build CUDA-enabled image, missing Makefile #129

@VirtualDisk

Description

@VirtualDisk
Attaching to llama-gpt-llama-gpt-api-cuda-ggml-1, llama-gpt-llama-gpt-ui-1
llama-gpt-llama-gpt-ui-1             | [INFO  wait] --------------------------------------------------------
llama-gpt-llama-gpt-ui-1             | [INFO  wait]  docker-compose-wait 2.12.1
llama-gpt-llama-gpt-ui-1             | [INFO  wait] ---------------------------
llama-gpt-llama-gpt-ui-1             | [DEBUG wait] Starting with configuration:
llama-gpt-llama-gpt-ui-1             | [DEBUG wait]  - Hosts to be waiting for: [llama-gpt-api-cuda-ggml:8000]
llama-gpt-llama-gpt-ui-1             | [DEBUG wait]  - Paths to be waiting for: []
llama-gpt-llama-gpt-ui-1             | [DEBUG wait]  - Timeout before failure: 3600 seconds
llama-gpt-llama-gpt-ui-1             | [DEBUG wait]  - TCP connection timeout before retry: 5 seconds
llama-gpt-llama-gpt-ui-1             | [DEBUG wait]  - Sleeping time before checking for hosts/paths availability: 0 seconds
llama-gpt-llama-gpt-ui-1             | [DEBUG wait]  - Sleeping time once all hosts/paths are available: 0 seconds
llama-gpt-llama-gpt-ui-1             | [DEBUG wait]  - Sleeping time between retries: 1 seconds
llama-gpt-llama-gpt-ui-1             | [DEBUG wait] --------------------------------------------------------
llama-gpt-llama-gpt-ui-1             | [INFO  wait] Checking availability of host [llama-gpt-api-cuda-ggml:8000]
llama-gpt-llama-gpt-api-cuda-ggml-1  |
llama-gpt-llama-gpt-api-cuda-ggml-1  | ==========
llama-gpt-llama-gpt-api-cuda-ggml-1  | == CUDA ==
llama-gpt-llama-gpt-api-cuda-ggml-1  | ==========
llama-gpt-llama-gpt-api-cuda-ggml-1  |
llama-gpt-llama-gpt-api-cuda-ggml-1  | CUDA Version 12.1.1
llama-gpt-llama-gpt-api-cuda-ggml-1  |
llama-gpt-llama-gpt-api-cuda-ggml-1  | Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
llama-gpt-llama-gpt-api-cuda-ggml-1  |
llama-gpt-llama-gpt-api-cuda-ggml-1  | This container image and its contents are governed by the NVIDIA Deep Learning Container License.
llama-gpt-llama-gpt-api-cuda-ggml-1  | By pulling and using the container, you accept the terms and conditions of this license:
llama-gpt-llama-gpt-api-cuda-ggml-1  | https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license
llama-gpt-llama-gpt-api-cuda-ggml-1  |
llama-gpt-llama-gpt-api-cuda-ggml-1  | A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.
llama-gpt-llama-gpt-api-cuda-ggml-1  |
llama-gpt-llama-gpt-api-cuda-ggml-1  | /models/llama-2-7b-chat.bin model found.
llama-gpt-llama-gpt-api-cuda-ggml-1  | make: *** No rule to make target 'build'.  Stop.
llama-gpt-llama-gpt-api-cuda-ggml-1  | Initializing server with:
llama-gpt-llama-gpt-api-cuda-ggml-1  | Batch size: 2096
llama-gpt-llama-gpt-api-cuda-ggml-1  | Number of CPU threads: 12
llama-gpt-llama-gpt-api-cuda-ggml-1  | Number of GPU layers: 10
llama-gpt-llama-gpt-api-cuda-ggml-1  | Context window: 4096
llama-gpt-llama-gpt-ui-1             | [INFO  wait] Host [llama-gpt-api-cuda-ggml:8000] not yet available...
llama-gpt-llama-gpt-api-cuda-ggml-1 exited with code 0
llama-gpt-llama-gpt-api-cuda-ggml-1  |
llama-gpt-llama-gpt-api-cuda-ggml-1  | /models/llama-2-7b-chat.bin model found.
llama-gpt-llama-gpt-api-cuda-ggml-1  | make: *** No rule to make target 'build'.  Stop.
llama-gpt-llama-gpt-api-cuda-ggml-1  | Initializing server with:
llama-gpt-llama-gpt-api-cuda-ggml-1  | Batch size: 2096
llama-gpt-llama-gpt-api-cuda-ggml-1  | Number of CPU threads: 12
llama-gpt-llama-gpt-api-cuda-ggml-1  | Number of GPU layers: 10
llama-gpt-llama-gpt-api-cuda-ggml-1  | Context window: 4096
llama-gpt-llama-gpt-api-cuda-ggml-1 exited with code 0
llama-gpt-llama-gpt-api-cuda-ggml-1  |
llama-gpt-llama-gpt-api-cuda-ggml-1  | /models/llama-2-7b-chat.bin model found.
llama-gpt-llama-gpt-api-cuda-ggml-1  | make: *** No rule to make target 'build'.  Stop.
llama-gpt-llama-gpt-api-cuda-ggml-1  | Initializing server with:
llama-gpt-llama-gpt-api-cuda-ggml-1  | Batch size: 2096
llama-gpt-llama-gpt-api-cuda-ggml-1  | Number of CPU threads: 12
llama-gpt-llama-gpt-api-cuda-ggml-1  | Number of GPU layers: 10
llama-gpt-llama-gpt-api-cuda-ggml-1  | Context window: 4096
llama-gpt-llama-gpt-ui-1             | [INFO  wait] Host [llama-gpt-api-cuda-ggml:8000] not yet available...
llama-gpt-llama-gpt-api-cuda-ggml-1 exited with code 132
llama-gpt-llama-gpt-api-cuda-ggml-1  |
llama-gpt-llama-gpt-api-cuda-ggml-1  | /models/llama-2-7b-chat.bin model found.
llama-gpt-llama-gpt-api-cuda-ggml-1  | make: *** No rule to make target 'build'.  Stop.
llama-gpt-llama-gpt-api-cuda-ggml-1  | Initializing server with:
llama-gpt-llama-gpt-api-cuda-ggml-1  | Batch size: 2096
llama-gpt-llama-gpt-api-cuda-ggml-1  | Number of CPU threads: 12
llama-gpt-llama-gpt-api-cuda-ggml-1  | Number of GPU layers: 10
llama-gpt-llama-gpt-api-cuda-ggml-1  | Context window: 4096
llama-gpt-llama-gpt-api-cuda-ggml-1 exited with code 132

Run on Ubuntu Server 22.04.3. It looks like there isn't a Makefile included for the CUDA container that the corresponding run.sh is looking for.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions