[pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci
opea-project · Nov 8, 2024 · d6be0fd · d6be0fd
1 parent f315a09
commit d6be0fd
Show file tree

Hide file tree

Showing 4 changed files with 21 additions and 6 deletions.
diff --git a/comps/llms/text-generation/tgi/llama_stack/README.md b/comps/llms/text-generation/tgi/llama_stack/README.md
@@ -16,11 +16,15 @@ export LLM_MODEL_ID="meta-llama/Llama-3.1-8B-Instruct" # change to your llama mo
 export TGI_LLM_ENDPOINT="http://${your_ip}:8008"
 export LLAMA_STACK_ENDPOINT="http://${your_ip}:5000"
 ```
+
 Insert `TGI_LLM_ENDPOINT` to llama stack configuration yaml, you can use `envsubst` command, or just replace `${TGI_LLM_ENDPOINT}` with actual value manually.
+
 ```bash
 envsubst < ./dependency/llama_stack_run_template.yaml > ./dependency/llama_stack_run.yaml
 ```
+
 Make sure get a `llama_stack_run.yaml` file, in which the inference provider is pointing to the correct TGI server endpoint. E.g.
+
 ```bash
 inference:
   - provider_id: tgi0
@@ -40,9 +44,11 @@ pip install -r requirements.txt
 ```
 
 ### 2.2 Start TGI Service
+
 First we start a TGI endpoint for your LLM model on Gaudi.
+
 ```bash
-volume="./data" 
+volume="./data"
 docker run -p 8008:80 \
     --name tgi_service \
     -v $volume:/data \
@@ -63,7 +69,9 @@ docker run -p 8008:80 \
 ```
 
 ### 2.3 Start Llama Stack Server
+
 Then we start the Llama Stack server based on TGI endpoint.
+
 ```bash
 docker run \
   --name llamastack-service \
@@ -74,9 +82,11 @@ docker run \
 ```
 
 ### 2.4 Start Microservice with Python Script
+
 ```bash
 python llm.py
 ```
+
 ## 🚀3. Start Microservice with Docker (Option 2)
 
 If you start an LLM microservice with docker, the `docker_compose_llm.yaml` file will automatically start TGI and Llama Stack service with docker.
@@ -119,8 +129,8 @@ curl http://${your_ip}:9000/v1/health_check\
 
 ### 4.2 Consume the Services
 
-
 Verify the TGI Service
+
 ```bash
 curl http://${your_ip}:8008/generate \
   -X POST \
@@ -129,6 +139,7 @@ curl http://${your_ip}:8008/generate \
 ```
 
 Verify Llama Stack Service
+
 ```bash
 curl http://${your_ip}:5000/inference/chat_completion \
 -H "Content-Type: application/json" \
@@ -156,4 +167,4 @@ curl http://${your_ip}:9000/v1/chat/completions \
   -X POST \
   -d '{"query":"What is Deep Learning?","max_tokens":17,"top_k":10,"top_p":0.95,"typical_p":0.95,"temperature":0.01,"repetition_penalty":1.03,"streaming":true}' \
   -H 'Content-Type: application/json'
-```
+```
diff --git a/comps/llms/text-generation/tgi/llama_stack/dependency/llama_stack_run_template.yaml b/comps/llms/text-generation/tgi/llama_stack/dependency/llama_stack_run_template.yaml
@@ -1,3 +1,6 @@
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
 version: '2'
 built_at: '2024-10-08T17:40:45.325529'
 image_name: local

diff --git a/comps/llms/text-generation/tgi/llama_stack/llm.py b/comps/llms/text-generation/tgi/llama_stack/llm.py
@@ -20,6 +20,7 @@
 logger = CustomLogger("llm_tgi_llama_stack")
 logflag = os.getenv("LOGFLAG", False)
 
+
 @register_microservice(
     name="opea_service@llm_tgi_llama_stack",
     service_type=ServiceType.LLM,
@@ -70,4 +71,4 @@ async def stream_generator():
 
 
 if __name__ == "__main__":
-    opea_microservices["opea_service@llm_tgi_llama_stack"].start()
+    opea_microservices["opea_service@llm_tgi_llama_stack"].start()
diff --git a/comps/llms/text-generation/tgi/llama_stack/requirements.txt b/comps/llms/text-generation/tgi/llama_stack/requirements.txt
@@ -3,12 +3,12 @@ docarray[full]
 fastapi
 httpx
 huggingface_hub
+llama-stack
+llama-stack-client
 opentelemetry-api
 opentelemetry-exporter-otlp
 opentelemetry-sdk
 prometheus-fastapi-instrumentator
 shortuuid
 transformers
 uvicorn
-llama-stack-client
-llama-stack