Skip to content

ansible/ansible-chatbot-stack

Ansible Chatbot (llama) Stack

An Ansible Chatbot (llama) Stack custom distribution (Container type).

It includes:

Build/Run overview:

flowchart TB
%% Nodes
    CHATBOT_STACK([fa:fa-layer-group ansible-chatbot-stack-base:x.y.z])
    AAP_CHATBOT_STACK([fa:fa-layer-group ansible-chatbot-stack:x.y.z])
    AAP_CHATBOT([fa:fa-comment Ansible Chatbot Service])
    CHATBOT_BUILD_CONFIG{{fa:fa-wrench ansible-chatbot-build.yaml}}
    CHATBOT_RUN_CONFIG{{fa:fa-wrench ansible-chatbot-run.yaml}}
    AAP_CHATBOT_DOCKERFILE{{fa:fa-wrench Containerfile}}
    Lightspeed_Providers("fa:fa-code-branch lightspeed-providers")
    PYPI("fa:fa-database PyPI")

%% Edge connections between nodes
    CHATBOT_STACK -- Consumes --> PYPI
    Lightspeed_Providers -- Publishes --> PYPI
    CHATBOT_STACK -- Built from --> CHATBOT_BUILD_CONFIG
    AAP_CHATBOT_STACK -- Built from --> AAP_CHATBOT_DOCKERFILE
    AAP_CHATBOT_STACK -- inherits from --> CHATBOT_STACK
    AAP_CHATBOT -- Uses --> CHATBOT_RUN_CONFIG
    AAP_CHATBOT_STACK -- Runtime --> AAP_CHATBOT
Loading

Build

Setup for Ansible Chatbot Stack


Actually using temporary lightspeed stack providers package, otherwise further need for lightspeed external providers available on PyPI

  • Install llama-stack on the host machine, if not present.
  • External providers YAML manifests must be present in providers.d/ of your host's llama-stack directory.
  • External providers' python libraries must be in the container's python's library path, but also in the host machine's python library path. It is a workaround for this hack.
  • Vector DB and embedding image files are copied from the latest aap-rag-content image to ./vector_db and ./embeddings_model respectively.
        make setup

Building the Ansible Chatbot Stack


Builds the image ansible-chatbot-stack-base:$PYPI_VERSION.

    make build

Customizing the Ansible Chatbot Stack


Builds the image ansible-chatbot-stack:$ANSIBLE_CHATBOT_VERSION.

    export ANSIBLE_CHATBOT_VERSION=0.0.1
    make build-custom

Run

Change the ANSIBLE_CHATBOT_VERSION version and inference parameters below accordingly.

    export ANSIBLE_CHATBOT_VERSION=0.0.1
    export ANSIBLE_CHATBOT_VLLM_URL=<YOUR_MODEL_SERVING_URL>
    export ANSIBLE_CHATBOT_VLLM_API_TOKEN=<YOUR_MODEL_SERVING_API_TOKEN>
    export ANSIBLE_CHATBOT_INFERENCE_MODEL=<YOUR_INFERENCE_MODEL>
    export ANSIBLE_CHATBOT_INFERENCE_MODEL_FILTER=<YOUR_INFERENCE_MODEL_TOOLS_FILTERING>
    export AAP_GATEWAY_TOKEN=<YOUR_AAP_GATEWAY_TOKEN>
    make run

Deploy into a k8s cluster

Change configuration in kustomization.yaml accordingly, then

    kubectl kustomize . > my-chatbot-stack-deploy.yaml

Deploy the service

    kubectl apply -f my-chatbot-stack-deploy.yaml

Appendix - Host clean-up

If you have the need for re-building images, apply the following clean-ups right before:

    make clean

Appendix - Testing by using the CLI client

    > llama-stack-client --configure ...

    > llama-stack-client models list
    ┏━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
    ┃ model_type             ┃ identifier                                       ┃ provider_resource_id                             ┃ metadata                                                       ┃ provider_id                                ┃
    ┡━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
    │ llm                    │ granite-3.3-8b-instruct                          │ granite-3.3-8b-instruct                          │                                                                │ rhosai_vllm_dev                            │
    ├────────────────────────┼──────────────────────────────────────────────────┼──────────────────────────────────────────────────┼────────────────────────────────────────────────────────────────┼────────────────────────────────────────────┤
    │ embedding              │ all-MiniLM-L6-v2                                 │ all-MiniLM-L6-v2                                 │ {'embedding_dimension': 384.0}                                 │ inline_sentence-transformer                │
    └────────────────────────┴──────────────────────────────────────────────────┴──────────────────────────────────────────────────┴────────────────────────────────────────────────────────────────┴────────────────────────────────────────────┘

    > llama-stack-client providers list
    ┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
    ┃ API          ┃ Provider ID                  ┃ Provider Type                        ┃
    ┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
    │ inference    │ rhosai_vllm_dev              │ remote::vllm                         │
    │ inference    │ inline_sentence-transformer  │ inline::sentence-transformers        │
    │ vector_io    │ aap_faiss                    │ inline::faiss                        │
    │ safety       │ llama-guard                  │ inline::llama-guard                  │
    │ safety       │ lightspeed_question_validity │ inline::lightspeed_question_validity │
    │ agents       │ meta-reference               │ inline::meta-reference               │
    │ datasetio    │ localfs                      │ inline::localfs                      │
    │ telemetry    │ meta-reference               │ inline::meta-reference               │
    │ tool_runtime │ rag-runtime-0                │ inline::rag-runtime                  │
    │ tool_runtime │ model-context-protocol-1     │ remote::model-context-protocol       │
    │ tool_runtime │ lightspeed                   │ remote::lightspeed                   │
    └──────────────┴──────────────────────────────┴──────────────────────────────────────┘

    > llama-stack-client vector_dbs list
    ┏━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
    ┃ identifier           ┃ provider_id ┃ provider_resource_id ┃ vector_db_type ┃ params                            ┃
    ┡━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
    │ aap-product-docs-2_5 │ aap_faiss   │ aap-product-docs-2_5 │                │ embedding_dimension: 384          │
    │                      │             │                      │                │ embedding_model: all-MiniLM-L6-v2 │
    │                      │             │                      │                │ type: vector_db                   │
    │                      │             │                      │                │                                   │
    └──────────────────────┴─────────────┴──────────────────────┴────────────────┴───────────────────────────────────┘

    > llama-stack-client inference chat-completion --message "tell me about Ansible Lightspeed"
    ...

Appendix - Obtain a container shell

    # Obtain a container shell for the Ansible Chatbot Stack.
    make shell

About

No description, website, or topics provided.

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Packages

No packages published