Skip to content

Releases: opea-project/GenAIInfra

Generative AI Infrastructure v1.0 Release Notes

20 Sep 09:30
ca4fd83
Compare
Choose a tag to compare

OPEA Release Notes v1.0

What’s New in OPEA v1.0

  • Highlights

    • Improve the RAG performance through microservice optimizations (e.g., Hugging Face TGI, vLLM) and megaservice tuning
    • Provide the experimental LLM model training support, includes full fine-tuning and parameter-efficient fine-tuning (PEFT)
    • Improve RAG with Knowledge Graph based on Neo4j
    • Improve VisualQnA and provide multi-modality RAG support
    • Faster microservice launch through removal of some dispatch overhead
    • Enable Gateway with guardrail, and integrate nginx with CORS protection and data preparation
    • Enable HorizontalPodAutoscaler (HPA) for better resource management
    • Define the metrics of RAG performance and enable accuracy evaluation for more GenAI examples
    • Further improvement on documentation and developer experience
  • Other features

    • Enable OpenAI compatible format on applicable microservices
    • Support microservice launch from ModelScope to address China ecosystem need
    • Support Red Hat OpenShift Container Platform (RHOCP)
    • Refactor the code and CI/CD pipeline to provide better support for contributors
    • Improve Docker versioning to avoid the potential conflict
    • Enhance GenAI Microservice Connector (GMC), including improvements such as router performance optimizations and other updates
    • Introduce Memory Bandwidth Exporter that integrates with Kubernetes Node Resource Interface
  • Learn more about OPEA at

  • Release Documentation:

Details

GenAIExamples
  • Deployment

    • Add ui/nginx support in K8S manifest for ChatQnA/CodeGen/CodeTrans/Docsum(ba94e01)
    • K8S manifest: Update ChatQnA/CodeGen/CodeTrans/DocSum(0629696)
    • Update mount path in xeon k8s(2a6af64)
    • Add Nginx - k8s manifest in CodeTrans(6a679ba)
    • Add Nginx - docker in CodeTrans(cc84847)
    • watch more docker compose files changes(4b0bc26)
    • Add chatQnA UI manifest(758d236)
    • Revert the LLM model for kubernetes GMS(f5f1e32)
    • [ChatQnA] Update retrieval & dataprep manifests(6730b24)
    • [ChatQnA]Update manifests(3563f5d)
    • [ChatQnA] Update benchmarking manifests(36fb9a9)
    • [ChatQnA] udate OOB & Tuned manifests(ac34860)
    • Add nginx and UI to the ChatQnA manifest(05f9828)
    • [ChatQnA] Update OOB with wrapper manifests.(933c3d3)
    • [Translation] Support manifests and nginx(1e13031)
    • update V1.0 benchmark manifest (e5affb9)
    • update image name(e2a74f7)
    • K8S manifest: Update ChatQnA/CodeGen/CodeTrans/DocSum(0629696)
    • Change megaservice path in line with new file structure(5ab27b6)
    • Add ui/nginx support in K8S manifest for ChatQnA/CodeGen/CodeTrans/Docsum(ba94e01)
    • Add chatQnA UI manifest(758d236)
    • Yaml: add comments to specify gaudi device ids.(63406dc)
    • add tgi bf16 setup on CPU k8s.(ba17031)
  • Documentation

    • [ChatQnA] Update README for ModelScope(aebc23f)
    • Update README.md(4bd7841)
    • [ChatQnA] Update README for without Rerank Pipeline(6b617d6)
    • [ChatQnA] Update Benchmark README for w/o rerank(4a51874)
    • Fix readme for nv gpu(43b2ae5)
    • [ChatQnA] Update Benchmark README to Fix Input Length(55d287d)
    • Refine ChatQnA README for TGI(afc3341)
    • Add default model for VisualQnA README(07baa8f)
    • Update readme for manifests of some examples(adb157f)
    • doc: use markdown table in supported_examples(9cf1d88)
    • doc: remove invalid code block language(c6d811a)
    • add AudioQnA readme with supported model(f4f4da2)
    • add more code owners(7f89797)
    • doc: fix headings(7a0fca7)
    • [Codegen] Refine readme to prompt users on how to change the model.(814164d)
    • Update README.md and remove some open-source details(2ef83fc)
    • Add issue template(84a781a)
    • doc: fix headings and indenting(67394b8)
    • Add default model in readme for FaqGen and DocSum(d487093)
    • Change docs of kubernetes for curl commands in README(4133757)
    • Update v0.9 RAG release data(947936e)
    • Explain Default Model in ChatQnA and CodeTrans READMEs(2a2ff45)
    • Update docker images list.(a8244c4)
    • refactor the network port setting for AWS(bc81770)
    • Add validate microservice details link(bd811bd)
    • [ChatQnA] Add Nginx in Docker Compose and README(6c36448
    • [Doc] Update CodeGen and Translation READMEs(a09395e)
    • [Doc] Refine READMEs(372d78c)
    • Remove marketing materials(d85ec09)
    • doc PR to main instead of of v1.0r(dc94026)
    • Update README.md for Multiplatforms(b205dc7)
    • Refine the quick start of ChatQnA(3b70fb0)
    • Update supported_examples(96d5cd9)
    • [Doc] doc improvement(e0b3b57)
    • Fix README issues(bceacdc)
    • doc: fix broken image reference and markdown(d422929)
    • doc: give document meaningful title(a3fa0d6)
    • doc: fix incorrefine readme for reorg(d2bab99)
    • doc: fix incorrect path to png image files (d97882e)
    • update doc according to comments(f990f79)
    • doc: fix headings and indenting(67394b8)
    • Update README.md(4bd7841)
    • refine readme for reorg(d2bab99)
    • Update README with new examples(2d28beb)
    • README: fix broken links(ff6f841)
    • Update v0.9 RAG release data([947936e](https://github....
Read more

Generative AI Infrastructure v0.9 Release Notes

27 Aug 03:11
Compare
Choose a tag to compare

OPEA Release Notes v0.9

What’s New in OPEA v0.9

  • Broaden functionality

    • Provide telemetry functionalities for metrics and tracing using Prometheus, Grafana, and Jaeger
    • Initialize two Agent examples: AgentQnA and DocIndexRetriever
    • Support for authentication and authorization
    • Add Nginx Component to strengthen backend security
    • Provide Toxicity Detection Microservice
    • Support the experimental Fine-tuning microservice
  • Enhancement

    • Align the Microservice format with the standards of OpenAI (Chat Completions, Fine-tuning... etc)
    • Enhance the performance benchmarking and evaluation for GenAI Examples, ex: TGI, resource allocation, ...etc
    • Enable support for launching container images as a non-root user
    • Use Llama-Guard-2-8B as default Guardrails model and bge-large-zh-v1.5 as default embedding model, mistral-7b-grok as default CodeTrans model
    • Add ProductivitySuite to provide access management and maintains user context
  • Deployment

    • Support Red Hat OpenShift Container Platform (RHOCP)
    • GenAI Microservices Connector (GMC) successfully tested on Nvidia GPUs
    • Add Kubernetes support for AudioQnA and VisualQnA examples
  • OPEA Docker Hub: https://hub.docker.com/u/opea

  • GitHub IO: https://opea-project.github.io/latest/index.html

  • Thanks for the external contribution from Sharan Shirodkar, Aishwarya Ramasethu
    , Michal Nicpon and Jacob Mansdorfer

Details

GenAIExamples
  • ChatQnA

    • Update port in set_env.sh(040d2b7)
    • Fix minor issue in ChatQnA Gaudi docker README(a5ed223)
    • update chatqna dataprep-redis port(02a1536)
    • Add support for .md file in file upload in the chatqna-ui(7a67298)
    • Added the ChatQnA delete feature, and updated the corresponding README(09a3196)
    • fixed ISSUE-528(45cf553)
    • Fix vLLM and vLLM-on-Ray UT bug(cfcac3f)
    • set OLLAMA_MODEL env to docker container(c297155)
    • Update guardrail docker file path(06c4484)
    • remove ray serve(c71bc68)
    • Refine docker_compose for dataprep param settings(3913c7b)
    • fix chatqna guardrails(db2d2bd)
    • Support ChatQnA pipeline without rerank microservice(a54ffd2)
    • Update the number of microservice replicas for OPEA v0.9(e6b4fff)
    • Update set_env.sh(9657f7b)
    • add env for chatqna vllm(f78aa9e)
  • Deployment

    • update manifests for v0.9(ba78b4c)
    • Update K8S manifest for ChatQnA/CodeGen/CodeTrans/DocSum(01c1b75)
    • Update benchmark manifest to fix errors(4fd3517)
    • Update env for manifest(4fa37e7)
    • update manifests for v0.9(08f57fa)
    • Add AudioQnA example via GMC(c86cf85)
    • add k8s support for audioqna(0a6bad0)
    • Update mainifest for FaqGen(80e3e2a)
    • Add kubernetes support for VisualQnA(4f7fc39)
    • Add dataprep microservice to chatQnA example and the e2e test(1c23d87)
  • Documentation

    • [doc] Update README.md(c73e4e0)
    • doc fix: Update README.md to remove specific dicscription of paragraph-1(5a9c109)
    • doc: fix markdown in docker_image_list.md(9277fe6)
    • doc: fix markdown in Translation/README.md(d645305)
    • doc: fix markdown in SearchQnA/README.md(c461b60)
    • doc: fix FaqGen/README.md markdown(704ec92)
    • doc: fix markdown in DocSum/README.md(83712b9)
    • doc: fix markdown in CodeTrans/README.md(076bca3)
    • doc: fix CodeGen/README.md markdown(33f8329)
    • doc: fix markdown in ChatQnA/README.md(015a2b1)
    • doc: fix headings in markdown files(21fab71)
    • doc: missed an H1 in the middle of a doc(4259240)
    • doc: remove use of HTML for table in README(e81e0e5)
    • Update ChatQnA readme with OpenShift instructions(ed48371)
    • Convert HTML to markdown format.(14621f8)
    • Fix typo {your_ip} to {host_ip}(ad8ca88)
    • README fix typo(abc02e1)
    • fix script issues in MD file(acdd712)
    • Minor documentation improvements in the CodeGen README(17b9676)
    • Refine Main README(08eb269)
    • [Doc]Add a micro/mega service WorkFlow for DocSum(343d614)
    • Update README for k8s deployment(fbb81b6)
  • Other examples

    • Clean deprecated VisualQnA code(87617e7)
    • Using TGI official release docker image for intel cpu(b2771ad)
    • Add VisualQnA UI(923cf69)
    • fix container name(5ac77f7)
    • Add VisualQnA docker for both Gaudi and Xeon using TGI serving(2390920)
    • Remove LangSmith from Examples(88eeb0d)
    • Modify the language variable to match language highlight.(f08d411)
    • Remove deprecated folder.(7dd9952)
    • update env for manifest(4fa37e7)
    • AgentQnA example(67df280)
    • fix tgi xeon tag(6674832)
    • Add new DocIndexRetriever example(566cf93)
    • Add env params for chatqna xeon test(5d3950)
    • ProductivitySuite Combo Application with REACT UI and Keycloak Authen(947cbe3)
    • change codegen tgi model(06cb308)
    • change searchqna prompt(acbaaf8)
    • minor fix mismatched hf token(ac324a9)
    • fix translation gaudi env(4f3be23)
    • Minor fixes for CodeGen Xeon and Gaudi Kubernetes codegen.yaml (c25063f)
  • CI/CD/UT

    • update deploy_gmc logical in cd workflow(c016d82)
    • fix ghcr.io/huggingface/text-generation-inference tag(503a1a9)
    • Add GMC e2e in CD workflow(f45e4c6)
    • Fix CI test changed file detect issue([5...
Read more

Generative AI Infrastructure v0.8 Release Notes

29 Jul 02:21
39c7f46
Compare
Choose a tag to compare

OPEA Release Notes v0.8

What’s New in OPEA v0.8

  • Broaden functionality

    • Support frequently asked questions (FAQs) generation GenAI example
    • Expand the support of LLMs such as Llama3.1 and Qwen2 and support LVMs such as llava
    • Enable end-to-end performance and accuracy benchmarking
    • Support the experimental Agent microservice
    • Support LLM serving on Ray
  • Multi-platform support

    • Release the Docker images of GenAI components under OPEA dockerhub and support the deployment with Docker
    • Support cloud-native deployment through Kubernetes manifests and GenAI Microservices Connector (GMC)
    • Enable the experimental authentication and authorization support using JWT tokens
    • Validate ChatQnA on multiple platforms such as Xeon, Gaudi, AIPC, Nvidia, and AWS
  • OPEA Docker Hub: https://hub.docker.com/u/opea

Details

GenAIExamples
  • ChatQnA

    • Add ChatQnA instructions for AIPC(26d4ff)
    • Adapt Vllm response format (034541)
    • Update tgi version(5f52a1)
    • Update README.md(f9312b)
    • Udpate ChatQnA docker compose for Dataprep Update(335362)
    • [Doc] Add valid micro-service details(e878dc)
    • Updates for running ChatQnA + Conversational UI on Gaudi(89ddec)
    • Fix win PC issues(ba6541)
    • [Doc]Add ChatQnA Flow Chart(97da49)
    • Add guardrails in the ChatQnA pipeline(955159)
    • Fix a minor bug for chatqna in docker-compose(b46ae8)
    • Support vLLM/vLLM-on-Ray/Ray Serve for ChatQnA(631d84)
    • Added ChatQnA example using Qdrant retriever(c74564)
    • Update TEI version v1.5 for better performance(f4b4ac)
    • Update ChatQnA upload feature(598484)
    • Add auto truncate for embedding and rerank(8b6094)
  • Deployment

    • Add Kubernetes manifest files for deploying DocSum(831463)
    • Update Kubernetes manifest files for CodeGen(2f9397)
    • Add Kubernetes manifest files for deploying CodeTrans(c9548d)
    • Updated READMEs for kubernetes example pipelines(c37d9c)
    • Update all examples yaml files of GMC in GenAIExample(290a74)
    • Doc: fix minor issue in GMC doc(d99461)
    • README for installing 4 worklods using helm chart(6e797f)
    • Update Kubernetes manifest files for deploying ChatQnA(665c46)
    • Add new example of SearchQnA for GenAIExample(21b7d1)
    • Add new example of Translation for GenAIExample(d0b028)
  • Other examples

    • Update reranking microservice dockerfile path (d7a5b7)
    • Update tgi-gaudi version(3505bd)
    • Refine README of Examples(f73267)
    • Update READMEs(8ad7f3)
    • [CodeGen] Add codegen flowchart(377dd2)
    • Update audioqna image name(615f0d)
    • Add auto-truncate to gaudi tei (8d4209)
    • Update visualQnA chinese version(497895)
    • Fix Typo for Translation Example(95c13d)
    • FAQGen Megaservice(8c4a25)
    • Code-gen-react-ui(1b48e5)
    • Added doc sum react-ui(edf0d1)
  • CI/UT

    • Frontend failed with unknown timeout issue (7ebe78)
    • Adding Chatqna Benchmark Test(11a56e)
    • Expand tgi connect timeout(ee0dcb)
    • Optimize gmc manifest e2e tests(15fc6f)
    • Add docker compose yaml print for test(bb4230)
    • Refactor translation ci test (b7975e)
    • Refactor searchqna ci test(ecf333)
    • Translate UT for UI(284d85)
    • Enhancement the codetrans e2e test(450efc)
    • Allow gmc e2e workflow to get secrets(f45f50)
    • Add checkout ref in gmc e2e workflow(62ae64)
    • SearchQnA UT(268d58)
GenAIComps
  • Cores

    • Support https for microservice(2d6772)
    • Enlarge megaservice request timeout for supporting high concurrency(876ca5)
    • Add dynamic DAG(f2995a)
  • LLM

    • Optional vllm microservice container build(963755)
    • Refine vllm instruction(6e2c28)
    • Introduce 'entrypoint.sh' for some Containers(9ecc5c)
    • Support llamaindex for retrieval microservice and remove langchain(61795f)
    • Update tgi with text-generation-inference:2.1.0(f23694)
    • Fix requirements(f4b029)
    • Add vLLM on Ray microservice(ec3b2e)
    • Update code/readme/UT for Ray Serve and VLLM([dd939c](https://gith...
Read more

Generative AI Infrastructure v0.7 Release Notes

28 Jun 16:53
ac6c247
Compare
Choose a tag to compare

GenAIInfra

  • GMC

    • Enable gmc e2e for manifests changes and some minor fix (758432)
    • GMC: make "namespace" field of each resource in the CR optional (7073ac)
    • ChatQnA demo yaml files integration between GMC and Oneclick (020899)
    • Add gmc e2e (595185)
    • Add docker build and push target for GMC (04d7f2)
    • GMC: overwrite config map template before GMC resources are deployed (ce9190)
    • GMC: replace the service and deployment name if GMC has defined (eec845)
    • Add gmc guide (6bb8a3)
    • GMC: adopt separate e2e for gaudi and xeon (c5075b)
    • Update readme and user guide for GMC (2d17c9)
    • GMC: add Codetrans example (aed70d)
    • Enable GMC e2e on Gaudi (d204a7)
  • HelmChart

    • Helm chart: Add default minimal pod security (8fcf0a)
    • Support e2e test for chatqna helm chart (2f317d)
    • Add helm charts for deploy ChatQnA (20dce6)
    • Reorg of helm charts (d332c2)
  • Others

    • Add DocSum llm service manifests (9ab8de)
    • Enable golang e2e test in CI (bc9aba)
    • Add e2e test for docsum example (89aa5a)
    • Add docsum example on both xeon and gaudi node (c88817)

Others

Generative AI Infrastructure v0.6 Release Notes

01 Jun 09:43
bc9c2bd
Compare
Choose a tag to compare

GenAIInfra

  • Add Helm Charts redis-vector-db, TEI, TGI and CodeGen for deploying GenAIExamples on Kubernetes
  • Add Manifests for deploying GenAIExamples CodeGen, ChatQnA and Docsum on Kubernetes
  • Add Manifests for deploying GenAIExamples CodeGen, ChatQnA and Docsum on Docker Compose

Others