Releases: opea-project/GenAIInfra
Releases · opea-project/GenAIInfra
Generative AI Infrastructure v1.0 Release Notes
OPEA Release Notes v1.0
What’s New in OPEA v1.0
-
Highlights
- Improve the RAG performance through microservice optimizations (e.g., Hugging Face TGI, vLLM) and megaservice tuning
- Provide the experimental LLM model training support, includes full fine-tuning and parameter-efficient fine-tuning (PEFT)
- Improve RAG with Knowledge Graph based on Neo4j
- Improve VisualQnA and provide multi-modality RAG support
- Faster microservice launch through removal of some dispatch overhead
- Enable Gateway with guardrail, and integrate nginx with CORS protection and data preparation
- Enable HorizontalPodAutoscaler (HPA) for better resource management
- Define the metrics of RAG performance and enable accuracy evaluation for more GenAI examples
- Further improvement on documentation and developer experience
-
Other features
- Enable OpenAI compatible format on applicable microservices
- Support microservice launch from ModelScope to address China ecosystem need
- Support Red Hat OpenShift Container Platform (RHOCP)
- Refactor the code and CI/CD pipeline to provide better support for contributors
- Improve Docker versioning to avoid the potential conflict
- Enhance GenAI Microservice Connector (GMC), including improvements such as router performance optimizations and other updates
- Introduce Memory Bandwidth Exporter that integrates with Kubernetes Node Resource Interface
-
Learn more about OPEA at
- Getting Started: https://opea-project.github.io/latest/index.html
- Github: https://github.com/opea-project
- Docker Hub: https://hub.docker.com/u/opea
-
Release Documentation:
- Landing Page: https://opea.dev/
- Release Notes: https://github.com/opea-project/docs/tree/main/release_notes
Details
GenAIExamples
-
Deployment
- Add ui/nginx support in K8S manifest for ChatQnA/CodeGen/CodeTrans/Docsum(ba94e01)
- K8S manifest: Update ChatQnA/CodeGen/CodeTrans/DocSum(0629696)
- Update mount path in xeon k8s(2a6af64)
- Add Nginx - k8s manifest in CodeTrans(6a679ba)
- Add Nginx - docker in CodeTrans(cc84847)
- watch more docker compose files changes(4b0bc26)
- Add chatQnA UI manifest(758d236)
- Revert the LLM model for kubernetes GMS(f5f1e32)
- [ChatQnA] Update retrieval & dataprep manifests(6730b24)
- [ChatQnA]Update manifests(3563f5d)
- [ChatQnA] Update benchmarking manifests(36fb9a9)
- [ChatQnA] udate OOB & Tuned manifests(ac34860)
- Add nginx and UI to the ChatQnA manifest(05f9828)
- [ChatQnA] Update OOB with wrapper manifests.(933c3d3)
- [Translation] Support manifests and nginx(1e13031)
- update V1.0 benchmark manifest (e5affb9)
- update image name(e2a74f7)
- K8S manifest: Update ChatQnA/CodeGen/CodeTrans/DocSum(0629696)
- Change megaservice path in line with new file structure(5ab27b6)
- Add ui/nginx support in K8S manifest for ChatQnA/CodeGen/CodeTrans/Docsum(ba94e01)
- Add chatQnA UI manifest(758d236)
- Yaml: add comments to specify gaudi device ids.(63406dc)
- add tgi bf16 setup on CPU k8s.(ba17031)
-
Documentation
- [ChatQnA] Update README for ModelScope(aebc23f)
- Update README.md(4bd7841)
- [ChatQnA] Update README for without Rerank Pipeline(6b617d6)
- [ChatQnA] Update Benchmark README for w/o rerank(4a51874)
- Fix readme for nv gpu(43b2ae5)
- [ChatQnA] Update Benchmark README to Fix Input Length(55d287d)
- Refine ChatQnA README for TGI(afc3341)
- Add default model for VisualQnA README(07baa8f)
- Update readme for manifests of some examples(adb157f)
- doc: use markdown table in supported_examples(9cf1d88)
- doc: remove invalid code block language(c6d811a)
- add AudioQnA readme with supported model(f4f4da2)
- add more code owners(7f89797)
- doc: fix headings(7a0fca7)
- [Codegen] Refine readme to prompt users on how to change the model.(814164d)
- Update README.md and remove some open-source details(2ef83fc)
- Add issue template(84a781a)
- doc: fix headings and indenting(67394b8)
- Add default model in readme for FaqGen and DocSum(d487093)
- Change docs of kubernetes for curl commands in README(4133757)
- Update v0.9 RAG release data(947936e)
- Explain Default Model in ChatQnA and CodeTrans READMEs(2a2ff45)
- Update docker images list.(a8244c4)
- refactor the network port setting for AWS(bc81770)
- Add validate microservice details link(bd811bd)
- [ChatQnA] Add Nginx in Docker Compose and README(6c36448
- [Doc] Update CodeGen and Translation READMEs(a09395e)
- [Doc] Refine READMEs(372d78c)
- Remove marketing materials(d85ec09)
- doc PR to main instead of of v1.0r(dc94026)
- Update README.md for Multiplatforms(b205dc7)
- Refine the quick start of ChatQnA(3b70fb0)
- Update supported_examples(96d5cd9)
- [Doc] doc improvement(e0b3b57)
- Fix README issues(bceacdc)
- doc: fix broken image reference and markdown(d422929)
- doc: give document meaningful title(a3fa0d6)
- doc: fix incorrefine readme for reorg(d2bab99)
- doc: fix incorrect path to png image files (d97882e)
- update doc according to comments(f990f79)
- doc: fix headings and indenting(67394b8)
- Update README.md(4bd7841)
- refine readme for reorg(d2bab99)
- Update README with new examples(2d28beb)
- README: fix broken links(ff6f841)
- Update v0.9 RAG release data([947936e](https://github....
Generative AI Infrastructure v0.9 Release Notes
OPEA Release Notes v0.9
What’s New in OPEA v0.9
-
Broaden functionality
- Provide telemetry functionalities for metrics and tracing using Prometheus, Grafana, and Jaeger
- Initialize two Agent examples: AgentQnA and DocIndexRetriever
- Support for authentication and authorization
- Add Nginx Component to strengthen backend security
- Provide Toxicity Detection Microservice
- Support the experimental Fine-tuning microservice
-
Enhancement
- Align the Microservice format with the standards of OpenAI (Chat Completions, Fine-tuning... etc)
- Enhance the performance benchmarking and evaluation for GenAI Examples, ex: TGI, resource allocation, ...etc
- Enable support for launching container images as a non-root user
- Use Llama-Guard-2-8B as default Guardrails model and bge-large-zh-v1.5 as default embedding model, mistral-7b-grok as default CodeTrans model
- Add ProductivitySuite to provide access management and maintains user context
-
Deployment
- Support Red Hat OpenShift Container Platform (RHOCP)
- GenAI Microservices Connector (GMC) successfully tested on Nvidia GPUs
- Add Kubernetes support for AudioQnA and VisualQnA examples
-
OPEA Docker Hub: https://hub.docker.com/u/opea
-
Thanks for the external contribution from Sharan Shirodkar, Aishwarya Ramasethu
, Michal Nicpon and Jacob Mansdorfer
Details
GenAIExamples
-
ChatQnA
- Update port in set_env.sh(040d2b7)
- Fix minor issue in ChatQnA Gaudi docker README(a5ed223)
- update chatqna dataprep-redis port(02a1536)
- Add support for .md file in file upload in the chatqna-ui(7a67298)
- Added the ChatQnA delete feature, and updated the corresponding README(09a3196)
- fixed ISSUE-528(45cf553)
- Fix vLLM and vLLM-on-Ray UT bug(cfcac3f)
- set OLLAMA_MODEL env to docker container(c297155)
- Update guardrail docker file path(06c4484)
- remove ray serve(c71bc68)
- Refine docker_compose for dataprep param settings(3913c7b)
- fix chatqna guardrails(db2d2bd)
- Support ChatQnA pipeline without rerank microservice(a54ffd2)
- Update the number of microservice replicas for OPEA v0.9(e6b4fff)
- Update set_env.sh(9657f7b)
- add env for chatqna vllm(f78aa9e)
-
Deployment
- update manifests for v0.9(ba78b4c)
- Update K8S manifest for ChatQnA/CodeGen/CodeTrans/DocSum(01c1b75)
- Update benchmark manifest to fix errors(4fd3517)
- Update env for manifest(4fa37e7)
- update manifests for v0.9(08f57fa)
- Add AudioQnA example via GMC(c86cf85)
- add k8s support for audioqna(0a6bad0)
- Update mainifest for FaqGen(80e3e2a)
- Add kubernetes support for VisualQnA(4f7fc39)
- Add dataprep microservice to chatQnA example and the e2e test(1c23d87)
-
Documentation
- [doc] Update README.md(c73e4e0)
- doc fix: Update README.md to remove specific dicscription of paragraph-1(5a9c109)
- doc: fix markdown in docker_image_list.md(9277fe6)
- doc: fix markdown in Translation/README.md(d645305)
- doc: fix markdown in SearchQnA/README.md(c461b60)
- doc: fix FaqGen/README.md markdown(704ec92)
- doc: fix markdown in DocSum/README.md(83712b9)
- doc: fix markdown in CodeTrans/README.md(076bca3)
- doc: fix CodeGen/README.md markdown(33f8329)
- doc: fix markdown in ChatQnA/README.md(015a2b1)
- doc: fix headings in markdown files(21fab71)
- doc: missed an H1 in the middle of a doc(4259240)
- doc: remove use of HTML for table in README(e81e0e5)
- Update ChatQnA readme with OpenShift instructions(ed48371)
- Convert HTML to markdown format.(14621f8)
- Fix typo {your_ip} to {host_ip}(ad8ca88)
- README fix typo(abc02e1)
- fix script issues in MD file(acdd712)
- Minor documentation improvements in the CodeGen README(17b9676)
- Refine Main README(08eb269)
- [Doc]Add a micro/mega service WorkFlow for DocSum(343d614)
- Update README for k8s deployment(fbb81b6)
-
Other examples
- Clean deprecated VisualQnA code(87617e7)
- Using TGI official release docker image for intel cpu(b2771ad)
- Add VisualQnA UI(923cf69)
- fix container name(5ac77f7)
- Add VisualQnA docker for both Gaudi and Xeon using TGI serving(2390920)
- Remove LangSmith from Examples(88eeb0d)
- Modify the language variable to match language highlight.(f08d411)
- Remove deprecated folder.(7dd9952)
- update env for manifest(4fa37e7)
- AgentQnA example(67df280)
- fix tgi xeon tag(6674832)
- Add new DocIndexRetriever example(566cf93)
- Add env params for chatqna xeon test(5d3950)
- ProductivitySuite Combo Application with REACT UI and Keycloak Authen(947cbe3)
- change codegen tgi model(06cb308)
- change searchqna prompt(acbaaf8)
- minor fix mismatched hf token(ac324a9)
- fix translation gaudi env(4f3be23)
- Minor fixes for CodeGen Xeon and Gaudi Kubernetes codegen.yaml (c25063f)
-
CI/CD/UT
Generative AI Infrastructure v0.8 Release Notes
OPEA Release Notes v0.8
What’s New in OPEA v0.8
-
Broaden functionality
- Support frequently asked questions (FAQs) generation GenAI example
- Expand the support of LLMs such as Llama3.1 and Qwen2 and support LVMs such as llava
- Enable end-to-end performance and accuracy benchmarking
- Support the experimental Agent microservice
- Support LLM serving on Ray
-
Multi-platform support
- Release the Docker images of GenAI components under OPEA dockerhub and support the deployment with Docker
- Support cloud-native deployment through Kubernetes manifests and GenAI Microservices Connector (GMC)
- Enable the experimental authentication and authorization support using JWT tokens
- Validate ChatQnA on multiple platforms such as Xeon, Gaudi, AIPC, Nvidia, and AWS
-
OPEA Docker Hub: https://hub.docker.com/u/opea
Details
GenAIExamples
-
ChatQnA
- Add ChatQnA instructions for AIPC(26d4ff)
- Adapt Vllm response format (034541)
- Update tgi version(5f52a1)
- Update README.md(f9312b)
- Udpate ChatQnA docker compose for Dataprep Update(335362)
- [Doc] Add valid micro-service details(e878dc)
- Updates for running ChatQnA + Conversational UI on Gaudi(89ddec)
- Fix win PC issues(ba6541)
- [Doc]Add ChatQnA Flow Chart(97da49)
- Add guardrails in the ChatQnA pipeline(955159)
- Fix a minor bug for chatqna in docker-compose(b46ae8)
- Support vLLM/vLLM-on-Ray/Ray Serve for ChatQnA(631d84)
- Added ChatQnA example using Qdrant retriever(c74564)
- Update TEI version v1.5 for better performance(f4b4ac)
- Update ChatQnA upload feature(598484)
- Add auto truncate for embedding and rerank(8b6094)
-
Deployment
- Add Kubernetes manifest files for deploying DocSum(831463)
- Update Kubernetes manifest files for CodeGen(2f9397)
- Add Kubernetes manifest files for deploying CodeTrans(c9548d)
- Updated READMEs for kubernetes example pipelines(c37d9c)
- Update all examples yaml files of GMC in GenAIExample(290a74)
- Doc: fix minor issue in GMC doc(d99461)
- README for installing 4 worklods using helm chart(6e797f)
- Update Kubernetes manifest files for deploying ChatQnA(665c46)
- Add new example of SearchQnA for GenAIExample(21b7d1)
- Add new example of Translation for GenAIExample(d0b028)
-
Other examples
- Update reranking microservice dockerfile path (d7a5b7)
- Update tgi-gaudi version(3505bd)
- Refine README of Examples(f73267)
- Update READMEs(8ad7f3)
- [CodeGen] Add codegen flowchart(377dd2)
- Update audioqna image name(615f0d)
- Add auto-truncate to gaudi tei (8d4209)
- Update visualQnA chinese version(497895)
- Fix Typo for Translation Example(95c13d)
- FAQGen Megaservice(8c4a25)
- Code-gen-react-ui(1b48e5)
- Added doc sum react-ui(edf0d1)
-
CI/UT
- Frontend failed with unknown timeout issue (7ebe78)
- Adding Chatqna Benchmark Test(11a56e)
- Expand tgi connect timeout(ee0dcb)
- Optimize gmc manifest e2e tests(15fc6f)
- Add docker compose yaml print for test(bb4230)
- Refactor translation ci test (b7975e)
- Refactor searchqna ci test(ecf333)
- Translate UT for UI(284d85)
- Enhancement the codetrans e2e test(450efc)
- Allow gmc e2e workflow to get secrets(f45f50)
- Add checkout ref in gmc e2e workflow(62ae64)
- SearchQnA UT(268d58)
GenAIComps
-
Cores
-
LLM
- Optional vllm microservice container build(963755)
- Refine vllm instruction(6e2c28)
- Introduce 'entrypoint.sh' for some Containers(9ecc5c)
- Support llamaindex for retrieval microservice and remove langchain(61795f)
- Update tgi with text-generation-inference:2.1.0(f23694)
- Fix requirements(f4b029)
- Add vLLM on Ray microservice(ec3b2e)
- Update code/readme/UT for Ray Serve and VLLM([dd939c](https://gith...
Generative AI Infrastructure v0.7 Release Notes
GenAIInfra
-
GMC
- Enable gmc e2e for manifests changes and some minor fix (758432)
- GMC: make "namespace" field of each resource in the CR optional (7073ac)
- ChatQnA demo yaml files integration between GMC and Oneclick (020899)
- Add gmc e2e (595185)
- Add docker build and push target for GMC (04d7f2)
- GMC: overwrite config map template before GMC resources are deployed (ce9190)
- GMC: replace the service and deployment name if GMC has defined (eec845)
- Add gmc guide (6bb8a3)
- GMC: adopt separate e2e for gaudi and xeon (c5075b)
- Update readme and user guide for GMC (2d17c9)
- GMC: add Codetrans example (aed70d)
- Enable GMC e2e on Gaudi (d204a7)
-
HelmChart
-
Others
Others
Generative AI Infrastructure v0.6 Release Notes
GenAIInfra
- Add Helm Charts redis-vector-db, TEI, TGI and CodeGen for deploying GenAIExamples on Kubernetes
- Add Manifests for deploying GenAIExamples CodeGen, ChatQnA and Docsum on Kubernetes
- Add Manifests for deploying GenAIExamples CodeGen, ChatQnA and Docsum on Docker Compose