Skip to content

Project ideas for 2025

Adrian Boguszewski edited this page Mar 21, 2025 · 7 revisions

Below you can find all project ideas for the current year.

1. Gesture Control with OpenVINO

Short description: The MediaPipe Gesture Recognizer task lets you recognize hand gestures in real-time and provides the recognized hand gesture results along with the landmarks of the detected hands. You can use this task to recognize specific hand gestures from a user and invoke application features that correspond to those gestures. This project asks you to port MediaPipe Gesture Recognizer to OpenVINO and create a local gesture control system on a monitor. You can also customize the models in this pipeline according to task requirements.

Expected outcomes: A gesture control system over the monitor

Skills required/preferred: Python, C++, Computer Vision

Mentors: Ethan Yang, Zhuo Wu

Size of project: 175 hours

Difficulty: Medium

2. OpenVINO Messenger AI-Assistant for an AI PC

Short description: People use messengers daily not just for communication, but also for reading news and gathering information on a variety of topics by subscribing to channels. For any popular messenger implement a Desktop AI-Assistant for AI PC, which can read messages from a specified time interval and use Retrieval-augmented Generation (RAG) to enhance the local Language Model (LLM) with this private data, providing useful information such as a daily digest. Users should be able to interact with the OpenVINO Messenger AI Assistant to ask questions related to any discussions extracted from the messenger.

Expected outcomes:

  • A standalone desktop application capable of retrieving messages from popular messaging platforms, such as by using API access.
  • The project incorporates OpenVINO, utilizing a local Language Model (LLM) and Retrieval-augmented Generation (RAG) technique, running on AI PC integrated GPU.
  • The application features a user interface that allows interaction with the local LLM to generate valuable output.

Skills required/preferred: Python or C++, LLMs, RAG, UI/Qt

Mentors: Dmitriy Pastushenkov, Ethan Yang

Size of project: 350 hours

Difficulty: Medium

3. Desktop Chat-Bot Application

Short description: Neural Language Model can work locally without the internet. You will write your own Chatbot desktop cross-platform application using OpenVINO and Electron (or analogs). The Chatbot may be general or crafted to your needs (subject to the NLP model).

Expected outcomes:

  • Desktop Chat-Bot application works without the internet
  • Project uses an NLP model
  • Project uses OpenVINO in Electron environment
  • Medium/OpenVINO blogs

Skills required/preferred: JavaScript, Electron, OpenVINO GenAI, Natural Language Processing

Mentors: Alicja Miłoszewska, Kirill Suvorov

Size of project: 175 hours

Difficulty: Medium

4. Continuous Face-Detection with Automatic Device Switching on AI PCs using OpenVINO AUTO feature

Short description: AI PCs incorporate multiple devices/inference engines for different machine-learning applications. Based on the performance, latency or power consumption requirements, an application may choose to use either NPU, GPU or a CPU for inference tasks. Usually, an application utilizes a single engine/device for the entire lifetime of the process/inference. The machine learning model being used by the application is compiled only for one device. However, it is important for the application to switch between different inference devices during runtime based on user preference, application behavior, and load/stress on the current device in use. Through this project, we want to build a face-detection application that continuously runs on the AI PC while switching between different inference devices during runtime based on user recommendations or evaluating the stress on the current engine. The inference should not end/pause while switching devices and should not lead to BSODs/System Hang/Device Crashes causing other applications to fail.

Expected outcomes:

  1. Implement low latency Face-Detection application to run on multiple devices/engines within AI PCs
  2. Utilize OpenVINO AUTO feature to demonstrate runtime switching between devices
  3. Create a GUI to prompt user to change the device during runtime based on user preference
  4. Analyze the device load and recommend user to switch to the most appropriate device to continue inference

Skills required/preferred: Python or C++, Basic ML knowledge

Mentors: Shivam Basia, Aishwarye Omer

Size of project: 175 hours

Difficulty: Easy

5. OpenVINO AI PC Model Training Kit

Short description: A lot of students/developers still use their laptops or free resources(Google Collabs/ Kaggle compute) for various Kaggle competitions, and other hackathons to train their ML model with a specific dataset. Building scalable training extensions for OpenVINO is the best way to pull developers to the OpenVINO framework. However, as AI PCs are rapidly evolving, developers do not need to invest in expensive resources to train their models; AI PCs' capabilities should be sufficient for the use case. Also, there are a lot of use cases where companies tend to predict their financial goals through time-series forecasting or tree-based models trained on their private datasets. Enabling training on their local AI PC will open doors for this many use cases.

Similar to popular toolkits like CUDA, the OpenVINO stack can be utilized to enable training on the Intel XPUs by optimizing implementations of the operations commonly used in training machine learning models (tree-based and graph-based regressors/classifiers) and extend capabilities in deep learning models, such as convolutions, matrix multiplications, and other tensor operations.

Currently, a pre-trained model(mostly trained on a GPU) gets converted to OpenVINO IR files through OpenVINO Converter(OVC) and then quantized using NNCF. Then inferencing gets implemented with that quantized file model which takes a significant amount of time and code implementation if we want to move an existing application using that model.

So, we want to create an OpenVINO training wrapper on top of PyTorch/TensorFlow/sci-kit-learn to directly use the OpenVINO as a backend to optimize the training through more efficient hardware. The model will be trained using AI PC XPU. Therefore, with minimum code changes and less training time, the application can be moved to use the OpenVINO framework thereby reducing latency and increasing throughput.

Expected outcomes: After the project is finished, what would be delivered by the contributor e.g.

  • OpenVINO training wrapper on top of Pytorch/ Tensorflow/sci-kit
  • Enable support for better hyperparameter tuning while training on AI PC through OpenVINO
  • A model trained with AI PC using the wrapper and OpenVINO as a backend framework
  • Enable inference with the above model using the OpenVINO benchmark app on an AI PC

Skills required/preferred: Python or C++, DL Model Training

Mentors: Shivam Basia, Aishwarye Omer

Size of project: 350 hours

Difficulty: Medium

6. Improve OpenVINO training extensions classification component by introducing DinoV2 architecture with DoRA

Short description: OpenVINO training extensions (OTX) provide a diverse set of image classification models, but they lack manually refined solutions. Most of the models in OTX are imported from timm or torchvision as is. Also, currently, OTX finetunes all weights of the imported models, which is sub-optimal on a small number of datasets. This project aims to add into OTX a fast-to-train and accurate classification model (or a family of models) aiming to cover fine-tuning on a small or medium dataset scenario. The idea is to pick a compact transformer architecture and explore the accuracy/training time trade-off when utilizing DoRA and linear classification head fine-tuning. DinoV2 is an example of a transformer model to try in that scenario, but the scope is not necessarily limited to it (one can consider VLM image decoders, like SmolVLM and others).

Expected outcomes: The training pipeline assumes that applying DoRA and linear fine-tuning is added to OTX and surpasses traditional whole-model fine-tuning on small/medium datasets.

Skills required/preferred: ML basics, Python

Mentors: Vladislav Sovrasov, Kirill Prokofiev

Size of project: 175 hours

Difficulty: Medium

7. Enhancing Anomalib with SOTA Anomaly Detection Benchmarks, Models and Evaluation Metrics

Short description: This project aims to enhance Anomalib by implementing state-of-the-art anomaly detection models from both one-class and multi-class learning paradigms such as GLASS, GLAD and Dinomaly, while expanding the framework's benchmark capabilities across different domains. The work includes integrating recent advances in anomaly detection literature from various learning approaches, adding new benchmark datasets from industrial and medical domains, and implementing comprehensive evaluation metrics (e.g., mIoU-max, averaged mAD). These enhancements will make Anomalib a more comprehensive framework for real-world anomaly detection applications across different domains and use cases.

Expected outcomes:

  • Implementation of SOTA anomaly detection models such as GLASS, GLAD and Dinomaly
  • Integration of new benchmark datasets from industrial and medical domains
  • Implementation of new evaluation metrics such as mIoU-max, averaged mAD
  • Comprehensive documentation and examples
  • Benchmark results and performance comparisons
  • Optimization for deployment readiness

Skills required/preferred:

  • Strong Python programming skills
  • Experience with PyTorch, PyTorch Lightning and Anomalib (optional) frameworks
  • Understanding of anomaly detection principles and SOTA methods
  • Experience with testing and benchmarking ML models
  • Good software engineering practices

Mentors: Ashwin Vaidya, Dick Ameln, Samet Akcay

Size of project: 350 hours

Difficulty: Medium to hard

8. Refining Zero-Shot Object Segmentation by Combining Vision Foundation Models

Short description: Visual Prompting is an advanced computer vision technique that enables object identification in images using reference examples, eliminating the need for labeled training data (zero-shot learning). This approach leverages powerful foundational models such as DINOv2 for feature extraction and Segment Anything Model (SAM) for precise object segmentation. The process involves matching object features across images to locate instances of the target object in unseen data. SAM is then used to generate segmentation masks. However, these masks often contain false positives or incomplete segmentations. To improve accuracy, filtering and merging techniques must be applied. Existing solutions are often dataset-specific, limiting their generalizability. The goal of this project is to develop a more effective and generalizable approach for refining segmentation masks across diverse datasets. The student will experiment with existing methods, evaluate different refinement strategies, and explore novel techniques to improve segmentation robustness.

Expected outcomes:

  • A robust pipeline for refining object masks that generalize across datasets.
  • A benchmarking framework to evaluate different segmentation refinement strategies.
  • Potential integration with existing open-source vision repositories.

Skills required/preferred:

  • Proficiency in Python and experience with deep learning frameworks like PyTorch.
  • Strong understanding of deep learning, computer vision, and self-supervised learning.
  • Familiarity with foundational vision models like DINOv2, SAM, and CLIP (optional but beneficial).

Mentors: Daan Krol, Klaas Dijkstra, Samet Akcay

Size of project: 350 hours (175 hours possible)

Difficulty: Medium

9. Interactive Multimodal Data Explorer: Leveraging Foundation Models for Dataset Exploration and Cleaning with Datumaro

Short description: Large language-vision models and other multimodal models (CLIP, LLaVa, PaLM, GPT-4V) generate embeddings for their included modalities which are aligned during training. These embeddings have proven valuable for downstream tasks like detection and classification. This project aims to enhance Datumaro - an efficient dataset management library - by building interactive visualization tools for exploring these joint embeddings. Users will be able to navigate (pan, zoom, filter) the embedding space to gain insights into dataset structure and relationships between modalities. The project will leverage either modern web frameworks (React/Vue.js) or ML-specific frameworks (Streamlit/Gradio) to create an intuitive interface for data exploration, with additional functionality for basic annotation operations to identify and tag noisy or corrupt data. The Datumaro toolkit will handle core dataset operations while OTX will be used for feature computation.

Expected outcomes:

  • Interactive web application integrated with Datumaro for dataset visualization and management
  • Real-time exploration of joint embedding spaces through 3D visualization
  • Integration with foundation models for embedding generation
  • Basic annotation interface for data cleaning and tagging
  • Documentation and example workflows within the Datumaro ecosystem

Skills required/preferred:

  • Python programming with experience in ML/DL concepts
  • Web development skills (React/Vue.js OR Streamlit/Gradio OR HTML/Javascript/D3)
  • Experience with data visualization libraries (D3.js, Plotly, etc.)
  • Understanding of embedding spaces and dimensional reduction
  • Familiarity with dataset management concepts
  • Interest in learning Datumaro's architecture and capabilities

Mentors: Laurens Hogeweg, Samet Akcay

Size of project: 350 hours

Difficulty: Medium

10. Fine-tuning Vision Language Models (VLMs) for Object Detection and Hierarchical Classification using the OpenVINO Ecosystem

Short description: Vision Language Models (VLMs) are foundational AI models capable of understanding and processing both visual and textual data. This project aims to fine-tune VLMs for object detection and hierarchical classification tasks, leveraging Intel Geti and the OpenVINO ecosystem. The project will explore different training methods, inference optimizations, dataset integrations, and if time permits, Explainable AI (XAI) enhancements. By enabling fine-tuning of VLMs on specific domains, this project will improve their adaptability and provide a structured approach to training VLMs with optimized memory usage and accuracy trade-offs. It will also contribute to dataset integration into Datumaro, making it easier for researchers to use public datasets, enhance OpenVINO inference optimizations for faster and more efficient VLM deployments, and extend XAI support for VLMs, improving transparency and interpretability.

Expected outcomes:

  • A survey of existing Fine-tuning methods for VLMs for object detection and hierarchical classification.
  • Benchmarking of various fine-tuning approaches, including LoRA, QLoRA, and targeted layers training into the OpenVINO Training extensions (OTX)
  • Integrated dataset formats into OpenVINO's Datumaro repo for seamless VLM dataset management.
  • Optimized inference pipeline leveraging OpenVINO.
  • XAI support for VLMs within the OpenVINO ecosystem to the OpenVINO XAI Toolkit

Skills required/preferred:

  • Python programming with experience in deep learning (pytorch and lightning)
  • Experience with object detection and hierarchical classification.
  • Experience with Vision Language Models is preferred
  • Familiarity with quantization techniques
  • Interest in learning and contributing to Datumaro, OTX, XAI and OpenVINO capabilities

Mentors: Rajesh Gangireddy, Laurens Hogeweg, Samet Akcay

Size of project: 350 hours

Difficulty: Medium

11. Leveraging Large Foundation Models (LFMs) for Automated Annotation and Edge-Deployable Model Training with Human-In-Loop

Short description: The annotation process for large-scale datasets, particularly for tasks such as classification, object detection, and segmentation, is time-consuming and labor-intensive. Large Foundation Models (LFMs) offer powerful capabilities in generating annotations automatically, significantly reducing human effort. However, these models are often too large and computationally expensive for real-time edge deployment. This project aims to develop a framework that leverages LFMs for annotation while progressively distilling their knowledge into smaller, edge-deployable models. The system will also incorporate uncertainty estimation and active learning to ensure high-quality labels with minimal human intervention.

Example flow: The first step involves leveraging LFMs to generate initial annotations for large datasets for a specific task (classification, object detection, or segmentation). To enhance reliability, predictions from LFMs can be combined with additional weak supervision sources or human in loop. Active learning techniques will be incorporated to prioritize uncertain or highly informative samples for human verification, ensuring efficient use of annotation effort. The system will track human corrections and use them to improve future annotation reliability by iteratively refining the uncertainty estimation models. Additionally, as the annotation loop progresses, the smaller model will be fine-tuned on corrected annotations, gradually improving its accuracy with minimal human intervention. Over time, the smaller model can take over the annotation process, reducing dependency on expensive LFMs while maintaining high-quality labels.

Expected outcomes:

  • A scalable framework that automates annotation using LFMs while minimizing human effort.
  • A HITL framework where uncertain annotations are flagged for human verification. A simple UI can be useful for this, however, for experimentation, the annotations can also be corrected using ground truth (No UI required).
  • A robust uncertainty estimation and active learning pipeline to improve annotation quality.
  • Training pipelines for edge-deployable models in an active learning environment.
  • A benchmarking study comparing manual annotation, LFM-only annotation, and the proposed AL-HITL approach in terms of accuracy, annotation speed, and human effort reduction.
  • Open-source repo, including example scripts (and possibly datasets) annotated using the system.

Skills required/preferred:

  • Experience with deep learning frameworks such as PyTorch and Transformers.
  • Familiarity with Large Foundation Models (LFMs) and Active Learning techniques.
  • Understanding of uncertainty estimation in AI models
  • (Optional) Experience in developing simple web-based UI applications for interactive dataset annotation (e.g., using Gradio or Streamlit), leveraging existing work where applicable.

Mentors: Rajesh Gangireddy, Samet Akcay

Size of project: 350 Hours

Difficulty: Medium

12. Enable popular Keras hub GenAI/LLM pipelines for the OpenVINO backend in Keras 3 workflow and optimize

Short description: Need to implement missed operations support in Keras 3 for the OpenVINO backend in order to run the required GenAI/LLM pipelines. Optimize enabled pipelines to reach the best performance metrics for them and outreach other backends (PyTorch, TensorFlow, JAX)

Expected outcomes: Since Keras 3.8 release, we introduce prelimiary version of the OpenVINO backend for Keras 3 workflows. The goal of this project is to select the most 2-3 popular GenAI/LLM pipelines from Keras Hub, enable and optimize them for OpenVINO backend. Some required operations can be missed so you need to implement missed operations support in Keras 3 for the OpenVINO backend in order to run the required GenAI/LLM pipelines. Optimize enabled pipelines to reach the best performance metrics for them and outreach other backends (PyTorch, TensorFlow, JAX).

Skills required/preferred: well familiar with AI frameworks and tensor operations, approaches for GenAI/LLMs optimization like KV-cache, Lora adapters

Mentors: Roman Kazantsev, Maxim Vafin, Anastasia Popova, Andrei Kochin

Size of project: 175 hours

Difficulty: Medium to hard

13. Implement XLA plugin to run OpenVINO applications on any XLA supported devices (NVIDIA GPUs, FPGA, TPUs)

Short description: Need to implement new OpenVINO plugin for which openvino.compile will convert OpenVINO IR into XLA representation (using HLO or MLIR dialect) and inference request will run the compiled blob on any XLA backend (GPUs, FPGA, TPUs). This feature will allow to infer models on CUDA GPUs, TPUs, FPGA devices using OpenVINO API.

Expected outcomes: It should be new OpenVINO plugin with "XLA" name that is able to infer basic CNN and transformer models on NVidia GPUs, Google TPUs, etc.

Skills required/preferred: well familiar with AI frameworks and tensor operations, understanding and experience with HLO/MLIR dialect, XLA C++ API.

Mentors: Roman Kazantsev, Maxim Vafin, Anastasia Popova, Andrei Kochin

Size of project: 350 hours

Difficulty: Hard

14. Accelerating Inference of NNCF-Compressed LLMs with Triton

Short description: Large Language Models (LLMs) require significant computational resources: memory, compute and power for efficient inference. The Neural Network Compression Framework (NNCF) enables model optimization via quantization and weight compression techniques, reducing memory and compute requirements. However, inference these compressed models efficiently across CPU, GPU and other hardware demands optimized execution kernels. Triton can solve this because it allows writing a kernel once and achieving portable and efficient execution on multiple different hardware platforms. This project aims to accelerate the inference performance of NNCF-compressed LLMs by leveraging "low-bit matmul" Triton kernels and providing capability to customize kernels for researching new compression algorithms. In this project, available open-source implementations of "low-bit matmul" Triton kernels such as: GemLite, Tinygemm, Marlin, Triton AutoGPTQ, etc., will be considered. Taking into account of the open-source implementations, the "low-bit matmul" Triton kernels will be implemented to support NNCF weight compression types: INT8, INT4, NF4, FP4 and dynamic INT8 group quantization. It will also be necessary to implement torch.compile compatibility, as one of the solutions is to use custom_op for calling the kernels.

Expected outcomes:

  • Efficient inference of NNCF-compressed LLMs using Triton.
  • Performance benchmarks demonstrating acceleration improvements.
  • Pull request with the implementation of "low-bit matmul" Triton kernels in NNCF.

Skills required/preferred:

  • Python
  • Understanding LLMs
  • Understanding model optimization techniques (NNCF)
  • Experience in writing Triton or CUDA (Optional) kernels
  • Experience with PyTorch, torch.compile
  • Performance Profiling

Mentors: Alexander Suslov, Alexander Dokuchaev

Size of project: 350 hours

Difficulty: Medium to hard

15. OpenVINO GenAI provider for LangChain

Short description: LangChain is a framework for developing applications powered by language models. This project aims to create an OpenVINO GenAI extension for LangChain, enabling users to use OpenVINO as a backend for running language models. The contributor will need to learn how to develop extensions for LangChain and understand the OpenVINO GenAI API.

The project involves binding these components together to allow seamless integration of OpenVINO with LangChain, providing users with optimized performance for language model inference, as well as clear documentation for the exposed API. The final package should be published on PyPI under langchain-openvino.

Useful links:

Expected outcomes: A fully functional OpenVINO GenAI extension for LangChain, allowing users to run language models using OpenVINO as the backend. It should also be delivered in form of a Python wheel.

Skills required/preferred: Python, LLMs

Mentors: Przemyslaw Wysocki, Anastasia Kuporosova

Size of project: 175 hours

Difficulty: Medium

16. Traffic Intersection Monitoring System with OpenVINO

Short description: The Traffic Intersection Monitoring System leverages object detection algorithms to identify vehicles, bicycles, and pedestrians in real-time. This system provides statistics on traffic and pedestrian flow at intersections and can trigger alarms for traffic violations such as running red lights. The project involves implementing object detection algorithms with OpenVINO, developing post-processing algorithms for traffic violation detection, and creating a GUI using Qt.

Expected outcomes: A fully functional traffic intersection monitoring system.

Skills required/preferred: Python, C++, QT, Computer Vision

Mentors: Dan Qiu, Yiwei Lee

Size of project: 175 hours

Difficulty: Medium

17. Optimized Serving endpoint for classic machine learning model training and inference

Short description: Classic machine learning algorithms play an important role in AI applications. The goal of this project is to create a serving endpoint inside of the OpenVINO Model Server that would perform machine learning model training based on the data sets and algorithm type received in the request. The training would be performed inside a custom python script employing scikit-learn with onedal performance optimization. The response of the training request would be the model in onnx format.

Such a training model would be stored in a versioned structure of models and exported for inference also using the OpenVINO Model Server. ONNX models are supported.

The advantages of the project would be:

  • Delegation of the training and inference for ML models and make the client application simpler and to protect the model
  • Performance acceleration on CPU and option to delegate execution to Intel GPU

Expected outcomes: Serving endpoint for training ML models. Inference serving deployment of trained ML models. Examples how to use and evaluate the endpoints.

Skills required/preferred: Python, scikit-learn, classical machine learning algorithms

Mentors: Dariusz Trawinski, Milosz Zeglarski

Size of project: 175 hours

Difficulty: Medium

18. Multimodal embeddings based on OVMS

Short description: Multimodal embeddings model can also compare semantically text with other information types like image, video, audio. They play an important role in AI applications with searching, sorting, classification and many more functions.

The idea of this project is to simplify and optimized calculating the embeddings for various data types by exposing dedicated REST API endpoints in the OpenVINO Model Server. The proposal is to combine the benefits of transformers library for data preprocessing with OpenVINO Runtime for inference execution. In the scope would be creating a Mediapipe graph with the following nodes:

  • Python based preprocessing steps to transform arbitrary type of data like images, audio, and video to the formats accepted by the AI Models
  • Inference execution with the model exported from HuggingFace hub

Such an endpoint would combine the flexibility of accepting multiple data formats with execution efficiency.

As part of demonstrating this functionality would be creating a sample application for searching images based on the user prompt. That would include populating the vector db with embeddings calculated for the images in the local folder and a retrieval operation that compares image embeddings with text embeddings (already supported in OpenVINO Model Server)

Expected outcomes: Image, Audio, and Video embeddings serving endpoints implemented in OpenVINO Model Server as a graph with python preprocessing and OpenVINO inference execution. An application performing image search based on user prompt.

Skills required/preferred: Python, langchain, transformers

Mentors: Dariusz Trawinski, Damian Kalinowski

Size of project: 175 hours

Difficulty: Medium

19. Object tracking in MP with OpenVINO inference

Short description: Tracking the objects in a video stream is an important use case. It combines an object detection model with a tracking algorithm that analyzes a whole sequence of images. The current state-of-the-art algorithm is ByteTrack

The goal of the project is to implement the ByteTrack algorithm as a MediaPipe graph that could delegate inference execution to the OpenVINO inference calculator. This graph could be deployed in the OpenVINO Model Server and deployed for serving. A sample application adopting KServer API would send the stream of images and would get the information about the tracked objects in the stream.

Expected outcomes: MediaPipe graphs with the calculator implementation for ByteTrack algorithm with yolo models.

Skills required/preferred:  C++ (for writing calculator), Python(for writing client) MediaPipe

Mentors: Dariusz Trawinski, Damian Kalinowski

Size of project: 175 hours

Difficulty: Medium

20. Demonstrating integration of open-webui with OpenVINO Model Server

Short description: Open-WebUI is a very popular component that provides a user interface to generative models. It supports use cases related to text generation, RAG, image generation, and many more. It also supports integration with remote execution servings compatible with standard APIs like OpenAI for chat completions and image generation.  

The goal of this project is the integrate open-webui with OpenVINO Model Server. It would include instructions for deploying the serving with a set of models and configuring open-webui to delegate generation to the serving endpoints.

Expected outcomes: Receipt for deploying open-web with an instance of OpenVINO Model Server including chat, RAG and image generation. Reporting usability experience and gaps analysis.

Skills required/preferred: Python, LLMs

Mentors: Dariusz Trawinski, Milosz Zeglarski

Size of project: 90 hours

Difficulty: Easy to medium

21. Improving OpenVINO Test Drive RAG

Short description: OpenVINO Test Drive allows users to quickly and easily test a variety of GenAI models like image generation, whisper and text generation. Although text generation models do a good job of answering questions, they often lack crucial domain-specific knowledge. RAG (retrieval-augmented generation) aims to solve this issue by allowing the user to supply this knowledge to the model. OpenVINO Test Drive has a basic RAG implementation. However, the quality of the output can be greatly improved by preprocessing documents. This project will explore and develop a more sophisticated RAG implementation to be implemented into the OpenVINO Test Drive.

Expected outcomes: Implementation of a more sophisticated RAG into OpenVINO Test Drive.

Skills required/preferred: Dart, Flutter, Langchain, OpenVINO

Mentors: Ronald Hecker, Arend Jan Kramer

Size of project: 175 hours

Difficulty: Medium

Clone this wiki locally