Skip to content

Turn management based on VAP

sagatake edited this page Apr 4, 2025 · 9 revisions

CAUTION: UNDER CONSTRUCTION!!

Turn management module based on voice activity projection (VAP) model.

This module considers both of signals from the user and the agent.

Currently, there are two mode

  • audio VAP
  • audio and face embed VAP

Installation

Please follow common installation

Download models

  1. Download and uncompress dlib face detection models from here and here, then place them into bin/Common/Data/TurnManagement/dlib_models
  2. Download all VAP models in model/VAP from here and place it into bin/Common/Data/TurnManagement/models/

Usage

  1. In Modular.jar, add Microphone module from [Add -> Input -> Microphone]
  2. In Modular.jar, add TurnManagement module from [Add -> Input -> Dialogue -> TurnManagement]
  3. Create the following connections in Modular.jar: Feedback -> TurnManagement, TurnManagement -> BehaviorPlanner
  4. When you are speaking something and Greta is not talking, the module randomly picks one XML file from bin/Examples/DemoEN/backchannel upon several rules written in /bin/Common/Data/TurnManagement/turnManager.py
  5. Some rules were based on the previous backchannels module

Note

License for pretrained models

  • A pre-trained CPC model, located at encoders/cpc/60k_epoch4-d0f474de.pt, is from the original CPC project and please follow its specific license. Refer to the original repository at (https://github.com/facebookresearch/CPC_audio) for more details.
  • A pre-trained FormerDFER model, located at encoders/FormerDFER/DFER_encoder_weight_only.pt, is the simplified (from model_set_1.pt, eliminated temporal transformer and linear layer) version of the original pre-trained model from the original Former-DFER project. Please follow its specific license. Refer to the original repository at (https://github.com/zengqunzhao/Former-DFER) for more details.

Screenshot

Getting started with Greta

Basics

Advanced

For developpers

Functionalities

Core functionality

Auxiliary functionality

Preview functionality (only in dev branch)

Nothing to show :)

Previous functionality (it might work, but not supported anymore)

Tips

Clone this wiki locally