Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] AvatarChatbot: Allow user to choose avatar images and model type between runs, instead of just the audio question #1055

Open
ctao456 opened this issue Nov 1, 2024 · 0 comments
Assignees

Comments

@ctao456
Copy link
Contributor

ctao456 commented Nov 1, 2024

Priority

P4-Low

OS type

Ubuntu

Hardware type

Gaudi2

Running nodes

Single Node

Description

The input to the wav2lip fastAPI is currently a post request that contains only the "audio" field, the byte64str for the audio question wav2lip_server.py

Whereas the avatar image/video path and the model type are passed through environment variables, $FACE and $INFERENCE_MODE, respectively. They are set before docker run to create the wav2lip fastapi container, through entrypoint.sh. Once the container is created, they are fixed values inside the container.

We need a smarter method to merge these environment variables into the fastapi post request. We need to create a new datatype for the animation microservice in docarray.py. Then, we can use that datatype to include the fields "face" and "inference_model", and pass them to the animate function in wav2lip_server.py.

We also need to make corresponding changes in the UI, to let the user use preview images/videos, and preset model types.

The outcome of this change will be that the user can change his/her avatar to use, and the inference model type, beside just the audio question, between one run and another.

@louie-tsai louie-tsai self-assigned this Nov 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants