Skip to content

feat: Enable HunyuanVideo-Avatar on RunPod Serverless #36

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

arpitdayma123
Copy link

This commit introduces the necessary changes to run the HunyuanVideo-Avatar model on the RunPod Serverless platform.

Key changes include:

  1. RunPod Handler (handler.py):

    • Created a new handler.py to serve as the entry point for RunPod.
    • The handler accepts image_url and audio_url as input.
    • It downloads the provided image and audio files to temporary local storage.
  2. Inference Logic Refactoring (hymm_sp/inference_runner.py):

    • Refactored the core inference script hymm_sp/sample_gpu_poor.py into a callable function process_video_avatar_job within the new hymm_sp/inference_runner.py.
    • This function now accepts direct file paths for image and audio, along with output parameters, instead of relying on command-line arguments for these.
    • Hardcoded model configuration parameters (e.g., frame count, seed, FP8 usage, CPU offload) as specified for low VRAM usage.
    • Adapted data loading to work with single inputs rather than a CSV file.
  3. Integration:

    • The handler.py now calls process_video_avatar_job from inference_runner.py to perform video generation.
    • Temporary input files are cleaned up after processing.
    • The path to the generated video (within the container) is returned in the handler's response.
  4. Dependencies (requirements.txt):

    • Added requests for URL downloading and runpod for the serverless environment.
    • Removed the non-pip ffmpeg entry (ffmpeg is now handled as a system dependency in Docker).
  5. Dockerfile:

    • Created a Dockerfile to package the application.
    • Uses a RunPod PyTorch base image with CUDA 11.8.
    • Installs system dependencies like ffmpeg.
    • Installs Python dependencies from requirements.txt, including flash-attention.
    • Sets necessary environment variables (MODEL_BASE, CPU_OFFLOAD, PYTHONPATH).
    • Copies the application code into the image.
    • Sets the CMD to run handler.py.

These changes allow the model to be deployed as a serverless endpoint on RunPod, taking image and audio URLs as input and configured for low VRAM operation. You need to ensure model weights are correctly placed in the /app/weights directory within the Docker image during the build process.

This commit introduces the necessary changes to run the HunyuanVideo-Avatar
model on the RunPod Serverless platform.

Key changes include:

1.  **RunPod Handler (`handler.py`):**
    *   Created a new `handler.py` to serve as the entry point for RunPod.
    *   The handler accepts `image_url` and `audio_url` as input.
    *   It downloads the provided image and audio files to temporary local storage.

2.  **Inference Logic Refactoring (`hymm_sp/inference_runner.py`):**
    *   Refactored the core inference script `hymm_sp/sample_gpu_poor.py` into a callable function `process_video_avatar_job` within the new `hymm_sp/inference_runner.py`.
    *   This function now accepts direct file paths for image and audio, along with output parameters, instead of relying on command-line arguments for these.
    *   Hardcoded model configuration parameters (e.g., frame count, seed, FP8 usage, CPU offload) as specified for low VRAM usage.
    *   Adapted data loading to work with single inputs rather than a CSV file.

3.  **Integration:**
    *   The `handler.py` now calls `process_video_avatar_job` from `inference_runner.py` to perform video generation.
    *   Temporary input files are cleaned up after processing.
    *   The path to the generated video (within the container) is returned in the handler's response.

4.  **Dependencies (`requirements.txt`):**
    *   Added `requests` for URL downloading and `runpod` for the serverless environment.
    *   Removed the non-pip `ffmpeg` entry (ffmpeg is now handled as a system dependency in Docker).

5.  **Dockerfile:**
    *   Created a `Dockerfile` to package the application.
    *   Uses a RunPod PyTorch base image with CUDA 11.8.
    *   Installs system dependencies like `ffmpeg`.
    *   Installs Python dependencies from `requirements.txt`, including `flash-attention`.
    *   Sets necessary environment variables (`MODEL_BASE`, `CPU_OFFLOAD`, `PYTHONPATH`).
    *   Copies the application code into the image.
    *   Sets the `CMD` to run `handler.py`.

These changes allow the model to be deployed as a serverless endpoint on RunPod, taking image and audio URLs as input and configured for low VRAM operation. You need to ensure model weights are correctly placed in the `/app/weights` directory within the Docker image during the build process.
This commit introduces the necessary changes to run the HunyuanVideo-Avatar
model on the RunPod Serverless platform. This is a re-attempt after a
previous Docker build error you reported. The Dockerfile in the
repository has been verified and appears correct.

Key changes include:

1.  **RunPod Handler (`handler.py`):**
    *   I created a new `handler.py` to serve as the entry point for RunPod.
    *   The handler accepts `image_url` and `audio_url` as input.
    *   It downloads the provided image and audio files to temporary local storage.

2.  **Inference Logic Refactoring (`hymm_sp/inference_runner.py`):**
    *   I refactored the core inference script `hymm_sp/sample_gpu_poor.py` into a callable function `process_video_avatar_job` within the new `hymm_sp/inference_runner.py`.
    *   This function now accepts direct file paths for image and audio, along with output parameters, instead of relying on command-line arguments for these.
    *   I hardcoded model configuration parameters (e.g., frame count, seed, FP8 usage, CPU offload) as specified for low VRAM usage.
    *   I adapted data loading to work with single inputs rather than a CSV file.

3.  **Integration:**
    *   The `handler.py` now calls `process_video_avatar_job` from `inference_runner.py` to perform video generation.
    *   Temporary input files are cleaned up after processing.
    *   The path to the generated video (within the container) is returned in the handler's response.

4.  **Dependencies (`requirements.txt`):**
    *   I added `requests` for URL downloading and `runpod` for the serverless environment.
    *   I removed the non-pip `ffmpeg` entry (ffmpeg is now handled as a system dependency in Docker).

5.  **Dockerfile:**
    *   I created a `Dockerfile` to package the application.
    *   It uses a RunPod PyTorch base image with CUDA 11.8.
    *   It installs system dependencies like `ffmpeg`.
    *   It installs Python dependencies from `requirements.txt`, including `flash-attention`.
    *   It sets necessary environment variables (`MODEL_BASE`, `CPU_OFFLOAD`, `PYTHONPATH`).
    *   It copies the application code into the image.
    *   It sets the `CMD` to run `handler.py`.

These changes allow the model to be deployed as a serverless endpoint on RunPod, taking image and audio URLs as input and configured for low VRAM operation. You need to ensure model weights are correctly placed in the `/app/weights` directory within the Docker image during the build process.
Moved the comment for `ENV MPLCONFIGDIR` to its own line.
This is to prevent potential parsing issues in certain Docker build
environments that might be sensitive to inline comments on ENV lines,
as suggested by persistent build errors.
Removes a commented-out `if __name__ == "__main__":` block from the end
of `hymm_sp/inference_runner.py`. This block was causing a SyntaxError
during runtime in the RunPod environment, likely due to formatting issues
or characters within the comments that were misinterpreted.
This change ensures the script only contains the necessary importable
logic for `process_video_avatar_job`.
Removes a commented-out `if __name__ == "__main__":` block from the end
of `hymm_sp/inference_runner.py`. This block was causing a SyntaxError
during runtime in the RunPod environment, likely due to formatting issues
or characters within the comments that were misinterpreted.
This change ensures the script only contains the necessary importable
logic for `process_video_avatar_job`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant