Skip to content

MultiDeviceModel problem #121

Open
Open
@Chebart

Description

@Chebart

environment:

pcie environment
transformers: 4.51.0
torch: 2.6.0
LLM-TPU: f5502b5 2025.03.29
tpu-mlir: 5eba2a0e2 2025.03.20
driver version: release version:0.5.2 release date: 20250411-003300

path:

/workspace/LLM-TPU/template/parallel_demo

operate:

python3 pipeline.py --devid 4,5,6,7 --dir_path /workspace/LLM-TPU/inference_bmodels/Llama-3.2-3B-Instruct-w4-4dev

question:

Hello, I'm using the SC7-224T card with sophosdk version 24.04.01 and libsophon version 0.5.2, and I've encountered an issue with the chat.cpp module located in the template/parallel_demo directory. Previously, models configured to run on 1, 2, 4, and 8 chips all functioned as expected. However, now unexpectedly only the 1-chip and 2-chip models are working. The 4-chip and 8-chip configurations, which used to run without any issues, are now failing to load. You can download the full debug trace using this link. Do you have any idea what might be causing this issue or how I can fix it?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions