Skip to content

[Question] How was the llama fp8 mlir generated? #11

@rednoah91

Description

@rednoah91

Hi @sogartar

Could you teach me how to generate this fp8 mlir which only matmul is fp8 and other ops keep in f32?
https://sharkpublic.blob.core.windows.net/sharkpublic/dan/fp8_prefill.mlir
I can not find related doc describes where is this mlir be generated. Maybe you can point me to the doc if any : )

Thanks
Hong-Rong

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions