How to debug with IREE when neural network contains large weight? #20418
Answered
by
ScottTodd
FlintWangacc
asked this question in
Q&A
-
|
Recently I am debugging deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B model on IREE. It contains weight about 15G. So when I try to dump the IR generated by IREE. It takes a lot of disk space and a lot of time. Can we seperate the code and weight? Or is there any other method to do it? |
Beta Was this translation helpful? Give feedback.
Answered by
ScottTodd
Mar 31, 2025
Replies: 2 comments 1 reply
-
Beta Was this translation helpful? Give feedback.
0 replies
-
|
I found it is possible to use iree-turbine to make neural network external weight. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Yep, we recommend separating model parameters (weights) from program code. Here are our docs on using parameter files:
Also, if you are dumping IR, see the IR printing options at https://mlir.llvm.org/docs/PassManagement/#ir-printing together with https://mlir.llvm.org/getting_started/Debugging/. You can do something like
--mlir-print-ir-after-all --mlir-elide-elementsattrs-if-larger=64to dump all IR but print constants larger than 64 bytes in redacted form. IREE adds more options on top of those like--dump-compilation-phases-to=${PATH}which prints after high level phases in t…