Skip to content

Issue with Trace Option Causing TypeError in mLoRA Training #268

Open
@EricLabile

Description

@EricLabile

I encountered an error when using the --trace option. The error message indicates the following:

/u/.conda/envs/mlora/lib/python3.12/site-packages/bitsandbytes/autograd/functions.py:322: UserWarning: MatMul8bitLt: inputs will be cast from torch.float32 to float16 during quantization
warnings.warn(f"MatMul8bitLt: inputs will be cast from {A.dtype} to float16 during quantization")
Traceback (most recent call last):
File "/mLoRA/mlora_train.py", line 68, in
executor.execute()
File "/mLoRA/mlora/executor/executor.py", line 110, in execute
output = self.model
.forward(data.model_data())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mLoRA/mlora/model/llm/model_llama.py", line 174, in forward
data = seq_layer.forward(data)
^^^^^^^^^^^^^^^^^^^^^^^
File "/mLoRA/mlora/model/llm/model_llama.py", line 138, in forward
return forward_func_dictmodule_name
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mLoRA/mlora/model/llm/model_llama.py", line 108, in decoder_forward
set_backward_tracepoint(output.grad_fn, "b_checkpoint")
File "/mLoRA/mlora/profiler/profiler.py", line 139, in set_backward_tracepoint
if TRACEPOINT_KEY in grad_fn.metadata():
^^^^^^^^^^^^^^^^^^
TypeError: 'dict' object is not callable
Generating '/tmp/nsys-report-4fe1.qdstrm'

I executed the command:

nsys profile -w true -t cuda,nvtx -s none -o test_report -f true -x true python mlora_train.py --base_model TinyLlama/TinyLlama-1.1B-Chat-v0.4 --device "cuda:0" --config /projects/bcrn/mLoRA/demo/lora/lora_case_1.yaml --trace
or simply added --trace after normal commands.

Could you please help me understand why this error is occurring? And could you help me with using trace? Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions