-
Notifications
You must be signed in to change notification settings - Fork 247
Open
Description
which file i need to change to solve this issue . Iam working with video-llava but I think this is a issue across all llava model . can admin suggest me where should I look into to solve this mismatch issue :
here is the error message :
Adding LoRA adapters...
total data 10988
Formatting inputs...Skip in lazy mode
0%| | 0/86 [00:00<?, ?it/s]Traceback (most recent call last):
File "/home/hmbadal/AQA/ABC/Video-LLaVA/videollava/train/train_mem.py", line 13, in <module>
train()
File "/home/hmbadal/AQA/ABC/Video-LLaVA/videollava/train/train.py", line 1078, in train
trainer.train()
File "/home/hmbadal/anaconda3/envs/badalbhai/lib/python3.10/site-packages/transformers/trainer.py", line 1539, in train
return inner_training_loop(
File "/home/hmbadal/anaconda3/envs/badalbhai/lib/python3.10/site-packages/transformers/trainer.py", line 1809, in _inner_training_loop
tr_loss_step = self.training_step(model, inputs)
File "/home/hmbadal/anaconda3/envs/badalbhai/lib/python3.10/site-packages/transformers/trainer.py", line 2654, in training_step
loss = self.compute_loss(model, inputs)
File "/home/hmbadal/anaconda3/envs/badalbhai/lib/python3.10/site-packages/transformers/trainer.py", line 2679, in compute_loss
outputs = model(**inputs)
File "/home/hmbadal/anaconda3/envs/badalbhai/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/hmbadal/anaconda3/envs/badalbhai/lib/python3.10/site-packages/torch/nn/parallel/data_parallel.py", line 171, in forward
outputs = self.parallel_apply(replicas, inputs, kwargs)
File "/home/hmbadal/anaconda3/envs/badalbhai/lib/python3.10/site-packages/torch/nn/parallel/data_parallel.py", line 181, in parallel_apply
return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
File "/home/hmbadal/anaconda3/envs/badalbhai/lib/python3.10/site-packages/torch/nn/parallel/parallel_apply.py", line 89, in parallel_apply
output.reraise()
File "/home/hmbadal/anaconda3/envs/badalbhai/lib/python3.10/site-packages/torch/_utils.py", line 644, in reraise
raise exception
RuntimeError: Caught RuntimeError in replica 0 on device 0.
Original Traceback (most recent call last):
File "/home/hmbadal/anaconda3/envs/badalbhai/lib/python3.10/site-packages/torch/nn/parallel/parallel_apply.py", line 64, in _worker
output = module(*input, **kwargs)
File "/home/hmbadal/anaconda3/envs/badalbhai/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/hmbadal/anaconda3/envs/badalbhai/lib/python3.10/site-packages/peft/peft_model.py", line 922, in forward
return self.base_model(
File "/home/hmbadal/anaconda3/envs/badalbhai/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/hmbadal/AQA/ABC/Video-LLaVA/videollava/model/language_model/llava_llama.py", line 79, in forward
) = self.prepare_inputs_labels_for_multimodal(
File "/home/hmbadal/AQA/ABC/Video-LLaVA/videollava/model/llava_arch.py", line 207, in prepare_inputs_labels_for_multimodal
video_features_minibatch = self.encode_videos(videos_minibatch) # fake list [mini_b, t, l, c]
File "/home/hmbadal/AQA/ABC/Video-LLaVA/videollava/model/llava_arch.py", line 144, in encode_videos
video_features = self.get_model().get_video_tower()(videos) # [mini_b, t, n, c]
File "/home/hmbadal/anaconda3/envs/badalbhai/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/hmbadal/anaconda3/envs/badalbhai/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/home/hmbadal/AQA/ABC/Video-LLaVA/videollava/model/multimodal_encoder/languagebind/__init__.py", line 227, in forward
video_forward_outs = self.video_tower(videos.to(device=self.device, dtype=self.dtype), output_hidden_states=True)
File "/home/hmbadal/anaconda3/envs/badalbhai/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/hmbadal/AQA/ABC/Video-LLaVA/videollava/model/multimodal_encoder/languagebind/video/modeling_video.py", line 646, in forward
hidden_states = self.embeddings(pixel_values)
File "/home/hmbadal/anaconda3/envs/badalbhai/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/hmbadal/anaconda3/envs/badalbhai/lib/python3.10/site-packages/transformers/models/clip/modeling_clip.py", line 195, in forward
patch_embeds = self.patch_embedding(pixel_values) # shape = [*, width, grid, grid]
File "/home/hmbadal/anaconda3/envs/badalbhai/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/hmbadal/anaconda3/envs/badalbhai/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 463, in forward
return self._conv_forward(input, self.weight, self.bias)
File "/home/hmbadal/anaconda3/envs/badalbhai/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 459, in _conv_forward
return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Given groups=1, weight of size [1024, 3, 14, 14], expected input[1024, 1, 224, 224] to have 3 channels, but got 1 channels instead
Metadata
Metadata
Assignees
Labels
No labels