-
Notifications
You must be signed in to change notification settings - Fork 173
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Environment issue #179
Comments
Could you try pip3 install -r requirements.txt directly? |
I tried it but it leads to python version conflict: #177. Which python version do you use? |
Fix: #180 |
When I took new requirements from #180 it still had the same issue. However switching to CUDA 12.4 solved the problem! Previously I had CUDA 11.8. I saw you added CUDA version to the Readme - thanks! |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Can you please share more details about environment? I was trying to run training based on your instructions but it generated an error (pased below). I am trying to run the code on Ubuntu 22, with CUDA 11.8 and python version 3.11. I install pytorch 2.6 and then install remaining dependencies with
pip3 install gin-config absl-py scikit-learn scipy matplotlib numpy apex hypothesis pandas fbgemm_gpu iopath tensorboard
.Error message:
`WARNING:root:Could not the library 'fbgemm_gpu_py.so': /home/mateusz.marzec/.pyenv/versions/3.11.9/envs/generative-recommenders/lib/python3.11/site-packages/fbgemm_gpu/fbgemm_gpu_py.so: undefined symbol: _ZN2at23SavedTensorDefaultHooks11set_tracingEb. This may be expected depending on the FBGEMM_GPU variant.
WARNING:root:Could not the library 'fbgemm_gpu_py.so': /home/mateusz.marzec/.pyenv/versions/3.11.9/envs/generative-recommenders/lib/python3.11/site-packages/fbgemm_gpu/fbgemm_gpu_py.so: undefined symbol: _ZN2at23SavedTensorDefaultHooks11set_tracingEb. This may be expected depending on the FBGEMM_GPU variant.
Initialize _item_emb.weight as truncated normal: torch.Size([695763, 64]) params
Skipping init for _embedding_module._item_emb.weight
Initialize _input_features_preproc._pos_emb.weight as xavier normal: torch.Size([61, 64]) params
Skipping init for _hstu._attention_layers.0._uvqk
Skipping init for _hstu._attention_layers.0._rel_attn_bias._ts_w
Skipping init for _hstu._attention_layers.0._rel_attn_bias._pos_w
Skipping init for _hstu._attention_layers.0._o.weight
Skipping init for _hstu._attention_layers.0._o.bias
Skipping init for _hstu._attention_layers.1._uvqk
Skipping init for _hstu._attention_layers.1._rel_attn_bias._ts_w
Skipping init for _hstu._attention_layers.1._rel_attn_bias._pos_w
Skipping init for _hstu._attention_layers.1._o.weight
Skipping init for _hstu._attention_layers.1._o.bias
Skipping init for _hstu._attention_layers.2._uvqk
Skipping init for _hstu._attention_layers.2._rel_attn_bias._ts_w
Skipping init for _hstu._attention_layers.2._rel_attn_bias._pos_w
Skipping init for _hstu._attention_layers.2._o.weight
Skipping init for _hstu._attention_layers.2._o.bias
Skipping init for _hstu._attention_layers.3._uvqk
Skipping init for _hstu._attention_layers.3._rel_attn_bias._ts_w
Skipping init for _hstu._attention_layers.3._rel_attn_bias._pos_w
Skipping init for _hstu._attention_layers.3._o.weight
Skipping init for _hstu._attention_layers.3._o.bias
WARNING:root:Could not the library 'fbgemm_gpu_py.so': /home/mateusz.marzec/.pyenv/versions/3.11.9/envs/generative-recommenders/lib/python3.11/site-packages/fbgemm_gpu/fbgemm_gpu_py.so: undefined symbol: _ZN2at23SavedTensorDefaultHooks11set_tracingEb. This may be expected depending on the FBGEMM_GPU variant.
WARNING:root:Could not the library 'fbgemm_gpu_py.so': /home/mateusz.marzec/.pyenv/versions/3.11.9/envs/generative-recommenders/lib/python3.11/site-packages/fbgemm_gpu/fbgemm_gpu_py.so: undefined symbol: _ZN2at23SavedTensorDefaultHooks11set_tracingEb. This may be expected depending on the FBGEMM_GPU variant.
WARNING:root:Could not the library 'fbgemm_gpu_py.so': /home/mateusz.marzec/.pyenv/versions/3.11.9/envs/generative-recommenders/lib/python3.11/site-packages/fbgemm_gpu/fbgemm_gpu_py.so: undefined symbol: _ZN2at23SavedTensorDefaultHooks11set_tracingEb. This may be expected depending on the FBGEMM_GPU variant.
WARNING:root:Could not the library 'fbgemm_gpu_py.so': /home/mateusz.marzec/.pyenv/versions/3.11.9/envs/generative-recommenders/lib/python3.11/site-packages/fbgemm_gpu/fbgemm_gpu_py.so: undefined symbol: _ZN2at23SavedTensorDefaultHooks11set_tracingEb. This may be expected depending on the FBGEMM_GPU variant.
WARNING:root:Could not the library 'fbgemm_gpu_py.so': /home/mateusz.marzec/.pyenv/versions/3.11.9/envs/generative-recommenders/lib/python3.11/site-packages/fbgemm_gpu/fbgemm_gpu_py.so: undefined symbol: _ZN2at23SavedTensorDefaultHooks11set_tracingEb. This may be expected depending on the FBGEMM_GPU variant.
WARNING:root:Could not the library 'fbgemm_gpu_py.so': /home/mateusz.marzec/.pyenv/versions/3.11.9/envs/generative-recommenders/lib/python3.11/site-packages/fbgemm_gpu/fbgemm_gpu_py.so: undefined symbol: _ZN2at23SavedTensorDefaultHooks11set_tracingEb. This may be expected depending on the FBGEMM_GPU variant.
WARNING:root:Could not the library 'fbgemm_gpu_py.so': /home/mateusz.marzec/.pyenv/versions/3.11.9/envs/generative-recommenders/lib/python3.11/site-packages/fbgemm_gpu/fbgemm_gpu_py.so: undefined symbol: _ZN2at23SavedTensorDefaultHooks11set_tracingEb. This may be expected depending on the FBGEMM_GPU variant.
WARNING:root:Could not the library 'fbgemm_gpu_py.so': /home/mateusz.marzec/.pyenv/versions/3.11.9/envs/generative-recommenders/lib/python3.11/site-packages/fbgemm_gpu/fbgemm_gpu_py.so: undefined symbol: _ZN2at23SavedTensorDefaultHooks11set_tracingEb. This may be expected depending on the FBGEMM_GPU variant.
Traceback (most recent call last):
File "/home/mateusz.marzec/custom/generative-recommenders/main.py", line 85, in
main()
File "/home/mateusz.marzec/custom/generative-recommenders/main.py", line 81, in main
app.run(_main)
File "/home/mateusz.marzec/.pyenv/versions/generative-recommenders/lib/python3.11/site-packages/absl/app.py", line 308, in run
_run_main(main, args)
File "/home/mateusz.marzec/.pyenv/versions/generative-recommenders/lib/python3.11/site-packages/absl/app.py", line 254, in _run_main
sys.exit(main(argv))
^^^^^^^^^^
File "/home/mateusz.marzec/custom/generative-recommenders/main.py", line 72, in _main
mp.spawn(
File "/home/mateusz.marzec/.pyenv/versions/generative-recommenders/lib/python3.11/site-packages/torch/multiprocessing/spawn.py", line 282, in spawn
return start_processes(fn, args, nprocs, join, daemon, start_method="spawn")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/mateusz.marzec/.pyenv/versions/generative-recommenders/lib/python3.11/site-packages/torch/multiprocessing/spawn.py", line 238, in start_processes
while not context.join():
^^^^^^^^^^^^^^
File "/home/mateusz.marzec/.pyenv/versions/generative-recommenders/lib/python3.11/site-packages/torch/multiprocessing/spawn.py", line 189, in join
raise ProcessRaisedException(msg, error_index, failed_process.pid)
torch.multiprocessing.spawn.ProcessRaisedException:
-- Process 0 terminated with the following error:
Traceback (most recent call last):
File "/home/mateusz.marzec/.pyenv/versions/generative-recommenders/lib/python3.11/site-packages/torch/multiprocessing/spawn.py", line 76, in _wrap
fn(i, *args)
File "/home/mateusz.marzec/custom/generative-recommenders/main.py", line 65, in mp_train_fn
train_fn(rank, world_size, master_port)
File "/home/mateusz.marzec/.pyenv/versions/generative-recommenders/lib/python3.11/site-packages/gin/config.py", line 1605, in gin_wrapper
utils.augment_exception_message_and_reraise(e, err_str)
File "/home/mateusz.marzec/.pyenv/versions/generative-recommenders/lib/python3.11/site-packages/gin/utils.py", line 41, in augment_exception_message_and_reraise
raise proxy.with_traceback(exception.traceback) from None
File "/home/mateusz.marzec/.pyenv/versions/generative-recommenders/lib/python3.11/site-packages/gin/config.py", line 1582, in gin_wrapper
return fn(*new_args, **new_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/mateusz.marzec/custom/generative-recommenders/generative_recommenders/trainer/train.py", line 333, in train_fn
eval_dict = eval_metrics_v2_from_tensors(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/mateusz.marzec/.pyenv/versions/generative-recommenders/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/mateusz.marzec/custom/generative-recommenders/generative_recommenders/data/eval.py", line 103, in eval_metrics_v2_from_tensors
shared_input_embeddings = model.encode(
^^^^^^^^^^^^^
File "/home/mateusz.marzec/custom/generative-recommenders/generative_recommenders/modeling/sequential/hstu.py", line 799, in encode
return self._encode(
^^^^^^^^^^^^^
File "/home/mateusz.marzec/custom/generative-recommenders/generative_recommenders/modeling/sequential/hstu.py", line 760, in _encode
encoded_seq_embeddings, cache_states = self.generate_user_embeddings(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/mateusz.marzec/custom/generative-recommenders/generative_recommenders/modeling/sequential/hstu.py", line 696, in generate_user_embeddings
x_offsets=torch.ops.fbgemm.asynchronous_complete_cumsum(past_lengths),
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/mateusz.marzec/.pyenv/versions/generative-recommenders/lib/python3.11/site-packages/torch/_ops.py", line 1170, in getattr
raise AttributeError(
AttributeError: '_OpNamespace' 'fbgemm' object has no attribute 'asynchronous_complete_cumsum'
In call to configurable 'train_fn' (<function train_fn at 0x7fccf13e82c0>)`
The text was updated successfully, but these errors were encountered: