Open
Description
I am new to AI and trying to use llama2
model locally using pyllama
.
I tried different options, but nothing seems to work. I downloaded llama using https://github.com/facebookresearch/llama.
Here is what I tried (see below for installed packages):
$ torchrun --nproc_per_node 1 example.py --ckpt_dir ../codellama/CodeLlama-7b/ --tokenizer_path ../codellama/CodeLlama-7b/tokenizer.model
Traceback (most recent call last):
File "/home/xxxxx/pyllama/example.py", line 80, in <module>
fire.Fire(main)
File "/home/xxxxx/miniconda3/envs/llama2/lib/python3.11/site-packages/fire/core.py", line 141, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
...
..
File "/home/xxxxx/miniconda3/envs/llama2/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py", line 1268, in _new_process_group_helper
raise RuntimeError("Distributed package doesn't have NCCL built in")
RuntimeError: Distributed package doesn't have NCCL built in
[2024-01-01 20:58:30,998] torch.distributed.elastic.multiprocessing.api: [ERROR] failed (exitcode: 1) local_rank: 0 (pid: 1814953) of binary: /home/xxxxx/miniconda3/envs/llama2/bin/python
Traceback (most recent call last):
..
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
Below seems to work, but I don't get any response whatsoever:
KV_CACHE_IN_GPU=0 python inference.py --ckpt_dir ../codellama/CodeLlama-7b/ --tokenizer_path ../codellama/CodeLlama-7b/tokenizer.model
.. <after waiting for several seconds .. typed in the following command and pressed Enter> ..
Prompt:['I believe in ']
<no response whatsoever>
I tried both pytorch cuda and non-cuda packages from https://pytorch.org/get-started/locally/. Example: conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia
but same NCCL error in torchrun and no output from inference.py
I am on an HP workstation running ubuntu (23.04 (Lunar Lobster))
CPU(s): 4
On-line CPU(s) list: 0-3
Vendor ID: GenuineIntel
Model name: Intel(R) Xeon(R) CPU W3565 @ 3.20GHz
CPU family: 6
Model: 26
Thread(s) per core: 1
Core(s) per socket: 4
Socket(s): 1
Stepping: 5
$ sudo lshw -numeric -C display
..
*-display
description: VGA compatible controller
product: G94GL [Quadro FX 1800] [10DE:638]
vendor: NVIDIA Corporation [10DE]
...
Metadata
Metadata
Assignees
Labels
No labels