You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
I followed every instruction in the README, except for torch I have version 1.13.1. I have CUDA version 12.2, and I download torch using this command conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia
Then, when I run run_epr.sh I got the following error. The error is too long, and below are just part of error. If you need more detail, I would happy to provide them.
(icl) [conda] [lh599@corfu:icl-ceil]$ /research/cbim/medical/lh599/research/ruijiang/Dong/demonstration_selection/icl-ceil/scripts/run_epr.sh
bm25_retriever.py:94: UserWarning:
The version_base parameter is not specified.
Please specify a compatability version level, or None.
Will assume defaults for version 1.1
@hydra.main(config_path="configs", config_name="bm25_retriever")
[2024-07-29 11:21:17,148][__main__][INFO] - {'output_file': 'output/epr/qnli/EleutherAI/gpt-neo-2.7B/retrieved.json', 'num_candidates': 50, 'num_ice': 1, 'task_name': 'qnli', 'query_field': 'a', 'dataset_split': 'train', 'ds_size': 44000, 'index_reader': {'_target_': 'src.dataset_readers.index_dsr.IndexDatasetReader', 'task_name': '${task_name}', 'model_name': 'bert-base-uncased', 'field': 'a', 'dataset_split': 'train', 'dataset_path': 'index_data/qnli/index_dataset.json', 'ds_size': None}}
Downloading and preparing dataset None/ax to /research/cbim/vast/lh599/.cache/huggingface/datasets/parquet/ax-2087346d15759eef/0.0.0/14a00e99c0d15a23649d0db8944380ac81082d4b021f398733dd84f3a6c569a7...
Downloading data files: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 347.79it/s]
Extracting data files: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 33.06it/s]
Generating train split: 0 examples [00:00, ? examples/s][2024-07-29 11:21:19,127][datasets.packaged_modules.parquet.parquet][ERROR] - Failed to read file '/research/cbim/vast/lh599/.cache/huggingface/datasets/downloads/34b0e169567d6cc580fb18f6593c0f0d3d4ade7ca85314eaa32071da0c48b253' with error <class 'ValueError'>: Couldn't castsentence: stringlabel: int64idx: int32-- schema metadata --huggingface: '{"info": {"features": {"sentence": {"dtype": "string", "_ty' + 136to{'premise': Value(dtype='string', id=None), 'hypothesis': Value(dtype='string', id=None), 'label': ClassLabel(names=['entailment', 'neutral', 'contradiction'], id=None), 'idx': Value(dtype='int32', id=None)}because column names don't matchError executing job with overrides: ['output_file=output/epr/qnli/EleutherAI/gpt-neo-2.7B/retrieved.json', 'num_candidates=50', 'num_ice=1', 'task_name=qnli', 'index_reader.dataset_path=index_data/qnli/index_dataset.json', 'dataset_split=train', 'ds_size=44000', 'query_field=a', 'index_reader.field=a']Traceback (most recent call last): File "/research/cbim/medical/lh599/research/ruijiang/miniconda/envs/icl/lib/python3.7/site-packages/datasets/builder.py", line 1879, in _prepare_split_single for _, table in generator: File "/research/cbim/medical/lh599/research/ruijiang/miniconda/envs/icl/lib/python3.7/site-packages/datasets/packaged_modules/parquet/parquet.py", line 82, in _generate_tables yield f"{file_idx}_{batch_idx}", self._cast_table(pa_table) File "/research/cbim/medical/lh599/research/ruijiang/miniconda/envs/icl/lib/python3.7/site-packages/datasets/packaged_modules/parquet/parquet.py", line 61, in _cast_table pa_table = table_cast(pa_table, self.info.features.arrow_schema) File "/research/cbim/medical/lh599/research/ruijiang/miniconda/envs/icl/lib/python3.7/site-packages/datasets/table.py", line 2324, in table_cast return cast_table_to_schema(table, schema) File "/research/cbim/medical/lh599/research/ruijiang/miniconda/envs/icl/lib/python3.7/site-packages/datasets/table.py", line 2282, in cast_table_to_schema raise ValueError(f"Couldn't cast\n{table.schema}\nto\n{features}\nbecause column names don't match")ValueError: Couldn't castsentence: stringlabel: int64idx: int32-- schema metadata --huggingface: '{"info": {"features": {"sentence": {"dtype": "string", "_ty' + 136to{'premise': Value(dtype='string', id=None), 'hypothesis': Value(dtype='string', id=None), 'label': ClassLabel(names=['entailment', 'neutral', 'contradiction'], id=None), 'idx': Value(dtype='int32', id=None)}because column names don't match
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/research/cbim/medical/lh599/research/ruijiang/miniconda/envs/icl/lib/python3.7/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 92, in _call_target
return _target_(*args, **kwargs)
File "/research/cbim/medical/lh599/research/ruijiang/Dong/demonstration_selection/icl-ceil/src/dataset_readers/index_dsr.py", line 33, in __init__
super().__init__(**kwargs)
File "/research/cbim/medical/lh599/research/ruijiang/Dong/demonstration_selection/icl-ceil/src/dataset_readers/base_dsr.py", line 40, in __init__
self.init_dataset(task_name, field, dataset_path, dataset_split, ds_size)
File "/research/cbim/medical/lh599/research/ruijiang/Dong/demonstration_selection/icl-ceil/src/dataset_readers/base_dsr.py", line 46, in init_dataset
ds_size=ds_size)
File "/research/cbim/medical/lh599/research/ruijiang/Dong/demonstration_selection/icl-ceil/src/dataset_readers/dataset_wrappers/__init__.py", line 5, in get_dataset_wrapper
return importlib.import_module('src.dataset_readers.dataset_wrappers.{}'.format(name)).DatasetWrapper(**kwargs)
File "/research/cbim/medical/lh599/research/ruijiang/Dong/demonstration_selection/icl-ceil/src/dataset_readers/dataset_wrappers/base_dsw.py", line 25, in __init__
self.dataset = load_dataset(self.hf_dataset, self.hf_dataset_name)
File "/research/cbim/medical/lh599/research/ruijiang/miniconda/envs/icl/lib/python3.7/site-packages/datasets/load.py", line 1815, in load_dataset
storage_options=storage_options,
File "/research/cbim/medical/lh599/research/ruijiang/miniconda/envs/icl/lib/python3.7/site-packages/datasets/builder.py", line 913, in download_and_prepare
**download_and_prepare_kwargs,
File "/research/cbim/medical/lh599/research/ruijiang/miniconda/envs/icl/lib/python3.7/site-packages/datasets/builder.py", line 1004, in _download_and_prepare
self._prepare_split(split_generator, **prepare_split_kwargs)
File "/research/cbim/medical/lh599/research/ruijiang/miniconda/envs/icl/lib/python3.7/site-packages/datasets/builder.py", line 1768, in _prepare_split
gen_kwargs=gen_kwargs, job_id=job_id, **_prepare_split_args
File "/research/cbim/medical/lh599/research/ruijiang/miniconda/envs/icl/lib/python3.7/site-packages/datasets/builder.py", line 1912, in _prepare_split_single
raise DatasetGenerationError("An error occurred while generating the dataset") from e
datasets.builder.DatasetGenerationError: An error occurred while generating the dataset
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "bm25_retriever.py", line 102, in<module>main()
File "/research/cbim/medical/lh599/research/ruijiang/miniconda/envs/icl/lib/python3.7/site-packages/hydra/main.py", line 99, in decorated_main
config_name=config_name,
File "/research/cbim/medical/lh599/research/ruijiang/miniconda/envs/icl/lib/python3.7/site-packages/hydra/_internal/utils.py", line 401, in _run_hydra
overrides=overrides,
File "/research/cbim/medical/lh599/research/ruijiang/miniconda/envs/icl/lib/python3.7/site-packages/hydra/_internal/utils.py", line 458, in _run_app
lambda: hydra.run(
File "/research/cbim/medical/lh599/research/ruijiang/miniconda/envs/icl/lib/python3.7/site-packages/hydra/_internal/utils.py", line 223, in run_and_report
raise ex
File "/research/cbim/medical/lh599/research/ruijiang/miniconda/envs/icl/lib/python3.7/site-packages/hydra/_internal/utils.py", line 220, in run_and_report
returnfunc()
File "/research/cbim/medical/lh599/research/ruijiang/miniconda/envs/icl/lib/python3.7/site-packages/hydra/_internal/utils.py", line 461, in<lambda>
overrides=overrides,
File "/research/cbim/medical/lh599/research/ruijiang/miniconda/envs/icl/lib/python3.7/site-packages/hydra/_internal/hydra.py", line 132, in run
_ = ret.return_value
File "/research/cbim/medical/lh599/research/ruijiang/miniconda/envs/icl/lib/python3.7/site-packages/hydra/core/utils.py", line 260, in return_value
raise self._return_value
File "/research/cbim/medical/lh599/research/ruijiang/miniconda/envs/icl/lib/python3.7/site-packages/hydra/core/utils.py", line 186, in run_job
ret.return_value = task_function(task_cfg)
File "bm25_retriever.py", line 98, in main
find(cfg)
File "bm25_retriever.py", line 67, in find
knn_finder = BM25Finder(cfg)
File "bm25_retriever.py", line 23, in __init__
self.index_dataset = hu.instantiate(cfg.index_reader).dataset_wrapper
File "/research/cbim/medical/lh599/research/ruijiang/miniconda/envs/icl/lib/python3.7/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 227, in instantiate
config, *args, recursive=_recursive_, convert=_convert_, partial=_partial_
File "/research/cbim/medical/lh599/research/ruijiang/miniconda/envs/icl/lib/python3.7/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 347, in instantiate_node
return _call_target(_target_, partial, args, kwargs, full_key)
File "/research/cbim/medical/lh599/research/ruijiang/miniconda/envs/icl/lib/python3.7/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 97, in _call_target
raise InstantiationException(msg) from e
hydra.errors.InstantiationException: Error in call to target 'src.dataset_readers.index_dsr.IndexDatasetReader':
DatasetGenerationError('An error occurred while generating the dataset')
full_key: index_reader
[11:21:20] WARNING The following values were not passed to `accelerate launch` and had defaults used instead: launch.py:913
`--num_machines` was set to a value of `1``--mixed_precision` was set to a value of `'no'``--dynamo_backend` was set to a value of `'no'`
To avoid this warning pass in values for each of the problematic parameters or run `accelerate config`.
/research/cbim/medical/lh599/research/ruijiang/Dong/demonstration_selection/icl-ceil/inferencer.py:158: UserWarning:
The version_base parameter is not specified.
Please specify a compatability version level, or None.
Will assume defaults for version 1.1
@hydra.main(config_path="configs", config_name="inferencer")
scorer.py:108: UserWarning:
The version_base parameter is not specified.
Please specify a compatability version level, or None.
Will assume defaults for version 1.1
@hydra.main(config_path="configs", config_name="scorer")
/research/cbim/medical/lh599/research/ruijiang/miniconda/envs/icl/lib/python3.7/site-packages/hydra/_internal/defaults_list.py:251: UserWarning: In 'scorer': Defaults list is missing `_self_`. See https://hydra.cc/docs/1.2/upgrades/1.0_to_1.1/default_composition_order for more information
warnings.warn(msg, UserWarning)
[2024-07-29 11:21:23,372][__main__][INFO] - {'model_config': {'model_type': 'hf', 'model': {'_target_': 'transformers.AutoModelForCausalLM.from_pretrained', 'pretrained_model_name_or_path': '${model_name}'}, 'generation_kwargs': {'temperature': 0, 'max_new_tokens': 300}}, 'model_name': 'EleutherAI/gpt-neo-2.7B', 'task_name': 'qnli', 'output_file': 'output/epr/qnli/EleutherAI/gpt-neo-2.7B/scored.json', 'batch_size': 8, 'dataset_reader': {'_target_': 'src.dataset_readers.scoring_dsr.ScorerDatasetReader', 'dataset_path': 'output/epr/qnli/EleutherAI/gpt-neo-2.7B/retrieved.json', 'dataset_split': None, 'ds_size': None, 'task_name': '${task_name}', 'model_name': '${model_name}', 'n_tokens': 1600, 'field': 'gen_a', 'index_reader': '${index_reader}'}, 'index_reader': {'_target_': 'src.dataset_readers.index_dsr.IndexDatasetReader', 'task_name': '${task_name}', 'model_name': '${model_name}', 'field': 'qa', 'dataset_path': 'index_data/qnli/index_dataset.json', 'dataset_split': None, 'ds_size': None}}
Downloading and preparing dataset None/ax to /research/cbim/vast/lh599/.cache/huggingface/datasets/parquet/ax-2087346d15759eef/0.0.0/14a00e99c0d15a23649d0db8944380ac81082d4b021f398733dd84f3a6c569a7...
Downloading data files: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 353.69it/s]
Extracting data files: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 32.58it/s]
Generating train split: 0 examples [00:00, ? examples/s][2024-07-29 11:21:25,090][datasets.packaged_modules.parquet.parquet][ERROR] - Failed to read file '/research/cbim/vast/lh599/.cache/huggingface/datasets/downloads/34b0e169567d6cc580fb18f6593c0f0d3d4ade7ca85314eaa32071da0c48b253' with error <class 'ValueError'>: Couldn't castsentence: stringlabel: int64idx: int32-- schema metadata --huggingface: '{"info": {"features": {"sentence": {"dtype": "string", "_ty' + 136to{'premise': Value(dtype='string', id=None), 'hypothesis': Value(dtype='string', id=None), 'label': ClassLabel(names=['entailment', 'neutral', 'contradiction'], id=None), 'idx': Value(dtype='int32', id=None)}because column names don't matchError executing job with overrides: ['task_name=qnli', 'output_file=output/epr/qnli/EleutherAI/gpt-neo-2.7B/scored.json', 'batch_size=8', 'model_name=EleutherAI/gpt-neo-2.7B', 'dataset_reader.dataset_path=output/epr/qnli/EleutherAI/gpt-neo-2.7B/retrieved.json', 'dataset_reader.n_tokens=1600', 'index_reader.dataset_path=index_data/qnli/index_dataset.json']
Can you please take a look?
The text was updated successfully, but these errors were encountered:
Hi,
I followed every instruction in the README, except for
torch
I have version1.13.1
. I haveCUDA version 12.2
, and I download torch using this commandconda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia
Then, when I run
run_epr.sh
I got the following error. The error is too long, and below are just part of error. If you need more detail, I would happy to provide them.Can you please take a look?
The text was updated successfully, but these errors were encountered: