-
Notifications
You must be signed in to change notification settings - Fork 83
Description
The error comes from here:
if jaccard_similarity(train_json[index]["pre_skeleton"], target["pre_skeleton"]) < self.threshold:
the details as below:
File "/home/jiangshan/code/DAIL-SQL/prompt/ExampleSelectorTemplate.py", line 353, in get_examples
if jaccard_similarity(train_json[index]["pre_skeleton"], target["pre_skeleton"]) < self.threshold:
File "/home/jiangshan/code/DAIL-SQL/prompt/PromptICLTemplate.py", line 51, in format
examples = self.get_examples(target, self.NUM_EXAMPLE * scope_factor, cross_domain=cross_domain)
File "/home/jiangshan/code/DAIL-SQL/generate_question.py", line 87, in
question_format = prompt.format(target=question_json,
KeyError: 'pre_skeleton'
index = 2039
train_json[index]:
{'db_id': 'party_people', 'query': 'SELECT count() FROM region', 'query_toks': ['SELECT', 'count', '(', '', ')', 'FROM', 'region'], 'query_toks_no_value': ['select', 'count', '(', '*', ')', 'from', 'region'], 'question': 'How many regions do we have?', 'question_toks': ['How', 'many', 'regions', 'do', 'we', 'have', '?'], 'sql': {'from': {...}, 'select': [...], 'where': [...], 'groupBy': [...], 'having': [...], 'orderBy': [...], 'limit': None, 'intersect': None, 'union': None, 'except': None}, 'tables': [{...}, {...}, {...}, {...}], 'query_skeleton': 'select count ( _ ) from _', 'path_db': '/home/jiangshan/data/datasets/llm/spider/database/party_people/party_people.sqlite', 'sc_link': {'q_col_match': {...}, 'q_tab_match': {...}}, 'cv_link': {'num_date_match': {}, 'cell_match': {}}, 'question_for_copying': ['how', 'many', 'regions', 'do', 'we', 'have', '?'], 'column_to_table': {'0': None, '1': 0, '2': 0, '3': 0, '4': 0, '5': 0, '6': 0, '7': 1, '8': 1, '9': 1, '10': 1, '11': 1, '12': 1, '13': 2, '14': 2, '15': 2, '16': 2, '17': 3, '18': 3, ...}, 'table_names_original': ['region', 'party', 'member', 'party_events'], 'question_pattern': 'how many _ do we have ?', 'pre_skeleton': 'select count ( _ ) from _'}
The target is :
{'db_id': 'concert_singer', 'query': 'SELECT count() FROM singer', 'query_toks': ['SELECT', 'count', '(', '', ')', 'FROM', 'singer'], 'query_toks_no_value': ['select', 'count', '(', '*', ')', 'from', 'singer'], 'question': 'How many singers do we have?', 'question_toks': ['How', 'many', 'singers', 'do', 'we', 'have', '?'], 'sql': {'from': {...}, 'select': [...], 'where': [...], 'groupBy': [...], 'having': [...], 'orderBy': [...], 'limit': None, 'intersect': None, 'union': None, 'except': None}, 'tables': [{...}, {...}, {...}, {...}], 'query_skeleton': 'select count ( _ ) from _', 'path_db': '/home/jiangshan/data/datasets/llm/spider/database/concert_singer/concert_singer.sqlite', 'sc_link': {'q_col_match': {...}, 'q_tab_match': {...}}, 'cv_link': {'num_date_match': {}, 'cell_match': {}}, 'question_for_copying': ['how', 'many', 'singers', 'do', 'we', 'have', '?'], 'column_to_table': {'0': None, '1': 0, '2': 0, '3': 0, '4': 0, '5': 0, '6': 0, '7': 0, '8': 1, '9': 1, '10': 1, '11': 1, '12': 1, '13': 1, '14': 1, '15': 2, '16': 2, '17': 2, '18': 2, ...}, 'table_names_original': ['stadium', 'singer', 'concert', 'singer_in_concert'], 'question_pattern': 'how many _ do we have ?'}
The debug args are:
"args": [
"--data_type", "spider",
"--split", "test",
"--tokenizer", "/home/schinta/data/model/llm/pre_train/THUDM/chatglm3-6b",
"--max_seq_len", "4096",
"--selector_type", "EUCDISMASKPRESKLSIMTHR",
"--prompt_repr", "SQL",
"--k_shot", "9",
"--example_type", "QA"
]
The pip list show as below:
Package Version
accelerate 0.28.0
aiofiles 23.2.1
aiohttp 3.8.4
aiosignal 1.3.1
altair 5.2.0
annotated-types 0.6.0
annoy 1.17.1
anyio 3.7.0
async-timeout 4.0.2
attrs 23.1.0
bpemb 0.3.5
certifi 2024.2.2
charset-normalizer 3.1.0
click 8.1.7
cmake 3.26.3
contourpy 1.1.1
corenlp-protobuf 3.8.0
cpm-kernels 1.0.11
cycler 0.12.1
dataclasses-json 0.5.7
distro 1.9.0
exceptiongroup 1.1.1
ffmpy 0.3.2
filelock 3.12.0
fonttools 4.50.0
frozenlist 1.3.3
fsspec 2023.5.0
gensim 4.3.2
greenlet 2.0.2
h11 0.14.0
httpcore 1.0.4
httpx 0.27.0
huggingface-hub 0.23.0
idna 3.7
importlib_resources 6.4.0
Jinja2 3.1.2
joblib 1.2.0
jsonschema 4.21.1
jsonschema-specifications 2023.12.1
kiwisolver 1.4.5
latex2mathml 3.77.0
lit 16.0.6
Markdown 3.6
markdown-it-py 3.0.0
MarkupSafe 2.1.2
marshmallow 3.19.0
marshmallow-enum 1.5.1
matplotlib 3.7.5
mdtex2html 1.3.0
mdurl 0.1.2
mpmath 1.3.0
multidict 6.0.4
mypy-extensions 1.0.0
nemoguardrails 0.3.0
networkx 3.1
nltk 3.8.1
numexpr 2.8.4
numpy 1.24.4
nvidia-cublas-cu11 11.10.3.66
nvidia-cublas-cu12 12.1.3.1
nvidia-cuda-cupti-cu11 11.7.101
nvidia-cuda-cupti-cu12 12.1.105
nvidia-cuda-nvrtc-cu11 11.7.99
nvidia-cuda-nvrtc-cu12 12.1.105
nvidia-cuda-runtime-cu11 11.7.99
nvidia-cuda-runtime-cu12 12.1.105
nvidia-cudnn-cu11 8.5.0.96
nvidia-cudnn-cu12 8.9.2.26
nvidia-cufft-cu11 10.9.0.58
nvidia-cufft-cu12 11.0.2.54
nvidia-curand-cu11 10.2.10.91
nvidia-curand-cu12 10.3.2.106
nvidia-cusolver-cu11 11.4.0.1
nvidia-cusolver-cu12 11.4.5.107
nvidia-cusparse-cu11 11.7.4.91
nvidia-cusparse-cu12 12.1.0.106
nvidia-nccl-cu11 2.14.3
nvidia-nccl-cu12 2.20.5
nvidia-nvjitlink-cu12 12.4.127
nvidia-nvtx-cu11 11.7.91
nvidia-nvtx-cu12 12.1.105
openai 1.30.1
openapi-schema-pydantic 1.2.4
orjson 3.9.15
packaging 24.0
pandas 2.0.3
pillow 10.2.0
pip 24.0
pkgutil_resolve_name 1.3.10
protobuf 3.20.3
psutil 5.9.8
pydantic 2.7.1
pydantic_core 2.18.2
pydub 0.25.1
Pygments 2.17.2
pyparsing 3.1.2
python-multipart 0.0.9
pytz 2024.1
PyYAML 6.0
referencing 0.34.0
regex 2023.5.5
requests 2.31.0
rfc3986 1.5.0
rich 13.7.1
rpds-py 0.18.0
ruff 0.3.4
safetensors 0.3.1
scikit-learn 1.2.2
scipy 1.10.1
semantic-version 2.10.0
sentence-transformers 2.2.2
sentencepiece 0.1.99
setuptools 65.5.1
shellingham 1.5.4
simpleeval 0.9.13
six 1.16.0
smart-open 7.0.4
sniffio 1.3.0
sql_metadata 2.11.0
SQLAlchemy 2.0.17
sqlparse 0.5.0
stanford-corenlp 3.9.2
sympy 1.12
threadpoolctl 3.1.0
tokenizers 0.13.3
tomlkit 0.12.0
toolz 0.12.1
torch 2.3.0
torchtext 0.18.0
torchvision 0.18.0
tqdm 4.65.0
transformers 4.27.1
triton 2.3.0
typer 0.10.0
typing_extensions 4.11.0
typing-inspect 0.9.0
tzdata 2024.1
urllib3 2.2.1
uvicorn 0.22.0
websockets 11.0.3
wheel 0.43.0
wrapt 1.16.0
yarl 1.9.2
zipp 3.18.1