Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

something went wrong: Response does not contain codes! #27

Open
TorAllex opened this issue Jun 26, 2024 · 18 comments
Open

something went wrong: Response does not contain codes! #27

TorAllex opened this issue Jun 26, 2024 · 18 comments

Comments

@TorAllex
Copy link

got prompt
Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:44<00:00, 22.17s/it]
Some weights of the model checkpoint at /Users/alex/ComfyUI/models/omost were not used when initializing LlamaForCausalLM: ['model.layers.13.mlp.up_proj.weight.quant_map', 'model.layers.26.self_attn.q_proj.weight.nested_absmax', 'model.layers.10.self_attn.o_proj.weight.nested_quant_map',
- This IS expected if you are initializing LlamaForCausalLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing LlamaForCausalLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
'PreTrainedTokenizerFast' object has no attribute 'apply_chat_template'
!!! Exception during processing!!! Response does not contain codes!
Traceback (most recent call last):
  File "/Users/alex/ComfyUI/execution.py", line 151, in recursive_execute
    output_data, output_ui = get_output_data(obj, input_data_all)
  File "/Users/alex/ComfyUI/execution.py", line 81, in get_output_data
    return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
  File "/Users/alex/ComfyUI/execution.py", line 65, in map_node_over_list
    results.append(getattr(obj, func)(**input_data_all))
  File "/Users/alex/ComfyUI/custom_nodes/comfyui_LLM_party/tools/omost.py", line 46, in notify
    canvas = omost_canvas.from_bot_response(text[0])
  File "/Users/alex/ComfyUI/custom_nodes/comfyui_LLM_party/lib_omost/canvas.py", line 134, in from_bot_response
    assert matched, 'Response does not contain codes!'
AssertionError: Response does not contain codes!
@heshengtao
Copy link
Owner

When you use omost, if the output program is truncated, this error will be reported. You can increase the max length to prevent the LLM output from being truncated. If you don't use an omost-specific model, but use another model instead, it is very likely that other models will not be able to output code normally, and this error will also be reported. If your problem has not been solved, please attach a full screenshot of your workflow.

@TorAllex
Copy link
Author

Yes, I've try use omost. Just I've load your "start_with_OMOST.json" workflow and click "Queue Prompt", then got error.
Of course, this can be a hardware problem, since I use a Mac (Intel) and AMD graphics card.
Can you please explain to me, where is the "increase the maximum length" option? I'm a beginner and I'm not sure where to find it.

@heshengtao
Copy link
Owner

屏幕截图 2024-06-27 171938
LLM_local node has a max length parameter, 2048 in the figure, please check the show text node connected to the LLM_local node, the output text is not the complete code, the complete code will start with python```and end with```

@TorAllex
Copy link
Author

I tried to double the maximum length up to 32768, but it didn't fix the error.
Text node message: "PreTrainedTokenizerFast" object has no attribute "apply_chat_template".

@heshengtao
Copy link
Owner

path_in_launcher_configuration\python_embeded\python.exe -m pip install --upgrade transformers

Here path_in_launcher_configuration\python_embeded\python.exe is the interpreter path of comfyui

@TorAllex
Copy link
Author

I'll try an upgrade transformers, I need to see how to do it in MacOs

@TorAllex
Copy link
Author

Ok, now the error looks different - "gpu not found".
Does the device MPS not work with it?

@heshengtao
Copy link
Owner

If you choose auto, it will automatically find cuda. If there is no cuda, it will be mps. If it is not found, it is cpu. If it is an error that the gpu not found, this error message does not appear in my code. Maybe the code of transformer or the model itself has restrictions on using cuda. Some int4 quantization models do have poor compatibility with MPS devices.I'm a little helpless about this problem you have. AMD graphics cards are really not suitable for artificial intelligence.

@TorAllex
Copy link
Author

TorAllex commented Jun 28, 2024

Снимок экрана 2024-06-28 в 22 53 28

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Unused kwargs: ['_load_in_4bit', '_load_in_8bit', 'quant_method']. These kwargs are not used in <class 'transformers.utils.quantization_config.BitsAndBytesConfig'>.
!!! Exception during processing!!! No GPU found. A GPU is needed for quantization.
Traceback (most recent call last):
  File "/Users/alex/ComfyUI/execution.py", line 151, in recursive_execute
    output_data, output_ui = get_output_data(obj, input_data_all)
  File "/Users/alex/ComfyUI/execution.py", line 81, in get_output_data
    return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
  File "/Users/alex/ComfyUI/execution.py", line 74, in map_node_over_list
    results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
  File "/Users/alex/ComfyUI/custom_nodes/comfyui_LLM_party/llm.py", line 937, in chatbot
    self.model = AutoModelForCausalLM.from_pretrained(
  File "/Users/alex/miniconda3/envs/fooocus/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 564, in from_pretrained
    return model_class.from_pretrained(
  File "/Users/alex/miniconda3/envs/fooocus/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3279, in from_pretrained
    hf_quantizer.validate_environment(
  File "/Users/alex/miniconda3/envs/fooocus/lib/python3.10/site-packages/transformers/quantizers/quantizer_bnb_4bit.py", line 62, in validate_environment
    raise RuntimeError("No GPU found. A GPU is needed for quantization.")
RuntimeError: No GPU found. A GPU is needed for quantization.

Prompt executed in 0.39 seconds

@heshengtao
Copy link
Owner

Try this.
transformers>=4.41.1
bitsandbytes==0.43.1
accelerate==0.30.1

@TorAllex
Copy link
Author

I have Python 3.10.14 installed, and bitsandbytes 0.42.0 latest.
transformers and accelerate - Ok
should I upgrade python?

@heshengtao
Copy link
Owner

Please match the versions of the three libraries I gave, there is a high probability that doing so will solve this problem. bitsandbytes 0.42.0 version is not quite right, please adjust to bitsandbytes == 0.43.1
Accelerate == 0.30.1

@TorAllex
Copy link
Author

alex@iMac ComfyUI % pip show bitsandbytes         
Name: bitsandbytes
Version: 0.42.0
-------------------------------------------------------
alex@iMac ComfyUI % pip show accelerate  
Name: accelerate
Version: 0.30.1
-------------------------------------------------------
alex@iMac ComfyUI % pip show transformers
Name: transformers
Version: 4.41.1
-------------------------------------------------------
alex@iMac ComfyUI % pip install bitsandbytes==0.43.1
ERROR: Could not find a version that satisfies the requirement bitsandbytes==0.43.1 (from versions: 0.31.8, 0.32.0, 0.32.1, 0.32.2, 0.32.3, 0.33.0, 0.33.1, 0.34.0, 0.35.0, 0.35.1, 0.35.2, 0.35.3, 0.35.4, 0.36.0, 0.36.0.post1, 0.36.0.post2, 0.37.0, 0.37.1, 0.37.2, 0.38.0, 0.38.0.post1, 0.38.0.post2, 0.38.1, 0.39.0, 0.39.1, 0.40.0, 0.40.0.post1, 0.40.0.post2, 0.40.0.post3, 0.40.0.post4, 0.40.1, 0.40.1.post1, 0.40.2, 0.41.0, 0.41.1, 0.41.2, 0.41.2.post1, 0.41.2.post2, 0.41.3, 0.41.3.post1, 0.41.3.post2, 0.42.0)
ERROR: No matching distribution found for bitsandbytes==0.43.1

@heshengtao
Copy link
Owner

Either try updating Python, but unfortunately, bitsandbytes requires cuda support. I doubt you can use this int4 model because bitsandbytes is the foundation for using this quantization model.

@TorAllex
Copy link
Author

May be, it can works with ROCm (Vulkan) device?

@TorAllex
Copy link
Author

I started playing with python versions and now I've broken everything.
ERROR: llama_cpp_python-0.2.79-AVX2-macosx_13_0_x86_64.whl is not a valid wheel filename.
In install.py code response = get("https://api.github.com/repos/abetlen/llama-cpp-python/releases/latest")
but there is no files with AVX2 or AVX.
I've install manually pip install --upgrade --no-cache-dir llama-cpp-python, but it ignored by install.py

@heshengtao
Copy link
Owner

In my code, if you correctly install llama-cpp-python or llama_cpp, the program that automatically installs llama-cpp-python will be skipped. And "ERROR: llama_cpp_python - 0.2.79 - AVX2 - macosx_13_0_x86_64 .whl is not a valid wheel filename." The installer failed to recognize your mps device. Please check if you really have llama-cpp-python in your environment.

    Imported = package_is_installed ("llama-cpp-python") or package_is_installed ("llama_cpp")
    If imported:
        # If it is already installed, do nothing
        Pass

You can see that this part of my code, if there is llama_cpp_python in the environment, the installation program will not be executed.
As far as I know, bitsandbytes is a library that can only run on cuda, so no matter how you configure the environment, you won't be able to use the model quantified by bitsandbytes.

@TorAllex
Copy link
Author

TorAllex commented Jul 1, 2024

Ok, we just need to wait for support bitsandbytes-foundation/bitsandbytes#252 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants