Training error (Updated) #468
Unanswered
ilikespace1
asked this question in
Q&A
Replies: 3 comments 5 replies
-
I had the "ValueError: fp16 mixed precision requires a GPU" issue before too, and after manually updating "train_util.py", I am getting the same error as you are now. |
Beta Was this translation helpful? Give feedback.
2 replies
-
你好, 你解决了吗? |
Beta Was this translation helpful? Give feedback.
2 replies
-
Should I just conclude that kohya_ss doesn't work on AMD GPU's without ROCm? (so Windows users) |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Update: I updated some more files and I'm back to getting the "ValueError: fp16 mixed precision requires a GPU" error. I tried training without any mixed precision and "float" as my save precision and I get this error (full error below):
File "C:\AI\kohya_ss\venv\lib\site-packages\xformers\ops.py", line 726, in op
raise NotImplementedError(f"No operator found for this attention: {self}")
NotImplementedError: No operator found for this attention: AttentionOpDispatch(dtype=torch.float32, device=device(type='cpu'), k=40, has_dropout=False, attn_bias_type=<class 'NoneType'>, kv_len=7680, q_len=7680, kv=40, batch_size=2, num_heads=8)
steps: 0%| | 0/1600 [00:06<?, ?it/s]
Is it having trouble detecting and using my hardware? I can run Stable Diffusion just fine.
Things I've tried:
So to reiterate, I currently have the "ValueError: fp16 mixed precision requires a GPU" error when I try using fp16 as my mixed and saved precision. This is the full error that I get when I try using no mixed precision and "float" as my save precision:
Folder 100_AW: 32 images found
Folder 100_AW: 3200 steps
max_train_steps = 1600
stop_text_encoder_training = 0
lr_warmup_steps = 0
accelerate launch --num_cpu_threads_per_process=2 "train_network.py" --enable_bucket --pretrained_model_name_or_path="runwayml/stable-diffusion-v1-5" --train_data_dir="C:/AI/Process lora/AW/image" --resolution=768,768 --output_dir="C:/AI/Process lora/AW/model" --logging_dir="C:/AI/Process lora/AW/log" --network_alpha="128" --save_model_as=safetensors --network_module=networks.lora --text_encoder_lr=5e-5 --unet_lr=0.0001 --network_dim=128 --output_name="AW Model" --lr_scheduler_num_cycles="1" --learning_rate="0.0001" --lr_scheduler="constant" --train_batch_size="2" --max_train_steps="1600" --save_every_n_epochs="1" --mixed_precision="no" --save_precision="float" --seed="1234" --caption_extension=".txt" --cache_latents --optimizer_type="AdamW" --max_data_loader_n_workers="1" --clip_skip=2 --bucket_reso_steps=64 --xformers --bucket_no_upscale
prepare tokenizer
Use DreamBooth method.
prepare images.
found directory C:\AI\Process lora\AW\image\100_AW contains 32 image files
3200 train images with repeating.
0 reg images.
no regularization images / 正則化画像が見つかりませんでした
[Dataset 0]
batch_size: 2
resolution: (768, 768)
enable_bucket: True
min_bucket_reso: 256
max_bucket_reso: 1024
bucket_reso_steps: 64
bucket_no_upscale: True
[Subset 0 of Dataset 0]
image_dir: "C:\AI\Process lora\AW\image\100_AW"
image_count: 32
num_repeats: 100
shuffle_caption: False
keep_tokens: 0
caption_dropout_rate: 0.0
caption_dropout_every_n_epoches: 0
caption_tag_dropout_rate: 0.0
color_aug: False
flip_aug: False
face_crop_aug_range: None
random_crop: False
token_warmup_min: 1,
token_warmup_step: 0,
is_reg: False
class_tokens: AW
caption_extension: .txt
[Dataset 0]
loading image sizes.
100%|█████████████████████████████████████████████████████████████████████████████████| 32/32 [00:00<00:00, 197.35it/s]
make buckets
min_bucket_reso and max_bucket_reso are ignored if bucket_no_upscale is set, because bucket reso is defined by image size automatically / bucket_no_upscaleが指定された場合は、bucketの解像度は画像サイズから自動計算されるため、min_bucket_resoとmax_bucket_resoは無視されます
number of images (including repeats) / 各bucketの画像枚数(繰り返し回数を含む)
bucket 0: resolution (384, 384), count: 100
bucket 1: resolution (576, 832), count: 100
bucket 2: resolution (576, 896), count: 100
bucket 3: resolution (576, 1024), count: 600
bucket 4: resolution (640, 832), count: 400
bucket 5: resolution (640, 896), count: 100
bucket 6: resolution (704, 704), count: 500
bucket 7: resolution (704, 768), count: 100
bucket 8: resolution (768, 768), count: 100
bucket 9: resolution (832, 640), count: 100
bucket 10: resolution (896, 512), count: 100
bucket 11: resolution (960, 512), count: 800
bucket 12: resolution (960, 576), count: 100
mean ar error (without repeats): 0.03709647043484138
prepare accelerator
Using accelerator 0.15.0 or above.
load Diffusers pretrained models
safety_checker\model.safetensors not found
Fetching 19 files: 100%|███████████████████████████████████████████████████████████████████████| 19/19 [00:00<?, ?it/s]
C:\AI\kohya_ss\venv\lib\site-packages\transformers\models\clip\feature_extraction_clip.py:28: FutureWarning: The class CLIPFeatureExtractor is deprecated and will be removed in version 5 of Transformers. Please use CLIPImageProcessor instead.
warnings.warn(
You have disabled the safety checker for <class 'diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline'> by passing
safety_checker=None
. Ensure that you abide to the conditions of the Stable Diffusion license and do not expose unfiltered results in services or applications open to the public. Both the diffusers team and Hugging Face strongly recommend to keep the safety filter enabled in all public facing circumstances, disabling it only for use-cases that involve analyzing network behavior or auditing its results. For more information, please have a look at huggingface/diffusers#254 .Replace CrossAttention.forward to use xformers
[Dataset 0]
caching latents.
100%|██████████████████████████████████████████████████████████████████████████████████| 32/32 [07:03<00:00, 13.24s/it]
import network module: networks.lora
create LoRA network. base dim (rank): 128, alpha: 128.0
create LoRA for Text Encoder: 72 modules.
create LoRA for U-Net: 192 modules.
enable LoRA for text encoder
enable LoRA for U-Net
prepare optimizer, data loader etc.
use AdamW optimizer | {}
running training / 学習開始
num train images * repeats / 学習画像の数×繰り返し回数: 3200
num reg images / 正則化画像の数: 0
num batches per epoch / 1epochのバッチ数: 1600
num epochs / epoch数: 1
batch size per device / バッチサイズ: 2
gradient accumulation steps / 勾配を合計するステップ数 = 1
total optimization steps / 学習ステップ数: 1600
steps: 0%| | 0/1600 [00:00<?, ?it/s]epoch 1/1
Traceback (most recent call last):
File "C:\AI\kohya_ss\train_network.py", line 699, in
train(args)
File "C:\AI\kohya_ss\train_network.py", line 538, in train
noise_pred = unet(noisy_latents, timesteps, encoder_hidden_states).sample
File "C:\AI\kohya_ss\venv\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "C:\AI\kohya_ss\venv\lib\site-packages\diffusers\models\unet_2d_condition.py", line 381, in forward
sample, res_samples = downsample_block(
File "C:\AI\kohya_ss\venv\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "C:\AI\kohya_ss\venv\lib\site-packages\diffusers\models\unet_2d_blocks.py", line 612, in forward
hidden_states = attn(hidden_states, encoder_hidden_states=encoder_hidden_states).sample
File "C:\AI\kohya_ss\venv\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "C:\AI\kohya_ss\venv\lib\site-packages\diffusers\models\attention.py", line 216, in forward
hidden_states = block(hidden_states, context=encoder_hidden_states, timestep=timestep)
File "C:\AI\kohya_ss\venv\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "C:\AI\kohya_ss\venv\lib\site-packages\diffusers\models\attention.py", line 484, in forward
hidden_states = self.attn1(norm_hidden_states) + hidden_states
File "C:\AI\kohya_ss\venv\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "C:\AI\kohya_ss\library\train_util.py", line 1790, in forward_xformers
out = xformers.ops.memory_efficient_attention(q, k, v, attn_bias=None) # 最適なのを選んでくれる
File "C:\AI\kohya_ss\venv\lib\site-packages\xformers\ops.py", line 858, in memory_efficient_attention
).op
File "C:\AI\kohya_ss\venv\lib\site-packages\xformers\ops.py", line 726, in op
raise NotImplementedError(f"No operator found for this attention: {self}")
NotImplementedError: No operator found for this attention: AttentionOpDispatch(dtype=torch.float32, device=device(type='cpu'), k=40, has_dropout=False, attn_bias_type=<class 'NoneType'>, kv_len=7680, q_len=7680, kv=40, batch_size=2, num_heads=8)
steps: 0%| | 0/1600 [00:06<?, ?it/s]
Traceback (most recent call last):
File "C:\Program Files\Python310\lib\runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Program Files\Python310\lib\runpy.py", line 86, in run_code
exec(code, run_globals)
File "C:\AI\kohya_ss\venv\Scripts\accelerate.exe_main.py", line 7, in
File "C:\AI\kohya_ss\venv\lib\site-packages\accelerate\commands\accelerate_cli.py", line 45, in main
args.func(args)
File "C:\AI\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 1104, in launch_command
simple_launcher(args)
File "C:\AI\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 567, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['C:\AI\kohya_ss\venv\Scripts\python.exe', 'train_network.py', '--enable_bucket', '--pretrained_model_name_or_path=runwayml/stable-diffusion-v1-5', '--train_data_dir=C:/AI/Process lora/AW/image', '--resolution=768,768', '--output_dir=C:/AI/Process lora/AW/model', '--logging_dir=C:/AI/Process lora/AW/log', '--network_alpha=128', '--save_model_as=safetensors', '--network_module=networks.lora', '--text_encoder_lr=5e-5', '--unet_lr=0.0001', '--network_dim=128', '--output_name=AW Model', '--lr_scheduler_num_cycles=1', '--learning_rate=0.0001', '--lr_scheduler=constant', '--train_batch_size=2', '--max_train_steps=1600', '--save_every_n_epochs=1', '--mixed_precision=no', '--save_precision=float', '--seed=1234', '--caption_extension=.txt', '--cache_latents', '--optimizer_type=AdamW', '--max_data_loader_n_workers=1', '--clip_skip=2', '--bucket_reso_steps=64', '--xformers', '--bucket_no_upscale']' returned non-zero exit status 1.
Beta Was this translation helpful? Give feedback.
All reactions