Removed restrictions on image folder naming and parenting #1591

RobertMtx · 2023-10-10T12:22:53Z

RobertMtx
Oct 10, 2023

The current naming requirements for image folders makes it very cumbersome to manage huge image repositories. Just a couple of suggestions to consider when you get time to make changes.

Why not allow placing a single text file inside of each image folder, which has a specific extension, then use that file's filename for the current properties, such as number of repeats or class names? Or better yet, allow providing these properties inside of the text file, and give it a name like "_properties.txt". In either case, it could be implemented as a secondary method, allowing the first to still work when the file is missing.
Forcing users to place every image folder inside of a single image root directory is a nightmare for large image sets. Especially since writing code to locate images in an arbitrarily deep directory structure is very easy to do. Even in cases where the folder name is used to specify properties, it would be easy to extract them from the last branch of the folder name. I'm not understanding why this limitation is in place. Users should be able to place images in branches, such as MainImageFolder\Props\Sports\BasketBall\001.jpg, etc. The current requirements force them to create a gigantic directory of folders named things like 1_props_sports_basketball in order to have any sort of management over it. But finding and managing images in such a folder, which might contain thousands of sub-folders, is a headache to say the least.

I appreciate the hard work put into this tool. Look forward to the next release!

madrooky · 2023-11-10T13:06:35Z

madrooky
Nov 10, 2023

I am not certain about the captions, that is something i am currently figuring out. But i prefer having the usual system with one caption txt file for each image. I think it still works that way but the readme is pretty confusing with the description...

But you don't need to put all images in one single root folder, you can for each training session/concept select one parent folder and that can be deep inside of your folder structure. It still is a bit cumbersome to manage it that way but at least it is possible to have a training folder to each sub category you might have in your data set.

0 replies

aa956 · 2024-01-08T10:23:47Z

aa956
Jan 8, 2024

As far as I know you can set dataset config in toml file to kohya-ss scripts for any image directory structure, at least in the original scripts https://github.com/kohya-ss/sd-scripts

There's just no GUI support for this so workaround is to configure everything apart from the dataset in the GUI, print train command, and add the dataset config option manually --dataset_config dataset.toml?

English documentation is here
https://github.com/darkstorm2150/sd-scripts/blob/main/docs/config_README-en.md

Example of the dataset config file from the documentation is:

[general]
shuffle_caption = true
caption_extension = '.txt'
keep_tokens = 1

# This is a DreamBooth-style dataset
[[datasets]]
resolution = 512
batch_size = 4
keep_tokens = 2

  [[datasets.subsets]]
  image_dir = 'C:\hoge'
  class_tokens = 'hoge girl'
  # This subset has keep_tokens = 2 (using the value of the parent datasets)

  [[datasets.subsets]]
  image_dir = 'C:\fuga'
  class_tokens = 'fuga boy'
  keep_tokens = 3

  [[datasets.subsets]]
  is_reg = true
  image_dir = 'C:\reg'
  class_tokens = 'human'
  keep_tokens = 1

# This is a fine-tuning-style dataset
[[datasets]]
resolution = [768, 768]
batch_size = 2

  [[datasets.subsets]]
  image_dir = 'C:\piyo'
  metadata_file = 'C:\piyo\piyo_md.json'
  # This subset has keep_tokens = 1 (using the general value)

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Removed restrictions on image folder naming and parenting #1591

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 2 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

Removed restrictions on image folder naming and parenting #1591

Uh oh!

Uh oh!

RobertMtx Oct 10, 2023

Replies: 2 comments

Uh oh!

madrooky Nov 10, 2023

Uh oh!

aa956 Jan 8, 2024

RobertMtx
Oct 10, 2023

madrooky
Nov 10, 2023

aa956
Jan 8, 2024