Skip to content

More coherent image validation #5650

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
BartChris opened this issue May 4, 2023 · 7 comments · May be fixed by #6505
Open

More coherent image validation #5650

BartChris opened this issue May 4, 2023 · 7 comments · May be fixed by #6505

Comments

@BartChris
Copy link
Collaborator

BartChris commented May 4, 2023

Is your feature request related to a problem? Please describe.
The validation of images in Kitodo should be implemented more coherently and be enhanced. What seems to be implemented are mostly checks for images which are not assigned to a structural element as part of the metadata validation

results.add(checkForStructuresWithoutMedia(workpiece, translations));
results.add(checkForUnlinkedMedia(workpiece, translations));
)
But as far as i can see, there is no real image validation implemented.

I stumbled over different problems.

  1. The checkbox "Bilder validieren" as attribute of a task in the workflow editor has no effect. Even the image validation logic in the WorkflowControllerService is not checking for that attribute when the task is closed. Instead it is checking if the Task is of type typeImagesWrite. This is confusing.

if (task.isTypeImagesWrite()) {
ImageHelper mih = new ImageHelper();
URI imageFolder = ServiceManager.getProcessService().getImagesOriginDirectory(false, task.getProcess());
if (!mih.checkIfImagesValid(task.getProcess().getTitle(), imageFolder)) {
throw new DataException("Error on image validation!");
}

  1. This validity check has a very narrow definition of what is considered as valid. (see also Image / file name 'validation' is hardcoded to specific rules and is not used for custom configurations #5007). The validity of images has probably to include more things. For example

  2. Is there any check for the actual file types? It seems that it is allowed right now to put TIFFs in a folder which was marked as JPEG-folder. I stumbled over that, when i mis-configured my test system and set the folder with the originals to JPEG instead of TIFFs. The system let me put TIFF files in the folder. I only noticed there was something wrong when i tried to use the "upload media" feature, uploaded Tiffs and no derivatives where created.

@solth
Copy link
Member

solth commented May 4, 2023

Related to #5415 and #4262

@maria-federbusch
Copy link

Detailed requirements (discussed with @BartChris in Marburg):

  1. Turn validation on and off in configproperties
  2. Possibility of configuration of the validation conditions (format/s, naming, order) in configproperties (preferably depending on the document type; since, for example, audio recordings with other file formats may soon appear)
  • One possibility (Requirements for Staatsbibliothek zu Berlin for text material): TIFF; 8 digits purely numerically zeroed; numerical order
  1. Validation process
    a. Validation of the format?
    b. Validation of the naming?
    c. Validation of the order? Is the file sequence complete and sorted correctly?
  2. Definition of feedback
    If it deviates from one of the specifications, then the staff will not be able to open the process in the metadata editor and a specific error message will be issued that forces manual correction.

@BartChris
Copy link
Collaborator Author

BartChris commented Nov 21, 2024

Sounds very good to me. The only thing i am wondering is, if the best place for the validation settings would be the kitodo_config.properties or wether we should extend the project setting screen?

image

Extending the project setting screen is probably more (conceptual) work, especially when the settings are generally applied instance wide.
Opinions welcome @maria-federbusch @solth @matthias-ronge

@maria-federbusch
Copy link

Good Idea! May be, this was a misunderstanding on my part.

@solth
Copy link
Member

solth commented Nov 21, 2024

Extending the project setting screen is probably more (conceptual) work, especially when the settings are generally applied instance wide.
Opinions welcome @maria-federbusch @solth @matthias-ronge

I would prefer to make this behavior configurable via the frontend. In my opinion it's always desirable to configure options via the frontend without the need to have access to the server filesystem (I think the additional work is a good investment).

@henning-gerhardt
Copy link
Collaborator

henning-gerhardt commented Nov 21, 2024

@BartChris : do you mean the folder settings of the project? If so, I would prefer this even this feature would not be used in SLUB. This was suggest by @solth while I'm writing this answer :)

We at SLUB have a strong image validation (not only file name schema even long term preservation check and a file checksum check) already since 2.x outside of Kitodo.Production implemented. I did not want this inside Kitodo.Production as the used rules got extended or changed over the time and switching to the latest release is a lot more complicated than do the validation complete outside of Kitodo.Production. Kitodo.Production is starting the validation task by executing a script and if the validation is successful this validation task is closed over the ActiveMQ API. In case that the validation was failing a few people get informed by mail over the validation error cause.

Edit: Sending the answer was to early :-(

I forgot: I don't know if all possible validation rules should applied to all used institutions as every institution has different rules. This made the solution inside Kitodo.Production very complex. Even if you can choose between applying different validation rules there are rules which are not covered and need checked in an other way. But as long as this rules are all optional than everyone can choose itself which one should be used or not.

@thomaslow thomaslow self-assigned this Feb 14, 2025
@thomaslow thomaslow linked a pull request Apr 10, 2025 that will close this issue
11 tasks
@thomaslow
Copy link
Collaborator

I added a first draft of an implementation in pull request #6505 including a demonstration video. You can configure rules to validate the image content of folders (file type, image size, color space, etc.).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants