You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There are different, competing, ideas and solutions on how to address the performance and configuration problems. This story is meant to provide a list of those so that they can be groomed and broken down further into quests, features, bugs, etc...
Long Term Caching vs Redis Caching or Other Caching
The caching system in place includes REDIS.
The REDIS data is derived on the fly or as needed.
The REDIS cache has a relatively short expiration (such as say 30 days).
Rather than focus on caching, we might want to focus on long-term-caching.
Which, is really just making a copy of a file and storing it on disk indefinitely until such time that the source file changes. Only when the source file changes should the cached file be deleted or re-created.
The long-term-caching would be ideal for well-known or common files.
Unusual files, such as rotating an image by 2 degrees would not fall under this use case.
What the use case of when to do this or when not to do this is up for determination and is not described here.
Other caching, including REDIS, may still have a place but the caching situation as a whole still needs to be re-considered.
Caching Special TIFF via IRIIIF
It seems like IRIIIF can potentially be utilized to provide long-term-caching of problematic TIFF files.
(There are other states that may be significant but are not directly relevant here, such as seekable, sequential, and pyramidal TIFFs.)
Most TIFF encoders produce Strip Based TIFFs but those become slower to read the larger the file is.
Having the IRIIIF convert Strip Based TIFFs into Tile Based TIFFs in a long-term-cache should result in a faster load and processing time for/by Cantaloupe.
This should help address major performance problems.
This long-term-cache can then be used to create short-term-cache of images by the Cantaloupe server.
This long-term-cache would also allow use to avoid modifying the source image in any way.
We can also consider creating a Pyramidal TIFF in the long-term-cache.
These images would be created at different commonly used resolutions to allow for more efficient zooming.
From the Cantaloupe Documentation:
For efficient deep zooming, TIFF images need to be pyramidal, and each level of the pyramid must be tiled.
Stop Managing DSpace
We are not using DSpace 7.
DSpace 7 now has its own IIIF server.
Why should IRIIIF manage Dspace now?
Is there any reason to do this?
It may or may not be a good idea simplify IRIIIF and remove the DSpace functionality and let DSpace directly provide its IIIF support directly on its own.
This would save on network traffic and complexity.
I would imagine that this would improve performance and increase maintainability.
This can either require direct interaction with DSpace's IIIF functionality or Ingress can be used to mask the server.
Pass DSpace Through
Rather than removing the DSPace functionality entirely, just act as a pass through where appropriate.
Anything that IRIIIF needs to handle can be handled.
Anything that can go straight through into DSpace can just be routed.
The remote servers therefore only need to just talk to IRIIIF and do not have to select between DSpace and IRIIIF.
No masking via Ingress needed.
This allows for some simplification and some improvements of maintainability but not as much as completely pulling DSpace out.
This still incurs unnecessary network traffic due to the routing of the data (passing through) and is sub-optimal compared to directly going to DSpace.
We could utilize long-term-caching to cache this metadata if we convert anything not using XMP into using XMP.
Anything already in XMP could either be passed through or cached along with the converted data.
Metadata into XMP Rather than EXIF, et al., but not in IRIIIF
It could be that Cantaloupe provides ways to do this.
As noted in the documentation, Cantaloupe also supports caching of this XMP Metadata.
After metadata is initially read from a source image, it may be cached, in which case subsequent requests will read it from the cache rather than the source image. In this case, changes to the source image's metadata will not be reflected in the application until the cached metadata becomes invalid and is re-read. If you need to change a source image's metadata, you should manually purge any cached content relating to it afterwards.
Cantaloupe may need to be further investigated to determine if it can convert the XMP metadata for us.
If so, then this would save a lot of programming and maintenance effort by having Cantaloupe do this for us.
These can be written in languages like Ruby or Java.
It may be that some or all of the functionality provided by IRIIIF can be removed and instead added as a Delegate directly on Cantaloupe.
The text was updated successfully, but these errors were encountered:
kaladay
changed the title
IRIIIF interaction with Cantaloupe is problematic.
IRIIIF interaction with Cantaloupe is problematic, consider caching changes.
Sep 20, 2024
Strip Based
Tile Base
(There are other states that may be significant but are not directly relevant here, such as seekable, sequential, and pyramidal TIFFs.)
Most TIFF encoders produce Strip Based TIFFs but those become slower to read the larger the file is.
Having the IRIIIF convert Strip Based TIFFs into Tile Based TIFFs in a long-term-cache should result in a faster load and processing time for/by Cantaloupe.
This should help address major performance problems.
This long-term-cache can then be used to create short-term-cache of images by the Cantaloupe server.
This long-term-cache would also allow use to avoid modifying the source image in any way.
I just wanted to add that one of the things that I'm wrestling with right now is figuring out what files we should be storing in Fedora and ultimately what files Cantaloupe and irIIIFService should be interacting with. There is a belief from DiSC that we are and have been taking their files and creating appropriate Intermediate Files from them. This is tricky as it's hard to understand the thinking that went on when creating the original preservation file, but if we are going to consider making irIIIFService convert tiffs, maybe we should be doing this before they even come in to the repository through our processing and ingest practices.
(It also shows that using JP2 for source images is even better than TIFF because then we can utilize potentially faster processors like Kakadu and OpenJPEG.)
This is created following investigations of:
There are different, competing, ideas and solutions on how to address the performance and configuration problems. This story is meant to provide a list of those so that they can be groomed and broken down further into quests, features, bugs, etc...
Long Term Caching vs Redis Caching or Other Caching
The caching system in place includes REDIS.
The REDIS data is derived on the fly or as needed.
The REDIS cache has a relatively short expiration (such as say 30 days).
Rather than focus on caching, we might want to focus on long-term-caching.
Which, is really just making a copy of a file and storing it on disk indefinitely until such time that the source file changes. Only when the source file changes should the cached file be deleted or re-created.
The long-term-caching would be ideal for well-known or common files.
Unusual files, such as rotating an image by 2 degrees would not fall under this use case.
What the use case of when to do this or when not to do this is up for determination and is not described here.
Other caching, including REDIS, may still have a place but the caching situation as a whole still needs to be re-considered.
Caching Special TIFF via IRIIIF
It seems like IRIIIF can potentially be utilized to provide long-term-caching of problematic TIFF files.
There are two major types of TIFF files that are relevant here.
(There are other states that may be significant but are not directly relevant here, such as seekable, sequential, and pyramidal TIFFs.)
Most TIFF encoders produce Strip Based TIFFs but those become slower to read the larger the file is.
Having the IRIIIF convert Strip Based TIFFs into Tile Based TIFFs in a long-term-cache should result in a faster load and processing time for/by Cantaloupe.
This should help address major performance problems.
This long-term-cache can then be used to create short-term-cache of images by the Cantaloupe server.
This long-term-cache would also allow use to avoid modifying the source image in any way.
We can also consider creating a Pyramidal TIFF in the long-term-cache.
These images would be created at different commonly used resolutions to allow for more efficient zooming.
From the Cantaloupe Documentation:
Stop Managing DSpace
We are not using DSpace 7.
DSpace 7 now has its own IIIF server.
Why should IRIIIF manage Dspace now?
Is there any reason to do this?
It may or may not be a good idea simplify IRIIIF and remove the DSpace functionality and let DSpace directly provide its IIIF support directly on its own.
This would save on network traffic and complexity.
I would imagine that this would improve performance and increase maintainability.
This can either require direct interaction with DSpace's IIIF functionality or Ingress can be used to mask the server.
Pass DSpace Through
Rather than removing the DSPace functionality entirely, just act as a pass through where appropriate.
Anything that IRIIIF needs to handle can be handled.
Anything that can go straight through into DSpace can just be routed.
The remote servers therefore only need to just talk to IRIIIF and do not have to select between DSpace and IRIIIF.
No masking via Ingress needed.
This allows for some simplification and some improvements of maintainability but not as much as completely pulling DSpace out.
This still incurs unnecessary network traffic due to the routing of the data (passing through) and is sub-optimal compared to directly going to DSpace.
Metadata into XMP Rather than EXIF, et al.
The Cantaloupe documentation regarding the metadata shows that only XMP is well supported.
We could utilize long-term-caching to cache this metadata if we convert anything not using XMP into using XMP.
Anything already in XMP could either be passed through or cached along with the converted data.
Metadata into XMP Rather than EXIF, et al., but not in IRIIIF
Cantaloupe itself documents ways to modify XMP data.
It could be that Cantaloupe provides ways to do this.
As noted in the documentation, Cantaloupe also supports caching of this XMP Metadata.
Cantaloupe may need to be further investigated to determine if it can convert the XMP metadata for us.
If so, then this would save a lot of programming and maintenance effort by having Cantaloupe do this for us.
Use Delegates Instead
Cantaloupe supports the use of delegates.
These can be written in languages like Ruby or Java.
It may be that some or all of the functionality provided by IRIIIF can be removed and instead added as a Delegate directly on Cantaloupe.
The text was updated successfully, but these errors were encountered: