-
-
Notifications
You must be signed in to change notification settings - Fork 120
IIPServer: Pass Through Mode – Fix Colourspace Interpretation Issue #279 #282
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
IIPServer: Pass Through Mode – Fix Colourspace Interpretation Issue #279 #282
Conversation
This commit fixes the colourspace interpretation issue, present since pass-though was introduced last year; the colorspace interpretation issue manifests in particular for Adobe style JPEGs that are missing their APP14 marker segment which is used to denote the colorspace being used so that decoders can correctly interpret and decode the colourspace information in the image to correctly render the colors. When the APP14 marker is missing, but required based on the format of the JPEG, many decoders, particularly ones that follow colour management best practices are susceptible to rendering the colour information incorrectly, and often this results in the images being rendered with a magenta/pink and green hue! While pass-through functionality significantly improves throughput speed, by avoiding the decompression/recompression of tiles, it can result in necessary metadata being left out of the tile, leading to rendering issues. Tiles along the sides and edges of an image, are usually not affected by the colourspace rendering issue because they are incomplete tiles, and thus need to go through the regular Compressor() pathway in code which includes fully assembling the JPEG including all its metadata. This issue was found last year and a fix was implemented at that time in a local development branch, but other work commitments precluded finalizing it for a pull request. With further improvements to IIPServer including the addition of the new `Compressor::injectMetadata()` method, we now have a suitable location into which this fix can be applied. I have updated the `JPEGCompressor` class’ `injectMetadata()` method with the fix, which involves creating a `jpeg_decompress_struct` and a `jpeg_create_decompress` step to copy the already encoded coefficients from the provided `RawTile` into the output, and then adding in the necessary metadata. The decompression step does not actually cause the JPEG data to be decompressed, we simply copy the already compressed data from the input to the output, which is incredibly fast, and because we are constructing the output JPEG in a standards compliant way all the necessary metadata is being carried over from source image to destination image, including if needed, the APP14 Adobe marker. The code while slightly longer (mostly due to code comments) should also handle a range of other edge cases without needing special handling as we are just relying on the libjpeg or libturbojpeg library methods. This fix has been compared against the previous implementation of the `insertMetadata()` method and for JPEG tiles that were previously affected by this issue, the updated version of this method resolves the issue and tiles render correctly, including in Safari and Preview on macOS as well as within other colour managed software such as Photoshop.
|
Thank you very much for submitting this fix! Unfortunately, this systematically crashes (on Ubuntu 25.04) whenever I try to output a tile using passthrough mode. I haven't yet been able to track down the root cause, but I suspect it may be related to having a compress and decompress process running at the same time. In any case Valgrind gives me this (plus a lot of other similar errors):
There's also an issue I see with relying too heavily on libjpeg's internal functions. The One way to speed this up may, in fact, be to exploit what you've shown in the hex dump here: #279 (comment). It looks to me that the result of all this extra processing is simply to inject the APP14 marker at the beginning of the JPEG bitstream. If that's really the only difference, maybe we can do this without having to touch the coefficients at all. By the way, have you, in fact, tested the output JPEGs you get in Photoshop or Preview? |
Hi @ruven, thank you for your feedback, and as always for your time and dedication to this project. I am sorry to hear of the issues you encountered, as the fix had worked well for me. It is a little embarrassing that the fix doesn't seem to be working as intended on your system. I would have never thought about submitting this for review if I had encountered any issues. I have not encountered any crashes, errors or performance issues with this code in place, but of course such things are always possible, especially if there are build environment or configuration differences between systems. If it is helpful, I built and tested this on macOS Sonoma 14.7.4 using Xcode 15.3 on a system running on an Apple Silicon M2 chip, based on the ARMv8.6-A instruction set architecture (little-endian), so there could be a range of differences between the development systems, dependency versions, and tool chains that may have led to the different outcomes at runtime. Xcode defaulted to using Apple's version of Clang and the GNU C++20 dialect of C++. Regarding the library dependencies for IIPServer, I linked the build with the following library dependency versions:
I tested every iteration of the fix as I worked on it, and checked every output image produced in a range of applications including Photoshop to ensure that the image rendered correctly. I also reviewed the embedded metadata of the generated tiles. I ran IIPServer both directly via Xcode, so it would have caught and reported any runtime exceptions if they had occurred, as well as through the generated executable directly on the command line. I made sure that For most of the testing I called IIPServer via its FCGI interface, specifying the source PTIFF and tile parameters using an IIIF Image API URL. For all of the FCGI tests, I provided the
The example image I referenced in the related issue #279 and in this ticket, is part of the Silvia Sleigh Archives at the Getty Research Institute, as featured on this object page: https://www.getty.edu/research/collections/object/1NGCTP The IIIF image API endpoint for the image is as follows: To make testing straightforward, I used a Python script to assemble and perform the FCGI request, and captured and saved the generated image binary output to disk for review: #!/usr/bin/env python3
import sys
import os
# Install the `fastcgipy` client, and import it here
from fastcgipy import FCGIClient
# Define the QUERY_STRING in IIIF format for the desired input image and output derivative parameters:
qs = "IIIF=/tmp/iipimage/29bdbf6d-dbc9-495a-907c-e2827a441b94.tif/1024,1024,1024,1024/256,/0/default.jpg"
# Define the FCGI request parameters
request_params: dict[str, str] = {
"SERVER_PROTOCOL": "HTTP/1.1",
"HTTP_HOST": "http://localhost",
"REQUEST_METHOD": "GET",
"CONTENT_LENGTH": "0",
"REQUEST_URI": "",
"QUERY_STRING": qs,
}
# Define path to the socket that IIPServer is using (as specified to IIPServer via its `--bind` argument)
socket: str = "/tmp/iipimage/iipimage.socket"
# Ensure that IIPServer is running by checking that the socket file exists:
if not os.path.exists(socket):
raise RuntimeError(f"The IIPImage server socket file, '{socket}', does not exist!")
# Instantiate an instance of FCGIClient, using the socket to communicate with IIPServer
client = FCGIClient(socket)
# Make an FCGI call to the server, capturing the response
response = client.__call__(request_params)
assert isinstance(response, bytes)
# Extract the response headers and body
(headers, body) = response.split(b"\r\n\r\n")
# Print the headers and body for review purposes/debugging
print("FCGI Headers:")
print(headers)
print()
print("FCGI Body:")
print(body)
print()
# Save the response body (the target derivative) to disk for review
with open(os.path.expanduser("/tmp/iipimage/tile.jpg"), "wb+") as file:
file.write(body) As for the missing Adobe APP 14 marker segment, that is certainly one of the most important pieces of this puzzle, and I had also thought about just adding special handling to the Furthermore, from what I have seen, an Adobe APP 14 header isn't needed for every input JPEG, only JPEGs that have originally been compressed using the Adobe flavor of JPEG. While this is apparently a very common flavor of JPEG, I don't believe every source JPEG or source PTIFF using JPEG encoding will be an Adobe flavor JPEG, so the addition of that header when it is not needed may cause issues by confusing the decoder. As such, this furthered my preference for allowing the JPEG library to make its own determinations and perform any special handling it deems appropriate. However, this is only a workable solution if the performance and stability can be ensured across all test and deployment environments. As you noted the issue you experienced may be due to the possibility of the JPEG decode and encode steps overlapping, I wonder if the order of operations could be sequenced differently to solve this, or if the coefficients and intrinsic metadata can be temporarily cached into memory with the "decode" step being concluded before the JPEG encode step begins so that there is no overlap? I think a solution that utilizes the JPEG library has its advantages, primarily as it reduces the amount of special handling that needs to be managed directly within IIPServer, but given the issues you encountered, this proposed solution needs to be evaluated further to find a robust solution that both solves the issue and works perfectly every time. Perhaps special handling for APP 14 could work in this case if code was added elsewhere in IIPServer to check if the input JPEG is an Adobe flavor JPEG or not (presumably it would be sufficient to check if the source JPEG tile has an APP 14 marker segment). If an Adobe JPEG was detected, perhaps it will be sufficient to add the APP 14 marker segment to the output. I will do some experimentation to try this out. I will covert this to a draft for now, and will update it for a further review when I have had opportunity to work through the code again and to ensure it runs performantly and without issue on Ubuntu 25.04+ and as well as it did on macOS Sonoma. If you have any additional feedback in the interim, please let me know. I would be particularly interested in knowing what type of input image file(s) you were using to do the testing, how you were calling IIPServer (in debug mode via |
Sorry about the delay in getting back to you. And certainly no need to be embarrassed! Subtle bugs can sometimes crop up when doing cross-platform development. Your efforts to fix this are very much appreciated and these kinds of contributions are a vital and an often under-appreciated part of open source development. So thank you! So, you say you've tested this on Mac / ARM. That could well explain why this crashes on my Linux setup but works on yours. A possibly similar memory issue came up recently, but with FreeBSD (#278). This was also very tricky to track down! Unfortunately, I don't have a Mac / ARM machine to test on. But I presume your server infrastructure is Linux-based, so does this work on Linux. Or are your servers all Mac?
The dialect shouldn't, in theory, make a difference, but I note that gcc defaults to C++17 (iipsrv itself requires only C++11 or greater). Even if we resolve the crash issue, the performance issue still needs to be resolved. The whole point of introducing pass-through mode was to make it as fast as possible. I agree that handing things off to libjpeg would be ideal, but libjpeg / libjpeg-turbo weren't really designed for this kind of fast transcoding. So I think it's OK for iipsrv to hack the JPEG bitstream directly if necessary. Looking more closely at your code, in my testing it looks like the slowest function call is to
Something that works is just checking
I tested using FCGI using the image from the original bug report as well as some of my own encoded as JPEG-RGB. And I used the IIP API to specify a single tile from the image - the API used shouldn't make a difference as an equivalent IIIF request will map to an identical lower-level function call within iipsrv. Anyway, I'll keep looking into this ... |
Firstly, thank you Ruven and contributors for creating, maintaining and improving IIPServer over the years. This software is an incredibly useful resource for so many groups and organizations, especially in the cultural heritage community.
This commit fixes the colourspace interpretation issue described in #279, present since pass-though was introduced last year; the colorspace interpretation issue manifests in particular for Adobe style JPEGs that are missing their APP14 marker segment which is used to denote the colorspace being used so that decoders can correctly interpret and decode the colourspace information in the image to correctly render the colors.
When the APP14 marker is missing, but required based on the format of the JPEG, i.e. an Adobe JPEG, many decoders, particularly ones that follow colour management best practices are susceptible to rendering the colourspace information incorrectly, and this often results in the images being rendered with a magenta/pink and green hue!
While pass-through functionality significantly improves throughput speed by avoiding the decompression/recompression of tiles, it can result in necessary metadata being left out of the tile output image, leading to rendering issues in software that strictly adheres to colour management such as Safari, Preview and Photoshop.
This proposed fix also avoids decompression/recompression like the original pass-through functionality by copying the already compressed/encoded coefficients from the source JPEG image tile to the destination JPEG image tile, but provides an opportunity to ensure all required metadata is added to the output image.
Tiles along the sides and edges of an image, are usually not affected by the colourspace rendering issue because they are incomplete tiles, and thus are generally processed through the regular
Compressor::Compress()
pathway in code which includes fully assembling the JPEG including all its metadata. This difference originally led to the discovery that tiles which had been extracted from the source image (in our case a JPEG-compressed Pyramidal TIFF) and fed through the pass-through were missing some of their required metadata including the APP14 marker.While this issue was originally found last year, and a prototype fix was implemented at the time in a local development branch, other work commitments precluded finalizing the fix for a pull request until now. With further improvements to IIPServer including the addition of the new
Compressor::injectMetadata()
method, we now have a suitable location into which this fix can be applied.I have updated the
JPEGCompressor
class’injectMetadata()
method with the fix, which involves creating ajpeg_decompress_struct
and ajpeg_create_decompress
step to copy the already encoded coefficients from the providedrawtile.data
into the output, and then adding in all the critical metadata – the critical metadata is determined by thelibjpeg
orlibturbojpeg
library (whichever is being used) rather than by any special handling and this ensures that any special fields are applied as necessary and automatically.Crucially, the decompression step does not actually cause the JPEG data to be decompressed, we simply copy the already compressed JPEG data from the input to the output, which is incredibly fast, and because we are constructing the output JPEG in a standards compliant way using the library functions, all the necessary metadata is being carried over from source image to destination image, including if needed, the APP14 Adobe marker.
The code while slightly longer (mostly due to code comments) should also handle a range of other metadata edge cases without needing any special handling as we are just relying on the
libjpeg
orlibturbojpeg
library methods to copy the data they deem critical from the source to the destination.This fix has been compared against the previous implementation of the
insertMetadata()
method, and for JPEG tiles that were previously affected by this issue, the updated version of this method resolves the issue and tiles render correctly with correct colourspace information, including in Safari and Preview on macOS as well as within other colour managed software such as Photoshop.