Skip to content

Conversation

@fintelia
Copy link
Contributor

@fintelia fintelia commented Dec 2, 2025

This attempts to simplify the hooks logic/handling inside ImageReader by having it always operate on file extensions. As a result, specifying a built-in format now still attempts to dispatch to hooks if there's one registered for the format.

For formats with multiple extensions, we always default to the first listed format in ImageFormat::extensions_str. This has been adjusted to be .jpg, .tiff, and .pnm respectively.

Internally, the format is tracked as a Cow<'static, OsStr> which lets us avoid allocations when provided built-in formats.

@fintelia
Copy link
Contributor Author

fintelia commented Dec 2, 2025

I suppose if we make this change we might also consider changing guess_format to return a file extension

@197g
Copy link
Member

197g commented Dec 2, 2025

It's a little bit awkward for formats with more than one potential extension, choosing the first normative one is quite confusing as to which hook is being called precisely. If we have the first entry be equivalent to the format then why don't way say that is the format to avoid the complications of a involving a very open type like OsStr, which is also intertwined with ffi and not as fundamental as, say, str, especially for non-std use cases? The upside over the enum seems more like a trade-off, and I'm not sure it is a positive one to outside users.

Hopefully no one would register different decoders for .jpg and .jpeg and then be upset at the result...

I believe we rather fix that by letting you register hooks for ImageFormat with semantics of that hooking all our known file extensions plus an explicit request for that format. That would avoid the side step through a normative extension for users. Of course we should have const fn ImageFormat::extension_str() for the registration in either case to codify our statement.

To argue a new type, fewer meaningless representation holes, i.e. possible endings that all represent unknown files, and more possible specific methods come to mind. The PR demonstrates something important though, that we both think of known extensions to be treated as known formats in hooks and decoder selection. For sake of a sketch let's say we name that ImageKind (obv. confusing but TBD).

impl ImageKind {
    fn from_extension(_: &OsStr) -> Self {} // not a Result
    fn format(&self) -> Option<ImageFormat>;
    fn extension(&self) -> &OsStr; // the constructed extension or the normative one
}

impl From<ImageFormat> for ImageKind {}

// Probably we should not have PartialEq? Compare extension or not?
// ImageKind::from_extension("png") == ImageKind::from_extension("PNG")
// ImageKind::from_extension("jpg") == ImageKind::from_extension("jpeg")

So, more or less the current internal but when we recognize the extension we have both an extension and the corresponding kind. Then we have no problem returning that from guess_format and taking it as an argument for hook registration.

As a result, specifying a built-in format now still attempts to dispatch to hooks if there's one registered for the format.

I think that re-opens the discussion around the use cases for hooks a bit. The extension hooks have priority over any builtin so we do not break it with updates. But specifying a builtin format as the target of a hook implies we do have support; but the feature may not be enabled. What should we make of the situation, does the same intention apply? Or are the use cases a fallback or as specialization for cases such as pam (part of pnm)? That is the opposite priority order.

@fintelia
Copy link
Contributor Author

fintelia commented Dec 2, 2025

Yeah, that line of thinking was why I initially treated built-in formats and extensions separately.

But now I'm wondering if it matters? The only formats this would apply to is JPEG, TIFF and PNM. There's no difference for non-built-in formats and overriding the built-in formats is already an edge case. Plus newer formats generally don't create both 3 and 4 letter extensions or create a bunch of different extensions for different subtypes, so even if we add more formats in the future, I don't expect they'd have multiple extensions.

On top of that, when given ImageFormat::{Jpeg, Tiff}, we essentially have to either pick .jpg and .tiff or the reverse in order to dispatch to decoding hooks. So the results shouldn't be surprising for users even if it isn't immediately obvious which way we picked. And presumably anyone overriding the hook would want "foo.tif" and "foo.tiff" to be handled by the same decoder so it isn't even clear they'd care.

The last case is PNM. There the different extensions do have subtly different meaning. But guess-format hooks run before our built-in format detection logic so you'd be free to register your own detection to pick the individual extensions if you wanted. And .pnm is a valid extension for any of them so using it when given ImageFormat::Pnm isn't unreasonable either.

@197g
Copy link
Member

197g commented Dec 2, 2025

Maybe we best look at other systems for handling this. Compare the web stack, i.e. everything works through explicit media types. So we'd register hooks and guesses against a media type and translate extensions as well as ImageFormat into media types for everything. Furthermore we would have syntax for attribute filtering built-in.

@197g
Copy link
Member

197g commented Dec 2, 2025

Addendum: For extensions that do not match any known media type, translate them to an 'opaque' media type similar to opaque host that only matches that extension itself.

@RunDevelopment
Copy link
Contributor

This might be a stupid question, because I don't have image's entire API in my head, but what would the downsides of having a single unified type for both builtin formats and plugin formats be?

As I see it, both builtin and plugin formats have to do the same things:

  • Detection based on extension or signature.
  • Create an ImageDecoder given a reader.
  • Create an ImageEncoder given a writer.
  • Metadata (main extension, mime, can read/write).

Currently, builtin and plugin formats use different code paths to do the same thing (except for encoders+metadata, cause plugins don't support that right now), so why not have a single system that can do both? Or in other words, why couldn't we use the hook system (or something similar) for builtin formats as well?

Having a unified system seems a lot easier to me and would eliminate the current API limitations for plugin formats.

(To be specific, I'm not saying we should use the hook system exactly as is right now. I'm suggesting having a single registry for all formats (builtin and plugin).)

@197g
Copy link
Member

197g commented Dec 6, 2025

The builtin ImageFormat type is const-constructible and exhaustive in the sense of having an all() iterator. It also reflects the status of feature flags dynamically. We probably want to keep the enumeration in some form as a type but the name could be more tailored to that purpose.

@RunDevelopment
Copy link
Contributor

I think all of that would still be possible. In my mind, the global registry would just be a static Vec<FormatSpecification> (plus some lock around it) where the format specification defines all metadata + functionality. ImageFormat would just be an index into the vec.

  • Builtin formats can have constants because we can create the registry in a way such that builtin formats always get a certain index.
  • Feature flags are reflected by FormatSpec. (E.g. if the ico feature is disabled, we still define the format spec for ICO with its metadata + magic bytes and so on; it just wouldn't have a way to create encoders/decoders.)
  • Iterating all formats would just be iterating the registry.

As I see it, we wouldn't lose any functionality.

@fintelia
Copy link
Contributor Author

fintelia commented Dec 6, 2025

Formats added via hooks always operate on a BufReader<Box<dyn Read + Seek>> objects, while the built-in decoders get the actual reader implementation passed by the user. We had to do it that way to make it feasible to implement the hooks interface. But it does mean that if you already have the entire image file loaded into memory, then you get a (small) amount of overhead if you go through the hooks interface rather than directly via a built-in decoder

@RunDevelopment
Copy link
Contributor

RunDevelopment commented Dec 6, 2025

You're right, that's a difference I haven't thought of. Would that be a problem, though? As I see it, there are some upsides and some downsides. (Might have missed some stuff.)

Cons:

  • memory overhead
  • (very light) virtual function call overhead.

Pros:

  • a unified interface
  • users don't have to handle buffering anymore
  • each decoder would only have one generic instance generated (except through direct usage of decoders).

This tradeoff sounds reasonable to me, especially for more high-level interfaces like ImageReader. What do you think?

The same problem/tradeoff will likely also apply to encoders with a unified interface, since we might want to amortize the cost of virtual function calls via buffering there too.

@fintelia
Copy link
Contributor Author

fintelia commented Dec 6, 2025

I don't think it is worth it. Static dispatch for the built-in formats has been working well, and once we decide on what the external facing API should be, we should be able to hide the distinction from downstream users. Adding a bit of overhead (I measured 1-2% for PNG and I suspect TIFF could be worse) to all our high-level APIs just doesn't seem that appealing to make the internal implementation slightly simpler.

@RunDevelopment
Copy link
Contributor

The 1-2% is from decoding from an in-memory buffer, right? If so, then I would say that it's good news that the (probably) worst-case scenario got <2% slower through dynamic dispatch. I suspect that the overhead for file IO would be hard to even measure.

That said, we could use static dispatch for builtin decoders/encoders. I just wanted everything to function the same way, but this isn't a hard requirement or limitation. So performance wouldn't be a blocker for a unified system.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants