Gaussian blur #2496

awxkee · 2025-06-18T16:08:41Z

Closes #986

Benchmarks

My working laptop has 10-15% noise, so numbers just to understand the magnitude of the difference. As well as comparison to libblur, because it actually is completely different execution class.

fast blur: sigma 3.0    time:   [22.242 ms 22.345 ms 22.460 ms]
                        change: [-6.7313% -5.5870% -4.4410%] (p = 0.00 < 0.05)

fast blur: sigma 7.0    time:   [22.404 ms 22.427 ms 22.453 ms]
                        change: [-9.3688% -8.5638% -7.7808%] (p = 0.00 < 0.05)

fast blur: sigma 50.0   time:   [23.251 ms 23.279 ms 23.313 ms]

gaussian blur: sigma 3.0
                        time:   [8.8081 ms 8.8672 ms 8.9305 ms]

gaussian blur: sigma 7.0
                        time:   [20.795 ms 20.959 ms 21.141 ms]

gaussian blur: sigma 50.0
                        time:   [174.65 ms 175.70 ms 176.80 ms]

libblur gaussian blur: sigma 3.0
                        time:   [1.5107 ms 1.5178 ms 1.5262 ms]

libblur gaussian blur: sigma 7.0
                        time:   [3.2371 ms 3.2688 ms 3.3020 ms]

libblur gaussian blur: sigma 50.0
                        time:   [25.141 ms 25.244 ms 25.373 ms]

libblur fast_blur exact alternative: sigma 3.0
                        time:   [2.9289 ms 2.9360 ms 2.9436 ms]

libblur fast_blur exact alternative: sigma 7.0
                        time:   [3.1053 ms 3.1424 ms 3.1801 ms]

libblur fast_blur exact alternative: sigma 50.0
                        time:   [3.1044 ms 3.1450 ms 3.1875 ms]

# Conflicts: # src/imageops/sample.rs # src/images/dynimage.rs

# Conflicts: # benches/blur.rs # src/imageops/sample.rs # src/images/dynimage.rs

src/imageops/filter_1d.rs

Shnatsel · 2025-06-18T23:28:21Z

Thank you!

I wonder, are the benchmark numbers for libblur from single-threaded or multi-threaded execution?

awxkee · 2025-06-18T23:32:41Z

Single threaded as far as I can tell.

fintelia · 2025-06-19T00:09:08Z

src/images/dynimage.rs

@@ -854,8 +855,8 @@ impl DynamicImage {
    /// This method typically assumes that the input is scene-linear light.
    /// If it is not, color distortion may occur.
    #[must_use]
-    pub fn blur(&self, sigma: f32) -> DynamicImage {
-        dynamic_map!(*self, ref p => imageops::blur(p, sigma))
+    pub fn blur(&self, kernel_size: usize, sigma: f32) -> DynamicImage {


We'll need to make a decision on whether we want to do this API change. If so, this PR cannot be merged until we start working on the next major release.

Do other library generally require that you specify the kernel size alongside the blur amount?

Pure analytical gaussian filter often assumes that you want full control on it. libblur and OpenCV require kernel size and also support assymetry.

See here.

I'm ok to remove kernel size, and compute correct kernel size from sigma if you think it's better fit. But I'm not ok to return old behaviour, what is wrong.

If it needs to wait something it's better to me to go ahead and close this PR. I don't have any plans to support rotting PRs

An easy-to-use wrapper that only requires the user to specify the sigma and computes the kernel size by itself would be nice. So it'd be two functions, e.g. blur() and blur_advanced().

We're already preparing a major version bump on the main branch so I'd do the change now. However, another idea is given by your question regarding kernel sizes. Small kernel sizes are faster (just see our cutoff point for ring queue in this). Considering most of our users we should at least suggest the common choices.

What if we were to introduce a 'BlurKernel' type that wraps those choices and gives a few constructors / static constants of common cases? Then indeed blur_{by,with}(BlurKernel) may make sense with blur(f32) yielding a somewhat common default.

I was actually confused by the sigma vs kernel size distinction.

I just want a high-level API that I can call that roughly matches what I'd get from "blur radius" in GIMP, but I also recognize that there are use cases for more direct control of the parameters. So I'd prefer an easy-to-use version as blur() and a more advanced API for people who need it.

I did in libblur such parameters builder. However, it might still be overkill, but I don't expect that someone plugging in 70K lanes of SIMD code wouldn't at least do a bit of investigation into how to use the API.

For more general purpose implementation I agree that GIMP style "blur radius" is preferred.

I'm in favor of a single argument that behaves like blur radius. That's what I've always assumed that sigma did

What if we were to introduce a 'BlurKernel' type that wraps those choices and gives a few constructors / static constants of common cases? Then indeed blur_{by,with}(BlurKernel) may make sense with blur(f32) yielding a somewhat common default.

I think providing something like image.blur(3) is enough for general-purpose use.
Adding methods like image.blur3, image.blur5, image.blur7 gives a strong impression of API bloat.
It might make some sense if we were doing something truly interesting in these implementations, just to mark them that they may yield results are different from what you'd expect. But for now all implementations are the same, even when you hit a ring queue path it gives you the same result, but using a different way to get it.

The argument radius/sigma makes perfect sense to me, too. So agreed, the simple function might as well accept a simple integer argument. I'm still not entirely sure we should remove the interaction entirely though for an advanced function.. Deriving sigma from the radius doesn't make much sense either. For small values (3, 5, 7) the influence of normalization is quite high, which people will have different preferences for.

But also importantly, imagmagick has it as an option(https://imagemagick.org/script/command-line-options.php#blur). I'm thinking if we had a struct and dispatched internally:

struct BlurKernel { size: (u32, u32), sigma: (f32, f32), } impl BlurKernel { // Corresponds to calling the simple function with (3). pub const THREE: … // Due to float-const this must be filled manually.. /// The isotropic case. pub fn from_radius(sz: u32) -> Self { … } /// Document the (1.0, 1.0) default and what anisotropy refers to. pub fn with_sigma(self, (x, y): (f32, 32)) -> Self { … } }

I think that is straightforward enough, but please do argue if you think this is API bloat. The anisotropy case in particular, I think there's enough reason to support it if it is one additional parameter to an intermediate/expert-level function call.

197g

Do you want fixed point implementations for u8 or u16 instead of f32? Or just pure f32 convolution is needed?

You matched the style in as discussed in the codec case and to me that is neat enough.

But also this seems to be an implementation choice, rather than an API choice. At least when we cast back to the underlying storage type I'd expect that what you're referring to as fixed point is actually exact with regards to rounding ("as-if floating point"). Then it should not matter to the user and the choice should be as fast as possible. However, that is then also a discussion point for the future instead as we can dispatch on those specific types (I::Pixel: 'static allows TypeId).

If special cases is wanted then do you expect them to be bit exact?

Not sure. It doesn't seem necessary but appealing for the special casing of types / fixed-point. Less so for special casing for kernel sizes but also see discussion on those below.

197g · 2025-06-19T13:35:49Z

src/imageops/filter_1d.rs

+
+    let mut start_ky = column_kernel_len / 2 + 1;
+
+    start_ky %= column_kernel_len;


This one is odd. It just catches the case of column_kernel_len == 1 but that rather seems like a very special case on its own which doesn't need column buffers at all to be honest.

Currently, the implementation assumes that anisotropy is an acceptable condition, and there is no special implementation case that handles column as identity and row as convolution. So implementation assumes that column_kernel_len == 1 and row_kernel_len == 5 is just fine.

I technically could drop any possibility of anisotropy if you see a better fit.

I was just confused by the way those lines are written. It suggests that column_kernel_len == 1 is an even more special case than anisotropy itself since no other case required the modulos operation and the length of 1 does not really require any intermediate buffers.

That said, a comment is a fine resolution to this to me. There are bigger fish to fry.

src/imageops/filter_1d.rs

197g · 2025-06-19T13:40:06Z

src/imageops/filter_1d.rs

+    if scanned_row_kernel.is_empty() || scanned_column_kernel.is_empty() {
+        for (dst, src) in destination.iter_mut().zip(image.iter()) {
+            *dst = *src;
+        }
+        return Ok(());
+    }


An empty kernel is an error for convolution, the no-op case is a [1.0] kernel. So this is a leniency contract, right? Should be documented in the function signature.

There is a mistake here — this is a no-op only when both kernels are [1.0]. Currently, the implementation assumes that anisotropy is an acceptable condition.

src/imageops/filter_1d.rs

197g · 2025-06-19T13:53:59Z

src/images/dynimage.rs

@@ -854,8 +855,8 @@ impl DynamicImage {
    /// This method typically assumes that the input is scene-linear light.
    /// If it is not, color distortion may occur.
    #[must_use]
-    pub fn blur(&self, sigma: f32) -> DynamicImage {
-        dynamic_map!(*self, ref p => imageops::blur(p, sigma))
+    pub fn blur(&self, kernel_size: usize, sigma: f32) -> DynamicImage {


We're already preparing a major version bump on the main branch so I'd do the change now. However, another idea is given by your question regarding kernel sizes. Small kernel sizes are faster (just see our cutoff point for ring queue in this). Considering most of our users we should at least suggest the common choices.

What if we were to introduce a 'BlurKernel' type that wraps those choices and gives a few constructors / static constants of common cases? Then indeed blur_{by,with}(BlurKernel) may make sense with blur(f32) yielding a somewhat common default.

awxkee · 2025-06-19T21:17:56Z

A ton of WebP tests started failing today. I'm not sure if this PR is related to that.

Shnatsel · 2025-06-19T21:39:30Z

That should be the fixes shipped in https://crates.io/crates/image-webp v0.2.3 altering the enshrined hashes.

In the long run something like image-webp's pixel difference threshold should be implemented, or image-rs/image-webp#146 should be fixed at which point we could enshrine hashes again.

But for now they should probably just be regenerated. I'll open a PR with that against the main branch.

awxkee · 2025-06-20T11:03:18Z

I realized that I benchmarked the GenericImageView implementation instead of DynamicImage, so here are the updated numbers.

Benchmark


fast blur: sigma 3.0    time:   [24.121 ms 24.321 ms 24.525 ms]
                        change: [+4.5411% +5.6756% +6.8759%] (p = 0.00 < 0.05)
                        Performance has regressed.

fast blur: sigma 7.0    time:   [23.648 ms 23.834 ms 24.030 ms]
                        change: [-12.527% -5.4979% -0.7447%] (p = 0.07 > 0.05)
                        No change in performance detected.

fast blur: sigma 50.0   time:   [25.422 ms 26.036 ms 26.734 ms]
                        change: [+6.1714% +8.8062% +12.002%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 9 outliers among 100 measurements (9.00%)
  1 (1.00%) high mild
  8 (8.00%) high severe

gaussian blur: sigma 3.0
                        time:   [9.1108 ms 9.1780 ms 9.2486 ms]
                        change: [-2.8948% -1.7866% -0.6687%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 5 outliers among 100 measurements (5.00%)
  5 (5.00%) high mild

gaussian blur: sigma 7.0
                        time:   [21.107 ms 21.351 ms 21.645 ms]
                        change: [-2.4164% -1.0423% +0.5379%] (p = 0.18 > 0.05)
                        No change in performance detected.
Found 14 outliers among 100 measurements (14.00%)
  13 (13.00%) high mild
  1 (1.00%) high severe

Benchmarking gaussian blur: sigma 50.0: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 17.7s, or reduce sample count to 20.
gaussian blur: sigma 50.0
                        time:   [172.45 ms 173.46 ms 174.53 ms]
                        change: [-3.2896% -2.4175% -1.5235%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
  7 (7.00%) high mild
  1 (1.00%) high severe

gaussian blur (dynamic image): sigma 3.0
                        time:   [5.2648 ms 5.3129 ms 5.3641 ms]
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

gaussian blur (dynamic image): sigma 7.0
                        time:   [12.699 ms 12.860 ms 13.099 ms]
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high severe

Benchmarking gaussian blur (dynamic image): sigma 50.0: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 9.7s, or reduce sample count to 50.
gaussian blur (dynamic image): sigma 50.0
                        time:   [99.216 ms 100.11 ms 101.01 ms]

awxkee added 6 commits June 18, 2025 16:53

Gaussian blur

83cef78

Gaussian blur

d706452

Gaussian blur

159d03e

Merge remote-tracking branch 'origin/main'

7fa1ba8

# Conflicts: # src/imageops/sample.rs # src/images/dynimage.rs

Gaussian blur

393b4d0

Merge branch 'main' into gauss_b

29f05ca

# Conflicts: # benches/blur.rs # src/imageops/sample.rs # src/images/dynimage.rs

awxkee force-pushed the gauss_b branch from e3aa92d to b0e3aaa Compare June 18, 2025 16:17

awxkee mentioned this pull request Jun 18, 2025

blur function is too slow #986

Open

awxkee commented Jun 18, 2025

View reviewed changes

src/imageops/filter_1d.rs Show resolved Hide resolved

awxkee force-pushed the gauss_b branch from b0e3aaa to dcba181 Compare June 18, 2025 17:46

fintelia reviewed Jun 19, 2025

View reviewed changes

197g reviewed Jun 19, 2025

View reviewed changes

awxkee force-pushed the gauss_b branch from dcba181 to d3ae62a Compare June 19, 2025 19:43

awxkee requested a review from 197g June 19, 2025 19:44

awxkee force-pushed the gauss_b branch 3 times, most recently from 6f1ac04 to 6dc369e Compare June 19, 2025 21:05

Gaussian blur

9a1b1ec

awxkee force-pushed the gauss_b branch from 6dc369e to 9a1b1ec Compare June 19, 2025 22:14

awxkee added 2 commits June 20, 2025 07:43

Merge branch 'image-rs:main' into main

a553421

Merge branch 'main' into gauss_b

61667e1

awxkee force-pushed the gauss_b branch 3 times, most recently from 338e71a to b564431 Compare June 20, 2025 07:34

Disable subnormals,NaNs and Inf on sigma

50f329b

awxkee force-pushed the gauss_b branch from b564431 to 50f329b Compare June 20, 2025 11:04

Change blurring u16 to Q0.15 fixed point

97f9195


		let mut start_ky = column_kernel_len / 2 + 1;

		start_ky %= column_kernel_len;

Gaussian blur #2496

Are you sure you want to change the base?

Gaussian blur #2496

Uh oh!

Conversation

awxkee commented Jun 18, 2025

Uh oh!

Uh oh!

Shnatsel commented Jun 18, 2025

Uh oh!

awxkee commented Jun 18, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

awxkee Jun 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

197g left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

awxkee Jun 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

awxkee commented Jun 19, 2025

Uh oh!

Shnatsel commented Jun 19, 2025

Uh oh!

awxkee commented Jun 20, 2025

Uh oh!

Uh oh!

awxkee Jun 19, 2025 •

edited

Loading

awxkee Jun 19, 2025 •

edited

Loading