Enable Avx2 optimizations on Porter-Duff operations. #2328

JimBobSquarePants · 2023-01-26T07:22:37Z

JimBobSquarePants
Jan 26, 2023
Maintainer

Given that #1433 is not achievable within a reasonable timescale for v3 and might well be better implemented in a pattern using static virtual members in interfaces I'd like to propose that we enhance the existing Vector4 based Porter-Duff implementations with Vector256<float> equivalents.

This will give us a 2X performance improvement on supported devices over the existing implementation which we badly need if we are ever able to release a v1 of ImageSharp.Drawing.

We should be able to implement each separable method in the same fashion as the existing methods. The pixel blender implementations can call these.

For example. The existing Over method...

public static Vector4 Over(Vector4 destination, Vector4 source, Vector4 blend)
{
    // calculate weights
    float blendW = destination.W * source.W;
    float dstW = destination.W - blendW;
    float srcW = source.W - blendW;

    // calculate final alpha
    float alpha = dstW + source.W;

    // calculate final color
    Vector4 color = (destination * dstW) + (source * srcW) + (blend * blendW);

    // unpremultiply
    color /= MathF.Max(alpha, 0.001F);
    color.W = alpha;

    return color;
}

...Could be implemented in the following manner.

public static Vector256<float> Over(Vector256<float> destination, Vector256<float> source, Vector256<float> blend)
{
    const int BlendAlphaControl = 0b_10_00_10_00;
    const int ShuffleAlphaControl = 0b_11_11_11_11;

    // calculate weights
    var sW = Avx.Shuffle(source, source, ShuffleAlphaControl);
    var dstW = Avx.Shuffle(destination, destination, ShuffleAlphaControl);
    var blendW = Avx.Multiply(sW, dstW);

    dstW = Avx.Subtract(dstW, blendW);
    var srcW = Avx.Subtract(sW, blendW);

    // calculate final alpha
    Vector256<float> alpha = Avx2.Add(dstW, sW);

    // calculate final color
    Vector256<float> color = Avx2.Add(Avx2.Add(Avx2.Multiply(destination, dstW), Avx2.Multiply(source, srcW)), Avx2.Multiply(blend, blendW));

    // unpremultiply
    Vector256<float> alphaEpsilon = Avx2.Max(alpha, Vector256.Create(0.001f));
    color = Avx2.Divide(color, alphaEpsilon);
    return Avx.Blend(color, alpha, BlendAlphaControl);
}

For someone with a good grasp of the SIMD APIs this should not be a huge undertaking.

Thoughts @antonfirsov @br3aker ??

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Enable Avx2 optimizations on Porter-Duff operations. #2328

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Uh oh!

Enable Avx2 optimizations on Porter-Duff operations. #2328

Uh oh!

JimBobSquarePants Jan 26, 2023 Maintainer

Replies: 0 comments

JimBobSquarePants
Jan 26, 2023
Maintainer