You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Given that #1433 is not achievable within a reasonable timescale for v3 and might well be better implemented in a pattern using static virtual members in interfaces I'd like to propose that we enhance the existing Vector4 based Porter-Duff implementations with Vector256<float> equivalents.
This will give us a 2X performance improvement on supported devices over the existing implementation which we badly need if we are ever able to release a v1 of ImageSharp.Drawing.
We should be able to implement each separable method in the same fashion as the existing methods. The pixel blender implementations can call these.
For example. The existing Over method...
publicstaticVector4Over(Vector4destination,Vector4source,Vector4blend){// calculate weightsfloatblendW=destination.W*source.W;floatdstW=destination.W-blendW;floatsrcW=source.W-blendW;// calculate final alphafloatalpha=dstW+source.W;// calculate final colorVector4color=(destination*dstW)+(source*srcW)+(blend*blendW);// unpremultiplycolor/=MathF.Max(alpha,0.001F);color.W=alpha;returncolor;}
...Could be implemented in the following manner.
publicstaticVector256<float>Over(Vector256<float>destination,Vector256<float>source,Vector256<float>blend){constintBlendAlphaControl=0b_10_00_10_00;constintShuffleAlphaControl=0b_11_11_11_11;// calculate weightsvarsW=Avx.Shuffle(source,source,ShuffleAlphaControl);vardstW=Avx.Shuffle(destination,destination,ShuffleAlphaControl);varblendW=Avx.Multiply(sW,dstW);dstW=Avx.Subtract(dstW,blendW);varsrcW=Avx.Subtract(sW,blendW);// calculate final alphaVector256<float>alpha=Avx2.Add(dstW,sW);// calculate final colorVector256<float>color=Avx2.Add(Avx2.Add(Avx2.Multiply(destination,dstW),Avx2.Multiply(source,srcW)),Avx2.Multiply(blend,blendW));// unpremultiplyVector256<float>alphaEpsilon=Avx2.Max(alpha,Vector256.Create(0.001f));color=Avx2.Divide(color,alphaEpsilon);returnAvx.Blend(color,alpha,BlendAlphaControl);}
For someone with a good grasp of the SIMD APIs this should not be a huge undertaking.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
Given that #1433 is not achievable within a reasonable timescale for v3 and might well be better implemented in a pattern using static virtual members in interfaces I'd like to propose that we enhance the existing
Vector4
based Porter-Duff implementations withVector256<float>
equivalents.This will give us a 2X performance improvement on supported devices over the existing implementation which we badly need if we are ever able to release a v1 of ImageSharp.Drawing.
We should be able to implement each separable method in the same fashion as the existing methods. The pixel blender implementations can call these.
For example. The existing
Over
method......Could be implemented in the following manner.
For someone with a good grasp of the SIMD APIs this should not be a huge undertaking.
Thoughts @antonfirsov @br3aker ??
Beta Was this translation helpful? Give feedback.
All reactions