Skip to content

Add default implementation for determinant so that it works with -target cpu #7505

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

juliusikkala
Copy link
Contributor

Fixes #7504. The equations were found using a small program that computes the Laplace expansions and prints out which entries were accessed. I simplified the N==4 case by hand a bit.

@juliusikkala juliusikkala added the pr: non-breaking PRs without breaking changes label Jun 21, 2025
@juliusikkala juliusikkala requested a review from a team as a code owner June 21, 2025 21:43
Copy link
Contributor

@natevm natevm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was able to confirm that all the default cases added here are mathematically correct. Instrumenting my code with these defaults also empirically produces correct values.

float1x1 and float2x2 should be pretty obvious. The 2x2 case is the same as what a 2D cross product gives, generalizes to a "wedge" product.

For float3x3, here's two corroborating references,

For float4x4, here's another two:

With respect to numerical stability, here's a small script to compare against doubles:

import numpy as np
import random
random.seed(6222025)
def det2(m):
    return m[0][0] * m[1][1] - m[0][1] * m[1][0]

def det3(m):
    return (
        m[0][0]*(m[1][1]*m[2][2] - m[1][2]*m[2][1])
        - m[0][1]*(m[1][0]*m[2][2] - m[1][2]*m[2][0])
        + m[0][2]*(m[1][0]*m[2][1] - m[1][1]*m[2][0])
    )

def det4(m):
    a = m[2][2] * m[3][3] - m[2][3] * m[3][2]
    b = m[2][1] * m[3][3] - m[2][3] * m[3][1]
    c = m[2][1] * m[3][2] - m[2][2] * m[3][1]
    d = m[2][0] * m[3][3] - m[2][3] * m[3][0]
    e = m[2][0] * m[3][2] - m[2][2] * m[3][0]
    f = m[2][0] * m[3][1] - m[2][1] * m[3][0]
    return (
        m[0][0] * (m[1][1] * a - m[1][2] * b + m[1][3] * c)
        - m[0][1] * (m[1][0] * a - m[1][2] * d + m[1][3] * e)
        + m[0][2] * (m[1][0] * b - m[1][1] * d + m[1][3] * f)
        - m[0][3] * (m[1][0] * c - m[1][1] * e + m[1][2] * f)
    )

def det_generic(m):
    n = len(m)
    if n == 1: return m[0][0]
    if n == 2: return det2(m)
    if n == 3: return det3(m)
    if n == 4: return det4(m)
    raise ValueError("Size not supported")

def random_col_matrix(n):
    # Generate a random matrix, row v column doesn't matter.
    return [[random.uniform(-10.0, 10.0) for _ in range(n)] for _ in range(n)]

max_err = [0.0, 0.0, 0.0, 0.0]
for n in (1, 2, 3, 4):
    for i in range(4000):
        m_col = random_col_matrix(n)
        m_row = np.array([[m_col[c][r] for c in range(n)] for r in range(n)], dtype=float)
        ref = float(np.linalg.det(m_row))
        ours = float(det_generic(m_col))
        max_err[n-1] = max(max_err[n-1], abs(ref - ours))

print(f" float1x1 max|error| = {max_err[0]:.3e}")
print(f" float2x2 max|error| = {max_err[1]:.3e}")
print(f" float3x3 max|error| = {max_err[2]:.3e}")
print(f" float4x4 max|error| = {max_err[3]:.3e}")

The reported errors are:
float1x1 max|error| = 1.776e-15
float2x2 max|error| = 9.948e-14
float3x3 max|error| = 2.501e-12
float4x4 max|error| = 3.638e-11

I figure if a user really needs a precise determinant, they can use the template using doubles.

@juliusikkala
Copy link
Contributor Author

Thanks for the analysis! Yeah, this isn't the most numerically stable way to compute determinants. FWIW I've run into stability issues with the built-in determinant() in GLSL as well. A more expensive method would be to use a Householder reflection to simplify the determinant computation into a more numerically stable form, as outlined on the last page of the supplemental document for "BRDF Importance Sampling for Polygonal Lights" (Cristoph Peters, 2021).

But since the built-in functions in general focus more on performance rather than best possible numerical precision, I suppose it should be OK to go with this faster version.

@juliusikkala juliusikkala added this pull request to the merge queue Jun 22, 2025
Merged via the queue into shader-slang:master with commit b1fc75f Jun 22, 2025
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr: non-breaking PRs without breaking changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[bug] Using determinant with -target cpu and target exe results in undefined behavior and core dumps
2 participants