-
-
Notifications
You must be signed in to change notification settings - Fork 362
i.cca: fix indexing, buffer size issues and matrix computation #5948
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Have you seen #3239? |
I reviewed #3239 in depth. The patch I have proposed in this PR addresses the same root issues: segmentation faults and incorrect array indexing in As @marisn rightly pointed out, while the module now executes without crashing, correctness of the output remains uncertain. To investigate this, I constructed a synthetic test case based on the LASDOC methodology as suggested there to verify the mathematical part of the canonical transformation. Test SetupI generated synthetic rasters with two classes, each having 2D Gaussian distributions with distinct means and covariance matrices:
The signature file generated by
These values are acceptably close for testing purposes. Expected vs Actual CCA OutputUsing LASDOC's canonical projection direction, the expected separation along the first canonical axis should have absolute difference between the means nearly equal to 2.8. However, the actual results from the patched
These results suggest that while the implementation runs, the transformation may not be projecting the data along the true canonical direction. The cause could lie in incorrect eigenvector handling, normalization, or covariance preprocessing. I will now dive into the matrix computation steps other than |
How big of an error/deviation would an off by one issue cause in the same synthetic data, before concluding if it is a close enough result? |
Yes, the off-by-one bug was critical, but even after fixing it, the numerical deviation suggests that other computational inaccuracies remain. I tried fixing a precision issues in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should do a calculation on paper based on the original description and then create a test case based on this manually validated data. Take look at some other modules to see examples (e.g. r.kappa, r.mapcalc) – the dataset does not have to be large one (e.g. 5x5 raster should be fine) to make it easy to calculate by hand. This should be enough to validate that the algorithm is implemented correctly. Then the next step can be tracking down numerical stability issues.
A test case to merge this PR is a must!
I tried the numerical example and while debugging I found out that in the original
The subtraction of
I tried to fix the logic which now separates the classwise accumulation and the global mean subtraction:
With this fix:
I will add this change to the PR once it is verified. Thank you! |
902e95f
to
689999f
Compare
This looks good to me, but if somebody else more proficient in linear algebra wants to review this, that would be better. |
@jayneel-shah18 could you please add the test for this in this PR? |
689999f
to
fab1a8e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Unless there is anything else from others, I plan to merge this.
Seems ready to merge and no other mention of it |
This PR resolves a segmentation fault and memory misalignment in the
i.cca
module. (#5947)Changes Made:
nclass
with allocations correctly sized bybands
.<prefix>.1
,<prefix>.2
, etc.Testing:
The changes were tested using the same reproduction steps outlined in the linked issue. The module now runs successfully and produces canonical component outputs without error. This fix brings
i.cca
in line with standard C indexing practices, preventing both runtime crashes and silent memory corruption.Fix Matrix Calculation and Numeric Stability
Summary of new changes
μ̄ * μ̄ᵀ
in each loop iteration, which caused B to be mis-scaled.Normalize canonical vectors
Switch outputs to DCELL
Tests Added
test_valid_outputs
: Performs regression check by comparing the statistical properties of the generated canonical components against pre-calculated, expected values ensuring the numerical stability and correctness of the algorithm.test_orthogonality_of_components
: Validates mathematical property of CCA, that the resulting components are uncorrelated. This is confirmed by computing the full covariance matrix of the output maps and asserting that the off-diagonal elements are approximately zero.test_covariance_matrix_symmetry
: Confirms that the covariance matrix from is symmetric, as expected for valid linear transformations.test_components_lack_linear_dependency
: Complements the covariance check by assessing the statistical independence of components.