Testing Oscar #2761

lkastner · 2023-09-04T11:16:41Z

lkastner
Sep 4, 2023
Maintainer

I am having a hard time verifying mathematical correctness of the PRs that I am reviewing. Basically the time it would take for me to verify that a (more complicated) PR is mathematically correct is equal to the time it would take me to implement the feature myself. Usually I can see that everything is roughly fine quite fast, but catching the boundary cases is very hard.

Since Oscar has for some time ascended the phase where it was merely a wrapper around the cornerstone modules, it is now also getting its own unique functionality. Previously a very reduced testsuite sufficed, since the mathematical correctness was tested within the cornerstones. But I think that Oscar now needs more tests, especially catching the boundary cases. One example are that many parts of Oscar are now using polyhedra but several places have failed to address polyhedra with lineality.

There are some issues about testing:
https://github.com/oscar-system/Oscar.jl/issues?q=is%3Aissue+is%3Aopen+sort%3Aupdated-desc+label%3Atestsuite
and the speed of the testsuite is mentioned several times. Adding more tests will worsen this problem, one solution could be to split up the testsuite.

However I do not know how to solve the overall problem of catching more boundary cases. Code coverage will only get us so far.

Another point is serialization, it could be useful to collect boundary case objects in serialized form such that they may be reused easily elsewhere.

I'd be interested in ideas how to approach this problem. Or is this only my problem? Another question: Should code reviews verify mathematical correctness in all boundary cases at all? So far I have only aimed for "approximate" correctness.

fingolfin · 2023-09-06T14:04:13Z

fingolfin
Sep 6, 2023
Maintainer

We talked about this during our weekly OSCAR Wednesday meeting in Kaiserslautern today.

In a nutshell, we don't think a PR reviewer is expected (or even able, in general!) to "verify mathematical correctness in all boundary cases". So aiming for "approximate" correctness is fine. It is on the PR authors to try to write correct code -- but of course with the expectation that this will not always work out, as mistakes just happen.

But of course a reviewer could point out potentially problematic points and ask things like "what happens in this and that boundary case?". They might also suggest that test cases be added to cover specific such cases.

In the end, catching "all boundary cases" is most likely impossible, akin to the halting problem. We can just try our best, learn from past mistakes, and fix bugs as they are reported.

To me this is mostly orthogonal to the speed of our test suite, too: if such tests make the test suite noticeably slower, then that just means we need to prioritize improve the situation there more (be it by splitting the test suite into multiple parts, some of which are run more often than others; speeding up code in general; running tests on our own, faster, hardware; or any combination of these or other techniques).

0 replies

lkastner · 2023-09-13T10:28:11Z

lkastner
Sep 13, 2023
Maintainer Author

Ok, to summarize:

It is enough to review for "approximate correctness".
Point out boundary conditions in reviews to make authors aware of potential errors, in case you recognize these.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Testing Oscar #2761

{{title}}

Replies: 2 comments

{{title}}

{{title}}

Select a reply

Testing Oscar #2761

lkastner Sep 4, 2023 Maintainer

Replies: 2 comments

fingolfin Sep 6, 2023 Maintainer

lkastner Sep 13, 2023 Maintainer Author

lkastner
Sep 4, 2023
Maintainer

fingolfin
Sep 6, 2023
Maintainer

lkastner
Sep 13, 2023
Maintainer Author