-
Notifications
You must be signed in to change notification settings - Fork 161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Recommendation to check in proptest regressions seems off #440
Comments
I see a few related but different topics here:
we're looking at adding first-class support for these so you can share test code between arbitrary cases and well-known edge cases. i think longer term this will be an ergonomic approach to dealing with this and you can disable persistence and just list our your explicit cases
i don't have a perfect answer for this one. its definitely an issue, especially for areas that have some churn on the underlying data that gets generated. i don't think this makes it fundamentally useless but there are definitely cases where it becomes borderline useless against its original goal. the few alternative/complementary approaches I can think of off the top of my head aren't foolproof either:
this one isn't quite right. we don't count the persisted cases in our total number of runs so if you say 10 cases and you have 5 persisted seeds, we'll run the 5 persisted seeds and then our count is still at 0 and then run the 10 cases with new generated data. now this comes with the downside of increased execution time but if we assume the fidelity of the persisted seeds is good, then it's effectively the same run time as if you had manually written out the 5 explicit cases
i'd like to think there are almost no "misuses" of proptest out there -- if you're getting some value out of testing with it then that's a pretty good use. that said there are definitely different ways to use proptest and we probably don't have the best ergonomics for all of them. the ones i'm most familiar with in my own projects fall in to two rough categories:
for the first category, its fair to say that a workflow to write the manual test case without the indirection of persisting rng seeds may work better for you. for the second category it's a bit less straight forward where the data generated interacts with a stateful system and its not as trivial to stage an explicit test that can elicit the exact behavior of a bug. |
FWIW, I find property tests so hard to read and write, so extracting the regression into a standalone simple test case is still usually a better mechanism of documenting the regression than trying to verify it through property testing (+ it's typically faster to evaluate the regression too). There's just too much else that can go wrong beyond just the arbitraries that are generated (e.g. changes to custom strategies I imagine can't really be caught but even aside from that changes to the property test assertions written can invalidate the test without any signal to the harness). |
My experience with JS's fastcheck which is where I had more experience with applying property testing in production, I found myself adding explicit test cases when encountering a bug. That way I can explicitly document the specific conditions that leads to the bug and have an explicitly documented test case for the regression while leaving the property testing to continue randomly exploring potential other bugs. The other advantage is that even if something subtle about the property generation changes, the regression test remains durably in the code whereas the old regressed seed is now testing an irrelevant case & not providing the protection intended.
The other problem is that with every regression you add, you remove 1 random fuzz attempt from your test run. At the limit, if you have 250 issues in your crate found through property testing, you won't notice that your property test is now just rerunning those regressions on each run only instead of trying to find new issues. Yes you can increase the number of cases, but you have to remember to do this & the value of running regressions through proptest may not be as good (proptest will be slower at running those regressions + the ability to catch that regression depends on the strategies never changing & the property test remaining static).
Of course, maybe I'm misusing proptest & it's expected to be used differently.
This might be related to #439 in that if proptest only recorded the last seed attempted at the start of the program and removed it on success (thus rerunning if the proptest aborted for any reason), things would work a bit more smoothly.
The text was updated successfully, but these errors were encountered: