-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
When updating baselines, canonify them #28
Conversation
I've learned to deal with and workaround the old behavior, but often wondered whether it should really behave like this PR, so I find it favorable. Recently @timwoj also ran into hassle/confusion caused by the old behavior. |
I've been debating this one for a while as well, but haven't convinced myself yet that recording the canonified baselines is the right thing to do. Couple challenges with that:
(1) is really the main point for me, (2) is a nice-to-have. Personally, what bothers me most about the current baselines, is the user home directories that get recorded all over the place. I'm wondering if we could instead introduce a new "baseline scrubber" script that (1) replaces just stuff we really don't want in git, and (2) is expected to do so in a way that doesn't change the semantics of the original raw file (e.g., replace user's home directories with another valid dummy path). |
Interesting points. I just thought about this a bit more. To avoid the idempotence question we could ensure that canonifiers just run once:
That would seem to address your first point. The second point is also true ... I feel I could live with that, though. Overall we change the canonifiers rather infrequently. And, technically this also applies to the scrubbers. I agree btw that with this change the baselines become more opaque, for example when we sort the whole file. They'd become more of a mechanism to detect a failing test. For my workflow I feel that would be fine, but I can see others disagree. If the main guideline is "we only commit baseline updates when a test outcome requires it", I think we need canonicalized baselines, because otherwise noise will remain in certain settings (variations in line order, or the column content ordering problem for sets). |
I like that. Yes, that addresses my primary concern. It will also allow to keep existing baselines working and update them over time (for Zeek we may just want to do a general update, but other btests out there won't notice a difference). |
7a9dbdd
to
f220c54
Compare
btest-diff
Outdated
@@ -119,15 +163,20 @@ elif [ "$TEST_MODE" == "UPDATE_INTERACTIVE" ]; then | |||
|
|||
echo -n "$TEST_NAME ..." >/dev/tty | |||
|
|||
# If we have a canonified version of the input file, use it as the | |||
# new baseline. Otherwise, just use the input. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one question: is this still storing the non-cannonfied version the first time we run a new test (i.e., when we don't have a baseline yet)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, good catch Robin — it didn't. I pushed a few updates that fix this and expand the test case to cover it.
I also realized that running the canonifier as part of the baseline updates means there's a new error possibility, namely that the canonifier can fail. I added that possibility, and since I got a bit lost in the various exit codes I added them to the file. It's pretty tricky ... let me know if you think we should tweak further.
This leverages the same canonicalization btest-diff already applies during test output comparison against baselines also when updating those baselines (usually via btest -U/-u). This removes a bunch of noise from the baselines, including personal home directories, timestamps, etc. Since btest-diff doesn't know whether the baseline has undergone this canonicalization, it continues to canonicalize the basline prior to diffing. To track whether it has canonicalized a baseline when updating, btest-diff now also prepends a header to the generated baseline that warns users about the fact that it is auto-generated. The presence of this header doubles as a marker for canonicalization. Includes a test-case.
f220c54
to
a7b772c
Compare
Merge & pushed. I haven't updated the Zeek-side submodule yet as I we might be causing some pain there: once we start recording baselines in the new format, people who haven't upgraded btest yet will see errors popping up. Actually just noticed this in Spicy as well, see: https://github.com/zeek/spicy/pull/539/checks?check_run_id=1227182400. I don't have a good idea how to avoid this, so I guess people will just need to update. Any other thoughts? |
Robin that link doesn't seem to find that run ID any longer, but the problem is that when we start to update baselines to the new format and people are still using an older btest, then baseline comparisons will blow up because the header won't get removed (plus we have the potential non-idempotence of the canonifiers that you flagged) — right?
|
To your first point - that does not actually work either. I typically run the Zeek tests just calling "btest -j" directly in the correct directory. Which uses my local installation of btest, not the packaged one :) I kind of like the idea of introducing a minimum version requirement field in btest.cfg - that way you at least get a meaningful error message if we do something like this again. |
I like version requirement as well. One more thought for the issue at hand: we could include a note into that new header saying "you need at least btest x.x now"? |
Okay, I'll get on the version requirement, via a separate PR. Also thumbs up to the extra detail in the header. Should we move to 0.63 for this? |
btest-diff
currently uses anyTEST_DIFF_CANONIFIER
during the comparison to the baseline. When updating the baseline, it stores the "raw" output as produced by the tests. This makes the baselines somewhat "git-unfriendly": any update to the baseline file ends up getting committed, even if the canonifier will subsequently strip it. For larger updates it also makes it tough to distinguish harmless noise (like a timestamp) from relevant changes to the baseline.This change simply uses the same diff inputs
btest-diff
already produces internally also when updating the baseline.I've run this over the Zeek testsuite and I think I like the resulting baselines much better. After a run that converts the baselines the testsuite still passes. When one requires the "raw" baseline for understanding why a test fails, it's still available as before, in the .tmp directory.
What do you think?