Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When updating baselines, canonify them #28

Merged
merged 4 commits into from
Oct 9, 2020

Conversation

ckreibich
Copy link
Member

btest-diff currently uses any TEST_DIFF_CANONIFIER during the comparison to the baseline. When updating the baseline, it stores the "raw" output as produced by the tests. This makes the baselines somewhat "git-unfriendly": any update to the baseline file ends up getting committed, even if the canonifier will subsequently strip it. For larger updates it also makes it tough to distinguish harmless noise (like a timestamp) from relevant changes to the baseline.

This change simply uses the same diff inputs btest-diff already produces internally also when updating the baseline.

I've run this over the Zeek testsuite and I think I like the resulting baselines much better. After a run that converts the baselines the testsuite still passes. When one requires the "raw" baseline for understanding why a test fails, it's still available as before, in the .tmp directory.

What do you think?

@jsiwek
Copy link
Contributor

jsiwek commented Oct 5, 2020

I've learned to deal with and workaround the old behavior, but often wondered whether it should really behave like this PR, so I find it favorable. Recently @timwoj also ran into hassle/confusion caused by the old behavior.

@rsmmr
Copy link
Member

rsmmr commented Oct 6, 2020

I've been debating this one for a while as well, but haven't convinced myself yet that recording the canonified baselines is the right thing to do. Couple challenges with that:

  1. It's not clear that a canonifier script will be idem-potent when running on an already processed baseline. Depending on what kind of substitutions it already made, it may very well fail (like when putting place holders in that now happen to match the very patterns it's looking for differently). We could put the burden on the script writers here, but it seems risky to do that at this point where various scripts are already out there that may subtly be affected by this change (it even seems difficult to figure out if a script would be).

  2. Storing the original output allows to change the canonifiers anytime, they'll just transparently continue to work, vs. having to fully update all baselines.

(1) is really the main point for me, (2) is a nice-to-have.

Personally, what bothers me most about the current baselines, is the user home directories that get recorded all over the place. I'm wondering if we could instead introduce a new "baseline scrubber" script that (1) replaces just stuff we really don't want in git, and (2) is expected to do so in a way that doesn't change the semantics of the original raw file (e.g., replace user's home directories with another valid dummy path).

@ckreibich
Copy link
Member Author

Interesting points. I just thought about this a bit more. To avoid the idempotence question we could ensure that canonifiers just run once:

  • btest-diff could always prefix the canonicalized baseline with its own header. This could include the usual warning of "this is a generated file, do not edit", which might be a good idea anyway. (I'm certainly guilty of manually tweaking baselines ... many a foot has been lost in the process.)
  • When comparing, btest-diff then knows that when the header is present, btest-diff has generated it, so doesn't need to canonicalize again. It can just strip the header and diff on the rest.

That would seem to address your first point. The second point is also true ... I feel I could live with that, though. Overall we change the canonifiers rather infrequently. And, technically this also applies to the scrubbers.

I agree btw that with this change the baselines become more opaque, for example when we sort the whole file. They'd become more of a mechanism to detect a failing test. For my workflow I feel that would be fine, but I can see others disagree.

If the main guideline is "we only commit baseline updates when a test outcome requires it", I think we need canonicalized baselines, because otherwise noise will remain in certain settings (variations in line order, or the column content ordering problem for sets).

@rsmmr
Copy link
Member

rsmmr commented Oct 7, 2020

  • When comparing, btest-diff then knows that when the header is present, btest-diff has generated it, so doesn't need to canonicalize again. It can just strip the header and diff on the rest.

I like that. Yes, that addresses my primary concern. It will also allow to keep existing baselines working and update them over time (for Zeek we may just want to do a general update, but other btests out there won't notice a difference).

@ckreibich ckreibich force-pushed the topic/christian/canonify-baselines branch from 7a9dbdd to f220c54 Compare October 7, 2020 23:32
@ckreibich
Copy link
Member Author

@rsmmr, terrific! See how you like the above. I revamped the branch a bit: first the indentation cleanup, then the new features, then a shellcheck pass. If we dislike its suggestions, we can just drop that one. Some of it helps prep for #21.

This doesn't yet update the baslines. ;-)

btest-diff Outdated
@@ -119,15 +163,20 @@ elif [ "$TEST_MODE" == "UPDATE_INTERACTIVE" ]; then

echo -n "$TEST_NAME ..." >/dev/tty

# If we have a canonified version of the input file, use it as the
# new baseline. Otherwise, just use the input.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one question: is this still storing the non-cannonfied version the first time we run a new test (i.e., when we don't have a baseline yet)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, good catch Robin — it didn't. I pushed a few updates that fix this and expand the test case to cover it.

I also realized that running the canonifier as part of the baseline updates means there's a new error possibility, namely that the canonifier can fail. I added that possibility, and since I got a bit lost in the various exit codes I added them to the file. It's pretty tricky ... let me know if you think we should tweak further.

This leverages the same canonicalization btest-diff already applies
during test output comparison against baselines also when updating
those baselines (usually via btest -U/-u). This removes a bunch of
noise from the baselines, including personal home directories,
timestamps, etc.

Since btest-diff doesn't know whether the baseline has undergone this
canonicalization, it continues to canonicalize the basline prior to
diffing. To track whether it has canonicalized a baseline when
updating, btest-diff now also prepends a header to the generated
baseline that warns users about the fact that it is
auto-generated. The presence of this header doubles as a marker for
canonicalization.

Includes a test-case.
@ckreibich ckreibich force-pushed the topic/christian/canonify-baselines branch from f220c54 to a7b772c Compare October 8, 2020 19:19
@rsmmr rsmmr merged commit 57f53ad into master Oct 9, 2020
@rsmmr
Copy link
Member

rsmmr commented Oct 9, 2020

Merge & pushed.

I haven't updated the Zeek-side submodule yet as I we might be causing some pain there: once we start recording baselines in the new format, people who haven't upgraded btest yet will see errors popping up. Actually just noticed this in Spicy as well, see: https://github.com/zeek/spicy/pull/539/checks?check_run_id=1227182400. I don't have a good idea how to avoid this, so I guess people will just need to update. Any other thoughts?

@ckreibich ckreibich deleted the topic/christian/canonify-baselines branch October 11, 2020 04:21
@ckreibich
Copy link
Member Author

Robin that link doesn't seem to find that run ID any longer, but the problem is that when we start to update baselines to the new format and people are still using an older btest, then baseline comparisons will blow up because the header won't get removed (plus we have the potential non-idempotence of the canonifiers that you flagged) — right?
That is true ... and I don't see a great solution for it. A couple of thoughts:

  • For "larger" projects it might make sense to include btest via a submodule, as Zeek does. That way you could ensure the btest version is in line with the Baselines. But it won't scale — we can't expect any Zeek package to do this, for example.
  • We could start capturing btest version requirements in btest.cfg. That will only help with similar problems in the future, not for this current instance, but I like the idea.

@0xxon
Copy link
Member

0xxon commented Oct 11, 2020

To your first point - that does not actually work either.

I typically run the Zeek tests just calling "btest -j" directly in the correct directory. Which uses my local installation of btest, not the packaged one :)

I kind of like the idea of introducing a minimum version requirement field in btest.cfg - that way you at least get a meaningful error message if we do something like this again.

@rsmmr
Copy link
Member

rsmmr commented Oct 11, 2020

I like version requirement as well.

One more thought for the issue at hand: we could include a note into that new header saying "you need at least btest x.x now"?

@ckreibich
Copy link
Member Author

Okay, I'll get on the version requirement, via a separate PR. Also thumbs up to the extra detail in the header. Should we move to 0.63 for this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants