Skip to content

Implement log test in go #1061

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 5 commits into from

Conversation

europaul
Copy link
Contributor

Since lim/log_test has been acting flaky lately I decided to give it a try and reimplement it in Go using evetestkit.

Let's see if that works and fixes the issue. If so - then we can reimplement the rest of the tests in the lim package. However it requires more work, since they are used in other tests as well and those would need to be reimplemented as well. That's why we leave them untouched for now.

@europaul europaul requested a review from uncleDecart as a code owner May 27, 2025 20:02
@europaul europaul requested review from Copilot, shjala and rene and removed request for uncleDecart May 27, 2025 20:03
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR reimplements a flaky log test in Go using evetestkit and updates configuration naming from AdamLogLevel to RemoteLogLevel. Key changes include:

  • Removal of the legacy shell-based log test file and legacy TestMain implementation.
  • Introduction of a new Go test (log_test.go) with proper setup, teardown, and log capture.
  • Updates across configuration and utility files to reflect the renaming of logging parameters.

Reviewed Changes

Copilot reviewed 19 out of 19 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
tests/lim/testdata/log_test.txt Removed legacy shell test file.
tests/lim/log_test.go Added new Go log test implementation using evetestkit.
tests/lim/lim_test.go Removed obsolete TestMain code for log/info/metric tests.
tests/lim/eden.lim.tests.txt Updated test run command for the new log test.
tests/lim/Makefile Added debug target for testing.
tests/fsstress/testdata/fsstress_test.txt Updated remote logging configuration parameter reference.
tests/escript/go-internal/testscript/* Updated build tags to support new Go versions.
pkg/utils/config.go Renamed configuration field from AdamLogLevel to RemoteLogLevel.
pkg/openevec/* Updated logging and configuration references to match the new naming.
pkg/evetestkit/utils.go Modified file existence check command to allow globbing.
pkg/defaults/, pkg/controller/, pkg/controller/elog/elog.go Updated defaults and logging functions to reflect renaming changes.
Comments suppressed due to low confidence (1)

tests/lim/log_test.go:53

  • The variable 'tc' is used without prior declaration. Please declare 'tc' (e.g., var tc *testcontext.TestContext) before assigning it.
tc = eveNode.GetTestContext()

@europaul europaul force-pushed the implement-log-test-in-go branch from 7d98142 to 06ed85d Compare May 27, 2025 20:07
@europaul europaul requested a review from OhmSpectator May 27, 2025 20:09
@europaul europaul force-pushed the implement-log-test-in-go branch 2 times, most recently from 223915e to 246ebfe Compare May 28, 2025 14:42
Copy link
Member

@OhmSpectator OhmSpectator left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope it works, but I'm pretty confused with the test as it is...

// TODO: change this to use the metric of when the last config was applied / changed
// the waiting period includes the time for EVE to get & parse the config
// as well as the time for the logs that were written to disk before the config was applied to be sent
time.Sleep(60 * time.Second)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shjala, don't we have a nice function to wait for a specific log line in our logs? It would be handy. Here, we could wait for the config diff to be printed, which contains the expected values...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't really make sense in a log test, since we are testing exactly this functionality

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not for the testing purposes, but to ensure the device received the config, instead of hoping it comes after the time awaited.


const (
projectName = "lim"
logToLookFor = "Disconnected"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we define this constant close to the place where we use it?...

// initialize the channel for signaling
foundTheLog := make(chan struct{}, 1)
go func() {
if err := eveNode.GetLogsFromAdam(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But why do we parse the Controller logs here?... Don't we need to check the logs on the device? Or is it a test that logs go from the device to the controller?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, good point, I took this functionality from another test that I did for the controller

logInfof("STEP 3: start capturing logs")
// initialize the channel for signaling
foundTheLog := make(chan struct{}, 1)
go func() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks super tricky... However, it may be my general issue with understanding asynchronous code...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, our goal should be to eventually create a test framework that does not require starting Go routines inside tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand... why is that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO tests should be simple and easily readable. Mostly just a sequence of actions (onboard device, run ssh command, change device config) and one-line asserts. For things that we need to wait for I would prefer something like gomega.Eventually(device.HasLog(...)).Should(BeTrue()) rather than doing complicated things with Go routines and Go channels (that one has to replicate in every test that checks for some log message).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with Milan, tests should be easily debugable and utilize golang (or libraries) features like gomega.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alright guys, I see your point about the simplicity and that parallelism in tests is not contributing to that. I'll change that accordingly.

I just have two points:

  1. it was implemented with a background job in the original escript implementation
  2. to me using a standard goroutine to check the logs in the background is much clearer than relying on a third party library, not knowing how this gomega.Eventually thing is implemented - is there a deferred function in the background, a goroutine, something else 🤷 I have no idea

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To the point 2.: why does the internal implementation matter? We could make that argument for any 3rd party Go package used from eden or eve. The semantic is well defined: https://pkg.go.dev/github.com/onsi/gomega#Eventually
And if you need to know the implementation details, gomega is open-sourced.
gomega.Eventually was just an example, there are other testing frameworks with similar concepts.
But my overall point is that we should aim towards a simple human-readable Domain-Specific Language (DSL) for EVE testing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But my overall point is that we should aim towards a simple human-readable Domain-Specific Language (DSL) for EVE testing.

tbh I thought that we are trying to get rid of DSL - that's why we are rewriting the tests in plain go instead of escrypt

}()

logInfof("STEP 4: open and close SSH connection to EVE to generate some logs")
_, err := eveNode.EveRunCommand("exit")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A rather unusual method for generating logs, I would say. I believe that a significant part of the issue with the test being flaky stems from this SSH connection. Can we replace it with something more stable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about it, but for any kind of log generation we would need a connection to EVE. So I guess if no SSH connection is there, there is no connection at all?

@@ -614,6 +615,22 @@ func (openEVEC *OpenEVEC) EdenLog(outputFormat types.OutputFormat, follow bool,
return nil
}

// EdenFindLogs finds logs in the controller and processes them using the provided handler function.
func (openEVEC *OpenEVEC) EdenFindLogs(handler elog.HandlerFunc, timeout time.Duration) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to see a simpler interface here, but I'm not sure it's possible to simplify it...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What kind of interface you're thinking? To me functional approach seems reasonable...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like a simple synchronous function EdenExpectLog(expectedLine string, timeout time.Duration). It will cover 90% cases of checking logs, including this one.

const (
projectName = "lim"
logToLookFor = "Disconnected"
logTimeout = 30 * time.Second // 30 sec should be enough since it only takes 10 sec to upload logs with fast upload
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we certain the runners (which are VMs) can handle it consistently in 30 seconds? It's better than 10 seconds used previously, but nevertheless...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the main issue is like you mentioned not getting the logs, but getting an SSH connection to EVE. this is the flaky part and I sadly still have no idea where it comes from. So if the ssh connection is not established during 30 seconds, the test will fail. 30 sec is definitely enough to get the logs

@uncleDecart
Copy link
Member

Just brining @milan-zededa findings here, seems like this is related.

2025-05-30 07:08:23.428|info|pillar|| panic: runtime error: index out of range [0] with length 0
2025-05-30 07:08:23.428|info|pillar|| null
2025-05-30 07:08:23.428|info|pillar|| goroutine 89 [running]:
2025-05-30 07:08:23.436|info|pillar|| github.com/lf-edge/eve/pkg/pillar/vault.(*Ext4Handler).removeProtectorIfAny(0xc000bf3a20, {0x2d8c59e, 0xe})
2025-05-30 07:08:23.437|info|pillar||   /pillar/vault/handler_ext4.go:228 +0x5d9
2025-05-30 07:08:23.437|info|pillar|| github.com/lf-edge/eve/pkg/pillar/vault.(*Ext4Handler).createVault(0xc000bf3a20, {0x2d8c59e, 0xe})
2025-05-30 07:08:23.437|info|pillar||   /pillar/vault/handler_ext4.go:446 +0x175
2025-05-30 07:08:23.438|info|pillar|| github.com/lf-edge/eve/pkg/pillar/vault.(*Ext4Handler).setupVault(0xc000bf3a20, {0x2d8c59e, 0xe}, 0x0)
2025-05-30 07:08:23.438|info|pillar||   /pillar/vault/handler_ext4.go:357 +0x2cd
2025-05-30 07:08:23.438|info|pillar|| github.com/lf-edge/eve/pkg/pillar/vault.(*Ext4Handler).SetupDefaultVault(0xc000bf3a20)
2025-05-30 07:08:23.438|info|pillar||   /pillar/vault/handler_ext4.go:116 +0x45
2025-05-30 07:08:23.440|info|pillar|| github.com/lf-edge/eve/pkg/pillar/cmd/vaultmgr.handleVaultKeyFromControllerImpl({0x2b14600?, 0xc000ad9b30}, {0xc0013c62d0, 0x16}, {0x2c21140, 0xc00143eb70})
2025-05-30 07:08:23.440|info|pillar||   /pillar/cmd/vaultmgr/vaultmgr.go:475 +0x39c
2025-05-30 07:08:23.441|info|pillar|| github.com/lf-edge/eve/pkg/pillar/cmd/vaultmgr.handleVaultKeyFromControllerCreate({0x2b14600?, 0xc000ad9b30?}, {0xc0013c62d0?, 0x2c21140?}, {0x2c21140?, 0xc00143eb70?})
2025-05-30 07:08:23.441|info|pillar||   /pillar/cmd/vaultmgr/vaultmgr.go:386 +0x2c
2025-05-30 07:08:23.447|info|pillar|| github.com/lf-edge/eve/pkg/pillar/pubsub.handleModify({0x2c94900?, 0xc00030a800}, {0xc0013c62d0, 0x16}, {0xc0022cac80, 0x3a, 0x3c})
2025-05-30 07:08:23.447|info|pillar||   /pillar/pubsub/subscribe.go:269 +0x717
2025-05-30 07:08:23.448|info|pillar|| github.com/lf-edge/eve/pkg/pillar/pubsub.(*SubscriptionImpl).ProcessChange(0xc00030a800, {0x3, {0xc0013c62d0, 0x16}, {0xc0022cac80, 0x3a, 0x3c}})
2025-05-30 07:08:23.449|info|pillar||   /pillar/pubsub/subscribe.go:133 +0x3b8
2025-05-30 07:08:23.450|info|pillar|| github.com/lf-edge/eve/pkg/pillar/cmd/vaultmgr.Run(0xc000c32630, 0x0?, 0x10000?, {0x496aba0, 0x0, 0x0}, {0x0, 0x0})
2025-05-30 07:08:23.453|info|pillar||   /pillar/cmd/vaultmgr/vaultmgr.go:312 +0xd5b
2025-05-30 07:08:23.475|info|pillar|| main.startAgentAndDone({0x2e74b90?, 0x8?}, {0xc000bdb608, 0x8}, 0xc0000a3730?, 0x0?, 0x0?, {0x496aba0, 0x0, 0x0})
2025-05-30 07:08:23.476|info|pillar||   /pillar/zedbox/zedbox.go:245 +0x63
2025-05-30 07:08:23.476|info|pillar|| created by main.handleService in goroutine 1
2025-05-30 07:08:23.477|info|pillar||   /pillar/zedbox/zedbox.go:234 +0x334

europaul added 5 commits July 17, 2025 12:19
We have internal dependency on testing package that was changed in go
version 1.23

Signed-off-by: Paul Gaiduk <[email protected]>
Those binaries are used when running tests and don't need to
be checked into git.

Signed-off-by: Paul Gaiduk <[email protected]>
The config Eve.AdamLogLevel is confusing and without context might be
mistaken for the log level of the Adam service. Renaming it to
Eve.RemoteLogLevel makes it more consistent with EVE's own config
parameter RemoteLogLevel.

Signed-off-by: Paul Gaiduk <[email protected]>
FindLogOnAdam queries Adam for specified log entry. The user can query
the past or future logs and specify a timeout.

Signed-off-by: Paul Gaiduk <[email protected]>
In the spirit of writing tests purely in Go and since this
particular test has been acting flaky for a while, we
reimplement it using evetestkit.

The rest of the tests in this package (info_test, metrics_test)
remain as they are in escript. If the current reimplementation
proves a success they can also be reimplemented.

TestLog from lim_test.go is not called by this current test suite
anymore, but is used in others so we leave it be for now.

Signed-off-by: Paul Gaiduk <[email protected]>
@europaul europaul force-pushed the implement-log-test-in-go branch from 246ebfe to a1bbdf8 Compare July 17, 2025 10:35
@europaul
Copy link
Contributor Author

It seems like re-implementing the test in go has no effect on ssh issues. And since I included most useful commits from this PR in #1083 I think it's fair to close this one.

@europaul europaul closed this Jul 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants