-
Notifications
You must be signed in to change notification settings - Fork 52
Implement log test in go #1061
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement log test in go #1061
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR reimplements a flaky log test in Go using evetestkit and updates configuration naming from AdamLogLevel to RemoteLogLevel. Key changes include:
- Removal of the legacy shell-based log test file and legacy TestMain implementation.
- Introduction of a new Go test (log_test.go) with proper setup, teardown, and log capture.
- Updates across configuration and utility files to reflect the renaming of logging parameters.
Reviewed Changes
Copilot reviewed 19 out of 19 changed files in this pull request and generated 1 comment.
Show a summary per file
File | Description |
---|---|
tests/lim/testdata/log_test.txt | Removed legacy shell test file. |
tests/lim/log_test.go | Added new Go log test implementation using evetestkit. |
tests/lim/lim_test.go | Removed obsolete TestMain code for log/info/metric tests. |
tests/lim/eden.lim.tests.txt | Updated test run command for the new log test. |
tests/lim/Makefile | Added debug target for testing. |
tests/fsstress/testdata/fsstress_test.txt | Updated remote logging configuration parameter reference. |
tests/escript/go-internal/testscript/* | Updated build tags to support new Go versions. |
pkg/utils/config.go | Renamed configuration field from AdamLogLevel to RemoteLogLevel. |
pkg/openevec/* | Updated logging and configuration references to match the new naming. |
pkg/evetestkit/utils.go | Modified file existence check command to allow globbing. |
pkg/defaults/, pkg/controller/, pkg/controller/elog/elog.go | Updated defaults and logging functions to reflect renaming changes. |
Comments suppressed due to low confidence (1)
tests/lim/log_test.go:53
- The variable 'tc' is used without prior declaration. Please declare 'tc' (e.g., var tc *testcontext.TestContext) before assigning it.
tc = eveNode.GetTestContext()
7d98142
to
06ed85d
Compare
223915e
to
246ebfe
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hope it works, but I'm pretty confused with the test as it is...
// TODO: change this to use the metric of when the last config was applied / changed | ||
// the waiting period includes the time for EVE to get & parse the config | ||
// as well as the time for the logs that were written to disk before the config was applied to be sent | ||
time.Sleep(60 * time.Second) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shjala, don't we have a nice function to wait for a specific log line in our logs? It would be handy. Here, we could wait for the config diff to be printed, which contains the expected values...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't really make sense in a log test, since we are testing exactly this functionality
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not for the testing purposes, but to ensure the device received the config, instead of hoping it comes after the time awaited.
|
||
const ( | ||
projectName = "lim" | ||
logToLookFor = "Disconnected" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we define this constant close to the place where we use it?...
tests/lim/log_test.go
Outdated
// initialize the channel for signaling | ||
foundTheLog := make(chan struct{}, 1) | ||
go func() { | ||
if err := eveNode.GetLogsFromAdam( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But why do we parse the Controller logs here?... Don't we need to check the logs on the device? Or is it a test that logs go from the device to the controller?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, good point, I took this functionality from another test that I did for the controller
tests/lim/log_test.go
Outdated
logInfof("STEP 3: start capturing logs") | ||
// initialize the channel for signaling | ||
foundTheLog := make(chan struct{}, 1) | ||
go func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks super tricky... However, it may be my general issue with understanding asynchronous code...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, our goal should be to eventually create a test framework that does not require starting Go routines inside tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand... why is that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO tests should be simple and easily readable. Mostly just a sequence of actions (onboard device, run ssh command, change device config) and one-line asserts. For things that we need to wait for I would prefer something like gomega.Eventually(device.HasLog(...)).Should(BeTrue())
rather than doing complicated things with Go routines and Go channels (that one has to replicate in every test that checks for some log message).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with Milan, tests should be easily debugable and utilize golang (or libraries) features like gomega.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alright guys, I see your point about the simplicity and that parallelism in tests is not contributing to that. I'll change that accordingly.
I just have two points:
- it was implemented with a background job in the original escript implementation
- to me using a standard goroutine to check the logs in the background is much clearer than relying on a third party library, not knowing how this
gomega.Eventually
thing is implemented - is there a deferred function in the background, a goroutine, something else 🤷 I have no idea
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To the point 2.: why does the internal implementation matter? We could make that argument for any 3rd party Go package used from eden or eve. The semantic is well defined: https://pkg.go.dev/github.com/onsi/gomega#Eventually
And if you need to know the implementation details, gomega is open-sourced.
gomega.Eventually
was just an example, there are other testing frameworks with similar concepts.
But my overall point is that we should aim towards a simple human-readable Domain-Specific Language (DSL) for EVE testing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But my overall point is that we should aim towards a simple human-readable Domain-Specific Language (DSL) for EVE testing.
tbh I thought that we are trying to get rid of DSL - that's why we are rewriting the tests in plain go instead of escrypt
}() | ||
|
||
logInfof("STEP 4: open and close SSH connection to EVE to generate some logs") | ||
_, err := eveNode.EveRunCommand("exit") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A rather unusual method for generating logs, I would say. I believe that a significant part of the issue with the test being flaky stems from this SSH connection. Can we replace it with something more stable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about it, but for any kind of log generation we would need a connection to EVE. So I guess if no SSH connection is there, there is no connection at all?
pkg/openevec/eden.go
Outdated
@@ -614,6 +615,22 @@ func (openEVEC *OpenEVEC) EdenLog(outputFormat types.OutputFormat, follow bool, | |||
return nil | |||
} | |||
|
|||
// EdenFindLogs finds logs in the controller and processes them using the provided handler function. | |||
func (openEVEC *OpenEVEC) EdenFindLogs(handler elog.HandlerFunc, timeout time.Duration) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to see a simpler interface here, but I'm not sure it's possible to simplify it...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What kind of interface you're thinking? To me functional approach seems reasonable...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like a simple synchronous function EdenExpectLog(expectedLine string, timeout time.Duration). It will cover 90% cases of checking logs, including this one.
const ( | ||
projectName = "lim" | ||
logToLookFor = "Disconnected" | ||
logTimeout = 30 * time.Second // 30 sec should be enough since it only takes 10 sec to upload logs with fast upload |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we certain the runners (which are VMs) can handle it consistently in 30 seconds? It's better than 10 seconds used previously, but nevertheless...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the main issue is like you mentioned not getting the logs, but getting an SSH connection to EVE. this is the flaky part and I sadly still have no idea where it comes from. So if the ssh connection is not established during 30 seconds, the test will fail. 30 sec is definitely enough to get the logs
Just brining @milan-zededa findings here, seems like this is related.
|
We have internal dependency on testing package that was changed in go version 1.23 Signed-off-by: Paul Gaiduk <[email protected]>
Those binaries are used when running tests and don't need to be checked into git. Signed-off-by: Paul Gaiduk <[email protected]>
The config Eve.AdamLogLevel is confusing and without context might be mistaken for the log level of the Adam service. Renaming it to Eve.RemoteLogLevel makes it more consistent with EVE's own config parameter RemoteLogLevel. Signed-off-by: Paul Gaiduk <[email protected]>
FindLogOnAdam queries Adam for specified log entry. The user can query the past or future logs and specify a timeout. Signed-off-by: Paul Gaiduk <[email protected]>
In the spirit of writing tests purely in Go and since this particular test has been acting flaky for a while, we reimplement it using evetestkit. The rest of the tests in this package (info_test, metrics_test) remain as they are in escript. If the current reimplementation proves a success they can also be reimplemented. TestLog from lim_test.go is not called by this current test suite anymore, but is used in others so we leave it be for now. Signed-off-by: Paul Gaiduk <[email protected]>
246ebfe
to
a1bbdf8
Compare
It seems like re-implementing the test in go has no effect on ssh issues. And since I included most useful commits from this PR in #1083 I think it's fair to close this one. |
Since lim/log_test has been acting flaky lately I decided to give it a try and reimplement it in Go using
evetestkit
.Let's see if that works and fixes the issue. If so - then we can reimplement the rest of the tests in the lim package. However it requires more work, since they are used in other tests as well and those would need to be reimplemented as well. That's why we leave them untouched for now.