Skip to content

File logger sometimes misses last log line sent from output extension #3843

@nadiamoe

Description

@nadiamoe

Brief summary

In SM we use an output extension to output metrics in a particular format for our consumption. This output extension also outputs a log line for each check in the script:

https://github.com/grafana/synthetic-monitoring-agent/blob/d9a40a5294f7c94e15d15fdbbc8570ac8447b6c4/xk6/sm/output.go#L153

Which looks like this:

time="2024-07-12T21:11:22+02:00" level=info msg="check result" check="response code was 200" metric=checks_total scenario=default source=sm value=1

We have recently noticed, however, that in some executions this log line does not appear, for the same script.

I made a hacky script that runs k6 continuously and checks if the log line is missing, and this seems to happen once every ~100 iterations, meaning that in ~1% of the cases that log line is not printed.

With the reproduction script I was able to narrow down a bit what's going on:

  • This does not seem to happen for log lines printed within the script via console.log, or at least not in 10k iterations
  • This only happens when --log-output is a file
  • This happens more often if --verbose is used.

Digging a bit I suspect the reason for this lies here:

k6/log/file.go

Lines 113 to 128 in 04b53e4

select {
case entry := <-h.loglines:
if _, err := h.bw.Write(entry); err != nil {
h.fallbackLogger.Errorf("failed to write a log message to a logfile: %w", err)
}
case <-ctx.Done():
if err := h.bw.Flush(); err != nil {
h.fallbackLogger.Errorf("failed to flush buffer: %w", err)
}
if err := h.w.Close(); err != nil {
h.fallbackLogger.Errorf("failed to close logfile: %w", err)
}
return
}

I suspect it might be possible for the context to get cancelled while there are still log lines on the channel. I crudely patched k6 to yell if there are still stuff on the channel when the context is done:

diff --git a/log/file.go b/log/file.go
index 6da7e94b..ab34e9f6 100644
--- a/log/file.go
+++ b/log/file.go
@@ -116,6 +116,12 @@ func (h *fileHook) Listen(ctx context.Context) {
 				h.fallbackLogger.Errorf("failed to write a log message to a logfile: %w", err)
 			}
 		case <-ctx.Done():
+			select {
+			case <-h.loglines:
+				h.fallbackLogger.Errorf("Pending log lines found")
+			default:
+			}
+
 			if err := h.bw.Flush(); err != nil {
 				h.fallbackLogger.Errorf("failed to flush buffer: %w", err)
 			}

Running the reproduction script, I can see Pending log lines found in the same execution where the log line is missing.

From what I can see, k6 is doing everything right concurrency wise, but I think this may be happening because logrus might be calling hooks on a separate goroutine, which is not synchronized with anything. If this were the case, it would explain this sometimes failing, as the output extension's Stop() method is invoked reasonably shortly before calling stopLoggers(), which triggers the cancellation of that context (by means of

k6/cmd/root.go

Line 277 in 04b53e4

cancel()
)

I'm not sure what a possible solution for this would look like.

k6 version

k6 v0.52.0 (go1.22.5, linux/amd64)

OS

Arch linux

Docker version and image (if applicable)

No response

Steps to reproduce the problem

  1. Build k6 with an output extension that logs something on Stop()
  2. Run any k6 script with --log-ouput=file=something and --verbose (--verbose makes it more likely to happen)
  3. Repeat 2 100 times or so
  4. Observe that sometimes, the log line is missing

Expected behaviour

Log lines should always be there.

Actual behaviour

Log line is sometimes missing.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions