de-duplication of key/value pairs#415
Conversation
|
This issue is currently awaiting triage. If klog contributors determine this is a relevant issue, they will accept it by applying the The DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Recent golangci-lint warns about shadowing the print function.
1.21 enables the usage of generics and the slices package.
When testing with Go v1.23.0 in the PATH, `go test go_vet` panics. Bumping the
tools dependency adds 1.23 as toolchain requirement and bumps the Go
version. This is limited to the examples module and thus does not affect users
of klog.
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x62d74f]
goroutine 97 [running]:
go/types.(*Checker).handleBailout(0xc0005bb340, 0xc0001afc88)
/nvme/gopath/go-1.23.0/src/go/types/check.go:412 +0x88
panic({0x6e6c40?, 0x954490?})
/nvme/gopath/go-1.23.0/src/runtime/panic.go:785 +0x132
go/types.(*StdSizes).Sizeof(0x0, {0x7a5420, 0x957e40})
/nvme/gopath/go-1.23.0/src/go/types/sizes.go:229 +0x30f
go/types.(*Config).sizeof(...)
/nvme/gopath/go-1.23.0/src/go/types/sizes.go:334
go/types.representableConst.func1({0x7a5420?, 0x957e40?})
/nvme/gopath/go-1.23.0/src/go/types/const.go:77 +0x86
go/types.representableConst({0x7a7548, 0x94bb80}, 0xc0005bb340, 0x957e40, 0xc0001af588)
/nvme/gopath/go-1.23.0/src/go/types/const.go:93 +0x173
go/types.(*Checker).representation(0xc0005bb340, 0xc00015be00, 0x957e40)
/nvme/gopath/go-1.23.0/src/go/types/const.go:257 +0x65
go/types.(*Checker).implicitTypeAndValue(0xc0005bb340, 0xc00015be00, {0x7a5588, 0xc0000ec380})
/nvme/gopath/go-1.23.0/src/go/types/expr.go:377 +0x2d7
go/types.(*Checker).assignment(0xc0005bb340, 0xc00015be00, {0x7a5588, 0xc0000ec380}, {0x738a90, 0x14})
/nvme/gopath/go-1.23.0/src/go/types/assignments.go:70 +0x451
go/types.(*Checker).initConst(0xc0005bb340, 0xc000720a80, 0xc00015be00)
/nvme/gopath/go-1.23.0/src/go/types/assignments.go:144 +0x1da
go/types.(*Checker).constDecl(0xc0005bb340, 0xc000720a80, {0x7a6580, 0xc0002828a0}, {0x7a6580, 0xc0002828c0}, 0x0)
/nvme/gopath/go-1.23.0/src/go/types/decl.go:482 +0x2e8
go/types.(*Checker).objDecl(0xc0005bb340, {0x7a9e78, 0xc000720a80}, 0x0)
/nvme/gopath/go-1.23.0/src/go/types/decl.go:185 +0xa09
go/types.(*Checker).packageObjects(0xc0005bb340)
/nvme/gopath/go-1.23.0/src/go/types/resolver.go:714 +0x454
go/types.(*Checker).checkFiles(0xc0005bb340, {0xc00016cb10, 0x3, 0x3})
/nvme/gopath/go-1.23.0/src/go/types/check.go:467 +0x15a
go/types.(*Checker).Files(0xc0001b2000?, {0xc00016cb10?, 0xc000474120?, 0x6?})
/nvme/gopath/go-1.23.0/src/go/types/check.go:430 +0x75
golang.org/x/tools/go/packages.(*loader).loadPackage(0xc0001b2000, 0xc000432780)
/nvme/gopath/pkg/mod/golang.org/x/[email protected]/go/packages/packages.go:961 +0x70f
golang.org/x/tools/go/packages.(*loader).loadRecursive.func1()
/nvme/gopath/pkg/mod/golang.org/x/[email protected]/go/packages/packages.go:799 +0x1a9
sync.(*Once).doSlow(0x0?, 0x0?)
/nvme/gopath/go-1.23.0/src/sync/once.go:76 +0xb4
sync.(*Once).Do(...)
/nvme/gopath/go-1.23.0/src/sync/once.go:67
golang.org/x/tools/go/packages.(*loader).loadRecursive(0x0?, 0x0?)
/nvme/gopath/pkg/mod/golang.org/x/[email protected]/go/packages/packages.go:787 +0x3b
golang.org/x/tools/go/packages.(*loader).loadRecursive.func1.1(0x0?)
/nvme/gopath/pkg/mod/golang.org/x/[email protected]/go/packages/packages.go:794 +0x26
created by golang.org/x/tools/go/packages.(*loader).loadRecursive.func1 in goroutine 96
/nvme/gopath/pkg/mod/golang.org/x/[email protected]/go/packages/packages.go:793 +0x94
FAIL k8s.io/klog/examples/go_vet 0.438s
FAIL
bdddf29 to
ab1b250
Compare
|
This should be ready for a review. @mengjiao-liu, @harshanarayana, who has time? |
|
Checking @pohly |
| t.Errorf("Output mismatch. Expected:\n%s\nActual:\n%s\n", expectedWithPlaceholder, actual) | ||
| t.Errorf("Output mismatch.\n\nExpected with placeholders:\n%s\nExpected without placeholds:\n%s\nActual:\n%s\n", expectedWithPlaceholder, expected, actual) | ||
| } else { | ||
| t.Errorf("Output mismatch. klog:\n%s\nExpected:\n%s\nActual:\n%s\n", test.expectedOutput, expectedWithPlaceholder, actual) | ||
| t.Errorf("Output mismatch. klog:\n%s\n\nExpected with placeholders:\n%s\nExpected without placeholders:\n%s\nActual:\n%s\n", test.expectedOutput, expectedWithPlaceholder, expected, actual) |
There was a problem hiding this comment.
Should the code in lines 844-850 also be changed? Their output is similar.
Lines 844 to 850 in 7f02688
Printing the actual expected string makes it easier to run a diff, something that we cannot do here automatically because we cannot depend on testify or go-cmp.
Span and trace ID are potentially going to be duplicated a lot.
MergeKVs is about to become a lot simpler. Removing the tests and benchmarks simplifies the before/after performance comparison.
Previously, de-duplication was limited such that keys passed to a log call as
parameters overwrote keys passed earlier to WithValues. Using the same key in
different WithValues calls or multiple times in a WithValues call or the
parameters led to duplicates in the output.
Now all key/value pairs get checked for duplicates. The last one wins and is
printed at the place where it was added. A potential improvement would be to
keep it at the place where it first appeared if the value is the same because
then different log entries are more likely to have it at the beginning of their
key/value pairs because it was added by a common WithValues call.
Because the new implementation avoids memory allocations, the performance
impact is minimal. Some cases even become faster.
goos: linux
goarch: amd64
pkg: k8s.io/klog/examples/benchmarks
cpu: Intel(R) Core(TM) i9-7980XE CPU @ 2.60GHz
│ before │ after1 │
│ sec/op │ sec/op vs base │
Output/klog/fail-verbosity-non-standard-struct-key-check/objects/0-36 385.5n ± 9% 358.9n ± 5% -6.91% (p=0.041 n=6)
Output/klog/fail-verbosity-non-standard-struct-key-check/objects/10-36 374.3n ± 4% 357.3n ± 5% -4.55% (p=0.015 n=6)
Output/klog/fail-verbosity-non-standard-struct-key-check/objects/20-36 388.4n ± 3% 359.2n ± 4% -7.53% (p=0.002 n=6)
...
Output/zapr/fail-verbosity-non-standard-int-key-check/objects/90-36 380.9n ± 4% 379.4n ± 2% ~ (p=0.937 n=6)
Output/zapr/fail-verbosity-non-standard-int-key-check/objects/100-36 375.1n ± 3% 381.9n ± 4% ~ (p=0.093 n=6)
geomean 3.461µ 3.442µ -0.55%
pkg: k8s.io/klog/examples/output_test
ZaprOutput/regular_error_types_as_value-36 3.270µ ± 10% 3.530µ ± 9% ~ (p=0.132 n=6)
ZaprOutput/struct_values-36 3.932µ ± 11% 4.107µ ± 10% ~ (p=1.000 n=6)
ZaprOutput/klog.Format-36 4.276µ ± 12% 4.439µ ± 7% ~ (p=0.132 n=6)
ZaprOutput/regular_error_types_when_using_logr.Error-36 2.790µ ± 7% 2.834µ ± 15% ~ (p=0.699 n=6)
ZaprOutput/map_values-36 5.802µ ± 9% 5.829µ ± 3% ~ (p=1.000 n=6)
ZaprOutput/log_with_name_and_values-36 4.051µ ± 4% 3.864µ ± 6% -4.63% (p=0.015 n=6)
...
geomean 5.581µ 5.646µ +1.16%
pkg: k8s.io/klog/v2
Header-36 1.339µ ± 5% 1.311µ ± 6% ~ (p=0.513 n=6)
...
KlogOutput/klog.Format-36 4.207µ ± 10% 4.667µ ± 3% +10.95% (p=0.009 n=6)
KlogOutput/cyclic_list-36 151.3µ ± 1% 151.1µ ± 1% ~ (p=0.818 n=6)
KlogOutput/override_single_value-36 2.308µ ± 9% 3.501µ ± 15% +51.66% (p=0.002 n=6)
KlogOutput/multiple_WithValues-36 10.91µ ± 9% 12.46µ ± 7% +14.16% (p=0.004 n=6)
...
KlogOutput/duplicates-36 21.34µ ± 2% 22.52µ ± 7% ~ (p=0.065 n=6)
...
geomean 252.6n 260.6n +3.18%
pkg: k8s.io/klog/v2/textlogger
TextloggerOutput/html_characters-36 1.982µ ± 12% 1.978µ ± 6% ~ (p=0.937 n=6)
TextloggerOutput/map_values-36 3.215µ ± 6% 3.429µ ± 4% +6.67% (p=0.041 n=6)
...
TextloggerOutput/mixed_duplicates-36 19.33µ ± 5% 19.59µ ± 7% ~ (p=0.818 n=6)
...
TextloggerOutput/print_duplicate_keys_in_arguments-36 2.067µ ± 7% 2.388µ ± 6% +15.53% (p=0.002 n=6)
...
TextloggerOutput/duplicates-36 16.80µ ± 11% 17.06µ ± 4% ~ (p=0.310 n=6)
...
geomean 2.798µ 2.860µ +2.24%
Suppose the same key/value pair gets logged multiple times:
WithValues("sameKey", 1).WithValues("newKey", 2).WithValues("sameKey", 1)
Previously, the de-duplication code would have emitted it near the end:
"hello world" newKey=2 sameKey=1
Now, it gets emitted at the place where it first appeared:
"hello world" sameKey=1 newKey=2
This makes log entries more consistent when some repeat the key/value pair and
others don't.
Performance is about the same as before.
goos: linux
goarch: amd64
pkg: k8s.io/klog/examples/benchmarks
cpu: Intel(R) Core(TM) i9-7980XE CPU @ 2.60GHz
│ after1 │ after2 │
│ sec/op │ sec/op vs base │
...
geomean 3.442µ 3.445µ +0.09%
pkg: k8s.io/klog/examples/output_test
geomean 5.646µ 5.631µ -0.26%
pkg: k8s.io/klog/v2
geomean 260.6n 260.5n -0.07%
pkg: k8s.io/klog/v2/textlogger
TextloggerOutput/odd_WithValues-36 6.956µ ± 4% 7.263µ ± 4% +4.42% (p=0.041 n=6)
...
TextloggerOutput/log_with_multiple_names_and_values-36 3.353µ ± 4% 3.172µ ± 2% -5.40% (p=0.002 n=6)
...
TextloggerOutput/duplicates-36 17.06µ ± 4% 17.16µ ± 9% ~ (p=0.937 n=6)
...
TextloggerOutput/mixed_duplicates-36 19.59µ ± 7% 20.35µ ± 5% ~ (p=0.065 n=6)
...
TextloggerOutput/print_duplicate_keys_in_arguments-36 2.388µ ± 6% 2.218µ ± 7% -7.12% (p=0.026 n=6)
...
geomean 2.860µ 2.849µ -0.38%
This is a follow-up which removes several functions which don't provide enough additional value anymore. This wasn't done earlier to keep the previous commits simple.
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: harshanarayana, mengjiao-liu, pohly The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/assign @dims For approval - I don't want to self-approve this. Please lift the hold if you agree to merge this. |
|
/hold cancel thanks @pohly @harshanarayana |
|
Thx! |
What this PR does / why we need it:
Previously, de-duplication was limited such that keys passed to a log call as
parameters overwrote keys passed earlier to WithValues. Using the same key in
different WithValues calls or multiple times in a WithValues call or the
parameters led to duplicates in the output.
Now all key/value pairs get checked for duplicates. The last one wins and is
printed at the place where it was added. If a second occurrence is the same as
a prior one, the first one is kept and the second is discarded.
This is partly motivated by the concern that adding OpenTelemetry will lead to
duplicate IDs (cc @dashpole), partly by prior discussions around instrumentation
not being able to avoid duplicates when using WithValue when the instrumentation
is happening in unrelated packages (cc @sbueringer, if I remember correctly).
Special notes for your reviewer:
Because the new implementation avoids memory allocations does very little extra work
when there is nothing which needs to be de-duplicated, the performance
impact is minimal. Some cases even become faster. See commit messages for
benchmark results.
Release note: