Skip to content

[Bug] Race condition on monitoring.New[Uint, Int, ...] #319

@AndersonQ

Description

@AndersonQ

The function to create new variables on a registry, like monitoring.NewInt, monitoring.NewUint and so on have a race condition when called concurrent to create a variable of the same name.

Those functions try to fetch en existing variable, if none is found, they add a new variable to the registry. However the check for the existence of the variable and the creation when none is found isn't done atomically. Thus it happens 2 calls for the same variable name identify the variable do not exists, both try to add it and the second panics.

For examples, NewUint:

func NewUint(r *Registry, name string, opts ...Option) *Uint {
existingVar, r := setupMetric(r, name, opts)

it calls Get:

func setupMetric(reg *Registry, name string, opts []Option) (Var, *Registry) {
if reg == nil {
return nil, Default
}
vis := reg.Get(name)
if vis == nil {
return nil, reg
}

and if none is found, it adds a new one:

v := &Uint{}
addVar(r, name, opts, v, makeExpvar(func() string {
return strconv.FormatUint(v.Get(), 10)
}))
return v
}

However there is no synchronisation between the call to Get and to addVar. Thus it happens:

  • goroutine A: calls NewUint, which calls Get and there is no variable
  • goroutine B: calls NewUint, which calls Get and there is no variable
  • goroutine B: NewUint calls addVar and the new variable is created
  • goroutine A: NewUint calls addVar and, as goroutine B already created the variable, addVar panics

To reproduce the issue, add the following test and run it multiple times:

diff --git a/monitoring/registry_test.go b/monitoring/registry_test.go
index c9fdeca..6be4934 100644
--- a/monitoring/registry_test.go
+++ b/monitoring/registry_test.go
@@ -20,6 +20,7 @@
 package monitoring

 import (
+       "sync"
        "testing"

        "github.com/stretchr/testify/assert"
@@ -103,6 +104,23 @@ func TestRegistryGet(t *testing.T) {
        assert.Equal(t, v, v3)
 }

+func TestRegistryGet_concurrent(t *testing.T) {
+       reg := NewRegistry()
+       name := "foo"
+
+       wg := sync.WaitGroup{}
+       assert.NotPanics(t, func() {
+               for i := 0; i < 5; i++ {
+                       wg.Add(1)
+                       go func() {
+                               defer wg.Done()
+                               NewInt(reg, name)
+                       }()
+               }
+       })
+       wg.Wait()
+}
+
 func TestRegistryRemove(t *testing.T) {
        defer func(t *testing.T) {
                err := Clear()
go test -run TestRegistryGet_concurrent -count 10000 .

Metadata

Metadata

Assignees

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions