Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Panics when opa running as a server #7117

Open
alam-chime opened this issue Oct 12, 2024 · 5 comments
Open

Panics when opa running as a server #7117

alam-chime opened this issue Oct 12, 2024 · 5 comments

Comments

@alam-chime
Copy link

Short description

We're running opa server as a sidecar in kubernetes. At the time of the issue, both memory and cpu usage were well below the defined limits. For the majority of requests, we are receiving the expected outcomes. But there are a few instances where we're seeing HTTP 502s and 504s from opa. There are no differences between the inputs of the failing requests and those that succeed.

  • OPA version: 0.66.0, but we're seeing this behavior with 0.69.0 also
  • Policy bundle created using
                "opa", 
                "build",
		"--bundle", "./source",
		"--optimize", "2",
		"--output", "policy-bundle.tar.gz",
  • Server commands
                "opa",
		"run",
		"--server",
		"--addr", "localhost:8181",
		"--diagnostic-addr", ":8282",
		"--log-level", "error",
		"--config-file", "/config.yaml",
		"--authorization=basic",
		"--bundle", "policy-bundle.tar.gz",
  • Server error logs
2024/10/12 15:21:01 http: panic serving 127.0.0.1:46936: runtime error: invalid memory address or nil pointer dereference
goroutine 415141 [running]:
net/http.(*conn).serve.func1()
/usr/local/go/src/net/http/server.go:1898 +0xbe
panic({0x114a0e0?, 0x2102610?})
/usr/local/go/src/runtime/panic.go:770 +0x132
github.com/open-policy-agent/opa/internal/deepcopy.Map(...)
/src/internal/deepcopy/deepcopy.go:28
github.com/open-policy-agent/opa/internal/deepcopy.DeepCopy({0x1135cc0?, 0xc000ed5cb0})
/src/internal/deepcopy/deepcopy.go:19 +0x173
github.com/open-policy-agent/opa/internal/deepcopy.Map(...)
/src/internal/deepcopy/deepcopy.go:28
github.com/open-policy-agent/opa/internal/deepcopy.DeepCopy({0x1135cc0?, 0xc0007529f0})
/src/internal/deepcopy/deepcopy.go:19 +0x14e
github.com/open-policy-agent/opa/internal/deepcopy.Map(...)
/src/internal/deepcopy/deepcopy.go:28
github.com/open-policy-agent/opa/internal/deepcopy.DeepCopy({0x1135cc0?, 0xc000b0b230})
/src/internal/deepcopy/deepcopy.go:19 +0x14e
github.com/open-policy-agent/opa/internal/deepcopy.Map(...)
/src/internal/deepcopy/deepcopy.go:28
github.com/open-policy-agent/opa/internal/deepcopy.DeepCopy({0x1135cc0?, 0xc000b0b1d0})
/src/internal/deepcopy/deepcopy.go:19 +0x14e
github.com/open-policy-agent/opa/plugins/logs.maskRuleSet.Mask({0xc00096ab20, {0xc0007284e0, 0x3, 0x4}, 0x0}, 0xc001136668)
/src/plugins/logs/mask.go:339 +0x11d
github.com/open-policy-agent/opa/plugins/logs.(*Plugin).maskEvent(0xc0001d1ce0, {0x1804978, 0xc000de66f0}, {0x17fa720, 0xc000de6ae0}, {0x1807280, 0xc001250000}, 0xc001136668)
/src/plugins/logs/plugin.go:1039 +0x25b
github.com/open-policy-agent/opa/plugins/logs.(*Plugin).Log(0xc0001d1ce0, {0x1804978, 0xc000de66f0}, 0xc000158c60)
/src/plugins/logs/plugin.go:704 +0x4b8
github.com/open-policy-agent/opa/runtime.(*Runtime).decisionLogger(0x17f4710?, {0x1804978, 0xc000de66f0}, 0xc000158c60)
/src/runtime/runtime.go:789 +0x6a
github.com/open-policy-agent/opa/server.decisionLogger.Log({0xc000de6b40?, {0x0?, 0xf?}, 0xc0002e1510?}, {0x1804978, 0xc000de66f0}, {0x17fa720, 0xc000de6ae0}, {0xc00036803e, 0xd}, ...)
/src/server/server.go:2992 +0x7b0
github.com/open-policy-agent/opa/server.(*Server).v1DataPost(0xc0001cc6c8, {0x1802e40, 0xc0001ecbd0}, 0xc000ba30e0)
/src/server/server.go:1796 +0xf0f
net/http.HandlerFunc.ServeHTTP(0x0?, {0x1802e40?, 0xc0001ecbd0?}, 0xc000def1c0?)
/usr/local/go/src/net/http/server.go:2166 +0x29
github.com/open-policy-agent/opa/internal/prometheus.(*Provider).InstrumentHandler.func1({0x7f21a71208a0, 0xc000de6600}, 0xc000ba30e0)
/src/internal/prometheus/prometheus.go:89 +0x136
net/http.HandlerFunc.ServeHTTP(0x1802e40?, {0x7f21a71208a0?, 0xc000de6600?}, 0xc000def2c0?)
/usr/local/go/src/net/http/server.go:2166 +0x29
github.com/prometheus/client_golang/prometheus/promhttp.InstrumentHandlerDuration.func1({0x1802e40, 0xc0001ec018}, 0xc000ba30e0)
/src/vendor/github.com/prometheus/client_golang/prometheus/promhttp/instrument_server.go:97 +0xfd
net/http.HandlerFunc.ServeHTTP(0xc000ba2fc0?, {0x1802e40?, 0xc0001ec018?}, 0x0?)
/usr/local/go/src/net/http/server.go:2166 +0x29
github.com/gorilla/mux.(*Router).ServeHTTP(0xc000000000, {0x1802e40, 0xc0001ec018}, 0xc000ba2480)
/src/vendor/github.com/gorilla/mux/mux.go:212 +0x1e2
github.com/open-policy-agent/opa/server/authorizer.(*Basic).ServeHTTP(0xc000789620, {0x1802e40, 0xc0001ec018}, 0xc0001ec018?)
/src/server/authorizer/authorizer.go:129 +0x5a4
net/http.HandlerFunc.ServeHTTP(0x0?, {0x1802e40?, 0xc0001ec018?}, 0xc0006a3690?)
/usr/local/go/src/net/http/server.go:2166 +0x29
github.com/open-policy-agent/opa/internal/prometheus.(*Provider).InstrumentHandler.func1({0x7f21a71208a0, 0xc000b941e0}, 0xc000ba2000)
/src/internal/prometheus/prometheus.go:89 +0x136
net/http.HandlerFunc.ServeHTTP(0x1802150?, {0x7f21a71208a0?, 0xc000b941e0?}, 0x41af91?)
/usr/local/go/src/net/http/server.go:2166 +0x29
github.com/prometheus/client_golang/prometheus/promhttp.InstrumentHandlerDuration.func1({0x1802150, 0xc000b9a0a0}, 0xc000ba2000)
/src/vendor/github.com/prometheus/client_golang/prometheus/promhttp/instrument_server.go:97 +0xfd
net/http.HandlerFunc.ServeHTTP(0xc000b94150?, {0x1802150?, 0xc000b9a0a0?}, 0x12de5b5?)
/usr/local/go/src/net/http/server.go:2166 +0x29
github.com/open-policy-agent/opa/server.(*Server).initHandlerCompression.CompressHandler.func1({0x1802150, 0xc000b9a0a0}, 0xc000ba2000)
/src/server/handlers/compress.go:41 +0x175
net/http.HandlerFunc.ServeHTTP(0xc0006a3ad8?, {0x1802150?, 0xc000b9a0a0?}, 0x11?)
/usr/local/go/src/net/http/server.go:2166 +0x29
github.com/open-policy-agent/opa/runtime.(*LoggingHandler).ServeHTTP(0xc0005ebd10, {0x1801be0, 0xc00078e0e0}, 0xc000ba2000)
/src/runtime/logging.go:116 +0xad2
net/http.serverHandler.ServeHTTP({0x17fe1f0?}, {0x1801be0?, 0xc00078e0e0?}, 0x6?)
/usr/local/go/src/net/http/server.go:3137 +0x8e
net/http.(*conn).serve(0xc000b96000, {0x1804978, 0xc0003ae0f0})
/usr/local/go/src/net/http/server.go:2039 +0x5e8
created by net/http.(*Server).Serve in goroutine 111
/usr/local/go/src/net/http/server.go:3285 +0x4b4

Steps To Reproduce

We haven't been able to reproduce this issue locally, but we'll provide an update if we're successful.

Expected behavior

there shouldn't be a panic and opa server should respond back with the decision.

Additional context

This issue happens randomly, with no difference in the input between the requests that panic and the ones that succeed. The policy and data files are too big to share here, but I can create a smaller example if needed. Maybe the error logs are helpful for now?

@alam-chime alam-chime added the bug label Oct 12, 2024
@anderseknert
Copy link
Member

Hey, and thanks for filing this issue! Those error logs are helpful indeed. A few of us looked into this briefly today, and tbh, this is quite a mystery 🕵️ There's nothing obvious on those lines that should cause a nil dereference under any normal circumstances... and we've not yet managed to come up with even exceptional circumstance that provably cause one. If there's some obvious case we've overlooked, I'd be happy to hear about it!

@alam-chime
Copy link
Author

thanks @anderseknert for taking a look. Do you have any suggestions for debugging this issue? I'm also adding more error logs related to the panics we're seeing. I'll get back with a small example for our setup soon.

2024/10/11 09:12:51 http: panic serving 127.0.0.1:37098: runtime error: makeslice: len out of range
goroutine 950702 [running]:
net/http.(*conn).serve.func1()
/usr/local/go/src/net/http/server.go:1898 +0xbe
panic({0x114a0e0?, 0x17f50f0?})
/usr/local/go/src/runtime/panic.go:770 +0x132
encoding/json.(*encodeState).marshal.func1()
/usr/local/go/src/encoding/json/encode.go:293 +0x6d
panic({0x114a0e0?, 0x17f50f0?})
/usr/local/go/src/runtime/panic.go:770 +0x132
encoding/json.mapEncoder.encode({0x100000002?}, 0xc000dc8240, {0x1135cc0?, 0x666390?, 0x1135cc0?}, {0x5f?, 0x85?})


2024/10/12 16:27:42 http: panic serving 127.0.0.1:41396: can't call pointer on a non-pointer Value
goroutine 623239 [running]:
net/http.(*conn).serve.func1()
/usr/local/go/src/net/http/server.go:1898 +0xbe
panic({0x10cf800?, 0x17f6460?})
/usr/local/go/src/runtime/panic.go:770 +0x132
encoding/json.(*encodeState).marshal.func1()
/usr/local/go/src/encoding/json/encode.go:293 +0x6d
panic({0x10cf800?, 0x17f6460?})
/usr/local/go/src/runtime/panic.go:770 +0x132
reflect.Value.pointer(...)
/usr/local/go/src/reflect/value.go:110
reflect.Value.lenNonSlice({0x1135cc0?, 0xc000b90cf0?, 0x41a205?})


2024/10/13 22:14:10 http: panic serving 127.0.0.1:52366: reflect: call of reflect.Value.IsNil on zero Value
goroutine 1459006 [running]:
net/http.(*conn).serve.func1()
/usr/local/go/src/net/http/server.go:1898 +0xbe
panic({0x112a800?, 0xc000879c68?})
/usr/local/go/src/runtime/panic.go:770 +0x132
encoding/json.(*encodeState).marshal.func1()
/usr/local/go/src/encoding/json/encode.go:293 +0x6d
panic({0x112a800?, 0xc000879c68?})
/usr/local/go/src/runtime/panic.go:770 +0x132
reflect.Value.IsNil(...)
/usr/local/go/src/reflect/value.go:1574
encoding/json.interfaceEncoder(0xc0005c1080, {0x1115f80?, 0xc000682f40?, 0x43?}, {0xe0?, 0x5f?})
/usr/local/go/src/encoding/json/encode.go:654 +0x110


2024/10/13 14:40:25 http: panic serving 127.0.0.1:35712: unexpected map key type
goroutine 1730644 [running]:
net/http.(*conn).serve.func1()
/usr/local/go/src/net/http/server.go:1898 +0xbe
panic({0x10cf800?, 0x17f5430?})
/usr/local/go/src/runtime/panic.go:770 +0x132
encoding/json.(*encodeState).marshal.func1()
/usr/local/go/src/encoding/json/encode.go:293 +0x6d
panic({0x10cf800?, 0x17f5430?})
/usr/local/go/src/runtime/panic.go:770 +0x132
encoding/json.resolveKeyName({0x10cf800?, 0xc000ac6210?, 0xc000ac0660?})


2024/10/14 20:20:17 http: panic serving 127.0.0.1:41322: runtime error: hash of unhashable type ast.String
goroutine 586544 [running]:
net/http.(*conn).serve.func1()
/usr/local/go/src/net/http/server.go:1898 +0xbe
panic({0x114a0e0?, 0xc0011b1070?})
/usr/local/go/src/runtime/panic.go:770 +0x132
github.com/open-policy-agent/opa/topdown.(*baseCache).Put(...)
/src/topdown/cache.go:126
github.com/open-policy-agent/opa/topdown.(*eval).resolveReadFromStorage(0xc0011cc400, {0xc000578828, 0x3, 0x3}, {0x0, 0x0})
/src/topdown/eval.go:1641 +0x58f
github.com/open-policy-agent/opa/topdown.(*evalResolver).Resolve(0xc0011d3ec0, {0xc000578828, 0x3, 0x3})
/src/topdown/eval.go:1563 +0x657
github.com/open-policy-agent/opa/topdown.(*eval).Resolve(...)
/src/topdown/eval.go:1480


2024/10/14 19:43:45 http: panic serving 127.0.0.1:42696: reflect: call of reflect.Value.Bool on zero Value
goroutine 1640899 [running]:
net/http.(*conn).serve.func1()
/usr/local/go/src/net/http/server.go:1898 +0xbe
panic({0x112a800?, 0xc0010d2ae0?})
/usr/local/go/src/runtime/panic.go:770 +0x132
encoding/json.(*encodeState).marshal.func1()
/usr/local/go/src/encoding/json/encode.go:293 +0x6d
panic({0x112a800?, 0xc0010d2ae0?})
/usr/local/go/src/runtime/panic.go:770 +0x132
reflect.flag.mustBe(...)
/usr/local/go/src/reflect/value.go:233
reflect.Value.panicNotBool({0x10cfc00?, 0x2025768?, 0x532965?})
/usr/local/go/src/reflect/value.go:302 +0x7e
reflect.Value.Bool(...)
/usr/local/go/src/reflect/value.go:296


2024/10/14 18:57:46 http: panic serving 127.0.0.1:47100: runtime error: invalid memory address or nil pointer dereference
goroutine 227179 [running]:
net/http.(*conn).serve.func1()
/usr/local/go/src/net/http/server.go:1898 +0xbe
panic({0x114a0e0?, 0x2102610?})
/usr/local/go/src/runtime/panic.go:770 +0x132
github.com/open-policy-agent/opa/internal/deepcopy.Map(...)
/src/internal/deepcopy/deepcopy.go:28
github.com/open-policy-agent/opa/internal/deepcopy.DeepCopy({0x1135cc0?, 0xc000742870})
/src/internal/deepcopy/deepcopy.go:19 +0x173
github.com/open-policy-agent/opa/internal/deepcopy.Map(...)
/src/internal/deepcopy/deepcopy.go:28
github.com/open-policy-agent/opa/internal/deepcopy.DeepCopy({0x1135cc0?, 0xc001165980})


2024/10/12 13:12:22 http: panic serving 127.0.0.1:37580: illegal value: ast.String
goroutine 485802 [running]:
net/http.(*conn).serve.func1()
/usr/local/go/src/net/http/server.go:1898 +0xbe
panic({0x10cf800?, 0xc0003a8310?})
/usr/local/go/src/runtime/panic.go:770 +0x132
github.com/open-policy-agent/opa/ast.sortOrder({0x11d3b40?, 0xc00039fae0?})
/src/ast/compare.go:302 +0x3a5
github.com/open-policy-agent/opa/ast.Compare({0x11d3b40?, 0xc00039fbb0?}, {0x11d3b40?, 0xc00039fae0?})
/src/ast/compare.go:68 +0x10e
github.com/open-policy-agent/opa/ast.objectElemSlice.Less(...)
/src/ast/term.go:2087

@philipaconrad
Copy link
Contributor

I'll get back with a small example for our setup soon.

@alam-chime Did you happen to put together the example? If you're able to share any additional steps or information about your OPA config, it could be helpful for debugging. 🙂

@alam-chime
Copy link
Author

sorry @philipaconrad I missed this message. Will share an example this week.

Copy link

stale bot commented Dec 14, 2024

This issue has been automatically marked as inactive because it has not had any activity in the last 30 days. Although currently inactive, the issue could still be considered and actively worked on in the future. More details about the use-case this issue attempts to address, the value provided by completing it or possible solutions to resolve it would help to prioritize the issue.

@stale stale bot added the inactive label Dec 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants