-
-
Notifications
You must be signed in to change notification settings - Fork 40
Implement SQLCommenter support #176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
||
intercalate :: (Monoid a, Foldable f) => a -> f a -> a | ||
intercalate delim l = mconcat (intersperse delim l) | ||
{-# INLINE intercalate #-} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are foldable versions of the list equivalent.
-- SQL Commenter spec wants them escaped with a slash, but this should | ||
-- probably solve the same issue | ||
unreservedQS :: Word8Set | ||
unreservedQS = foldr insert SqlCommenter.empty $ map c2w "-_.~'" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This builds a data structure that can perform fast checks to determine whether to escape a character or not for a URL. This is similar to @ekmett's charset package, but with a much more constrained purposed, and is intended to avoid incurring extra dependencies. https://hackage.haskell.org/package/charset-0.3.11/docs/Data-CharSet.html
sqlCommenterKey :: Ctxt.Key (Map Text Text) | ||
sqlCommenterKey = unsafePerformIO $ Ctxt.newKey "sqlcommenter-attributes" | ||
{-# NOINLINE sqlCommenterKey #-} | ||
|
||
lookupSqlCommenterAttributes :: Ctxt.Context -> Map Text Text | ||
lookupSqlCommenterAttributes ctxt = case Ctxt.lookup sqlCommenterKey ctxt of | ||
Nothing -> mempty | ||
Just attrs -> attrs | ||
|
||
getSqlCommenterAttributes :: IO (Map Text Text) | ||
getSqlCommenterAttributes = lookupSqlCommenterAttributes <$> TL.getContext |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess sqlCommenterKey
has to be a global variable b/c of how Vault
works, and since it's an implementation detail that never leaks beyond the scope of this module there shouldn't be a problem here?
checking my understanding: we can't make sqlCommenterKey :: IO _
because we need lookupSqlCommenterAttributes
to have a pure API for any callers who might need to use it in a pure context (even though getSqlCommenterAttributes :: IO _
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. The main thing is that sqlCommenterKey
needs to be stable across uses. If we had users intitialize the key and pass it in any time they want to use it to look up the sqlcommenter attributes, then we could make it be an IO value. However, I don't think that's a great experience, and is relatively error prone compared to just making sure that there's a stable reference available.
And, while it's not exported currently, there might be a world in which we want to support adding attributes at a thread-local storage level, which necessitates the stable key.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't see this sort of context in newKey
docs, so it might be worth adding to a Haddock on sqlCommenterKey
at least. up to you, though
& M.insert "traceparent" (unsafeConvert traceparent) | ||
& M.insert "tracestate" (unsafeConvert tracestate) | ||
where | ||
unsafeConvert = B.toText . B.unsafeFromByteString |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
encodeSpanContext gives us two ByteString
s.
it looks like these are produced by traceparentHeader, which appears to construct these directly using a combination of primitives (char7 '-'
, word8HexFixed
on the Word8
stored in TraceFlags
) & Base16
-encoded SpanId
& TraceId
values.
You've got tests for this, so presumably something upholds the invariant that all of these must be UTF-8 encoded text but I'm having trouble understand what exactly this is (if it is anything besides convention).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. The W3C traceparent
and tracestate
headers by definition are US-ASCII encoded, so we don't have to think super hard about this assuming that import OpenTelemetry.Propagator.W3CTraceContext (encodeSpanContext)
is implemented correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aha okay, that's the missing link for me; I would maybe add this in a comment onto unsafeConvert
in the same manner that Rust encourages documenting assumed invariants around unsafe
blocks just so that it's clear.
specify "Does not add comment if the query already has a comment" $ do | ||
let queries = | ||
[ "SELECT * FROM table -- noodle" | ||
, "SELECT * FROM table -- noodle\n" | ||
, "SELECT * FROM table /* noodle\n poodle\n*/" | ||
] | ||
someAttrs = M.fromList [("foo", "bar")] | ||
for_ queries $ \query -> do | ||
sqlCommenter query someAttrs `shouldBe` query |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't quite follow what this is testing; someAttrs
is just ignored entirely and the query is returned as-is because there is an existing comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, exactly.
Per the sqlcommenter spec:
If a comment already exists within a SQL statement, we MUST NOT mutate that statement.
specify "Span attributes are picked up from thread-local context" $ do | ||
tp <- createTracerProvider [] emptyTracerProviderOptions | ||
let t = makeTracer tp (InstrumentationLibrary "test" "test") tracerOptions | ||
inSpan t "test" defaultSpanArguments $ do | ||
attrs <- getSqlCommenterAttributesWithTraceData | ||
M.lookup "traceparent" attrs `shouldSatisfy` isJust | ||
M.lookup "tracestate" attrs `shouldSatisfy` isJust | ||
specify "Parsing a queries reads attributes from the first comment" $ do | ||
let query1 = "SELECT * FROM table -- foo='bar'\n-- bar='baz'" | ||
query2 = "SELECT * FROM table /* noodle='poodle',wibble='wobble'*/ /* foo='bar' */" | ||
parseFirstSqlComment query1 `shouldBe` M.fromList [("foo", "bar")] | ||
parseFirstSqlComment query2 `shouldBe` M.fromList [("noodle", "poodle"), ("wibble", "wobble")] | ||
specify "Parsing a query decodes encoded bytes" $ do | ||
hedgehog $ do | ||
let attrList = M.fromList <$> Gen.list (Range.linear 0 30) ((,) <$> textValGen <*> textValGen) | ||
kvs <- forAll attrList | ||
kvs === parseFirstSqlComment (sqlCommenter "" kvs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think any of these tests cover the case where sqlCommenter
actually modifies the query, right?
I think it'd be useful to add an end-to-end to check that span attributes get picked up tacked onto a query in comments according to the sqlcommenter example documentation (or something).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very good point. Will add tests for this.
I think this could do with a bit of explanation for how to actually use it? As far as I can tell, there are several things here:
I guess the expectation is that on top of this you:
I guess I'm a bit unsure about the "action-at-a-distance" transmission of the attributes. This requires knowing to put attributes in a special place, that is a) specific to databases, and b) specific to a particular method of adding information into queries. As a user, I maybe don't even know that the An alternative would be to use something like the existing support for "carry-on" attributes. Then if a higher-scope piece of code wants to add an attribute that should get into the attributes of a database span, it can add it to the carry-ons? |
This adds support for the
sqlcommenter
spec, as outlined by folks at Google. We're using it at https://github.com/MercuryTechnologies for improved correlation between SQL queries and the rest of the OTel ecosystem. The main area we're seeing benefits is via https://pganalyze.com/, which has native support for the format. Namely, we can have "explain analyze" traces added to our existing application traces to help us understand why certain queries behave badly in production.We've used this in prod for about 10 months, so I'm not particularly concerned about the efficacy of the implementation, but I do want to ensure that it builds against API changes & is updated on a similar cadence to other packages in this repo.
Why should this package live in this repo? Because Google donated it to the OpenTelemetry community: https://cloud.google.com/blog/products/databases/sqlcommenter-merges-with-opentelemetry