Implement SQLCommenter support #176

iand675 · 2025-02-13T18:23:39Z

This adds support for the sqlcommenter spec, as outlined by folks at Google. We're using it at https://github.com/MercuryTechnologies for improved correlation between SQL queries and the rest of the OTel ecosystem. The main area we're seeing benefits is via https://pganalyze.com/, which has native support for the format. Namely, we can have "explain analyze" traces added to our existing application traces to help us understand why certain queries behave badly in production.

We've used this in prod for about 10 months, so I'm not particularly concerned about the efficacy of the implementation, but I do want to ensure that it builds against API changes & is updated on a similar cadence to other packages in this repo.

Why should this package live in this repo? Because Google donated it to the OpenTelemetry community: https://cloud.google.com/blog/products/databases/sqlcommenter-merges-with-opentelemetry

iand675 · 2025-02-13T18:28:24Z

utils/sqlcommenter/src/SqlCommenter.hs

+
+intercalate :: (Monoid a, Foldable f) => a -> f a -> a
+intercalate delim l = mconcat (intersperse delim l)
+{-# INLINE intercalate #-}


These are foldable versions of the list equivalent.

iand675 · 2025-02-13T18:30:25Z

utils/sqlcommenter/src/SqlCommenter.hs

+-- SQL Commenter spec wants them escaped with a slash, but this should
+-- probably solve the same issue
+unreservedQS :: Word8Set
+unreservedQS = foldr insert SqlCommenter.empty $ map c2w "-_.~'"


This builds a data structure that can perform fast checks to determine whether to escape a character or not for a URL. This is similar to @ekmett's charset package, but with a much more constrained purposed, and is intended to avoid incurring extra dependencies. https://hackage.haskell.org/package/charset-0.3.11/docs/Data-CharSet.html

jkachmar · 2025-02-13T18:44:25Z

utils/sqlcommenter/src/SqlCommenter.hs

+sqlCommenterKey :: Ctxt.Key (Map Text Text)
+sqlCommenterKey = unsafePerformIO $ Ctxt.newKey "sqlcommenter-attributes"
+{-# NOINLINE sqlCommenterKey #-}
+
+lookupSqlCommenterAttributes :: Ctxt.Context -> Map Text Text
+lookupSqlCommenterAttributes ctxt = case Ctxt.lookup sqlCommenterKey ctxt of
+  Nothing -> mempty
+  Just attrs -> attrs
+
+getSqlCommenterAttributes :: IO (Map Text Text)
+getSqlCommenterAttributes = lookupSqlCommenterAttributes <$> TL.getContext


I guess sqlCommenterKey has to be a global variable b/c of how Vault works, and since it's an implementation detail that never leaks beyond the scope of this module there shouldn't be a problem here?

checking my understanding: we can't make sqlCommenterKey :: IO _ because we need lookupSqlCommenterAttributes to have a pure API for any callers who might need to use it in a pure context (even though getSqlCommenterAttributes :: IO _).

Right. The main thing is that sqlCommenterKey needs to be stable across uses. If we had users intitialize the key and pass it in any time they want to use it to look up the sqlcommenter attributes, then we could make it be an IO value. However, I don't think that's a great experience, and is relatively error prone compared to just making sure that there's a stable reference available.

And, while it's not exported currently, there might be a world in which we want to support adding attributes at a thread-local storage level, which necessitates the stable key.

I didn't see this sort of context in newKey docs, so it might be worth adding to a Haddock on sqlCommenterKey at least. up to you, though

jkachmar · 2025-02-13T18:57:29Z

utils/sqlcommenter/src/SqlCommenter.hs

+    & M.insert "traceparent" (unsafeConvert traceparent)
+    & M.insert "tracestate" (unsafeConvert tracestate)
+  where
+    unsafeConvert = B.toText . B.unsafeFromByteString


encodeSpanContext gives us two ByteStrings.

it looks like these are produced by traceparentHeader, which appears to construct these directly using a combination of primitives (char7 '-', word8HexFixed on the Word8 stored in TraceFlags) & Base16-encoded SpanId & TraceId values.

You've got tests for this, so presumably something upholds the invariant that all of these must be UTF-8 encoded text but I'm having trouble understand what exactly this is (if it is anything besides convention).

Good question. The W3C traceparent and tracestate headers by definition are US-ASCII encoded, so we don't have to think super hard about this assuming that import OpenTelemetry.Propagator.W3CTraceContext (encodeSpanContext) is implemented correctly.

aha okay, that's the missing link for me; I would maybe add this in a comment onto unsafeConvert in the same manner that Rust encourages documenting assumed invariants around unsafe blocks just so that it's clear.

jkachmar · 2025-02-13T19:05:43Z

utils/sqlcommenter/test/Spec.hs

+  specify "Does not add comment if the query already has a comment" $ do
+    let queries =
+          [ "SELECT * FROM table -- noodle"
+          , "SELECT * FROM table -- noodle\n"
+          , "SELECT * FROM table /* noodle\n  poodle\n*/"
+          ]
+        someAttrs = M.fromList [("foo", "bar")]
+    for_ queries $ \query -> do
+      sqlCommenter query someAttrs `shouldBe` query


I don't quite follow what this is testing; someAttrs is just ignored entirely and the query is returned as-is because there is an existing comment?

Yes, exactly.

Per the sqlcommenter spec:

If a comment already exists within a SQL statement, we MUST NOT mutate that statement.

jkachmar · 2025-02-13T19:22:07Z

utils/sqlcommenter/test/Spec.hs

+  specify "Span attributes are picked up from thread-local context" $ do
+    tp <- createTracerProvider [] emptyTracerProviderOptions
+    let t = makeTracer tp (InstrumentationLibrary "test" "test") tracerOptions
+    inSpan t "test" defaultSpanArguments $ do
+      attrs <- getSqlCommenterAttributesWithTraceData
+      M.lookup "traceparent" attrs `shouldSatisfy` isJust
+      M.lookup "tracestate" attrs `shouldSatisfy` isJust
+  specify "Parsing a queries reads attributes from the first comment" $ do
+    let query1 = "SELECT * FROM table -- foo='bar'\n-- bar='baz'"
+        query2 = "SELECT * FROM table /* noodle='poodle',wibble='wobble'*/ /* foo='bar' */"
+    parseFirstSqlComment query1 `shouldBe` M.fromList [("foo", "bar")]
+    parseFirstSqlComment query2 `shouldBe` M.fromList [("noodle", "poodle"), ("wibble", "wobble")]
+  specify "Parsing a query decodes encoded bytes" $ do
+    hedgehog $ do
+      let attrList = M.fromList <$> Gen.list (Range.linear 0 30) ((,) <$> textValGen <*> textValGen)
+      kvs <- forAll attrList
+      kvs === parseFirstSqlComment (sqlCommenter "" kvs)


I don't think any of these tests cover the case where sqlCommenter actually modifies the query, right?

I think it'd be useful to add an end-to-end to check that span attributes get picked up tacked onto a query in comments according to the sqlcommenter example documentation (or something).

Very good point. Will add tests for this.

michaelpj · 2025-02-14T18:23:14Z

I think this could do with a bit of explanation for how to actually use it? As far as I can tell, there are several things here:

Support for adding sqlcommenter comments to queries. This seems useful and arguably deserves its own library, which need not depend on any OTel bits?
Support for a named section of the thread-local context which is used for special attributes associated with sqlcommenter
Support for adding some OTel specific attributes into the special section

I guess the expectation is that on top of this you:

Modify the instrumentation libraries for databases to call getSqlCommenterAttributesWithTraceData and then use sqlcommenter to actually put them in queries.
Tell people to add things into the special sqlcommenter section in order to get them into their queries.

I guess I'm a bit unsure about the "action-at-a-distance" transmission of the attributes. This requires knowing to put attributes in a special place, that is a) specific to databases, and b) specific to a particular method of adding information into queries. As a user, I maybe don't even know that the postgresql-simple instrumentation library is using sqlcommenter - should I need to know that? If we were going to have a special section for such attributes, perhaps it should be called postgres or database?

An alternative would be to use something like the existing support for "carry-on" attributes. Then if a higher-scope piece of code wants to add an attribute that should get into the attributes of a database span, it can add it to the carry-ons?

iand675 added 2 commits May 29, 2024 09:46

Add sqlcommenter package

9b7d632

Merge remote-tracking branch 'origin/main' into sqlcommenter

dfda522

iand675 requested review from ocharles, michaelpj, lf- and 9999years February 13, 2025 18:23

Add sqlcommenter to cabal.project

4294258

iand675 commented Feb 13, 2025

View reviewed changes

jkachmar reviewed Feb 13, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement SQLCommenter support #176

Implement SQLCommenter support #176

iand675 commented Feb 13, 2025

iand675 Feb 13, 2025

iand675 Feb 13, 2025

jkachmar Feb 13, 2025

iand675 Feb 13, 2025 •

edited

Loading

jkachmar Feb 13, 2025

jkachmar Feb 13, 2025

iand675 Feb 13, 2025

jkachmar Feb 13, 2025

jkachmar Feb 13, 2025

iand675 Feb 13, 2025

jkachmar Feb 13, 2025

iand675 Feb 13, 2025

michaelpj commented Feb 14, 2025

Implement SQLCommenter support #176

Are you sure you want to change the base?

Implement SQLCommenter support #176

Conversation

iand675 commented Feb 13, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

iand675 Feb 13, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

michaelpj commented Feb 14, 2025

iand675 Feb 13, 2025 •

edited

Loading