Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add db.cosmosdb.partition_key as a span attr on item ops #23664

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

kjg
Copy link
Contributor

@kjg kjg commented Oct 28, 2024

It would be useful to have the partition_key as an attribute on Item operations to make it easier to see if there are any performance based patterns in certain partitions, what are you thoughts on including it?

I know that db.cosmosdb.partition_key isn't in the OpenTelemetry Semantic Conventions (yet) as an attribute, but we could open a PR for that as well.

  • The purpose of this PR is explained in this or a referenced issue.
  • The PR does not update generated files.
  • Tests are included and/or updated for code changes.
  • Updates to module CHANGELOG.md are included.
  • MIT license headers are included in each file.

@github-actions github-actions bot added Community Contribution Community members are working on the issue Cosmos customer-reported Issues that are reported by GitHub users external to the Azure organization. labels Oct 28, 2024
Copy link

Thank you for your contribution @kjg! We will review the pull request and get back to you soon.

@Pilchie
Copy link
Member

Pilchie commented Oct 30, 2024

@FabianMeiswinkel @sourabh1007 @jcocchi - any thoughts here on the impact of this?

@analogrelay
Copy link
Member

My main concern is PII leakage. The query text and parameters are in the Semantic Conventions but with explicit notes about sanitization. The partition key could be considered PII in some cases. I don't know for sure if we currently emit db.query.text or db.query.parameter.<key> but they're all in the same "class" of value to me, and should be opt-in.

I'm OK having it as an option, it does seem quite useful if you know your PKs are not PII.

@kjg
Copy link
Contributor Author

kjg commented Oct 30, 2024

azcosmos does not yet emit db.query.text but it actually is on my list of PRs I'd like to make. I'll see what I can do about making it optional. Any thoughts on the best place for consumers to specify the option and if it should be opt-in or opt-out?

For my team's use case, I could see us wanting to emit db.query.text and db.cosmosdb.partition_key but probably not db.query.parameter.<key>, so I'm not sure if we'd want to make them separately optional?

@analogrelay
Copy link
Member

azcosmos does not yet emit db.query.text but it actually is on my list of PRs I'd like to make.

I absolutely think we need db.query.text to be opt-in, given the requirement in the OpenTelemetry guidance (which aligns with our own privacy/security guidance) to sanitize query text. I don't think it's feasible for us to add sanitization logic, so having it be opt-in is the appropriate alternative.


One thought is to have a list of "enabled tracing attributes" in azcosmos.ClientOptions. This would be a list of opt-in tracing attributes which the user is opting in to. Alas, my Go is rusty, but I don't think you can initialize a slice inside a struct initializer. What I'd like is to be able to write this:

options := azcosmos.ClientOptions {
    EnabledTracingAttributes: []string { "db.query.text", "db.cosmosdb.partition_key" },
}

But I don't know if that's feasible.

There's also the element of having to scan this list every time we build a span, to identify which attributes to specify.

I'd be curious what, if anything, other SDKs (both other Azure services' Go SDKs and Azure SDKs for other languages) do here 🤔 . Might poke around a bit and see what I can find.

@sourabh1007
Copy link

azcosmos does not yet emit db.query.text but it actually is on my list of PRs I'd like to make.

I absolutely think we need db.query.text to be opt-in, given the requirement in the OpenTelemetry guidance (which aligns with our own privacy/security guidance) to sanitize query text. I don't think it's feasible for us to add sanitization logic, so having it be opt-in is the appropriate alternative.

One thought is to have a list of "enabled tracing attributes" in azcosmos.ClientOptions. This would be a list of opt-in tracing attributes which the user is opting in to. Alas, my Go is rusty, but I don't think you can initialize a slice inside a struct initializer. What I'd like is to be able to write this:

options := azcosmos.ClientOptions {
    EnabledTracingAttributes: []string { "db.query.text", "db.cosmosdb.partition_key" },
}

But I don't know if that's feasible.

There's also the element of having to scan this list every time we build a span, to identify which attributes to specify.

I'd be curious what, if anything, other SDKs (both other Azure services' Go SDKs and Azure SDKs for other languages) do here 🤔 . Might poke around a bit and see what I can find.

@analogrelay We have added db.query.text as an opt-in attribute (ref. here) in cosmos DB SDK.

@sourabh1007
Copy link

@FabianMeiswinkel @sourabh1007 @jcocchi - any thoughts here on the impact of this?

As @analogrelay pointed out, the primary concern here is protecting PII, particularly if referencing logical partition keys tied to customer data. Instead, we could use physical identifiers such as partitionid (and replicaid) or pkrangeid, which would help in identifying potential hot partition issues without exposing sensitive information.

In the Cosmos DB SDK, we currently don't emit traces at the network level, as network interactions can involve multiple partition IDs per operation. Therefore, we haven't included this data in traces yet. However, we are planning to introduce it as an optional dimension in network-level OpenTelemetry (OTel) metrics (ref here). This would allow users to opt in as needed, helping to manage cardinality effectively.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Community Contribution Community members are working on the issue Cosmos customer-reported Issues that are reported by GitHub users external to the Azure organization.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants