Skip to content

Enhancing instrumentation manual install#1632

Open
LikeTheSalad wants to merge 4 commits intoopen-telemetry:mainfrom
LikeTheSalad:enhancing-instrumentation-install
Open

Enhancing instrumentation manual install#1632
LikeTheSalad wants to merge 4 commits intoopen-telemetry:mainfrom
LikeTheSalad:enhancing-instrumentation-install

Conversation

@LikeTheSalad
Copy link
Contributor

Related to: #1541

These changes are intended to address "challenge 1", as explained in this comment. By making the OpenTelemetryRum implementation take care of the creation of an InstallationContext object, which contains the proper contextual dependencies that were used to create the OpenTelemetry instance.

@LikeTheSalad LikeTheSalad changed the title Enhancing instrumentation install Enhancing instrumentation manual install Mar 4, 2026
@codecov
Copy link

codecov bot commented Mar 4, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 61.80%. Comparing base (e4fb5a0) to head (d84bfa5).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1632      +/-   ##
==========================================
+ Coverage   61.67%   61.80%   +0.12%     
==========================================
  Files         159      159              
  Lines        3418     3427       +9     
  Branches      348      349       +1     
==========================================
+ Hits         2108     2118      +10     
  Misses       1215     1215              
+ Partials       95       94       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@LikeTheSalad LikeTheSalad marked this pull request as ready for review March 4, 2026 16:01
@LikeTheSalad LikeTheSalad requested a review from a team as a code owner March 4, 2026 16:01
Copy link
Member

@fractalwrench fractalwrench left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking at this. I left a few comments inline but would be broadly happy with this sort of approach.

*
* @param instrumentation The instrumentation to install.
*/
fun install(instrumentation: AndroidInstrumentation)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we want to consider marking this as an incubating/beta API in some way?

override fun install(instrumentation: AndroidInstrumentation) {
val ctx = InstallationContext(context, openTelemetrySdk, sessionProvider, clock)
instrumentation.install(ctx)
manuallyInstalledInstrumentations.add(instrumentation)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth guarding against misuse by checking whether the reference is already in the collection?


override fun install(instrumentation: AndroidInstrumentation) {
val ctx = InstallationContext(context, openTelemetrySdk, sessionProvider, clock)
instrumentation.install(ctx)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth a try-catch for the install/uninstall functions, given that it's user-submitted code that probably shouldn't terminate the entire process?

Copy link
Contributor

@breedx-splk breedx-splk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm failing to understand how adding the install(instrumentation) method on our public/stable OpenTelemetryRum interface is making things better or what problem we're solving. A couple of problems I see with this approach:

  1. I think this change makes things somewhat MORE confusing for users -- who now have to decide if their instrumentation should be added via OpenTelemetryRumBuilder.addInstrumentation(instrumentation) or if they should wait until after they build the OpenTelemetryRum instance in order to call install(instrumentation) on that.
  2. You've introduced mutable state to the OpenTelemetryRumImpl instance. Before this change, basically all the instrumentation state was configured and established at "build time". Once the rum instance was created, things didn't change...and there are numerous benefits to this. Along these same lines, I think it's confusing that a user can now just choose at any old random time to install an instrumentation. I think it's easier to reason about the code when all setup/installation is done at creation time.

I've read through the related issues and comments a couple times, but I still can't make sense of what this helps.

There's some suggestion that the InstallationContext is more accurate or something with this approach, but I think it's just passing along the same instances as before.

@LikeTheSalad
Copy link
Contributor Author

LikeTheSalad commented Mar 6, 2026

Thanks for the detailed feedback, @breedx-splk

It's been a while since I explained the issues in this comment, and I wasn't aware of your opinion on it until now, which is why I proceeded to write the changes. Still, I'm glad that we can discuss it further, because it'd certainly be a big change if this gets through, so it's better to align on the details beforehand.

The problem

The current instrumentation API looks like this:

Screenshot 2026-03-06 at 10 16 31

It's all public, which means nothing stops any user from using it. For example, they can implement their own instrumentation and manually call install() at any time.

However, if they choose to do so, they'd need to construct an InstallationContext object, which requires all the fields shown in the diagram above. So the main roadblock is, how can a user construct an InstallationContext object?

Here's what we provide our users when initializing the agent:

Screenshot 2026-03-06 at 10 20 31

Which means that, out of all the dependencies from InstallationContext, we only help with 1 of them (the OpenTelemetry object). So here's what happens with the rest:

  • For Context, they can get it themselves from the Android SDK, although there's a potential issue here, which is that we need a specific type of Context, which is the "Application Context". This is important because, if an instrumentation impl stores the context for whatever reason, and the user provides a screen-related context, then that can cause a memory leak.
  • For SessionProvider, users would have to create their own object, which would naturally not provide the same session.id values as the one we configure internally when initializing OpenTelemetryRum. This can cause session inconsistency issues across the app.
  • For Clock, users would either have to provide their own implementation, or use the default one, none of which is the implementation that we use when initializing the agent, which can cause time-related inconsistencies.
  • An additional issue, is that we intend for InstallationContext to be "extendable" to adapt to future instrumentation use cases in the future, in case some extra dependencies are needed by then. But once the API is stable, we won't be able to enforce those new dependencies in the constructor, as that would cause a breaking change. So, our only option would be to adding them with default args, which would most likely suffer from similar inconsistencies as the previously mentioned dependencies.

Potential solutions

You can find more details in this issue. But essentially the options discussed there were 2:

  • Adding getters for all of the InstallationContext dependencies to OpenTelemetryRum, and that way getting rid of the need to instantiate a third object to install instrumentations, as we'd only need to pass OpenTelemetryRum to install an instrumentation, where it could find everything it needs and it's extendable without introducing breaking changes. This option was considered dangerous because we would provide instrumentations the ability to shut the agent down, given that the OpenTelemetryRum api allows it.
  • Removing the need for users to construct the InstallationContext themselves, which would not only make it easier for them to manually install instrumentations, but also ensure that the instrumentations are consistent with the whole context used in OpenTelemetryRum. This option sounded like the least problematic in the thread, hence I created this PR with it.

Based on your comments, it sounds like you're only taking into account the ability to install instrumentations at the time of initializing OpenTelemetryRum. This option is also valid, in the sense that it would prevent the issues mentioned above, though for it to work, we should explain in the docs that there's a "recommended way" of installing instrumentations, and explain that we wouldn't support issues that may arise from attempting to install them via other methods. Also, we should probably make the API more restricted if possible, to reduce the chances of users trying to manually install it by calling install() by themselves. It'd probably be a strict requirement, given that it's technically possible to still install instrumentations after the agent is initialized, but at least it would make the maintenance easier, so I'm not opposed to going with this option. But we need to be aligned, so let me know what you think.

@breedx-splk
Copy link
Contributor

Thanks @LikeTheSalad for the reply and discussion.

It's all public, which means nothing stops any user from using it. For example, they can implement their own instrumentation and manually call install() at any time.

Sure, but I actually don't want users to call install() on instrumentation. This should be a responsibility of the agent (and/or OpenTelemetryRum startup/creation).

However, if they choose to do so, they'd need to construct an InstallationContext object, which requires all the fields shown in the diagram above.

Right, and if you recall, we used to not have an InstallationContext class at all -- it used to be something like install(app, otel, etc)...and we intentionally added InstallationContext class so that we could modify what gets passed to instrumentations without breaking the contract. It's much easier to change the internals off the InstallationContext implementation instead of changing the signature of the install() method (or that was the thinking at the time anyway).

So the main roadblock is, how can a user construct an InstallationContext object?

So that's the thing: I don't want users to ever have to create instances of this! It's not a user's responsibility. I wish I could hide it or make it internal or something. If they have some custom AndroidInstrumentation that they've built, then they should wire that up to the agent (or builder) and away we go. The agent is then also responsible for managing the lifecycle and ensuring that installed bits are uninstalled, etc...without the user needing to manage this.

If we want to relax this and allow users to install AndroidInstrumentation implementations willy-nillly whenever they choose, then this is a pretty big change. If we do that, then maybe it's less bad to let the OpenTelemetryRumImpl contain the instance of the InstallationContext (it's effectively a singleton), and that could be accessed by users doing self-service install or whatever. One thing that I don't like about this approach immediately is that it then provides multiple ways to get at the same things (eg. otelRum.getRumSessionId() and otelRum.getInstrumentationInstallationContext().getSessionProvider().getSessionId()) and it probably exposes more public interface than we want.

Another idea: I think if we put getClock() on OpenTelemetryRum we could delete the InstallationContext class and just go back to install(Context, OpenTelemetryRum).

@LikeTheSalad
Copy link
Contributor Author

LikeTheSalad commented Mar 9, 2026

So that's the thing: I don't want users to ever have to create instances of this! It's not a user's responsibility. I wish I could hide it or make it internal or something.

I think we all agree on this. The changes proposed in this PR are an option to prevent users from having to manually instantiate InstallationContext, while still keeping this type around.

If we want to relax this and allow users to install AndroidInstrumentation implementations willy-nillly whenever they choose, then this is a pretty big change.

I think it's the other way around. Users have always had the option to install instrumentations willy-nilly whenever they choose. We could instead do what I mentioned earlier, of at least documenting that there's only a "supported way" of doing so, and that we wouldn't provide support for issues that may arise from installing instrumentations in a "non-supported" way. Though it would probably cause issues, as it would be a virtual constraint.

Another idea: I think if we put getClock() on OpenTelemetryRum we could delete the InstallationContext class and just go back to install(Context, OpenTelemetryRum).

This sounds like the first option I mentioned earlier, where we'd get rid of InstallationContext and instead provide its dependencies via OpenTelemetryRum. I've always liked that option, though it was considered dangerous because we'd expose the shutdown() method to instrumentations. But I like that it reduces the api surface and I guess the shutdown concern can be verified for all the instrumentation impl that we define in OTel Android, so it should be manageable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants