|
Note
|
This repository contains the guide documentation source. To view the guide in published form, view it on the Open Liberty website. |
Learn how to add manual instrumentation to collect custom spans in traces and custom metrics from microservices by using MicroProfile Telemetry and the Grafana stack.
Automatic instrumentation in MicroProfile Telemetry makes it easy to capture traces, metrics, and logs without changing your code. However, it might not capture all the details that you need. Often, you need to monitor specific operations, track custom events, or measure performance indicators unique to your application.
Manual instrumentation lets you add custom spans and metrics directly to your code. These custom signals are collected alongside the automatically captured data, allowing you to view and analyze everything together in the same observability backend.
In this guide, you’ll learn how to add manual instrumentation to your microservices by creating custom spans in traces and custom metrics to extend the default telemetry. You’ll use the Grafana Docker OpenTelemetry LGTM image, a preconfigured OpenTelemetry observability backend based on the Grafana stack.
This guide builds on the Enabling observability in microservices with traces, metrics, and logs using OpenTelemetry and Grafana guide. If you are not familiar with enabling automatic telemetry collection, it will be helpful to read that guide before you proceed.
The diagram shows a distributed environment with multiple services. For simplicity, this guide configures only the system and inventory services.
The system service provides system load information, while the inventory service retrieves and stores this data by calling the system service through a MicroProfile REST Client. Both services expose endpoints built with Jakarta RESTful Web Services.
In addition, the inventory service makes periodic background requests to the system service every 15 seconds to refresh system load information for all stored systems.
Before you begin, ensure that Docker is installed and running on your system. For installation instructions, see the official Docker documentation.
Start a container from the grafana/otel-lgtm Docker image by running the following command:
docker run -d --name otel-lgtm -p 3000:3000 -p 4317:4317 -p 4318:4318 --rm -ti grafana/otel-lgtm
You can monitor the container startup by viewing its logs:
docker logs otel-lgtm
It may take a minute for the container to start. After you see the following message, your observability stack is ready:
The OpenTelemetry collector and the Grafana LGTM stack are up and running.
When the container is running, you can access the Grafana dashboard at the http://localhost:3000 URL.
The finish directory in the root of this guide contains the finished application. The finished application includes manual instrumentation that adds custom spans and custom metrics in addition to the default telemetry. Give it a try before you proceed.
To try out the application, go to the finish directory and run the following Maven goal to build the system service and deploy it to Open Liberty:
mvnw.cmd -pl system liberty:run./mvnw -pl system liberty:run./mvnw -pl system liberty:runNext, open another command-line session in the finish directory and run the following command to start the inventory service:
mvnw.cmd -pl inventory liberty:run./mvnw -pl inventory liberty:run./mvnw -pl inventory liberty:runAfter you see the following message in both command-line sessions, both of your services are ready:
The defaultServer server is ready to run a smarter planet.
When both services are running, visit the http://localhost:9081/inventory/systems/localhost URL. This action triggers the inventory service to retrieve and store system load information for localhost by making a request to the system service at http://localhost:9080/system/systemLoad. The inventory service also refreshes stored systems every 15 seconds in the background. These requests are traced and included in the collected telemetry.
Open the Grafana dashboard at the http://localhost:3000 URL.
To view the traces, open the Explore view from the left menu. Select Tempo as the data source, set Query type to Search, and click the blue Run query button. Find and open the trace named GET /inventory/systems/{hostname}. You see a result similar to the following:
The trace contains five spans, four from the inventory service and one from the system service. Expand a span to view request and response timing, span attributes, and events. For example, the Inventory Manager GetSystemLoad span shows the hostname attribute with the value localhost and a Retrieved system load event.
To view the metrics, go to Drilldown → Metrics from the left menu. In the metrics view, use the A_ prefix filters control on the left menu to select inventory. At the upper right, set the time range to Last 5 minutes and set Auto refresh interval to 10s to watch updates. You see a result similar to the following:
After you’re finished reviewing the application, stop the Open Liberty instances by pressing CTRL+C in the command-line sessions where you ran the system and inventory services. Alternatively, you can run the following goals from the finish directory in another command-line session:
mvnw.cmd -pl system liberty:stop
mvnw.cmd -pl inventory liberty:stop./mvnw -pl system liberty:stop
./mvnw -pl inventory liberty:stop./mvnw -pl system liberty:stop
./mvnw -pl inventory liberty:stopWhen OpenTelemetry is enabled in Open Liberty, automatic instrumentation applies only to Jakarta RESTful Web Services (JAX-RS) servers and clients, and MicroProfile REST Clients. To capture more details, such as internal logic or interactions with external systems like databases, you can manually instrument your application code to enhance observability.
Navigate to the start directory to begin.
system/server.xml
link:finish/system/src/main/liberty/config/server.xml[role=include]inventory/server.xml
link:finish/inventory/src/main/liberty/config/server.xml[role=include]system/bootstrap.properties
link:finish/system/src/main/liberty/config/bootstrap.properties[role=include]inventory/bootstrap.properties
link:finish/inventory/src/main/liberty/config/bootstrap.properties[role=include]The mpTelemetry feature, which enables MicroProfile Telemetry support in Open Liberty, is already enabled for both the system and inventory services. The OpenTelemetry SDK is also enabled by setting the otel.sdk.disabled property to false for both the system and inventory services, allowing telemetry data to be collected from each.
Start the services to begin collecting telemetry data.
When you run Open Liberty in dev mode, dev mode listens for file changes and automatically recompiles and deploys your updates whenever you save a new change. Run the following command to start the system service in dev mode:
mvnw.cmd -pl system liberty:dev./mvnw -pl system liberty:dev./mvnw -pl system liberty:devOpen another command-line session and run the following command to start the inventory service in dev mode:
mvnw.cmd -pl inventory liberty:dev./mvnw -pl inventory liberty:dev./mvnw -pl inventory liberty:devWhen you see the following message, your Liberty instances are ready in dev mode:
************************************************************** * Liberty is running in dev mode.
Dev mode holds your command-line session to listen for file changes. Open another command-line session to continue, or open the project in your editor.
You can trace your Jakarta CDI beans by annotating a method with the @WithSpan annotation.
Replace theInventoryManagerclass.inventory/src/main/java/io/openliberty/guides/inventory/InventoryManager.java
InventoryManager.java
link:finish/inventory/src/main/java/io/openliberty/guides/inventory/InventoryManager.java[role=include]The list(), getSystemLoad(), and set() methods are annotated with the @WithSpan annotation. In this example, the default span name assigned to the list() method is automatically generated through the instrumentation. You can also provide a custom span name, such as Inventory Manager GetSystemLoad for the getSystemLoad() method, and Inventory Manager Set for the set() method. Each time one of these methods is called, the annotation creates a span and establishes the appropriate relationships with the current trace context.
Optionally, you can include parameters and their values in the span by using the @SpanAttribute annotation. For example, the @SpanAttribute annotations on the host parameter capture its value as the hostname attribute in the spans generated by the getSystemLoad() and set() methods.
To learn more about how to use OpenTelemetry annotations to instrument code, see the OpenTelemetry Annotations documentation.
You can now view the traces that are generated by the @WithSpan annotation.
Visit the http://localhost:9081/inventory/systems URL to view the inventory, then open the Grafana dashboard at the http://localhost:3000 URL. In the Explore view, select Tempo as the data source, set Query type to Search, and click the blue Run query button. Find and click the trace ID for the request named GET /inventory/systems. You see the following result:
Verify that there are two spans from the inventory service. You see the InventoryManager.list span, which is created by the @WithSpan annotation.
To check out the information generated by the @SpanAttribute annotation, visit the http://localhost:9081/inventory/systems/localhost URL to fetch and store the localhost system information in inventory. Rerun your Grafana query, then find and click the trace ID for the request named GET /inventory/systems/{hostname}. You see the following result:
Verify that there are four spans from the inventory service and one from the system service. Expand the Inventory Manager GetSystemLoad and the Inventory Manager Set spans and check their Span attributes field. You see the hostname attribute with the value localhost in each span, created by the @SpanAttribute annotation.
Open Liberty provides access to the underlying OpenTelemetry Tracer instance through the MicroProfile Telemetry feature, allowing you to manually create spans in your application. You can inject the Tracer into any class to record custom spans.
SystemLoadRefreshScheduler.java
link:start/inventory/src/main/java/io/openliberty/guides/inventory/SystemLoadRefreshScheduler.java[role=include]InventoryManager.java
link:finish/inventory/src/main/java/io/openliberty/guides/inventory/InventoryManager.java[role=include]When you explore the application traces in Grafana, you can see traces named Inventory Manager GetSystemLoad and Inventory Manager Set continuously generated by the inventory service. These traces originate from the SystemLoadRefreshScheduler.refreshSystemLoads() method, which runs every 15 seconds to fetch load for all stored systems and update the inventory by calling getSystemLoad() and set() in the InventoryManager class.
Because these methods are annotated for tracing, each call creates a span. If at least one system exists in the inventory, such as the localhost system that is added, the scheduler repeatedly invokes these methods, generating spans on every cycle.
To make these refresh operations easier to identify, use the Tracer API to create a custom root span for the refreshSystemLoads() method.
Replace theSystemLoadRefreshSchedulerclass:inventory/src/main/java/io/openliberty/guides/inventory/SystemLoadRefreshScheduler.java
SystemLoadRefreshScheduler.java
link:finish/inventory/src/main/java/io/openliberty/guides/inventory/SystemLoadRefreshScheduler.java[role=include]The OpenTelemetry Tracer bean is injected into the SystemLoadRefreshScheduler class to manually instrument code for trace collection. The refreshSystemLoads() method creates a custom span named RefreshingSystemLoads using the spanBuilder() and startSpan() methods to track each periodic update of system load data.
After the span is created, the makeCurrent() method sets it as the current span. The current span serves as the parent for any new spans that are created in the same thread, whether automatically by Open Liberty or manually through the API. The makeCurrent() method returns a Scope object that must be closed to restore the previous context. The code uses a try-with-resources block so that the Scope is closed automatically at the end of the block.
The OpenTelemetry Span API is used to record runtime information within the active span. The setAttribute() method adds key-value pairs that persist for the duration of the span, such as the systems.total attribute, which stores the total number of systems in the current inventory, and the systems.refreshed attribute, which records the number of systems that were successfully refreshed. The addEvent() method records events within the span. In this example, the span includes an event when no systems are found to refresh or the number of systems that were successfully refreshed.
Each span must be properly ended by calling end() on the span. If the span is not ended, it will not be recorded or displayed in the trace. The method ensures that end() is always called by placing it inside a finally block.
Because you updated the inventory service, its data is cleared.
To view the trace data from the RefreshingSystemLoads span, open the Grafana dashboard at the http://localhost:3000 URL. In the Explore view, select the Tempo data source and click the blue Run query button. Select a trace named RefreshingSystemLoads. You should see one span. Expand it and check the Span attributes field, both the systems.total and systems.refreshed attributes show 0. In the Events field, you see No systems found to refresh.
Next, visit the http://localhost:9081/inventory/systems/localhost URL to retrieve and add the localhost system to the inventory. Wait for about 15 seconds, then return to the Grafana dashboard and rerun the query. Select the latest RefreshingSystemLoads trace, which appears after a GET /inventory/systems/{hostname} trace. You now see five spans, four from the inventory service and one from the system service. Expand the RefreshingSystemLoads span and check the Span attributes and Events fields. You see that the systems.total and systems.refreshed attributes both have a value of 1, and the Events field shows Refreshed system load for 1 hosts.
To simulate a case where the localhost system in the inventory is unavailable, stop the Liberty instance for the system microservice by pressing CTRL+C in its dev mode console. Wait for 15 seconds and rerun the Grafana query. Select the latest trace named RefreshingSystemLoads. This time, you see two spans. Expand the RefreshingSystemLoads span and check the Span attributes and Events fields. The systems.total attribute has a value of 1, but systems.refreshed shows 0, and the Events field shows Refreshed system load for 0 hosts.
You can add more context to an existing span to make your traces easier to interpret.
When viewing the RefreshingSystemLoads trace where the localhost system is down, you see only two spans in the trace. In contrast, when the system is running, the trace includes five spans. These include spans from both the inventory and system services for retrieving system load data, and an Inventory Manager Set span for updating the inventory.
Rerun the same Grafana query in the Explore view, then select a trace named RefreshingSystemLoads. Expand the Inventory Manager GetSystemLoad span, you can see that the trace does not indicate whether the request to the system service succeeded or failed.
You can add additional context to a span by using the Span.current() method to retrieve the active span and record events or attributes.
Replace theInventoryManagerclass.inventory/src/main/java/io/openliberty/guides/inventory/InventoryManager.java
InventoryManager.java
link:finish/inventory/src/main/java/io/openliberty/guides/inventory/InventoryManager.java[role=include]The Span.current() method is used to retrieve the current active span so that additional data can be added to it. The addEvent() method adds events to the Inventory Manager GetSystemLoad span when system load data is received or when it fails to get the system load from the system service. The setAttribute() method adds attributes to the Inventory Manager Set span to record when a system is first added to the inventory or when its information is updated.
You can now view the new traces with the added span details. Restart the dev mode for the system microservice:
mvnw.cmd -pl system liberty:dev./mvnw -pl system liberty:dev./mvnw -pl system liberty:devWhen the system service is running again, visit the http://localhost:9081/inventory/systems/localhost URL. Open the Grafana dashboard and rerun the same query in the Explore view. Find and open the trace for GET /inventory/systems/{hostname}. You see the following result:
Expand the Inventory Manager GetSystemLoad span to see the Retrieved system load event. Expand the Inventory Manager Set span to see the operation attribute set to add.
If you revisit the http://localhost:9081/inventory/systems/localhost URL, the Inventory Manager Set span from the GET /inventory/systems/{hostname} trace now shows the operation attribute as update.
Next, visit the http://localhost:9081/inventory/systems/unknown URL to simulate a request for a nonexistent host. Rerun your Grafana query and open the latest trace for GET /inventory/systems/{hostname}. In the Inventory Manager GetSystemLoad span, you now see the Cannot get system load event. Because the request failed, the set() method is not called to add or update any system in the inventory.
For more information about OpenTelemetry distributed tracing, see the OpenTelemetry distributed tracing API documentation.
When you enable the MicroProfile Telemetry feature, Open Liberty automatically collects and exports a default set of metrics. For the full list, see the MicroProfile Telemetry metrics reference list. In addition, you can create custom metrics in your application by using the OpenTelemetry Metrics API.
To define custom metrics, instrument your application code manually.
Replace theInventoryManagerclass.inventory/src/main/java/io/openliberty/guides/inventory/InventoryManager.java
InventoryManager.java
link:finish/inventory/src/main/java/io/openliberty/guides/inventory/InventoryManager.java[role=include]The OpenTelemetry Meter bean is injected into the InventoryManager class. This bean provides the Metrics API to create instruments such as gauges, counters, and histograms.
The gaugeBuilder() call defines a gauge named inventory.size that reports the current number of systems in the inventory. The buildWithCallback() call supplies a callback so the gauge always reflects the latest size.
The counterBuilder() calls define two counters, inventory.system.load.successes and inventory.system.load.errors. In getSystemLoad(), systemLoadSuccessCounter increments after a successful response, and systemLoadErrorCounter increments when the request fails. These counters track how many system load retrievals succeeded or failed.
The histogramBuilder() call defines a histogram named inventory.system.load.duration that measures retrieval latency in seconds. A startTime is captured before calling the system service, and the duration is recorded in the histogram after the response is received.
When naming metrics, OpenTelemetry’s general naming guidelines are followed to support compatibility across observability tools.
For more information about OpenTelemetry metrics, see the OpenTelemetry metrics API documentation.
You can now view the custom metrics.
Visit the http://localhost:9081/inventory/systems/localhost URL to add your local system to the inventory. Then, open Grafana at the http://localhost:3000 URL and go to Drilldown → Metrics from the left menu. In the metrics view, use the A_ prefix filters control on the left menu to select inventory. At the upper right, set the time range to Last 5 minutes and set Auto refresh interval to 10s to watch updates.
OpenTelemetry metrics are exported to the collector every 60 seconds by default, so it may take up to a minute for new metrics to appear. If you don’t see them right away, wait for the next update cycle.
You see a result similar to the following:
Note the difference between the metric names that are defined in code and the names that appear in Grafana. For details on how OpenTelemetry metrics are transformed when exported to Prometheus, see the OTLP metric points to Prometheus section of the specification.
To generate more data for the custom metrics, you can add another host by visiting the http://localhost:9081/inventory/systems/127.0.0.1 URL. To simulate a nonexistent host, visit the http://localhost:9081/inventory/systems/unknown URL or any host that is not running on your machine, which increments the error counter. You can also stop and restart the system service to simulate an unreachable system in the inventory.
Because the refresh scheduler runs every 15 seconds, wait a few minutes and refresh the dashboard to observe the metrics change in real time.
Manually verify the telemetry signals by inspecting them in the Grafana dashboard. You can also run the included tests to check the basic functionality of the services. If any of the tests fail, you might have introduced a bug into the code.
Because you started Open Liberty in dev mode, you can run the tests for the system and inventory services by pressing the enter/return key from the command-line sessions where you started the services.
If the tests pass, you see an output for each service similar to the following:
------------------------------------------------------- T E S T S ------------------------------------------------------- Running it.io.openliberty.guides.system.SystemEndpointIT Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.445 s -- in it.io.openliberty.guides.system.SystemEndpointIT Results: Tests run: 1, Failures: 0, Errors: 0, Skipped: 0
------------------------------------------------------- T E S T S ------------------------------------------------------- Running it.io.openliberty.guides.inventory.InventoryEndpointIT ... Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.855 s -- in it.io.openliberty.guides.inventory.InventoryEndpointIT Results: Tests run: 3, Failures: 0, Errors: 0, Skipped: 0
When you are done checking out the services, exit dev mode by pressing CTRL+C in the shell sessions where you ran the system and inventory services.
Finally, run the following command to stop the container that you started from the grafana/otel-lgtm image in the Additional prerequisites section.
docker stop otel-lgtm
You just used MicroProfile Telemetry in Open Liberty to extend automatic instrumentation with custom tracing spans and custom metrics, and you visualized the telemetry in Grafana.
Try out one of the related MicroProfile guides. These guides demonstrate more technologies that you can learn to expand on what you built in this guide.







