ENABLE_ASYNC_EXPORT, using a background thread in Export() function to gain even more performance boost #2426

nic-godz · 2023-12-04T14:58:45Z

nic-godz
Dec 4, 2023

I have been testing traces, metrics & logs http exporter. I started out testing the trace Exporter over HTTP. First without ENABLE_ASYNC_EXPORT turned on. It was a no go for out heavy asynchronous application(powered by Boost Asio). Enabling ENABLE_ASYNC_EXPORT did partly the trick. Still quite slow & heavy on the CPU side.

I recently started testing the HTTP exporter for logging. Been using Boost Logging for a long time and it is decent when it comes to performance. Using a different sinks running as backends(background thread).

So, I started out turning ENABLE_ASYNC_EXPORT on from start. First I compared the OStreamLogRecordExporter with the OtlpHttpLogRecordExporter. With Curl running asynchronously it shouldn't really be much difference. Right? Well it is. A factor 7-8x slower. I started looking into the code and Nlohmann json is known for slow serialization. Switching to binary/protobuf did some improvements. But still functions like createSession, addSession, ... in OtlpHttpClient are quite heavy and will affect the "main thread"

I then tried the same approach as Boost Logging. Doing as much as possible in a background thread directly in OtlpHttpLogRecordExporter::Export as soon as you have the "proto::collector::logs::v1::ExportLogsServiceRequest service_request". Dispatch it in a thread safe queue to a background thread and let it handle the rest.

Sure I guess performance boost can be achieved by switching Curl to something faster. But this is a rather easy change in current code and as OtlpHttpLogRecordExporter::Export already are returning "return opentelemetry::sdk::common::ExportResult::kSuccess;" directly it shouldn't change the behaviour.

Comments on my thoughts?

owent · 2023-12-05T12:21:57Z

owent
Dec 5, 2023
Collaborator

ENABLE_ASYNC_EXPORT is used to solve the problem that it's easy to drop datas in batch processor when we get more datas and when the client is exporting datas to a server with high lag with a single session.It's not used to reduce CPU cost.
The difficulty is we have not event driven IO framework now and we can not limit users with specify IO framework, so we use a background thread to poll event now.We have a issue #1250 before but there is no plan to implement it yet.
#2407 implement async exporting for gRPC exporters and use arena to reduce memory fragments, it may also slightly improve the performence when using binary/protobuf.
As for JSON serialization, maybe we can try rapidjson or simdjson.

What's your thoughts? @open-telemetry/cpp-approvers @open-telemetry/cpp-maintainers

0 replies

nic-godz · 2024-06-03T12:27:41Z

nic-godz
Jun 3, 2024
Author

Finally hade some time playing around with HTTP(both json & binary/protobuf) as well as gRPC.

Just as you say @owent going from json => binary makes a huge difference as Nlohmann json serialization is slow.

But what surprises me is that gRPC is 2x - 3x slower then HTTP with protobuf. It is built with ENABLE_ASYNC_EXPORT. But even if gRPC is slower compared to HTTP with protobuf. HTTP/protobuf is not fast enough either.

I still think the Exporter should work with a backgrounds thread(or multiple). For instance as soon as:
OtlpGrpcExporter::Export(const nostd::span<std::unique_ptr<sdk::trace::Recordable>> &spans)
is called. It should transfer alla execution to the background thread. For instance protobuf serialization and other CPU heavy operations should be done in the background thread. Any thoughts about this approach? Sure a real event driven IO framework would be nice. But as far as I can see the main issue is the CPU demanding tasks that is done in the "main thread".

I have looked into writing my own Exporter and using Boost Asio Beast for the HTTP requests and the Boost Asio IO context in the background thread. Similar approach for gRPC. But with asio-grpc.

But when looking into the Exporter SDK I noticed there are 3 different base classes(SpanExporter, LogRecordExporter & PushMetricExporter) are very similar. Any reason for separating all Exporters like this? Shouldn't the Exporters be more connected to the transport mechanism and more or less transparent to what they really are exporting/transporting? Maybe this should be another discussion thread.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENABLE_ASYNC_EXPORT, using a background thread in Export() function to gain even more performance boost #2426

{{title}}

Replies: 2 comments

{{title}}

{{title}}

Select a reply

ENABLE_ASYNC_EXPORT, using a background thread in Export() function to gain even more performance boost #2426

nic-godz Dec 4, 2023

Replies: 2 comments

owent Dec 5, 2023 Collaborator

nic-godz Jun 3, 2024 Author

nic-godz
Dec 4, 2023

owent
Dec 5, 2023
Collaborator

nic-godz
Jun 3, 2024
Author