Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Populate both origin and hostname correctly to span attributes #3097

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

shivanshuraj1333
Copy link
Member

@shivanshuraj1333 shivanshuraj1333 commented Dec 12, 2024

Description

Previously, the hostname was conflated with the origin; however, they are distinct concepts, as detailed in the issue below.

Fixes #3096

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Tested end-to-end with a demo app available here https://github.com/shivanshuraj1333/celery-opentelemetry-instrumentation

Below are the new attributes in spans and metrics after the change:

Celery host:

 -------------- celery@7c2c2cd6a5b5 v5.4.0 (opalescent)
--- ***** ----- 
-- ******* ---- Linux-6.10.14-linuxkit-aarch64-with 2024-12-12 13:27:37
- *** --- * --- 
- ** ---------- [config]
- ** ---------- .> app:         myproject:0xffff91429160
- ** ---------- .> transport:   amqp://guest:**@rabbitmq:5672//
- ** ---------- .> results:     rpc://
- *** --- * --- .> concurrency: 11 (prefork)
-- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker)
--- ***** ----- 
 -------------- [queues]
                .> queue1           exchange=queue1(direct) key=queue1
                .> queue2           exchange=queue2(direct) key=queue2
                .> queue3           exchange=queue3(direct) key=queue3
                .> queue4           exchange=queue4(direct) key=queue4

Stack Trace Dump from Celery:

[2024-12-12 13:31:30,356: DEBUG/MainProcess] TaskPool: Apply <function fast_trace_task at 0xffff91106ca0> (args:('tasks.tasks.multiply', '9717beb8-39fa-4b03-8b6a-464d2df5472b', {'argsrepr': '[3, 4]', 'eta': None, 'expires': None, 'group': None, 'group_index': None, 'id': '9717beb8-39fa-4b03-8b6a-464d2df5472b', 'ignore_result': False, 'kwargsrepr': '{}', 'lang': 'py', 'origin': 'gen8@b98c7aca4628', 'parent_id': None, 'replaced_task_nesting': 0, 'retries': 0, 'root_id': '9717beb8-39fa-4b03-8b6a-464d2df5472b', 'shadow': None, 'stamped_headers': None, 'stamps': {}, 'task': 'tasks.tasks.multiply', 'timelimit': [None, None], 'traceparent': '00-ef2b1df22bc3e7bdd1b1dab863760cd3-ee4a9e13ef330488-01', 'properties': {'content_type': 'application/json', 'content_encoding': 'utf-8', 'application_headers': {'argsrepr': '[3, 4]', 'eta': None, 'expires': None, 'group': None, 'group_index': None, 'id': '9717beb8-39fa-4b03-8b6a-464d2df5472b', 'ignore_result': False, 'kwargsrepr': '{}', 'lang': 'py', 'origin': 'gen8@b98c7aca4628', 'parent_id': None, 'replaced_task_nesting': 0, 'retries': 0, 'root_id':... kwargs:{})

Span attributes:

Resource SchemaURL: 
2024-12-12T13:33:33.122120922Z Resource attributes:
2024-12-12T13:33:33.122121922Z      -> telemetry.sdk.language: Str(python)
2024-12-12T13:33:33.122122964Z      -> telemetry.sdk.name: Str(opentelemetry)
2024-12-12T13:33:33.122139380Z      -> telemetry.sdk.version: Str(1.29.0)
2024-12-12T13:33:33.122140714Z      -> service.name: Str(worker1)
2024-12-12T13:33:33.122141922Z      -> telemetry.auto.version: Str(0.50b0)
2024-12-12T13:33:33.122143172Z ScopeSpans #0
2024-12-12T13:33:33.122144297Z ScopeSpans SchemaURL: 
2024-12-12T13:33:33.122149005Z InstrumentationScope opentelemetry.instrumentation.celery 0.50b0
2024-12-12T13:33:33.122150964Z Span #0
2024-12-12T13:33:33.122151922Z     Trace ID       : 53ed5fb0e35189cd037e18c2885505e7
2024-12-12T13:33:33.122152922Z     Parent ID      : 2901ac2e8ccccb1e
2024-12-12T13:33:33.122153880Z     ID             : 7166553679d195c5
2024-12-12T13:33:33.122154839Z     Name           : run/tasks.tasks.multiply
2024-12-12T13:33:33.122155755Z     Kind           : Consumer
2024-12-12T13:33:33.122156672Z     Start time     : 2024-12-12 13:33:30.452448296 +0000 UTC
2024-12-12T13:33:33.122157755Z     End time       : 2024-12-12 13:33:32.45406488 +0000 UTC
2024-12-12T13:33:33.122158714Z     Status code    : Unset
2024-12-12T13:33:33.122159589Z     Status message : 
2024-12-12T13:33:33.122160464Z Attributes:
2024-12-12T13:33:33.122161339Z      -> celery.action: Str(run)
2024-12-12T13:33:33.122162339Z      -> celery.state: Str(SUCCESS)
2024-12-12T13:33:33.122163255Z      -> messaging.conversation_id: Str(1be3f640-b36d-4951-9a88-d2fa7758c51e)
2024-12-12T13:33:33.122164297Z      -> messaging.destination: Str(queue2)
2024-12-12T13:33:33.122165964Z      -> celery.delivery_info: Str({'exchange': '', 'routing_key': 'queue2', 'priority': 0, 'redelivered': False})
2024-12-12T13:33:33.122170005Z      -> celery.hostname: Str(celery@7c2c2cd6a5b5)
2024-12-12T13:33:33.122170964Z      -> messaging.message.id: Str(1be3f640-b36d-4951-9a88-d2fa7758c51e)
2024-12-12T13:33:33.122172047Z      -> celery.reply_to: Str(4774934f-b0d9-30c3-a69d-41830cb8848a)
2024-12-12T13:33:33.122173005Z      -> celery.origin: Str(gen8@b98c7aca4628)
2024-12-12T13:33:33.122173964Z      -> celery.task_name: Str(tasks.tasks.multiply)
2024-12-12T13:33:33.122174880Z 	{"kind": "exporter", "data_type": "traces", "name": "debug"}

Metrics attributes:

Metric #20
2024-12-12T13:34:11.430502384Z Descriptor:
2024-12-12T13:34:11.430506343Z      -> Name: flower_task_prefetch_time_seconds
2024-12-12T13:34:11.430509718Z      -> Description: The time the task spent waiting at the celery worker to be executed.
2024-12-12T13:34:11.430515051Z      -> Unit: 
2024-12-12T13:34:11.430518926Z      -> DataType: Gauge
2024-12-12T13:34:11.430522551Z NumberDataPoints #0
2024-12-12T13:34:11.430526926Z Data point attributes:
2024-12-12T13:34:11.430530884Z      -> task: Str(tasks.tasks.add)
2024-12-12T13:34:11.430534384Z      -> worker: Str(celery@7c2c2cd6a5b5)

Additional Info:

Celery release notes where origin was introduced first in Celery 4.0 https://docs.celeryq.dev/en/latest/history/whatsnew-4.0.html#whatsnew-4-0

**Hostname**
Definition:
The hostname in Celery typically refers to the name of the machine or server where the worker is running. It is derived from the worker's configuration and can be set explicitly when starting a worker.
Example: If a worker is started with a command like celery -A app worker -n worker1@%h, the %h will resolve to the hostname of the machine (e.g., worker1@computer), but only the part after @ (i.e., computer) is reported in task.request.hostname[1](https://github.com/celery/celery/issues/6555)[4](https://github.com/celery/celery/issues/3234).
Usage: This information is useful for monitoring and debugging, especially in environments with multiple workers running on the same host.

**Origin**
Definition:
The origin in Celery includes more comprehensive information about the task execution context. It can encompass details such as the worker node name, process ID (PID), and hostname.
Example: In Celery 4.0 and later, a new origin header was introduced that provides information about the process sending the task, which includes both the node name and hostname[6](https://docs.celeryq.dev/en/latest/history/whatsnew-4.0.html)[3](https://github.com/celery/celery/blob/main/docs/history/whatsnew-4.0.rst). This allows for better tracking of tasks across different workers.
Usage: The origin header is particularly valuable for understanding where a task was executed, which can be critical for debugging issues related to task processing.

Does This PR Require a Core Repo Change?

  • Yes. - Link to PR:
  • No.

Checklist:

See contributing.md for styleguide, changelog guidelines, and more.

  • Followed the style guidelines of this project
  • Changelogs have been updated
  • Unit tests have been added
  • Documentation has been updated

@xrmx
Copy link
Contributor

xrmx commented Dec 12, 2024

@shivanshuraj1333 Please add a changelog and entry, also a test to check the changed behaviour would be nice

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Celery.Hostname getting overridden by Celery.Origin
2 participants