RFC-0008: Tracebox Explicit Daemon Management #3540
Replies: 4 comments 2 replies
-
|
fyi: @ribalda |
Beta Was this translation helpful? Give feedback.
-
|
Two thoughts from me:
|
Beta Was this translation helpful? Give feedback.
-
|
Thanks a lot for putting this together!!! 😮 A few notes:
I don't think we need this systemd unit. What does this do? perfetto is only a cmdline client, not a service. Unless you want to use this as a "group" somehow to start/stop both? (clarify if it's the case)
I think "ctl" is right. what he plans to do (I think) is to NOT use systemd when doing ctl start. it's some sort of "apachectl" Realistically this will just daemonize() the two services and remember their pid somewhere. We need to have a way to cheaply run the daemon without having to install anything. @sashwinbalaji I have one caveat to add to the way "ctl" works. IMHO it:
If !1 && !2 instead it should do its own daemonization and store the pids in /tmp/somewhere (or just pkill to stop?) |
Beta Was this translation helpful? Give feedback.
-
|
Ah another comment.
I think in tracebox ctl we should try to use /run/perfetto/ if
(we don't need to probe for root, we should just try to access or mkdir that dir) The reason is the SDK. If people want to try out the sdk, it should JustWork(TM) if they start tracebox ctl as root If we fail to acquire the default socket path, we should print a very explicit error message that says that we are using a fallback path, and the SDK will NOT work (Which is what catches everybody by surprise) |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
📄 RFC Doc: 0008-tracebox-explicit-daemon-management.md
Tracebox Explicit Daemon Management
Authors: @sashwinbalaji, @primiano
Status: Draft
Problem
traceboxis a key tool for using Perfetto, especially on Linux and fordevelopment purposes, bundling
traced,traced_probes, and theperfettoCLIinto a single binary. However, its current "autostart" mode—where invoking
traceboxwith trace arguments automatically spawns daemons—has severalstructural issues that create a confusing and unreliable user experience.
Daemon Lifecycle and SDK Integration: In its default autostart mode,
traceboxspawnstracedandtraced_probesonly for the duration of thecommand's execution. This ephemeral nature, coupled with the use of private,
temporary sockets, creates significant problems:
(e.g., using
track_event) expect a persistent tracing service onstandard system sockets. When an SDK-instrumented application starts, it
fails to connect because no daemon is running. If
traceboxis thenused to trace, it spawns its own daemons on private sockets (e.g.,
PID-based abstract sockets on Linux) which the already-running
application cannot discover. This results in traces missing all SDK data
and is a major source of developer confusion, as highlighted in issues
like #3437,
#2105, and
#850.
producers also require a stable, accessible tracing service and face the
same discovery issues.
User Experience: It is not obvious to users whether the required daemons
are running or which sockets are in use. The lack of a persistent service
model on non-Android platforms is contrary to developer expectations for
system tracing tools.
System Integration:
traceboxlacks a straightforward way to set up apersistent, system-wide tracing service. For users on
systemd-baseddistributions without a Debian package, there is no simple migration path
from a temporary setup to a proper service installation. Furthermore, the
current autostart behavior would conflict with or hide daemons managed by a
future
apt installof Perfetto.Platform Inconsistency: The daemon management behavior is inconsistent
across platforms. On Linux/Android, it uses PID-based abstract domain
sockets (
@traced-c-PID) that auto-cleanup, while on macOS it createsfilesystem sockets (
/tmp/traced-c-PID) that are left stale after a crash.On Windows, the autostart mode is entirely unsupported and triggers a
PERFETTO_FATAL, making any cross-platform workflow unreliable.Session Clashes: Running multiple
traceboxinstances simultaneouslyleads to conflicts as they compete for control of system resources like
ftrace and the same socket paths. This can result in catastrophic failures
and undefined behavior, as noted in
#2903.
Decision
To provide a clearer, more robust, and less mysterious model,
tracebox'sdaemon management will become explicit. This is a breaking behavioral change.
Explicit Daemon Control by Default: Running
traceboxwith traceconfiguration arguments (e.g.,
tracebox -c config.pbtx) will no longerautomatically start
tracedandtraced_probes. It will require thedaemons to be already running and accessible on standard system sockets
(e.g.,
/run/perfetto/...or/tmp/perfetto/...).New
tracebox ctlApplet for Daemon Management: A new applet isintroduced to manage the lifecycle of daemons for users not relying on a
system package manager. It supports two distinct modes of operation:
tracebox ctl start: Startstracedandtraced_probesasdetached background daemons for the current user, using sockets
in
/tmp/. This is for temporary, non-system-wide use.tracebox ctl stop: Stops the daemons started viactl start.tracebox ctl status: Reports the status of these daemons.tracebox ctl install-systemd-units [--start]: For users onsystemd-based systems, this command generates and installssystemdunit files, effectively replicating a system packageinstallation. This is a one-time setup action for a persistent,
system-wide service.
Backward Compatibility via Flag: The current self-contained execution
model is preserved for existing scripts via the
--autodaemonize=sessionflag.
Clear Guidance: If
traceboxis invoked and daemons are not detected,it will exit with an actionable error message, pointing to
systemctl(forsystem-managed installs) or
tracebox ctl start(for user-managed daemons).Design
traceboxDefault Mode ChangesWhen
traceboxis invoked without a specific applet name (e.g.,tracebox -c config.pbtx ...):--autodaemonize=session: This flag preserves the current behavior.Daemons are spawned on private, temporary sockets (PID-based abstract
sockets on Linux/Android,
/tmp/traced-*-PIDon macOS) and live only forthe duration of the command.
--autodaemonize=none(or flag omitted): This is the new defaultbehavior.
socket locations to find active daemons. The search order is:
PERFETTO_*_SOCK_NAME/run/perfetto/traced-producer.sock/tmp/perfetto-producer.sockwith the trace.
message:
tracebox ctlAppletThis applet provides two distinct daemon management workflows:
User-Session Management:
tracebox ctl start:checking for
/lib/systemd/system/perfetto.svc), guiding the user tosystemctl.(
/tmp/perfetto-*.sock). If so, it reports this and exits.tracedandtraced_probesas detached processes owned by thecurrent user, using standard daemonization techniques (e.g., double
fork).
/tmp/(e.g.,/tmp/perfetto-{producer,consumer}.sock) with permissions for thecurrent user.
/tmp/) for management.tracebox ctl stop: Finds the daemons via their PID files and sendsa termination signal.
tracebox ctl status: Checks if the user-session daemons areresponsive on the
/tmp/sockets.System-Wide Service Installation:
tracebox ctl install-systemd-units: This command (requiringsudo) generatestraced.service,traced_probes.service, and aperfetto.svctarget and installs them into/etc/systemd/system/.This provides a persistent setup that survives reboots.
--startflag will also executesystemctl daemon-reloadand
systemctl start perfetto.svc.already detected (e.g., in
/lib/systemd/system/).Interaction with Debian Package
systemdunits to manage daemons usingsockets in
/run/perfetto/. This is the canonical method.perfettoandtraceboxbinaries will prioritize connecting to the/run/perfetto/sockets.tracebox ctl startandtracebox ctl install-systemd-unitscommandswill yield to a Debian package installation to prevent conflicts.
Socket Paths Summary
/run/perfetto/{producer,consumer}.socktracebox ctl/tmp/perfetto-{producer,consumer}.sock--autodaemonize=session/tmp/traced-{c,p}-PIDSDK Enhancements
could be enhanced to use mechanisms like
inotifyon Linux to detect socketcreation and trigger an immediate reconnection attempt.
(WIP CL)
Alternatives Considered
problems.
cleaner.
was deemed too complex and "mysterious". Reliably managing timeouts,
handling edge cases with active tracing sessions, and debugging issues
across platforms would introduce significant maintenance overhead. The
explicit-control model is simpler and more predictable.
Open Questions
install-systemd-unitsoption as if Debian packaging goeswell that should cover most of the cases and mixing both can lead to
confusion.
tracebox ctlcommands to detect asystem-wide Perfetto installation to yield control to
systemctl?ctl stopon all supported platforms (Linux, Mac, Windows).
💬 Discussion Guidelines:
Beta Was this translation helpful? Give feedback.
All reactions