You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -85,143 +85,236 @@ \subsection{Use Case Details}
85
85
86
86
There are other keys that are helpful to have before a synchronization point, this is not meant to be a comprehensive list.
87
87
88
-
\section{Hybrid Programming Models}
89
-
\label{app:uc-hybrid-programming-models}
88
+
\section{Debugging}
89
+
\label{app:uc-debugging}
90
90
91
-
\subsection{Use Case Summary}
91
+
This use case distills out the features/extensions requested in the RFCs that are related to debugging. We have identified parts of PR23 (Co-located process launch for debuggers), RFC0010 (MPIR-like query), RFC0002 (event pub/sub), and RFC0022 (Environmental Parameter Directives for Applications and Launchers) under this category.
92
92
93
-
Hybrid applications (i.e., applications that utilize more than one lprogramming model, such as an MPI application that also uses OpenMP or PGAS) are growing in popularity, especially as chips with increasingly large numbers of cores and processors proliferate. Unfortunately, the various models currently operate under the assumption that they alone control execution. This leads to conflicts in hybrid applications. Deadlock of parallel applications can occur when one model prevents the other from making progress due to lack of coordination between the multiple programming models~\cite{2016:Hamidouche}. Sub-optimal performance can also occur due to uncoordinated division of hardware resources between the programming models~\cite{2018:Vallee,ompix-moc}. This use-case offers potential solutions to the problem by providing a pathway for programming models to coordinate their actions.
93
+
\subsection{Terminology}
94
94
95
-
\subsection{Use Case Details}
95
+
\subsubsection{Tools vs Debuggers}
96
+
97
+
A \texttt{tool} is a process designed to monitor, record, analyze, or control the execution of another process. Typically used for the purposes of profiling and debugging. A \texttt{first-party tool} runs within the address space of the application process while a \texttt{third-party tool} run within its own process. A \texttt{debugger} is a third-party tool that inspects and controls an application process's execution using system-level debug APIs (e.g., \code{ptrace}).
98
+
99
+
\subsubsection{Parallel Launching Methods}
100
+
A \texttt{starter} program is a program responsible for launching a parallel runtime, such as \ac{MPI}. \ac{PMIx} supports two primary methods for launching parallel applications under tools and debuggers: indirect and direct. In the indirect launching method, the tool is attached to the starter. In the direct launching method, the tool takes the place of the starter.
101
+
\ac{PMIx} also supports attaching to already running programs via the \texttt{Process Acquisition} interfaces.
102
+
103
+
\subsubsection{Process Synchronization}
104
+
Process Synchronization is the technique tools use to start the processes of a parallel application such that the tools can still attach to the process early in it's lifetime. Said another away, the tool must be able to start the application processes without them ``running away'' from the tool. In the case of \ac{MPI}, this means stopping the applications processes before they return from \code{MPI_Init}.
\subsubsection{Identifying Active Programming Models}
108
+
Process Acquisition is technique tools use to locate all of the processes, local and remote, of a given parallel application. This typically boils down to collecting for every process in the parallel application: the hostname or IP of the machine running the process, the executable name, and the process ID.
98
109
99
-
The current state-of-the-practice for programming models to detect one another is via set environment variables. For example, OpenMP looks for environment variables to indicate that MPI is active. Unfortunately, this technique is not completely reliable as environment variables change over time and with new software versions. Also, the fact that an environment variable is present doesn't guarantee that a particular programming is in active use since Resource Managers routinely set environment variables "just in case" the application needs them. PMIx provides a reliable mechanism by which each library can determine that another library is in operation.
110
+
\subsection{Use Case Details}
111
+
\subsubsection{Direct-Launch Debugger Tool}
100
112
101
-
When initializing PMIx, programming models can register themselves, including their name, version, and threading model. This information is then cached locally and can then be read asynchronously by other programming models using PMIx's Event Notification system (see next section for more details).
113
+
PMIx can support the tool itself using the PMIx spawn options to control the app’s startup, including directing the RM/application as to when to block and wait for tool attachment, or stipulating that an interceptor library be preloaded. However, this means that the user is restricted to whatever command line options the tool vendor has provided for operations such as process placement and binding, which places a significant burden on the tool vendor. An example might look like the following: \code{dbgr -n 3 ./myapp}.
102
114
103
-
This initialization mechanism also allows libraries to share knowledge of each other's resources and intended resource utilization. For example, if OpenMP knows which hardware threads that MPI is using it could potentially avoid processor and cache contention.
115
+
Assuming it is supported, co-launch of debugger daemons in this use-case is supported by adding a \code{pmix_app_t} to the \refapi{PMIx_Spawn} command, indicating that the resulting processes are debugger daemons by setting the \refattr{PMIX_DEBUGGER_DAEMONS} attribute.
The PMIx Event Notification system provides a mechanism by which the resource manager can communicate system events to applications, thus providing applications with an opportunity to generate an appropriate response. Hybrid applications can leverage these events for cross-library coordination.
163
+
\refconst{PMIX_DEBUG_WAITING_FOR_NOTIFY} \\
164
+
\refconst{PMIX_DEBUGGER_RELEASE}
130
165
131
-
Programming models can access the information provided by other programming models during their initialization using the event notification system. In this case, programming models should register a callback for the \refconst{PMIX_MODEL_DECLARED} event.
166
+
\subsubsection{Indirect-Launch Debugger Tool}
132
167
133
-
Programming models can also use the PMIx event notification system to communicate dynamic information, such as entering a new application phase (\refconst{PMIX_MODEL_PHASE_NAME}) or a change in resources used (\refconst{PMIX_MODEL_RESOURCES}). This dynamic information can be broadcast to other programming models using the \refapi{PMIx_Notify_event} function. Other programming models can register callback functions to run when these events occur (i.e., callback functions) using \refapi{PMIx_Register_event_handler}.
168
+
Executing a program under a tool using an intermediate launcher such as mpiexec can also be made possible. This requires some degree of coordination between the tool and the launcher. Ultimately, it is the launcher that is going to launch the application, and the tool must somehow inform it (and the application) that this is being done in a debug session so that the application knows to ``block'' until the tool attaches to it.
134
169
135
-
\littleheader{Code Example}
170
+
In this operational mode, the user invokes a tool (typically on a non-compute, or ``head'', node) that in turn uses mpiexec to launch their application – a typical command line might look like the following: \code{dbgr -dbgoption mpiexec -n 32 ./myapp}.
136
171
137
-
Registering a callback to run when another programming model initializes:
PMIx supports attaching to an already running parallel job in two ways. In the first way, the main process of a tool calls \refapi{PMIx_Query_info} with the \refattr{PMIX_QUERY_PROC_TABLE} attribute. This returns an array of structs containing the information required for \hyperref[subsubsec:process-acq]{process acquisition}. This includes remote hostnames, executable names, and process IDs. In the second way, every tool daemon calls \refapi{PMIx_Query_info} with the \refattr{PMIX_QUERY_LOCAL_PROC_TABLE} attribute. This returns a similar array of structs but only for processes on the same node.
234
+
235
+
An example of this use-case may look like the following: \code{mpiexec -n32~./myApp \&\& dbgr attach \$!}.
Tools can benefit from a mechanism by which they may interact with a local PMIx server that has opted to accept such connections along with support for tool connections to system-level PMIx servers, and a logging feature. To add support for tool connections to a specified system-level, PMIx server environments could choose to launch a set of PMIx servers to support a given allocation - these servers will (if so instructed) provide a tool rendezvous point that is tagged with their pid and typically placed in an allocation-specific temporary directory to allow for possible multi-tenancy scenarios. Supporting such operations requires that a system-level PMIx connection be provided which is not associated with a specific user or allocation. A new key has been added to direct the PMIx server to expose a rendezvous point specifically for this purpose.
168
272
273
+
{\large\refapi{PMIx_Query_info_nb}}
274
+
\pasteSignature{PMIx_Query_info_nb}
169
275
170
-
\subsubsection{Coordinating at Runtime with Multiple Event Handlers}
276
+
{\large\refapi{PMIx_Register_event_handler}}
277
+
\pasteSignature{PMIx_Register_event_handler}
171
278
172
-
Coordinating with a threading library such as OpenMP creates the need for separate event handlers for threads of the same process. For example in an MPI+OpenMP hybrid application, the MPI thread and the main OpenMP thread may both want to be notified anytime an OpenMP worker thread enters a parallel region. This requiring support for multiple threads to potentially register different event handlers against the same status code.
279
+
{\large\refapi{PMIx_Deregister_event_handler}}
280
+
\pasteSignature{PMIx_Deregister_event_handler}
173
281
174
-
Multiple event handlers registered against the same event are processed in a chain-like manner based on the order in which they were registered, as modified by directive. Registrations against specific event codes are processed first, followed by registrations against multiple event codes and then any default registrations. At each point in the chain, an event handler is called by the PMIx progress thread and given a function to call when that handler has completed its operation. The handler callback notifies PMIx that the handler is done, returning a status code to indicate the result of its work. The results are appended to the array of prior results, with the returned values combined into an array within a single pmix_info_t as follows:
175
-
\begin{itemize}
176
-
\item\texttt{array[0]}: the event handler name provided at registration (may be an empty field if a string name was not given) will be in the key, with the pmix_status_t value returned by the handler
177
-
\item\texttt{array[*]}: the array of results returned by the handler, if any.
178
-
\end{itemize}
282
+
{\large\refapi{PMIx_Notify_event}}
283
+
\pasteSignature{PMIx_Notify_event}
179
284
180
-
The current PMIx standard does not actually specify a default ordering for event handlers as they are being registered. However, it does include an inherent ordering for invocation. Specifically, PMIx stipulates that handlers be called in the following categorical order:
285
+
{\large\refapi{PMIx_server_init}}
286
+
\pasteSignature{PMIx_server_init}
181
287
182
-
\begin{itemize}
183
-
\item single status event handlers - i.e., handlers that were registered against a single specific status.
184
-
\item multi status event handlers - those registered against more than one specific status
185
-
\item default event handlers - those registered against no specific status
186
-
\end{itemize}
288
+
\littleheader{Job-specific events}
289
+
\code{PMIX_EVENT_JOB_LEVEL /* debugger attached, process failure */}
187
290
188
-
\littleheader{Code Example}
291
+
\littleheader{Environment events}
292
+
\code{PMIX_EVENT_ENVIRO_LEVEL /*ECC errors, temperature excursions */}
189
293
190
-
From the OpenMP master thread:
294
+
\littleheader{Errors detected by clients/peers}
295
+
\code{Network fabric manager detects data corruption}
\subsubsection{Environmental Parameter Directives for Applications and Launchers}
197
298
198
-
From the MPI thread:
299
+
It is sometimes desirable or required that standard environmental variables (e.g., \code{PATH}, \code{LD_LIBRARY_PATH}, \code{LD_PRELOAD}) be modified prior to executing an application binary or a starter such as mpiexec - this is particularly true when tools/debuggers are used to start the application. This RFC proposes the definition of a new PMIx structure (\refstruct{pmix_envar_t}) and associated attributes for specifying such operations.
Resource managers and launchers must scan for relevant directives, modifying environmental parameters as directed. Directives are to be processed in the order in which they were given, starting with job-level directives (applied to each app) followed by app-level directives.
0 commit comments