You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Provide access to resource usage for processes and nodes
Operating systems typically maintain a running measure of resource
utilization by active processes. This includes metrics on CPU
utilization, disk accesses, memory size, and network activity.
Define a set of attributes by which these these metrics can be
requested and returned.
Attributes are used as a means of providing for later extension
to include a broader range of metrics.
Signed-off-by: Ralph Castain <[email protected]>
Monitor the resources specified in the provided \refstruct{pmix_data_array_t}. Resource types may
860
+
include any of the following:
861
+
862
+
\begin{itemize}
863
+
\item\refattr{PMIX_MONITOR_RESOURCE_RATE}. If not provided, then the request will be treated as a one-shot
864
+
sampling of resource usage.
865
+
\item\refattr{PMIX_PROC_RESOURCE_USAGE}. If the \refstruct{pmix_data_array_t} is empty, then
866
+
all process resource usage values shall be returned for all processes in the session.
867
+
Optionally, the array of \refstruct{pmix_info_t} can specify the processes to be monitored, and/or the particular attributes to be included. Note that the values in the provided structures will be
868
+
ignored (i.e., only the attribute keys are relevant) except where noted, and that the
869
+
\refattr{PMIX_PROC_SAMPLE_TIME} will always be included in the returned data (there is no
870
+
need to include it in the request). Optional attributes include:
871
+
\begin{itemize}
872
+
\item\refattr{PMIX_PROCID}. Optionally specify the process to be monitored. Can include a
873
+
\refconst{PMIX_RANK_WILDCARD} to indicate that all processes
874
+
from a given namespace are to be included. If omitted, then
875
+
all processes in the session will be monitored. May be included
876
+
multiple times to fully specify all processes to be included.
877
+
\item\refattr{PMIX_HOSTNAME}. Include the hostname where the process is located.
878
+
\item\refattr{PMIX_PROC_PID} Optionally specify the process to be monitored.
879
+
\item\refattr{PMIX_PROC_OS_STATE}
880
+
\item\refattr{PMIX_PROC_TIME}
881
+
\item\refattr{PMIX_PROC_PERCENT_CPU}
882
+
\item\refattr{PMIX_PROC_PRIORITY}
883
+
\item\refattr{PMIX_PROC_NUM_THREADS}
884
+
\item\refattr{PMIX_PROC_PSS}
885
+
\item\refattr{PMIX_PROC_VSIZE}
886
+
\item\refattr{PMIX_PROC_RSS}
887
+
\item\refattr{PMIX_PROC_PEAK_VSIZE}
888
+
\item\refattr{PMIX_PROC_CPU}
889
+
\item\refattr{PMIX_PROC_SAMPLE_TIME}
890
+
\end{itemize}
891
+
\item\refattr{PMIX_NODE_RESOURCE_USAGE}. If the \refstruct{pmix_data_array_t} is empty, then
892
+
all node resource usage values shall be returned for all nodes in the session.
893
+
Optionally, the array of \refstruct{pmix_info_t} can specify the nodes to be monitored (using the \refattr{PMIX_HOSTNAME} or \refattr{PMIX_NODEID} attributes), and/or the particular attributes to be included. Note that the values in the provided structures will be
894
+
ignored (i.e., only the attribute keys are relevant) except where noted, and that the
895
+
\refattr{PMIX_NODE_SAMPLE_TIME} will always be included in the returned data (there is no
896
+
need to include it in the request). Optional
897
+
attributes include:
898
+
\begin{itemize}
899
+
\item\refattr{PMIX_HOSTNAME}. Optionally specify the node to be monitored. May be included multiple
900
+
times to fully specify all nodes to be included. Only
901
+
hostname or node ID need be included (not both). If omitted, then all nodes in the session
902
+
shall be monitored.
903
+
\item\refattr{PMIX_NODEID}. Optionally specify the process to be monitored. May be included multiple
904
+
times to fully specify all nodes to be included. Only
905
+
hostname or node ID need be included (not both). If omitted, then all nodes in the session
906
+
shall be monitored.
907
+
\item\refattr{PMIX_NODE_LOAD_AVG}
908
+
\item\refattr{PMIX_NODE_LOAD_AVG5}
909
+
\item\refattr{PMIX_NODE_LOAD_AVG15}
910
+
\item\refattr{PMIX_NODE_MEM_TOTAL}
911
+
\item\refattr{PMIX_NODE_MEM_FREE}
912
+
\item\refattr{PMIX_NODE_MEM_BUFFERS}
913
+
\item\refattr{PMIX_NODE_MEM_CACHED}
914
+
\item\refattr{PMIX_NODE_MEM_SWAP_CACHED}
915
+
\item\refattr{PMIX_NODE_MEM_SWAP_TOTAL}
916
+
\item\refattr{PMIX_NODE_MEM_SWAP_FREE}
917
+
\item\refattr{PMIX_NODE_MEM_MAPPED}
918
+
\item\refattr{PMIX_DISK_RESOURCE_USAGE}. If the \refstruct{pmix_data_array_t} is empty, then
919
+
all disk resource usage values shall be returned for all disks attached to the node.
920
+
Optionally, the array of \refstruct{pmix_info_t} can specify the disks to be monitored (using the \refattr{PMIX_DISK_ID} attribute), and/or the particular attributes to be included. Note that the values in the provided structures will be
921
+
ignored (i.e., only the attribute keys are relevant) except where noted, and that the
922
+
\refattr{PMIX_DISK_SAMPLE_TIME} will always be included in the returned data (there is no
923
+
need to include it in the request). Optional
924
+
attributes include:
925
+
\begin{itemize}
926
+
\item\refattr{PMIX_DISK_ID}. Optionally specify the disk to be monitored. If omitted, then all disks
927
+
attached to the node will be monitored.
928
+
\item\refattr{PMIX_DISK_READ_COMPLETED}
929
+
\item\refattr{PMIX_DISK_READ_MERGED}
930
+
\item\refattr{PMIX_DISK_READ_SECTORS}
931
+
\item\refattr{PMIX_DISK_READ_MILLISEC}
932
+
\item\refattr{PMIX_DISK_WRITE_COMPLETED}
933
+
\item\refattr{PMIX_DISK_WRITE_MERGED}
934
+
\item\refattr{PMIX_DISK_WRITE_SECTORS}
935
+
\item\refattr{PMIX_DISK_WRITE_MILLISEC}
936
+
\item\refattr{PMIX_DISK_IO_IN_PROGRESS}
937
+
\item\refattr{PMIX_DISK_IO_MILLISEC}
938
+
\item\refattr{PMIX_DISK_IO_WEIGHTED}
939
+
\end{itemize}
940
+
\item\refattr{PMIX_NETWORK_RESOURCE_USAGE}. If the \refstruct{pmix_data_array_t} is empty, then
941
+
all network resource usage values shall be returned for all interfaces on the node.
942
+
Optionally, the array of \refstruct{pmix_info_t} can specify the networks to be monitored (using the \refattr{PMIX_NETWORK_ID} attribute), and/or the particular attributes to be included. Note that the values in the provided structures will be
943
+
ignored (i.e., only the attribute keys are relevant) except where noted, and that the
944
+
\refattr{PMIX_NET_SAMPLE_TIME} will always be included in the returned data (there is no
945
+
need to include it in the request). Optional
946
+
attributes include:
947
+
\begin{itemize}
948
+
\item\refattr{PMIX_NETWORK_ID}. Optionally specify the interface to be monitored. If omitted, then all
949
+
interfaces on the node will be monitored.
950
+
\item\refattr{PMIX_NET_RECVD_BYTES}
951
+
\item\refattr{PMIX_NET_RECVD_PCKTS}
952
+
\item\refattr{PMIX_NET_RECVD_ERRS}
953
+
\item\refattr{PMIX_NET_SENT_BYTES}
954
+
\item\refattr{PMIX_NET_SENT_PCKTS}
955
+
\item\refattr{PMIX_NET_SENT_ERRS}
956
+
\end{itemize}
957
+
\end{itemize}
958
+
\end{itemize}
959
+
}
960
+
961
+
%%%%%%%%%%%
962
+
\versionMarkerProvisional{6.0}
963
+
\subsection{Resource usage attributes}
964
+
\label{api:struct:attributes:resusage}
965
+
966
+
Operating systems typically maintain a running measure of resource utilization by active processes,
967
+
attached disks, and local network interfaces.
968
+
Though the precise values being tracked can vary by \ac{OS} flavor and local configuration, the following
969
+
attributes are defined to provide a means for requesting and returning the available metrics.
An array of \refstruct{pmix_info_t} describing the resource usage of the specified process, with
975
+
the first element containing the ID of the process (marked by either the \refattr{PMIX_PROCID}
976
+
or \refattr{PMIX_PROC_PID} key)
977
+
whose usage is reported in the array. The list of included information may vary across
978
+
implementations and \acp{OS}, depending upon availability and access restrictions. Except for
979
+
the process ID as the first element, ordering of information in the array is arbitrary.
980
+
}
981
+
982
+
Optional information that may be included (see \href{https://www.kernel.org/doc/html/latest/filesystems/proc.html}{PROCSTATS} for a detailed description of the following fields):
983
+
\begin{itemize}
984
+
\item\refattr{PMIX_HOSTNAME}. Either the hostname or \refattr{PMIX_NODEID} may be provided.
985
+
\item\refattr{PMIX_PROC_PID}
986
+
\item\refattr{PMIX_CMD_LINE}. Typically limited solely to the argv[0] for the process
An array of \refstruct{pmix_info_t} describing the resource usage of the specified disk, with
1033
+
the first element containing the string name of the disk (marked by the \refattr{PMIX_DISK_ID} key)
1034
+
whose usage is reported in the array. The list of included information may vary across
1035
+
implementations and \acp{OS}, depending upon availability and access restrictions. Except for
1036
+
the disk ID as the first element, ordering of information in the array is arbitrary.
1037
+
}
1038
+
1039
+
Optional information that may be included (see \href{https://www.kernel.org/doc/html/latest/admin-guide/iostats.html)}{IOSTATS} for a detailed description of the following fields):
An array of \refstruct{pmix_info_t} describing the resource usage of the specified network, with
1088
+
the first element containing the string name of the interface (marked by the \refattr{PMIX_NETWORK_ID} key)
1089
+
whose usage is reported in the array. The list of included information may vary across
1090
+
implementations and \acp{OS}, depending upon availability and access restrictions. Except for
1091
+
the network ID as the first element, ordering of information in the array is arbitrary.
1092
+
}
1093
+
1094
+
Optional information that may be included (see \href{https://www.kernel.org/doc/html/latest/networking/statistics.html}{NETSTATS} for a detailed description of the following fields):
An array of \refstruct{pmix_info_t} describing the overall resource usage on the specified node,
1124
+
with the first element containing
1125
+
the ID of the node (marked by the \refattr{PMIX_HOSTNAME} or \refattr{PMIX_NODEID} key) whose usage
1126
+
is reported in the array. The list of included information may vary across
1127
+
implementations and \acp{OS}, depending upon availability and access restrictions. Except for
1128
+
the node ID as the first element, ordering of information in the array is arbitrary.
1129
+
}
1130
+
1131
+
Optional information that may be included (see \href{https://www.kernel.org/doc/html/latest/filesystems/proc.html#kernel-data}{KERNEL} and \href{https://www.kernel.org/doc/html/latest/filesystems/proc.html#meminfo}{MEMINFO} for a detailed description of the following fields):
files which have been mmapped, such as libraries. Note that some kernel configurations might consider all pages part of a larger allocation (e.g., THP) as “mapped”, as soon as a single page is mapped. In MBytes
1166
+
}
1167
+
\item\refattr{PMIX_DISK_RESOURCE_USAGE} One for each disk attached to the node.
1168
+
\item\refattr{PMIX_NETWORK_RESOURCE_USAGE} One for each network interface on the node.
0 commit comments