-
Notifications
You must be signed in to change notification settings - Fork 5
Improving Hwgraph to support large rendering #97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
5786e11
to
bc67c4b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On "Clarify directory names". Is scaling
easy to understand? Wouldn't something like smp_scaling be more appropriate (it would englobe thread, core or numa domain scaling)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo in commit "Adding --same-rack option" : studing -> studying
print -> prints
Again for the --same-rack, I'm not sure I understand why the servers should be in the same rack for this graph? Why does it matter? At least for --same-chassis, we assume they share the PSUs. What do they share in --same-rack? Why can't it be any arbitrary --group-of-servers ? |
When plotting large graphs with crossing or nearly parallel lines, using the "o" marker makes a too large marker on the line. As a consequence, the graph is hard to read and some part of the lines are nearly unreadable. This commit is using the smallest marker which is large enough to be seen, increasing th readbility of the graphs. Signed-off-by: Erwan Velu <[email protected]>
The lower-case 's' makes no sense here. Signed-off-by: Erwan Velu <[email protected]>
df80e9b
to
5c2785e
Compare
The current directory structure was like : - <host1> - <host2> - <host.n> - individual - scaling This was a bit confusing to read as per the names and structure. This commit is offering a new scheme like: - environment / { by_host | by_chassis} / {host1, host2, host.n} / {metric1, metric2, ...} / { graph1, graph2, ...} - max_versus / { metric1, metric2, ...} / {job1, job2, ...} / { graph1, graph2, ...} - smp_scaling / {metric1, metric2, ...} / {graph1, graph2, ...} To match this new semantic, --no-individual argument is renamed --no-versus Signed-off-by: Erwan Velu <[email protected]>
Some benchmarks are operated with external scenario like: - shutdown a PSU during a period of time - remove liquid cooling during a portion of a run - remove a fan .... While rendering the graphs, it was not yet possible to indicate specific period of the benchmark where some external events occured. This commit adds a --events option to hwgraph to add to the graph some visual indications that events occured at some period of the benchmark. Each event is declared like <event_name>:<start_time>:<duration>, a example looks like the following : uv run hwgraph graph --traces mydir/results.json:my_server_name:BMC.Server --title 'Full CPU load 10 minutes - no coolant for 90 secs' --outdir mydir/graphs --events no-coolant:0:90 In this example, a background nearly transparent box will be added for the first 90 seconds (start_time=0s, duration=90s) of the benchmark. A new entry will be added to the legend : "Event no-coolant" As per the --traces option, multiple events can be defined, each new event will be colored differently to ensure an easy reading. Signed-off-by: Erwan Velu <[email protected]>
Signed-off-by: Erwan Velu <[email protected]>
When too many boxes are rendered at once, the horizontal print is unreadble. If more than 10 boxes are rendered, let's switch in vertical mode for the associated label. Signed-off-by: Erwan Velu <[email protected]>
5c2785e
to
b573fd1
Compare
When performing full rack benchmarking, it could be interesting to graph some specific graphs for studying the following metrics during the full run: - the sum of the servers power consumption - the sum of the cpus power consumption - the sum of difference between server's power and cpu power consumptions This patch also prints the percentage ratio of the cpu power and the other components versus the server's power consumption. It could be interesting to see how much of a server power is consumed by the CPUs. This is about the same idea as the --same-chassis but at the scale of a group of servers. This feature, will be useful to graph results of a given rack. Signed-off-by: Erwan Velu <[email protected]>
The max_versus mode is useful to compare the same run over several machines or the same machine in various conditions. If a single trace is provided, the produced graphs are not very interesting to read. Let's disable them and print a message to explain the reason of this automatic feature disabling. Signed-off-by: Erwan Velu <[email protected]>
The current version of hwgraph crashes if the trace are missing PDU or IPC metrics in the trace file. This patch ignores and report when a metric is missing instead of crashing with KeyError exceptions. This improves hwgraph's retrocompatiblity with older traces. Signed-off-by: Erwan Velu <[email protected]>
The current code was plotting each iteration of the benchmark which generated tons of nearly useless graphs. This commit keeps the maximum values and drops intermediates ones. Signed-off-by: Erwan Velu <[email protected]>
time_to_next_sync() returned the number of seconds before the next meeting point but also the "next_sync" absolute time. As the next_sync is not used at all, let's remove is useless return value and simplify the return code function. Signed-off-by: Erwan Velu <[email protected]>
d80648f
to
865b1d8
Compare
repushed sorry, forgot to fix typos you reported :/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This pull request is about bug fixing hwgraph and add a few features to ease the rendering of large datasets.