-
Notifications
You must be signed in to change notification settings - Fork 710
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for collecting CLR event through EventPipe #1291
base: main
Are you sure you want to change the base?
Conversation
chrisnas
commented
Oct 16, 2020
- Add -eventpipe option to activate the EventPipe collection
- Add -providers provider:keyword:verbosity:tags,... option to allow fine grained tuning of dotnet-trace execution
- Add -sdk-path to point to an already installed dotnet SDK (otherwise it will be install in /tmp/dotnet_sdk_tool)
- When the collection is done, a trace.nettrace and eventpipe.log files will be part of the resulting zip file that perfview is now able to leverage
785af82
to
4c4c6de
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@chrisnas, @gleocadie thank you very much for your contribution! I have a few questions and comments, but this is looking quite good.
src/perfcollect/perfcollect
Outdated
@@ -642,6 +642,10 @@ usePerf=1 | |||
# Use LTTng | |||
useLTTng=1 | |||
|
|||
# Use EventPipe to collect CLR events | |||
useEventPipe=0 | |||
sdkAndToolDir="/tmp/dotnet_sdk_tool" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Can we make this something like /tmp/perfcollect-dotnet-sdk
so that it's clear where it came from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do
src/perfcollect/perfcollect
Outdated
@@ -1377,18 +1382,33 @@ ProcessArguments() | |||
elif [ "-nolttng" == "$arg" ] | |||
then | |||
useLTTng=0 | |||
elif [ "-eventpipe" == "$arg" ] | |||
then | |||
useEventPipe=1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this - if you enable EventPipe, LTTng is disabled.
useLTTng=0 | ||
elif [ "-providers" == "$arg" ] | ||
then | ||
providers=$rawvalue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to propose that if -eventpipe
isn't specified then this writes a FatalError
that -eventpipe
must be specified. This ensures that users of LTTng don't inadvertently specify this, but get no results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good one. I will add a validation step after the arguments are parsed.
elif [ "-noperf" == "$arg" ] | ||
then | ||
usePerf=0 | ||
elif [ "-gccollectonly" == "$arg" ] | ||
then | ||
gcCollectOnly=1 | ||
usePerf=0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may have mis-understood, but I think you were wanting a way to make it possible to capture -gccollectonly
with perf
enabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking that I would do that in a second step (PR), we are not blocked now. what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, no problem.
src/perfcollect/perfcollect
Outdated
BuildEventPipeArgs() | ||
{ | ||
if [ "$collectionPid" == "" ] | ||
then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if we should have an argument validation step after the arguments are parsed to handle this and some of the other possible argument-related issues that I mentioned above. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are aligned :) I will add a validation step
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, thanks!
src/perfcollect/perfcollect
Outdated
WriteStatus "Installing dotnet sdk in $sdkAndToolDir" | ||
ResetText | ||
RunSilent "mkdir $sdkAndToolDir" | ||
RunSilent "curl -OL https://dot.net/v1/dotnet-install.sh" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the SDK directory should be created first so that you can download dotnet-install.sh and store it in this directory as opposed to the current working directory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see thank, I missed that.
src/perfcollect/perfcollect
Outdated
then | ||
FatalError "dotnet-trace tool was installed correctly." | ||
fi | ||
LogAppend 'dotnet-trace version:' `$sdkAndToolDir/dotnet trace --version` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding the version information to the log unconditionally!
src/perfcollect/perfcollect
Outdated
@@ -2153,6 +2312,11 @@ ProcessArguments $@ | |||
# Ensure prerequisites are installed. | |||
EnsurePrereqsInstalled | |||
|
|||
if [ "$useEventPipe" == "1" ] && [ "$1" != "stop" ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like this to follow the same pattern as we do for the other prerequisites, and make download and install of the SDK part of perfcollect install
. This ensures that any disk changes other than for the traces themselves are intentional on the part of the user.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@brianrob one question: the .NET sdk and the tool will unconditionally be installed in /tmp/perfcollect-dotnet-sdk when using perfcollect install.
The DiscoverCommands and InitializeLog functions will take use this path but not the one provided by -sdk-path option.
I was wondering if the -sdk-path was still worth. Maybe we can remove it no. What do you think ? (I might have missed something)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You bring up a good point here. I was trying to have more flexibility here, but I feel like the more I think about it, the more complicated things get, and that we should go back to a more simple plan as you are suggesting.
Here's what I think we should do, let me know what you think:
- If there is a global dotnet SDK installed, use it. If not, install to
/tmp/perfcollect-dotnet-sdk
. - Install
dotnet-trace
using whichever SDK we have. - When collecting, discover the SDK to use based on whether or not there is a global one or one in
/tmp
.
This ensures that if someone doesn't want to install another SDK, but already has one that they can use it. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds good to me.
src/perfcollect/perfcollect
Outdated
@@ -642,6 +642,10 @@ usePerf=1 | |||
# Use LTTng | |||
useLTTng=1 | |||
|
|||
# Use EventPipe to collect CLR events | |||
useEventPipe=0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to propose that instead of using the term eventpipe
that we use dotnetTrace
and/or dotnet-trace
since the actual tool being used here is dotnet-trace
. This is more of a forward-looking thing, so that it's clear what collector is being used should there be multiple, or if people don't know what eventpipe
is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will do
src/perfcollect/perfcollect
Outdated
then | ||
EnsureDotNetTraceToolIsInstalled | ||
fi | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please add a regression test that uses dotnet-trace
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I missed the test folder. I will add a regression test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@brianrob I tried to add a regression test (it allowed me to find stuff I forgot to add in the script: installing curl). But, to run the script with -dotnet-trace, we need an .NET app running in the container. I'm not sure it's possible to do that in the current state. Do you want me to add this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that would be great.
f307984
to
e8b179f
Compare
e8b179f
to
9f100e8
Compare
Hi, I'd like to push this PR forward to be able to merge criteo-forks@747f2a8 so that we stop relying on the perfview fork internally :) I discussed with @chrisnas and @gleocadie, and it seems that some tests were missing. I also see that 1 test in the current build failed, but the build result is long gone. @brianrob could you please re-trigger the test if there's such an option? |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
Thanks @ezsilmar. Just triggered a new CI run. |
Hello! @brianrob I got some time to come back to this PR and would be glad to get a code review. I mainly fought test instability:
About the last point, something weird is happening. If I attach to the process with dotnet-trace right after the process is started, dotnet-trace hangs forever printing |
The test failing currently is OOM of |
02df7c7
to
65bd20b
Compare
65bd20b
to
d1fa55b
Compare
Hello, this PR is hanging for almost 3 years but it is still relevant in our context, and I think it'd be beneficial for the community as well. To remind what this is all about, we often use PerfView on Windows to analyze the behavior of dotnet apps running on Linux. Relevant to this PR, PerfView can understand:
In the days of net2, using This PR is a quality of life change that allows the nettrace file to be packaged in .trace.zip, alongside perf data. A nice side-effect is we can zip .nettrace file which is important for sharing long sessions. The PR also modifies perfcollect to be able to use dotnet-trace under the hood, however this part is not important for my particular usecase as we run perf and zip directly in our troubleshooting code. If modifying perfcollect is not something you'd like to support, we could just merge the change in Mentioning @brianrob as the last reviewer |