Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updating the recommendation for Executor GC heuristic in case of high executor GC #442

Open
wants to merge 1 commit into
base: tuning_20190221
Choose a base branch
from

Conversation

ShubhamGupta29
Copy link
Contributor

Description

Adding some more recommendations in ExecutorGC heuristic when high GC is witnessed in the application.

@@ -55,11 +55,13 @@ class ExecutorGcHeuristic(private val heuristicConfigurationData: HeuristicConfi

//adding recommendations to the result, severityTimeA corresponds to the ascending severity calculation
if (evaluator.severityTimeA.getValue > Severity.LOW.getValue) {
resultDetails = resultDetails :+ new HeuristicResultDetails("Gc ratio high", "The job is spending too much time on GC. We recommend increasing the executor memory.")
resultDetails = resultDetails :+ new HeuristicResultDetails("Gc ratio high",
"The job is spending too much time on GC. Recommended to increase the executor memory and also can enable ParallelGC using spark.executor.extraJavaOptions or reducing number of UDF calls.")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this so quickly. For a more customized recommendation, could this check the value for "spark.executor.extraJavaOptions", to see if ParallelGC is already set? I don't think there's an easy way to check for UDFs unfortunately.

For LinkedIn, it would also be nice to include a link to the wiki: https://iwww.corp.linkedin.com/wiki/cf/display/DWH/Big+Data+Engineering+-+Advanced+Spark+SQL+Tuning+Techniques#BigDataEngineering-AdvancedSparkSQLTuningTechniques-GCtuning

Including the link would not work for open source, and there is also the danger of the link changing.

Another option is to provide the details mentioned in the wiki as part of the html explanation/recommendations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes check for spark.executor.extraJavaOptions will be good, but for adding this link I need to check if this is possible to represent it as a proper html link on the heuristic page.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed with Min and Fangshi, and the current plan is to add ParallelGC as part of the cluster configuration, in which case we don't need to list this as a recommendation, at least for LinkedIn -- it could still be useful for outside users.

@@ -102,6 +104,9 @@ object ExecutorGcHeuristic {
throw new Exception("No executor information available.")
}

val isParallelGcEnabled: Boolean = appConfigurationProperties.getOrElse("spark.executor.extraJavaOptions","").contains("XX:+UseParallelGC")
val parallelGcRecommendation: String = if (isParallelGcEnabled) "" else "Enable ParallelGc using spark.executor.extraJavaOptions."
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change "ParallelGc" to "ParallelGC", in case of copy-paste.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@ShubhamGupta29 ShubhamGupta29 force-pushed the executorGCHeuristicRecommedation branch from 6c4a047 to afd36d6 Compare October 3, 2018 06:55
<p>Enabling ParallelGC using spark.executor.extraJavaOptions could help.</p>
<p>Also recommended to reduce the number of UDF calls.</p>
<p>For more help refer <a href="https://iwww.corp.linkedin.com/wiki/cf/display/DWH/Spark+SQL+Tuning+Techniques" target="_blank">here</a></p>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a LinkedIn internal page -- is this OK to add to Dr. Elephant?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other thoughts are adding the information to the html detailed recommendations, or creating a wiki on the external Dr. Elephant wiki.

@edwinalu
Copy link

Looks good, thanks!

@varunsaxena varunsaxena force-pushed the customSHSWork branch 2 times, most recently from fc7de94 to ffe4d17 Compare October 16, 2018 10:13
@ShubhamGupta29 ShubhamGupta29 changed the base branch from customSHSWork to tuning_20190221 October 18, 2019 10:24
@ShubhamGupta29
Copy link
Contributor Author

@edwinalu thanks for the review. There were some merge conflicts introduced to some efforts to merge this branch with master. I have resolved the conflicts, if feasible kindly have a look through the changes and let me know if I should merge the changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants