Skip to content

Commit 73795ce

Browse files
Updating Executor GC heuristic recommendation for high executor GC
1 parent 79eb945 commit 73795ce

File tree

2 files changed

+18
-3
lines changed

2 files changed

+18
-3
lines changed

app/com/linkedin/drelephant/spark/heuristics/ExecutorGcHeuristic.scala

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -55,11 +55,13 @@ class ExecutorGcHeuristic(private val heuristicConfigurationData: HeuristicConfi
5555

5656
//adding recommendations to the result, severityTimeA corresponds to the ascending severity calculation
5757
if (evaluator.severityTimeA.getValue > Severity.LOW.getValue) {
58-
resultDetails = resultDetails :+ new HeuristicResultDetails("Gc ratio high", "The job is spending too much time on GC. We recommend increasing the executor memory.")
58+
resultDetails = resultDetails :+ new HeuristicResultDetails("Gc ratio high",
59+
"The job is spending too much time on GC. Recommended to increase the executor memory." + evaluator.parallelGcRecommendation + "Can also try reducing number of UDF calls.")
5960
}
6061
//severityTimeD corresponds to the descending severity calculation
6162
if (evaluator.severityTimeD.getValue > Severity.LOW.getValue) {
62-
resultDetails = resultDetails :+ new HeuristicResultDetails("Gc ratio low", "The job is spending too little time in GC. Please check if you have asked for more executor memory than required.")
63+
resultDetails = resultDetails :+ new HeuristicResultDetails("Gc ratio low",
64+
"The job is spending too little time in GC. Please check if you have asked for more executor memory than required.")
6365
}
6466

6567
val result = new HeuristicResult(
@@ -103,6 +105,11 @@ object ExecutorGcHeuristic {
103105
throw new Exception("No executor information available.")
104106
}
105107

108+
val sparkExecutorExtraJavaOptions = appConfigurationProperties.getOrElse("spark.executor.extraJavaOptions","")
109+
val isParallelGCEnabled: Boolean = sparkExecutorExtraJavaOptions.contains("XX:+UseParallelGC")
110+
val isG1GCenabled: Boolean = sparkExecutorExtraJavaOptions.contains("XX:+UseG1GC")
111+
val gcRecommendation: String = if (isParallelGCEnabled || isG1GCenabled) "" else "Enable ParallelGC or G1GC using spark.executor.extraJavaOptions."
112+
106113
lazy val appConfigurationProperties: Map[String, String] =
107114
data.appConfigurationProperties
108115
var (jvmTime, executorRunTimeTotal) = getTimeValues(executorSummaries)

app/views/help/spark/helpExecutorGcHeuristic.scala.html

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,4 +17,12 @@
1717
<p>This analysis shows how much time a job is spending in GC. To normalise the results across all jobs, the ratio of the time a job spends in Gc to the total run time of the job is calculated. </p>
1818
<p>A job is flagged if the ratio is too high, meaning the job spends too much time in GC.</p>
1919
<h3>Suggestions</h3>
20-
<p>We recommend increasing the executor memory.</p>
20+
<ul>
21+
<li>We recommend increasing the executor memory.</li>
22+
<li>Enabling G1GC or ParallelGC using spark.executor.extraJavaOptions could help.</li>
23+
<ul>
24+
<li>User can enable G1GC or ParallelGC by adding <b>-XX:+UseG1GC</b> or <b>-XX:+UseParallelGC</b> respectively to Spark configuration spark.executor.extraJavaOptions</li>
25+
</ul>
26+
<li>High GC can occur if the number of UDF calls made is high, especially if the UDFs are inefficient or use a lot of memory.</li>
27+
</ul>
28+
<p>For some general guideline about how to tune GC for your Spark application refer <a href="https://spark.apache.org/docs/latest/tuning.html#garbage-collection-tuning" target="_blank">here</a></p>

0 commit comments

Comments
 (0)