Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding autocompletion + scripts + standard deviation in 'mapper time' heuristic #275

Open
wants to merge 288 commits into
base: master
Choose a base branch
from

Conversation

alexandre32
Copy link

hello,
i add some features:
Autocompletion in search field of 'job history' tab (from DB).
2 scripts for automatic adding new heuristic
Calcultion of standard deviation in 'mapper time' heuristic

Akshay Rai and others added 30 commits December 13, 2014 10:10
…plying it to the query instead of asking the query to perform the case-insensitive comparison.
Changed HadoopJobData to include finishTime since that is needed for
metrics.
Changed the signature of getJobCounter to include jobConf and jobData
so that it can publish metrics
Updated README.md

Tested locally on my box and on spades

RB=406817
BUGS=HADOOP-7814
R=fli,mwagner
A=fli
The java file DaliMetricsAPI.java has a flavor of the APIs that we will be exposing from the dali library.
We can split these classes into individual files when we move this functionality to the dali library.

Changed start script to look for a config file that configures a publisher. If the file is present,
then dr-elephant is started with an option that has the file name. If the file is not present,
then the behavior is unchanged (i.e. no metrics are published).

If the file is parsed correctly then dr-elephant publishes metrics in HDFS (one avro file per job)
for jobs that are configured to publish the metrics.

The job needs to set something like mapreduce.job.publish-counters='org.apache.hadoop.examples.WordCount$AppCounter:*'
to publish all counters in the group given. The format is : 'groupName:counterName' where counterName can be an
asterisk to indicate all counters in the group. See the class DaliMetricsAPI.CountersToPublish

The HDFSPublisher is configured with a base path under which metrics are published. The date/hour hierarchy is added
to the base path.

The XML file for configuring dr-elephant is checked in as a template. A config file needs to be added to the
'conf' path of dr-elephant (manually, as per meeting with hadoop-admin) on clusters where we want dr-elephant
to publish metrics.

RB=409443
BUGS=HADOOP-7814
R=fli,csteinba,mwagner,cbotev,ahsu
A=fli,ahsu
hadoop-1 does not have JobStatus.getFinishTime(). This causes dr-elephant to hang.

Set the start time to be same as finish time for h1 jobs.

For consistency, reverted to the old method of scraping the job tracker url so that we get only
start time, and set the finish time to be equal to start time for retired jobs as well.

RB=417975
BUGS=HADOOP-8640
R=fli,mwagner
A=fli
RB=417448
BUGS=HADOOP-8648
R=fli
A=fli
…increasing mapred.min.split.size for too many mappers, NOT mapred.max.split.size
…name

RB=468832
BUGS=HADOOP-10405
R=fli
A=fli,ahsu
nntnag17 and others added 28 commits January 24, 2017 16:16
Jobs which put large files(> 500MB) in the distributed cache are flagged.
Files as part of the following are considered.
  mapreduce.job.cache.files
  mapreduce.job.cache.archives
…p2 (linkedin#203)

(1) Use ArrayList instead
(2) Add unit test for this
)

This commit allows Dr. Elephant to fetch Spark logs without universal
read access to eventLog.dir on HDFS. SparkFetcher would use SparkRestClient
instead of SparkLogClient if configured as

    <params>
      <use_rest_for_eventlogs>true</use_rest_for_eventlogs>
    </params>

The default behaviour is to fetch the logs via SparkLogClient/WebHDFS.
…alone fetcher (linkedin#232)

Remove backup for Rest Fetcher and make Legacy FSFetcher as top level fetcher. Change the default fetcher in the config.
* Fix SparkMetricsAggregator to not produce negative  ResourceUsage
* We have been ignoring Failed Tasks in calculation of resource usage. This handles that.
* Fixes Exception heuristic which was supposed to give the stacktrace.
…autocompletion search fild in tab job history
Copy link
Contributor

@akshayrai akshayrai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delayed review. I added the comments but forgot to publish them.

Can you clarify the motivation behind the scripts for automatic adding of heuristics?

Also, please ensure that you avoid addressing multiple issues in one PR.

db_user=root
db_password=""
db_user=drelephant
db_password="Dr-elephant123"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change this back to root and "".

compile.sh Outdated

# Echo the value of pwd in the script so that it is clear what is being removed.
rm -rf ${project_root}/dist
mkdir dist

play_command $OPTS clean test compile dist
play_command $OPTS clean compile dist
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add the test back?

[repositories]
local
maven-central
cloudera:https://repository.cloudera.com/cloudera/cloudera-repos/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these required? Please remove them.

@@ -682,6 +687,13 @@ private static Result getJobHistory(Version version) {
boolean hasSparkJob = false;
// get the graph type
String graphType = form.get("select-graph-type");

SqlQuery q = Ebean.createSqlQuery("select distinct job_def_id from yarn_app_result order by job_def_id;");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not in favor of merging this. In prod environments, there will be millions of entries and this can become an overkill.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.