-
Notifications
You must be signed in to change notification settings - Fork 0
Add files via upload #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
iross
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's in good shape overall! There are a few things worth addressing here and I think we should go ahead and fix them as part of this PR so that you can try out that aspect of the PR workflow. Commits to this branch will automatically be included in the PR, and you can mark all the comments as resolved as you push fixes.
|
|
||
| ## 📁 Repository Structure | ||
|
|
||
| - hold_classifer.py # Diagnoses held jobs and groups by hold reasons |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like the files have been renamed since this readme was written.
| if wall_time: | ||
| runtimes.append(wall_time) | ||
|
|
||
| from statistics import median |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You've already imported statistics, so no need to do it again here: you can just use statistics.median
| print(f"{'Efficiency Notes':^80}") | ||
| print("=" * 80) | ||
|
|
||
| def warn(resource, efficiency): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the upper and lower bounds should be arguments. I can definitely see value in being able to set different thresholds for each resource type.
|
|
||
| if __name__ == "__main__": | ||
| if len(sys.argv) != 2: | ||
| print("Usage: python htcondor_cluster_summary.py <ClusterId>") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Name of the script is wrong
| filepath = os.path.join(data_dir, f"cluster_{cluster_id}_jobs.csv") | ||
|
|
||
| if not os.path.exists(filepath): | ||
| print(f"File not found: {filepath}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This won't be an informative error message for a user.
| code = ad.eval("HoldReasonCode") | ||
| subcode = ad.eval("HoldReasonSubCode") | ||
|
|
||
| reason = ad.eval("HoldReason").split('. ')[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment here to explain why the split. I'm guessing it eliminates a lot of the EP-specific things that would muddy up the similarities?
| Dict[int, List[Tuple[str, int]]]: A dictionary mapping each HoldReasonCode to a list of (HoldReason, HoldReasonSubCode) tuples. | ||
| """ | ||
| def group_by_code(cluster_id): | ||
| global total_jobs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this a global? It seems to only be used in this function.
|
|
||
| while hits and len(all_hits) < MAX_RESULTS: | ||
| remaining = MAX_RESULTS - len(all_hits) | ||
| to_add = hits[:remaining] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not seeing the reason for this. If hits is just the next page of results from ES, I don't see why we'd want to potentially truncate that collection.
| def load_job_data(cluster_id, folder="cluster_data"): | ||
| filepath = os.path.join(folder, f"cluster_{cluster_id}_jobs.csv") | ||
| if not os.path.exists(filepath): | ||
| print(f"File not found: {filepath}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, a user who sees this message won't necessarily understand that it was a result of a bad ClusterID (or at least a Cluster whose csv isn't where the program expected)
| return jobs | ||
|
|
||
| # safe conversion to float | ||
| def safe_float(val): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Things like this (and the validate_params) which are repeated patterns can be moved into utils so that definition changes only need to happen once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh! It looks like you started to define things there, but haven't imported them from there.
No description provided.